### In this section, I Fine-tuned a pretrained BERT model on the IMDB dataset for sentiment analysis, evaluate it with accuracy / F1, and run inference using the Hugging Face Transformers library.

In [None]:
# pip install transformers datasets evaluate accelerate scikit-learn
import torch
from datasets import load_dataset
from transformers import (AutoTokenizer, AutoModelForSequenceClassification,
                          DataCollatorWithPadding, Trainer, TrainingArguments)
import evaluate
import numpy as np

W1019 16:44:52.873000 440 Lib\site-packages\torch\distributed\elastic\multiprocessing\redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.


In [1]:
# --- GPU / Mixed-Precision setup ---
use_cuda = torch.cuda.is_available()
use_bf16 = use_cuda and torch.cuda.is_bf16_supported()     # Ampere+ GPUs
use_fp16 = use_cuda and not use_bf16                       # fallback to fp16 if bf16 not available
print(f"CUDA available: {use_cuda} | bf16: {use_bf16} | fp16: {use_fp16} | GPUs: {torch.cuda.device_count()}")

NameError: name 'torch' is not defined

In [2]:
MODEL_NAME = "bert-base-uncased"
NUM_LABELS = 2  # binary classification

# 1) Data: IMDB (binary sentiment)
dataset = load_dataset("imdb")  # splits: train/test

In [3]:
dataset.column_names

{'train': ['text', 'label'],
 'test': ['text', 'label'],
 'unsupervised': ['text', 'label']}

In [4]:
dataset["test"][:3]

{'text': ['I love sci-fi and am willing to put up with a lot. Sci-fi movies/TV are usually underfunded, under-appreciated and misunderstood. I tried to like this, I really did, but it is to good TV sci-fi as Babylon 5 is to Star Trek (the original). Silly prosthetics, cheap cardboard sets, stilted dialogues, CG that doesn\'t match the background, and painfully one-dimensional characters cannot be overcome with a \'sci-fi\' setting. (I\'m sure there are those of you out there who think Babylon 5 is good sci-fi TV. It\'s not. It\'s clichéd and uninspiring.) While US viewers might like emotion and character development, sci-fi is a genre that does not take itself seriously (cf. Star Trek). It may treat important issues, yet not as a serious philosophy. It\'s really difficult to care about the characters here as they are not simply foolish, just missing a spark of life. Their actions and reactions are wooden and predictable, often painful to watch. The makers of Earth KNOW it\'s rubbish as

In [5]:
dataset["train"][50:53]

{'text': ['I saw this film opening weekend in Australia, anticipating with an excellent cast of Ledger, Edgerton, Bloom, Watts and Rush that the definitive story of Ned Kelly would unfold before me. Unfortunately, despite an outstanding performance by Heath Ledger in the lead role, the plot was paper thin....which doesn\'t inspire me to read "Our Sunshine". There were some other plus points, the support acting from Edgerton in particular, assured direction from Jordan (confirming his talent on show in Buffalo Soldiers as well), and production design that gave a real feel of harshness to the Australian bush, much as the Irish immigrants of the early 19th century must have seen it. But I can\'t help feeling that another opportunity has been missed to tell the real story of an Australian folk hero (or was he?)....in what I suspect is a concession to Hollywood and selling the picture in the US. Oh well, at least Jordan and the producers didn\'t agree to lose the beards just to please Unive

In [6]:
dataset["unsupervised"][:3]

{'text': ['This is just a precious little diamond. The play, the script are excellent. I cant compare this movie with anything else, maybe except the movie "Leon" wonderfully played by Jean Reno and Natalie Portman. But... What can I say about this one? This is the best movie Anne Parillaud has ever played in (See please "Frankie Starlight", she\'s speaking English there) to see what I mean. The story of young punk girl Nikita, taken into the depraved world of the secret government forces has been exceptionally over used by Americans. Never mind the "Point of no return" and especially the "La femme Nikita" TV series. They cannot compare the original believe me! Trash these videos. Buy this one, do not rent it, BUY it. BTW beware of the subtitles of the LA company which "translate" the US release. What a disgrace! If you cant understand French, get a dubbed version. But you\'ll regret later :)',
  'When I say this is my favourite film of all time, that comment is not to be taken lightly

##### SFT needs labeled data. So removing unlabeled data.

In [7]:
del dataset["unsupervised"]

In [8]:
# 2) Tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

In [9]:
def preprocess(batch):
    return tokenizer(batch["text"], truncation=True)

##### Tokenizing the text based on its vocab so that machine can process it.
##### Models like BERT or GPT don’t read English words the way humans do. They can only process numbers — so we must first convert text into numbers that the model can understand. That’s what tokenization and token IDs do.

In [10]:
tokenized = dataset.map(preprocess, batched=True)

Map:   0%|          | 0/25000 [00:00<?, ? examples/s]

In [11]:
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

In [12]:
# 3) Model
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=NUM_LABELS)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
# Optional: reduce memory on large batches
if use_cuda:
    model.gradient_checkpointing_enable()

In [13]:
# 4) Metrics
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

In [14]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    preds = np.argmax(logits, axis=-1)
    return {
        "accuracy": accuracy.compute(predictions=preds, references=labels)["accuracy"],
        "f1": f1.compute(predictions=preds, references=labels, average="weighted")["f1"]
    }

In [None]:
# 5) Training config (GPU-aware)
args = TrainingArguments(
    output_dir="bert-imdb",
    eval_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,   # adjust up if you have more GPU memory
    per_device_eval_batch_size=32,
    num_train_epochs=3,
    weight_decay=0.01,
    warmup_ratio=0.1,
    logging_steps=50,
    load_best_model_at_end=True,
    metric_for_best_model="f1",
    bf16=use_bf16,                    # mixed precision on Ampere+ (preferred)
    fp16=use_fp16,                    # else use fp16
    dataloader_pin_memory=True,
    optim="adamw_torch",              # fast fused AdamW in recent PyTorch
    torch_compile=True if use_cuda and torch.__version__.startswith("2.") else False,
)

In [None]:
# 6) Trainer (Trainer uses GPU automatically if available)
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized["train"],
    eval_dataset=tokenized["test"],
    processing_class=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

In [None]:
trainer.train()



In [None]:
# 7) Save + quick inference
trainer.save_model("bert-imdb/best")
tokenizer.save_pretrained("bert-imdb/best")

In [None]:
from transformers import pipeline
# device_map="auto" will place the model on GPU if available for inference
clf = pipeline("text-classification", model="bert-imdb/best", tokenizer="bert-imdb/best", device_map="auto")
print(clf("This movie was absolutely wonderful!"))
print(clf("Terrible plot and wooden acting."))