# Guide to Fine-Tuning a Hugging Face Model (BERT for Text Classification)

Fine-tune BERT using the Hugging Face `transformers` library on the IMDb sentiment analysis dataset.

##  Setup

In [1]:
%pip install transformers datasets torch pandas scikit-learn

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [2]:
import torch
from transformers import (
    BertTokenizer,
    BertForSequenceClassification,
    TrainingArguments,
    Trainer,
)
from datasets import load_dataset
from sklearn.metrics import accuracy_score, f1_score
import numpy as np

##  Step 1: Load Dataset (IMDb Reviews)

In [3]:
dataset = load_dataset("imdb")
print(dataset["train"][0])

{'text': 'I rented I AM CURIOUS-YELLOW from my video store because of all the controversy that surrounded it when it was first released in 1967. I also heard that at first it was seized by U.S. customs if it ever tried to enter this country, therefore being a fan of films considered "controversial" I really had to see this for myself.<br /><br />The plot is centered around a young Swedish drama student named Lena who wants to learn everything she can about life. In particular she wants to focus her attentions to making some sort of documentary on what the average Swede thought about certain political issues such as the Vietnam War and race issues in the United States. In between asking politicians and ordinary denizens of Stockholm about their opinions on politics, she has sex with her drama teacher, classmates, and married men.<br /><br />What kills me about I AM CURIOUS-YELLOW is that 40 years ago, this was considered pornographic. Really, the sex and nudity scenes are few and far be

## Step 2: Tokenize the Data

In [4]:
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

##  Step 3: Split Dataset

In [5]:
train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(200))

##  Step 4: Load Pre-trained BERT

In [6]:
model = BertForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=2
)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


##  Step 5: Define Training Arguments

In [11]:
training_args = TrainingArguments(
    output_dir="./bert_imdb_finetuned",
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=5,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_dir="./logs",
    logging_steps=10,
    load_best_model_at_end=True,
)

##  Step 6: Define Evaluation Metrics

In [13]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return {
        "accuracy": accuracy_score(labels, predictions),
        "f1": f1_score(labels, predictions, average="weighted"),
    }

##  Step 7: Fine-tune BERT

In [14]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
)

trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.4443,0.588455,0.8,0.798796
2,0.2535,0.953213,0.78,0.776326
3,0.2383,0.893075,0.825,0.824538
4,0.2005,1.15258,0.815,0.81403
5,0.0007,1.00084,0.835,0.834864


TrainOutput(global_step=625, training_loss=0.1878538741093129, metrics={'train_runtime': 1108.6003, 'train_samples_per_second': 4.51, 'train_steps_per_second': 0.564, 'total_flos': 328888819200000.0, 'train_loss': 0.1878538741093129, 'epoch': 5.0})

##  Step 8: Save & Reload Model

In [15]:
# Save model and tokenizer (using .bin instead of .safetensors)
model.save_pretrained("./bert_imdb_finetuned", safe_serialization=False)
tokenizer.save_pretrained("./bert_imdb_finetuned")

# Reload
model = BertForSequenceClassification.from_pretrained("./bert_imdb_finetuned")
tokenizer = BertTokenizer.from_pretrained("./bert_imdb_finetuned")


##  Step 9: Make Predictions

In [16]:
def predict_sentiment(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
    outputs = model(**inputs)
    predicted_class = torch.argmax(outputs.logits).item()
    return "Positive" if predicted_class == 1 else "Negative"


print(predict_sentiment("Worst film ever."))
print(predict_sentiment(" Good movie ."))
print(predict_sentiment("loved it."))
print(predict_sentiment("Boring."))

Negative
Positive
Positive
Negative


##  Task
- Train on full IMDb (25,000+ samples).
- Try other Transformer models.
- Try Other datasets
