### Step-by-Step Guide to Fine-Tuning a BERT Model for Text Classification
- We will fine-tune a BERT model (Bidirectional Encoder Representations from Transformers) for a binary sentiment classification task (positive/negative reviews) using the IMDB movie reviews dataset.

### Step 1: Install Required Libraries
- First, install the necessary Python libraries:

- `transformers` → Contains pre-trained models like BERT.

- `datasets` → Provides easy access to datasets (e.g., IMDB dataset).

- `torch` → PyTorch, required for model training.

- `sklearn` → Used for evaluation metrics.

In [1]:
! pip install transformers datasets torch -q

In [2]:
import os
os.environ["WANDB_DISABLED"] = "true"

### Step 2: Import Necessary Libraries


In [3]:
# Hugging Face Libraries
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments
from datasets import load_dataset
from huggingface_hub import login
import torch

# Model Evaluation Libraries
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)


Using device: cpu


- `AutoModelForSequenceClassification` → Loads a BERT model for text classification.

- `AutoTokenizer` → Converts text into a format suitable for BERT.

- `Trainer & TrainingArguments` → Simplifies model training.

- `load_dataset` → Loads datasets like IMDB easily.

- `torch.device` → Moves the model to GPU (if available) for faster training.

In [4]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### Step 3: Load Pre-trained BERT Model and Tokenizer
- We will use bert-base-uncased, a version of BERT trained on uncased English text.

In [5]:
# Load a pre-trained BERT model for text classification
model_name = "bert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)  # Binary classification
tokenizer = AutoTokenizer.from_pretrained(model_name)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


- `bert-base-uncased` → A BERT model trained on lowercased English text.

- `num_labels=2` → Since we are doing binary classification (positive/negative), we set 2 output classes.

In [6]:
# Move the model to GPU if available
model.to(device)

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e

### Step 4: Load and Preprocess the Dataset

In [7]:
# Load the IMDB dataset
dataset = load_dataset("imdb")

# Tokenize the dataset
def preprocess_function(examples):
    return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=512)

# Apply the tokenization function to the dataset
encoded_dataset = dataset.map(preprocess_function, batched=True)

# Split dataset into train and test sets
train_dataset = encoded_dataset["train"]
test_dataset = encoded_dataset["test"]


Map:   0%|          | 0/50000 [00:00<?, ? examples/s]

- `load_dataset("imdb")` → Downloads the IMDB movie reviews dataset.

- `tokenizer()` → Converts text into token IDs that BERT understands.

- `truncation=True` → Ensures the text length does not exceed 512 tokens (BERT’s limit).

- `padding="max_length"` → Ensures each sequence has the same length for batch training.

- `dataset.map()` → Applies tokenization to every text sample in the dataset.

In [8]:
dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 25000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 25000
    })
    unsupervised: Dataset({
        features: ['text', 'label'],
        num_rows: 50000
    })
})

In [9]:
dataset.column_names

{'train': ['text', 'label'],
 'test': ['text', 'label'],
 'unsupervised': ['text', 'label']}

In [10]:
train_dataset

Dataset({
    features: ['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 25000
})

In [11]:
print(dataset["train"].column_names)
print(dataset["train"][0])  # Print first example


['text', 'label']
{'text': 'I rented I AM CURIOUS-YELLOW from my video store because of all the controversy that surrounded it when it was first released in 1967. I also heard that at first it was seized by U.S. customs if it ever tried to enter this country, therefore being a fan of films considered "controversial" I really had to see this for myself.<br /><br />The plot is centered around a young Swedish drama student named Lena who wants to learn everything she can about life. In particular she wants to focus her attentions to making some sort of documentary on what the average Swede thought about certain political issues such as the Vietnam War and race issues in the United States. In between asking politicians and ordinary denizens of Stockholm about their opinions on politics, she has sex with her drama teacher, classmates, and married men.<br /><br />What kills me about I AM CURIOUS-YELLOW is that 40 years ago, this was considered pornographic. Really, the sex and nudity scenes 

In [12]:
print(dataset["test"].column_names)
print(dataset["test"][0])  # Print first example


['text', 'label']
{'text': 'I love sci-fi and am willing to put up with a lot. Sci-fi movies/TV are usually underfunded, under-appreciated and misunderstood. I tried to like this, I really did, but it is to good TV sci-fi as Babylon 5 is to Star Trek (the original). Silly prosthetics, cheap cardboard sets, stilted dialogues, CG that doesn\'t match the background, and painfully one-dimensional characters cannot be overcome with a \'sci-fi\' setting. (I\'m sure there are those of you out there who think Babylon 5 is good sci-fi TV. It\'s not. It\'s clichéd and uninspiring.) While US viewers might like emotion and character development, sci-fi is a genre that does not take itself seriously (cf. Star Trek). It may treat important issues, yet not as a serious philosophy. It\'s really difficult to care about the characters here as they are not simply foolish, just missing a spark of life. Their actions and reactions are wooden and predictable, often painful to watch. The makers of Earth KNOW

In [13]:
print(dataset["unsupervised"].column_names)
print(dataset["unsupervised"][0])  # Print first example


['text', 'label']
{'text': 'This is just a precious little diamond. The play, the script are excellent. I cant compare this movie with anything else, maybe except the movie "Leon" wonderfully played by Jean Reno and Natalie Portman. But... What can I say about this one? This is the best movie Anne Parillaud has ever played in (See please "Frankie Starlight", she\'s speaking English there) to see what I mean. The story of young punk girl Nikita, taken into the depraved world of the secret government forces has been exceptionally over used by Americans. Never mind the "Point of no return" and especially the "La femme Nikita" TV series. They cannot compare the original believe me! Trash these videos. Buy this one, do not rent it, BUY it. BTW beware of the subtitles of the LA company which "translate" the US release. What a disgrace! If you cant understand French, get a dubbed version. But you\'ll regret later :)', 'label': -1}


In [14]:
train_dataset.column_names

['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask']

In [15]:
test_dataset.column_names

['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask']

### Step 5: Define Evaluation Metrics


In [16]:

# Define a function to compute accuracy, precision, recall, and F1 score
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = torch.argmax(torch.tensor(logits), dim=-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, predictions, average="binary")
    acc = accuracy_score(labels, predictions)
    return {"accuracy": acc, "precision": precision, "recall": recall, "f1": f1}


- `accuracy_score` → Measures how many predictions are correct.

- `precision_recall_fscore_support` → Computes precision, recall, and F1-score.

- `argmax(logits)` → Converts model outputs into class predictions (0 or 1).

### Step 6: Define Training Arguments


In [17]:
training_args = TrainingArguments(
    output_dir="/content/BERT-IMDB",  # Where to save the model
    evaluation_strategy="epoch",  # Evaluate at the end of each epoch
    save_strategy="epoch",  # Save model checkpoints
    learning_rate=2e-5,  # Standard learning rate for fine-tuning BERT
    per_device_train_batch_size=2,  # Batch size for training
    per_device_eval_batch_size=2,  # Batch size for evaluation
    num_train_epochs=1,  # Train for 3 epochs
    weight_decay=0.01,  # Regularization to prevent overfitting
    logging_dir="./logs",  # Directory for logging training metrics
    push_to_hub=True,  # Upload to Hugging Face Hub
    hub_model_id="Mohan-DS-1321/Fine-Tuning-of-BERT",  # Replace with your username and model name
)


Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).


- `learning_rate=2e-5` → A small learning rate prevents overfitting.

- `num_train_epochs=3` → Fine-tuning typically requires only a few epochs.

- `per_device_train_batch_size=8` → Defines batch size per GPU.

### Step 7: Train the Model


In [18]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
)

# Start fine-tuning
trainer.train()


Epoch,Training Loss,Validation Loss


KeyboardInterrupt: 

- `Trainer` → A high-level API that handles training and evaluation automatically.

- `trainer.train()` → Starts the fine-tuning process.

### Step 8: Evaluate the Model

In [None]:
# Evaluate the fine-tuned model on the test dataset
results = trainer.evaluate()
print("Evaluation Results:", results)


- Upload Fine-Tuned Model to Hugging Face Hub

In [None]:
# Push model to the Hugging Face Model Hub
trainer.push_to_hub()


- 1️⃣1️⃣ Use the Fine-Tuned Model for Predictions
- After fine-tuning, you can load and use the model for inference:

In [None]:
from transformers import pipeline

# Load the fine-tuned model from Hugging Face
model_pipeline = pipeline("text-classification", model="Mohan-DS-1321/Fine-Tuning-of-BERT")

# Test the model on a new review
review = "This movie was absolutely amazing! I loved it."
result = model_pipeline(review)

print(result)


In [None]:
# 📌 Output Example:
# [{'label': 'LABEL_1', 'score': 0.98}]

# LABEL_1 → Positive review

# LABEL_0 → Negative review

In [None]:
from huggingface_hub import HfApi

api = HfApi()
models = api.list_models(author="Mohan-DS-1321")  # Replace with your username

for model in models:
    print(model.modelId)  # Prints all models you have uploaded


### Summary of Steps
- Step         ---------------------------- Description
1. Install Packages	:  Install transformers, datasets, torch, huggingface_hub
2. Import Libraries	:  Load required modules
3. Authenticate with API Key:	 Log in using your Hugging Face token
4. Load Pretrained Model	:  Load bert-base-uncased for text classification
5. Load Dataset	Load :  IMDB movie reviews dataset
6. Define Metrics :	 Accuracy, Precision, Recall, F1-score
7. Training Config :	 Set hyperparameters like batch size and learning rate
8. Fine-Tune Model :	 Train using Hugging Face Trainer
9. Evaluate Model :	 Check accuracy on test dataset
10. Upload to Hugging Face :	 trainer.push_to_hub() to share model
11. Use Model for Predictions : 	Load and test it on new text