# Welcome to the BERT Project with Hugging Face!
This hands-on activity is designed to help you explore the capabilities of BERT (Bidirectional Encoder Representations from Transformers) using the Hugging Face library. The goal is to fine-tune, debug, and evaluate a BERT model for various natural language processing (NLP) tasks. Let’s get started



# Objective

Use the techniques from this module to

- Fine-tune BERT for a specific NLP task using Hugging Face.
- Debug issues during training or prediction.
- Evaluate the performance of the fine-tuned model using structured metrics.:

## Part 1: Fine-Tuning BERT
Task: Fine-tune a pre-trained BERT model for a specific NLP task using Hugging Face.

In [3]:
!pip install datasets

Collecting datasets
  Downloading datasets-3.6.0-py3-none-any.whl.metadata (19 kB)
Collecting pyarrow>=15.0.0 (from datasets)
  Downloading pyarrow-20.0.0-cp312-cp312-win_amd64.whl.metadata (3.4 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp312-cp312-win_amd64.whl.metadata (13 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py312-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.6.0-py3-none-any.whl (491 kB)
   ---------------------------------------- 0.0/491.5 kB ? eta -:--:--
   --------------------------------------  491.5/491.5 kB 10.2 MB/s eta 0:00:01
   ---------------------------------------- 491.5/491.5 kB 7.8 MB/s eta 0:00:00
Downloading multiprocess-0.70.16-py312-none-any.whl (146 kB)
   ---------------------------------------- 0.0/146.7 kB ? eta -:--:--
   ---------------------------------------- 146.7/146.7 kB 9.1 MB/s eta 0:00:00
Downloading pyarrow-20.0.0-cp312-cp312-win_amd64.whl (25.7 MB)
   ---------------

  You can safely remove it manually.


In [1]:
!pip install --upgrade transformers



In [3]:
import transformers
print(transformers.__version__)

4.53.0


In [3]:
## 1. Choose an NLP task:
### Task: Sentiment Analysis (IMDb dataset)

## 2. Prepare your dataset:
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import load_dataset
import torch

### Load the dataset
dataset = load_dataset('imdb')

### Ensure the dataset is preprocessed appropriately (e.g., tokenization using Hugging Face's tokenizer).
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

## 3. Fine-tune BERT:

### Load a pre-trained BERT model from Hugging Face (e.g., bert-base-uncased).
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

### Set up a training loop with Hugging Face's Trainer API.
def tokenize_function(examples):
    return tokenizer(examples['text'], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)
tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
tokenized_datasets.set_format("torch", columns=["input_ids", "attention_mask", "labels"])

train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(2000))
test_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(500))

### Specify hyperparameters such as batch size, learning rate, and number of epochs.
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    save_steps=10,
)

## 4. Monitor training:
### Track loss and accuracy during training.
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
)

trainer.train()  ### Fine-tune BERT
trainer.save_model("./fine_tuned_bert")  ### Save the trained model

# Problem with evaluation_strategy

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


TypeError: TrainingArguments.__init__() got an unexpected keyword argument 'evaluation_strategy'

## Part 2: Evaluating the Model
Task: Use evaluation metrics to assess the fine-tuned BERT model.

In [None]:
from sklearn.metrics import accuracy_score, f1_score
import numpy as np

## 1. Generate predictions on a test set:
### Use the fine-tuned model to make predictions on unseen data.
predictions = trainer.predict(test_dataset)
print(predictions)

### Get true labels and predicted labels
true_labels = predictions.label_ids
preds = np.argmax(predictions.predictions, axis=1)

## 2. Evaluate performance
accuracy = accuracy_score(true_labels, preds)
f1 = f1_score(true_labels, preds)

print(f"Accuracy: {accuracy}")
print(f"F1-Score: {f1}")