# Using a pre-trained NLP model for Question-Answering

In this notebook, we look at an example for fine-tuning a pre-trained NLP model for a question answering (QA) task. We'll use the Hugging Face transformers library and a pre-trained BERT model. The task involves improving the model's ability to answer questions based on a given context.

## Without Fine-Tuning
First, let's see how the pre-trained BERT model performs on a QA task without any fine-tuning. We'll use an example where the model needs to find an answer within a given passage.

In [1]:
from transformers import BertTokenizer, BertForQuestionAnswering
import torch

model_name = "bert-base-uncased"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForQuestionAnswering: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_out

Define a context and a question where the model might initially struggle

In [2]:
context = "The University of California was founded in 1868, located in Berkeley."
question = "When was the University of California established?"

#### Model Prediction
Tokenize the input, make a prediction, and decode the answer:

In [3]:
inputs = tokenizer(question, context, return_tensors='pt')
with torch.no_grad():
    outputs = model(**inputs)

# Find the tokens with the highest `start` and `end` scores
answer_start = torch.argmax(outputs.start_logits)
answer_end = torch.argmax(outputs.end_logits) + 1

# Convert tokens to answer string
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs.input_ids[0, answer_start:answer_end]))
print("Answer:", answer)

Answer: 


## With Fine-Tuning

Now, let's fine-tune this BERT model on a similar task to potentially improve its performance.

First, ensure that the tokenization aligns correctly with the expected answer positions in the input IDs. This is critical because the BERT tokenizer might split words into subwords, affecting position indices.

In [4]:
from transformers import BertTokenizer, BertForQuestionAnswering, AdamW
import torch
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler, TensorDataset

context = "The University of California was founded in 1868, located in Berkeley."
question = "When was the University of California established?"
answer = "1868"

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Tokenize the context to find the exact start and end position of the answer
encoded = tokenizer.encode_plus(question, context, return_tensors="pt")
input_ids = encoded["input_ids"].tolist()[0]
answer_tokens = tokenizer.encode(answer, add_special_tokens=False)

# Find start and end positions of answer tokens in the full token list
start_position = input_ids.index(answer_tokens[0])
end_position = start_position + len(answer_tokens) - 1

Prepare training data correctly based on the token positions. Ensure that these positions are accurately captured in the training loop.

In [5]:
from torch.utils.data import DataLoader, RandomSampler, TensorDataset

# Prepare training data
train_encodings = {'input_ids': encoded['input_ids'],
                   'attention_mask': encoded['attention_mask'],
                   'start_positions': torch.tensor([start_position]),
                   'end_positions': torch.tensor([end_position])}

train_dataset = TensorDataset(train_encodings['input_ids'], train_encodings['attention_mask'],
                              train_encodings['start_positions'], train_encodings['end_positions'])

train_sampler = RandomSampler(train_dataset)
train_dataloader = DataLoader(train_dataset, sampler=train_sampler, batch_size=1)


##### Fine-Tuning Loop
Adjust the training loop to ensure it's optimizing correctly. Print out the loss to monitor training progress.

In [6]:
optimizer = AdamW(model.parameters(), lr=5e-5)
model.train()

for epoch in range(3):  # Consider a small number of epochs
    total_loss = 0
    for batch in train_dataloader:
        batch = tuple(t.to(device) for t in batch)
        inputs = {'input_ids': batch[0],
                  'attention_mask': batch[1],
                  'start_positions': batch[2],
                  'end_positions': batch[3]}
        
        outputs = model(**inputs)
        loss = outputs.loss
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        
        total_loss += loss.item()
    print(f"Average training loss for epoch {epoch + 1}: {total_loss / len(train_dataloader)}")




Average training loss for epoch 1: 3.333040237426758
Average training loss for epoch 2: 2.3545355796813965
Average training loss for epoch 3: 1.5951474905014038


##### Evaluation After Fine-Tuning

In [7]:
model.eval()
with torch.no_grad():
    outputs = model(**encoded)

answer_start = torch.argmax(outputs.start_logits)
answer_end = torch.argmax(outputs.end_logits) + 1

# Convert tokens to answer string
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
print("Improved Answer:", answer)


Improved Answer: 1868


In [8]:
from transformers import BertTokenizer, BertForQuestionAnswering, AdamW
import torch

# Assume the rest of your model training setup is here

model_path = "qa_model.pth"

# Training loop here
# After training:
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)


('qa_model.pth/tokenizer_config.json',
 'qa_model.pth/special_tokens_map.json',
 'qa_model.pth/vocab.txt',
 'qa_model.pth/added_tokens.json')

We have seen how even a single epoch of fine-tuning can refine the model's understanding and improve the answering accuracy. Fine-tuning can potentially improve the model's accuracy significantly depending on the nature and amount of the fine-tuning data