# Question Answering with BERT
Source: most of the code is copied or adapted from The Fuzzy Scientist's [LLMs Mastery: Complete Guide to Transformers & Generative AI](https://udemy.com/course/llms-mastery-complete-guide-to-transformers-generative-ai) course on Udemy. The course is a lot more extensive than what is presented here and should be followed to understand all the concepts and the full context.

Source: Wikipedia article on James G. Blaine: https://en.wikipedia.org/wiki/James_G._Blaine (Context)

In [1]:
import torch

from transformers import BertForQuestionAnswering, BertTokenizerFast

from scipy.special import softmax
import pandas as pd
import numpy as np

## Context

In [123]:
context = """James G. Blaine (1830–1893) was an American statesman and Republican politician who represented Maine in the U.S. House of Representatives from 1863 to 1876, serving as Speaker of the House from 1869 to 1875, and then in the Senate from 1876 to 1881.
Born in Pennsylvania and a newspaper editor before entering politics, he twice served as the U.S. secretary of state, first in 1881 under President James A. Garfield and President Chester A. Arthur, and then from 1889 to 1892 under President Benjamin Harrison.
Blaine unsuccessfully sought the Republican presidential nomination in 1876 and 1880.
He gained the nomination in 1884, but in the election, he was narrowly defeated by Democratic nominee Grover Cleveland.
A charismatic speaker in an age that prized oratory, Blaine was a leading Republican of the late 19th century and a champion of the party's moderate reformist faction, later known as the "Half-Breeds"."""

In [117]:
context = """James G. Blaine (1830–1893) was an American statesman and Republican politician."""

In [124]:
len(context)

919

### Question

In [125]:
question_1 = "When was James Blaine born?"
question_2 = "What was James Blaine role from 1863 to 1876?"

## Model
### Import pre-trained model and tokeniser
Model source: https://huggingface.co/models

In [106]:
model_name = 'deepset/bert-base-cased-squad2'

tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)

Some weights of the model checkpoint at deepset/bert-base-cased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


### Answer the question
Both the question and the context must be encoded together in a single vector.

In [107]:
def chunk_sentences(sentences, chunk_size, overlap):
    chunks = []
    num_sentences = len(sentences)

    for i in range(0, num_sentences, chunk_size-overlap):
        chunk = sentences[i:i+chunk_size]
        chunks.append(chunk)

    return chunks

In [126]:
def answer_question(question, context):

    # Tokenise the question and context together in a single vector
    inputs = tokenizer(question, context, return_tensors='pt')

    # Disabling gradient calculation (torch.no_grad()) is useful for inference, when
    # you are sure that you will not call Tensor.backward(). It will reduce memory
    # consumption for computations that would otherwise have requires_grad=True.
    with torch.no_grad():
        outputs = model(**inputs)

    # Claculate the scores for the start and end of the answer
    start_scores = softmax(outputs.start_logits)[0]
    end_scores = softmax(outputs.end_logits)[0]

    # Extract the start and end indices of the most likely answer
    # selected by the model
    start_index = np.argmax(start_scores)
    end_index = np.argmax(end_scores)

    # Basded on the start and end indices, retrieve:
    # 1. the token ids
    # 2. the tokens converted form the ids
    # 3. a string converted from the tokens
    answer_ids = inputs.input_ids[0][start_index:end_index+1]
    answer_tokens = tokenizer.convert_ids_to_tokens(answer_ids)
    answer = tokenizer.convert_tokens_to_string(answer_tokens)

    # If the model is not able to answer the question based
    # on the context, it will return the [CLS] token with a
    # high confidence score.
    # We replace this with a more human friendly response
    if answer == tokenizer.cls_token:
        answer = "I cannot answer based on the provided context"

    # Calculate the averafe of the start and end token
    # confidence scores to get an overall confidence
    # Multiply by 100 to express the score as percent
    confidence_score = 100*0.5*(start_scores[start_index] + end_scores[end_index])

    return answer, confidence_score

In [129]:
# Select the question to ask
question = "who was James Blaine?"

# Split the context into separated sentences
context_sentences = context.split('\n')

# Group the sentences into chunks of two sentences each and with an
# overlap of one sentence (so no answers is left in between two chunks
# if it spreads over two sentences)
context_chunks = chunk_sentences(context_sentences, chunk_size=2, overlap=1)

answers = {'answer': '', 'score': 0}

for i, chunk in enumerate(context_chunks):
    # Group the sentences in a chunk into a single string
    sub_context = "\n".join(chunk)

    # Use the model to predict the answer based on each chunk as context
    answer, confidence_score = answer_question(question, sub_context)
    print(f"Chunk {i+1}: {answer} ({confidence_score:.1f}%)")

    # Check if the answer is valid
    if (answer != "I cannot answer based on the provided context"):
        print('valid answer')
        if confidence_score > answers['score']:
            answers['answer'] = answer
            answers['score'] = confidence_score

# Print the question and answer:
print(f"Q: '{question}'")
print(f"A: '{answers['answer']}'")

# Print the confidence score:
print(f"Confidence: {answers['score']:.1f}%")

Chunk 1: an American statesman and Republican politician (90.6%)
valid answer
Chunk 2: President (77.3%)
valid answer
Chunk 3: Grover Cleveland. (72.2%)
valid answer
Chunk 4: Republican of the late 19th century (64.3%)
valid answer
Chunk 5: a leading Republican of the late 19th century (73.9%)
valid answer
Q: 'who was James Blaine?'
A: 'an American statesman and Republican politician'
Confidence: 90.6%


### Comparison with a slightly different question
When we ask _"Who was Blaine?"_ instead of _"Who was **James** Blaine?"_, the answers qwe got were quite different, especially in terms of score.

#### Version 1
> Chunk 1: an American statesman and Republican politician (71.4%)<br/>
> Chunk 2:  (49.7%)<br/>
> Chunk 3: Grover Cleveland (77.7%)<br/>
> Chunk 4: Republican of the late 19th century (68.6%)<br/>
> Chunk 5: a leading Republican of the late 19th century (64.6%)<br/>


**Final answers**<br/>
> Q: 'who was Blaine?'<br/>
> A: 'Grover Cleveland'<br/>
> Confidence: 77.7%

#### Version 2
> Chunk 1: an American statesman and Republican politician (90.6%)<br/>
> Chunk 2: President (77.3%)<br/>
> Chunk 3: Grover Cleveland. (72.2%)<br/>
> Chunk 4: Republican of the late 19th century (64.3%)<br/>
> Chunk 5: a leading Republican of the late 19th century (73.9%)<br/>

**Final answers**<br/>
> Q: 'who was James Blaine?'<br/>
> A: 'an American statesman and Republican politician'<br/>
> Confidence: 90.6%