## Original BERT QA model
- documentation example

In [1]:
from transformers import AutoTokenizer, BertForQuestionAnswering
import torch

In [2]:
tokenizer = AutoTokenizer.from_pretrained("deepset/bert-base-cased-squad2")
model = BertForQuestionAnswering.from_pretrained("deepset/bert-base-cased-squad2")

Some weights of the model checkpoint at deepset/bert-base-cased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [3]:
def ask(question):
    return question + 2


In [4]:
question, text = "Who was Jim Henson?", "Jim Henson was a nice holoo"

inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)


In [5]:
answer_start_index = outputs.start_logits.argmax()
answer_end_index = outputs.end_logits.argmax()

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens, skip_special_tokens=True)

'a nice holoo'

In [6]:
# target is "nice puppet"
target_start_index = torch.tensor([14])
target_end_index = torch.tensor([15])

outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index)
loss = outputs.loss
round(loss.item(), 2)

2.9

## Exploring BERT Model

In [7]:
qa_data = [
    ("Who was Jim Henson?", "Jim Henson was a nice puppet"),
    ("What did Jim Henson do?", "Jim Henson was a puppeteer and filmmaker"),
    ("Why is Jim Henson famous?", "Jim Henson is famous for creating the Muppets"),
    ("Where did Jim Henson live?", "Jim Henson lived in the United States"),
    ("When did Jim Henson die?", "Jim Henson passed away in 1990"),
    ("what is this project for?",'This project is for GERAD')
]

In [8]:
def answer_question(question, text):
    inputs = tokenizer(question, text, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**inputs)
    answer_start_index = torch.argmax(outputs.start_logits)
    answer_end_index = torch.argmax(outputs.end_logits)
    answer = tokenizer.decode(inputs["input_ids"][0, answer_start_index:answer_end_index+1])
    return answer

In [None]:
print("Chat with the BERT QA system (type 'exit' to stop):")
while True:
    user_input = input("You: ")
    
    if user_input.lower() == 'exit':
        print("Exiting the chat.")
        break
    
    found_answer = False
    for q, t in qa_data:
        if q.lower() in user_input.lower():
            answer = answer_question(q, t)
            print("BERT:", answer)
            found_answer = True
            break
    
    if not found_answer:
        print("BERT: I'm sorry, I don't have an answer for that question.")


Chat with the BERT QA system (type 'exit' to stop):
You: hey
BERT: I'm sorry, I don't have an answer for that question.
You: what is this project for?
BERT: GERAD
You: What did Jim Henson do?
BERT: puppeteer and filmmaker
