### Question Answering (QA) (w/ BERT - GPT)

Question Answering (QA) is a task in Natural Language Processing (NLP) that involves developing models capable of understanding a question and providing an appropriate answer based on a given context. This task can be approached in several ways, including extracting answers from documents or generating answers based on pre-existing knowledge.

#### Types of Question Answering

1. **Extractive QA**: In extractive question answering, the system identifies and selects a portion of the input text that contains the correct answer. For instance, given a passage, the model extracts a span of text that directly answers the question. An example:

   - Question: "Who wrote 'Pride and Prejudice'?"
   - Context: "Jane Austen wrote 'Pride and Prejudice' in 1813."
   - Answer: "Jane Austen"

2. **Abstractive QA**: Abstractive question answering generates a summary or a rephrased version of the information to form a natural language answer. This approach goes beyond extracting text directly from the document and involves generating new sentences to convey the answer. For instance:
   - Question: "What is the capital of France?"
   - Context: "France is a country located in Europe, and its capital is Paris."
   - Answer: "Paris is the capital of France."

#### Key Components of QA Systems

- **Context**: The document, passage, or knowledge base from which the answer will be derived.
- **Question**: The user's query that requires an answer.
- **Answer**: The output of the QA model, which can either be a direct text extraction or a generated response.

#### Example of QA

- **Context**: "Albert Einstein was a theoretical physicist, widely recognized for developing the theory of relativity."
- **Question**: "What is Albert Einstein known for?"
- **Answer**: "Developing the theory of relativity."

#### Challenges in QA

- **Ambiguity**: Questions may have multiple possible interpretations.
- **Complexity**: Long and complex questions may require advanced reasoning to extract or generate accurate answers.
- **Context Understanding**: Models need to understand the context thoroughly to provide the most relevant answer.

In recent advancements, transformer-based models such as BERT, T5, and GPT have been used for QA, where fine-tuning these pre-trained models on QA datasets has led to state-of-the-art performance.


---


#### BERT


In [None]:
import torch
from transformers import BertForQuestionAnswering, BertTokenizer

# BERT model fine-tuned on the SQuAD dataset for Question Answering
model_name = "bert-large-uncased-whole-word-masking-finetuned-squad"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)


def predict_answer(question, context):
    # Tokenize the question and context
    encoding = tokenizer.encode_plus(
        question, context, return_tensors="pt", max_length=512, truncation=True
    )

    # Input tensors
    input_ids = encoding["input_ids"]
    attention_mask = encoding["attention_mask"]

    # Run the model and get the start and end scores
    with torch.no_grad():
        start_scores, end_scores = model(
            input_ids, attention_mask=attention_mask, return_dict=False
        )

    # Find the highest probability start and end indices
    start_index = torch.argmax(start_scores, dim=1).item()
    end_index = torch.argmax(end_scores, dim=1).item()

    # Get the tokens and decode the answer
    answer_tokens = tokenizer.convert_ids_to_tokens(
        input_ids[0][start_index : end_index + 1]
    )
    answer = tokenizer.convert_tokens_to_string(answer_tokens)

    return answer

Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [7]:
question = "What is the capital of France?"
context = "France, officially the French Republic, is a country whose capital is Paris."
answer = predict_answer(question=question, context=context)

print(question)
print(answer)

What is the capital of France?
paris


---


#### GPT


In [None]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)


def generate_answer(question, context):
    # Format the input text by combining the question and context
    input_text = f"Question: {question} Context: {context} Please answer the Question according to Context."

    # Tokenize the input text and convert it to tensors
    inputs = tokenizer.encode(input_text, return_tensors="pt", truncation=True)

    # Create attention mask (consider all tokens)
    attention_mask = torch.ones(inputs.shape, device=inputs.device)

    # Run the model and generate the output
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=256,  # Maximum length of the generated output
            attention_mask=attention_mask,
            pad_token_id=tokenizer.eos_token_id,  # Use EOS token as padding
            no_repeat_ngram_size=2,  # Prevent repetition of the same token combinations
        )

    # Decode the generated answer, skipping special tokens like EOS or padding
    answer = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # Extract the answer after the "Answer:" keyword
    answer = answer.split("Answer:")[-1].strip()

    return answer

In [46]:
question = "What is CPU?"
context = "A CPU (Central Processing Unit) is the primary component of a computer that performs most of the processing inside. It interprets and executes instructions from programs, making it essential for the operation of a computer."
answer = generate_answer(question=question, context=context)

print(question)
print(answer)

What is CPU?
CPU is a term used to describe a processor that is used for a specific purpose. For example, a CPU can be used as a "computer" for "programming", or as an "office" or "workstation".
. The term "CPU" is derived from the Latin word "cpu", which means "processor". The word is also used in the sense of "processor" and "system". In the context of computer programming, the term is often used interchangeably with "machine". For more information on the use of CPU, see the CPU FAQ.
