<a href="https://colab.research.google.com/github/Sagaust/DH-Computational-Methodologies/blob/main/Question_Answering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Question Answering (QA)

---

**Definition:**  
Question Answering is a field within Natural Language Processing (NLP) that focuses on building systems capable of answering questions posed by humans. The questions can be in natural language, and the answers can be extracted from a given text or knowledge base.

---

## 📌 **Why is Question Answering Important?**

1. **Information Retrieval**: Quickly extract precise information from vast amounts of text.
2. **User Interaction**: Enhance user interactions with systems, as seen with chatbots and virtual assistants.
3. **Knowledge Verification**: Check the understanding or existence of specific information in a text.
4. **Research and Education**: Assist in academic research and facilitate learning through interactive Q&A platforms.

---

## 🛠 **How Does Question Answering Work?**

At a high level, a QA system takes a question and a source of information (like a text document) as inputs and returns an answer. The complexity lies in understanding the question and finding the most relevant and accurate answer in the source.

---

## 🌐 **Approaches to Question Answering**:

- **Rule-Based QA**: Uses manually crafted rules and heuristics to find answers. Often relies on keyword matching.
- **Retrieval-Based QA**: Searches for the best answer from a predefined set of answers.
- **Generative QA**: Uses models to generate answers in natural language, especially effective with open-ended questions.
- **Neural QA**: Uses deep learning models, especially transformer-based architectures like BERT, to understand the context and semantics of questions and potential answers.

---

## 📚 **Applications of Question Answering**:

1. **Virtual Assistants**: Siri, Alexa, and Google Assistant employ QA to respond to user queries.
2. **Customer Support**: Chatbots use QA to provide instant answers to customer queries.
3. **Educational Tools**: Platforms that help students find answers to their academic questions.
4. **Research**: Tools like Semantic Scholar use QA to help researchers find relevant information in academic papers.

---

## 💡 **Insights from Question Answering**:

1. **Complexity of Language**: QA showcases the intricacies of human language, such as ambiguity, context dependence, and the need for world knowledge.
2. **Contextual Understanding**: Effective QA requires understanding the broader context, not just the immediate content around keywords.
3. **Interactivity**: QA systems provide an interactive way to engage with information, making them more user-friendly than traditional search systems.

---

## 🛑 **Challenges with Question Answering**:

1. **Ambiguity**: Some questions can be ambiguous, requiring clarification.
2. **Long Contexts**: Finding answers in long documents can be computationally challenging.
3. **Answer Veracity**: Ensuring the accuracy and truthfulness of extracted answers.
4. **Diverse Question Forms**: The same question can be posed in many different ways.

---

## 🧪 **Question Answering in Python**:

HuggingFace's Transformers library offers state-of-the-art models for question answering tasks:

```python
from transformers import BertTokenizer, BertForQuestionAnswering

# Load pre-trained model and tokenizer
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

# Define the question and context
question = "Who wrote the Iliad?"
context = "The Iliad is an ancient Greek epic poem attributed to Homer."

# Encode the question and context
inputs = tokenizer.encode_plus(question, context, return_tensors="pt")

# Get the answer span
answer_start_scores, answer_end_scores = model(**inputs)
answer_start = answer_start_scores.argmax().item()
answer_end = answer_end_scores.argmax().item() + 1

# Decode the answer
answer = tokenizer.decode(inputs["input_ids"][0][answer_start:answer_end])

print(f"Answer: {answer}")
