[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/question-answering.ipynb)
[![View on GitHub](https://img.shields.io/badge/View_on-GitHub-blue?logo=github)](https://github.com/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/question-answering.ipynb)

# Simple Question Answering with BERT

## 🎯 Learning Objectives
By the end of this notebook, you will understand:
- How question answering works with BERT models
- Using Hugging Face pipelines for question answering
- How to extract answers from text contexts
- Understanding confidence scores in QA tasks

## 📋 Prerequisites
- Basic understanding of machine learning concepts
- Familiarity with Python and PyTorch
- Knowledge of NLP fundamentals (refer to [NLP Learning Journey](https://github.com/vuhung16au/nlp-learning-journey))

## 📚 What We'll Cover
1. **Setup**: Import libraries and prepare environment
2. **Basic QA Pipeline**: Using BERT for question answering
3. **Simple Examples**: Testing with context and questions
4. **Understanding Results**: Interpreting answers and confidence scores
5. **Summary**: Key takeaways and next steps

## 1. Introduction to Question Answering

**Question Answering (QA)** is a natural language processing task where we extract answers from a given text based on questions. BERT-based models excel at this task because they can:

- **Understand Context**: Process the relationship between question and text
- **Find Relevant Spans**: Identify the most relevant text segments
- **Extract Answers**: Pull out specific answer spans from the context

### How BERT Does Question Answering

BERT for question answering adds a **span classification head** on top of the base BERT model. This linear layer:
1. Accepts the final hidden states from BERT
2. Performs linear transformation to compute **span start and end logits**
3. Uses cross-entropy loss to find the most likely text span
4. Returns the span as the answer with a confidence score

## 2. Setup and Environment

Let's start by importing the necessary libraries and setting up our environment.

In [None]:
# Install required packages (uncomment if running for the first time)
# !pip install transformers torch

# Import essential libraries
import torch
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

# Device detection for optimal performance
def get_device() -> torch.device:
    """
    Automatically detect and return the best available device.
    
    Priority: CUDA > MPS (Apple Silicon) > CPU
    
    Returns:
        torch.device: The optimal device for current hardware
    """
    if torch.cuda.is_available():
        device = torch.device("cuda")
        print(f"🚀 Using CUDA GPU: {torch.cuda.get_device_name()}")
    elif torch.backends.mps.is_available():
        device = torch.device("mps")
        print("🍎 Using Apple MPS (Apple Silicon)")
    else:
        device = torch.device("cpu")
        print("💻 Using CPU (consider GPU for better performance)")
    
    return device

# Detect the best available device
device = get_device()

print("\n📚 Setup completed successfully!")
print(f"PyTorch version: {torch.__version__}")
print(f"Device: {device}")

## 3. Basic Question Answering with BERT

The Hugging Face `pipeline` function makes question answering simple. By default, it uses a BERT-based model trained on the SQuAD dataset for extractive question answering.

In [None]:
# Create a question-answering pipeline
# This will use a BERT-based model by default (e.g., distilbert-base-cased-distilled-squad)
print("🔄 Loading question answering model...")
qa_pipeline = pipeline("question-answering", device=0 if device.type == "cuda" else -1)

print("✅ Model loaded successfully!")
print("📊 Ready for question answering with BERT!")

## 4. Simple Question Answering Examples

Let's test our BERT-based QA model with some simple examples.

In [None]:
# Define a simple context about BERT and transformers
context = """
BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based 
machine learning technique for natural language processing developed by Google. BERT was 
published in 2018 and stands for Bidirectional Encoder Representations from Transformers. 
Unlike previous models that read text sequentially, BERT reads the entire sequence of words 
at once, allowing it to learn the context of a word based on all of its surroundings. 
This makes BERT particularly effective for question answering tasks.
"""

# Define some questions about the context
questions = [
    "What does BERT stand for?",
    "Who developed BERT?",
    "When was BERT published?",
    "Why is BERT effective for question answering?"
]

print("🔍 QUESTION ANSWERING EXAMPLES")
print("=" * 50)
print(f"Context: {context.strip()}")
print("\n" + "=" * 50)

# Process each question
for i, question in enumerate(questions, 1):
    print(f"\n❓ Question {i}: {question}")
    
    # Get answer using BERT
    result = qa_pipeline(question=question, context=context)
    
    print(f"💡 Answer: {result['answer']}")
    print(f"📊 Confidence: {result['score']:.4f}")
    print(f"📍 Position: characters {result['start']}-{result['end']}")

## 5. Understanding the Results

Let's explore what the BERT model returns and how to interpret the results.

In [None]:
# Let's examine a single question in detail
sample_question = "What does BERT stand for?"
result = qa_pipeline(question=sample_question, context=context)

print("🔍 DETAILED RESULT ANALYSIS")
print("=" * 40)
print(f"Question: {sample_question}")
print(f"\nResult dictionary: {result}")

print(f"\n📊 EXPLANATION:")
print(f"• Answer: '{result['answer']}' - The extracted text span")
print(f"• Score: {result['score']:.4f} - Confidence level (0-1, higher is better)")
print(f"• Start: {result['start']} - Character position where answer begins")
print(f"• End: {result['end']} - Character position where answer ends")

# Show the context with the answer highlighted
print(f"\n🎯 ANSWER IN CONTEXT:")
start_pos = result['start']
end_pos = result['end']
before = context[:start_pos]
answer_span = context[start_pos:end_pos]
after = context[end_pos:]

print(f"{before}[**{answer_span}**]{after}")

## 6. Testing with Different Examples

Let's try some additional examples to see how BERT handles different types of questions.

In [None]:
# Different context about machine learning
ml_context = """
Machine learning is a subset of artificial intelligence that enables computers to learn 
and make decisions without being explicitly programmed. It was first coined by Arthur Samuel 
in 1959. There are three main types of machine learning: supervised learning, unsupervised 
learning, and reinforcement learning. Deep learning is a specialized subset of machine learning 
that uses neural networks with multiple layers to analyze data.
"""

ml_questions = [
    "Who coined the term machine learning?",
    "What are the three main types of machine learning?",
    "What is deep learning?"
]

print("🧠 MACHINE LEARNING QA EXAMPLES")
print("=" * 50)

for question in ml_questions:
    result = qa_pipeline(question=question, context=ml_context)
    print(f"\n❓ Q: {question}")
    print(f"💡 A: {result['answer']} (confidence: {result['score']:.3f})")

## 7. Model Information

Let's understand what BERT model we're actually using for question answering.

In [None]:
# Get information about the model being used
print("🔍 MODEL INFORMATION")
print("=" * 30)

# The default QA pipeline typically uses DistilBERT (a compressed version of BERT)
print("Default Model: distilbert-base-cased-distilled-squad")
print("Model Type: DistilBERT (distilled BERT)")
print("Training Dataset: SQuAD (Stanford Question Answering Dataset)")
print("Model Size: ~66M parameters (vs 110M for base BERT)")
print("Use Case: Extractive Question Answering")

print("\n📊 MODEL CHARACTERISTICS:")
print("• Bidirectional: Reads text in both directions")
print("• Attention-based: Uses self-attention mechanisms")
print("• Pre-trained: Trained on large text corpora")
print("• Fine-tuned: Specialized for QA on SQuAD dataset")
print("• Extractive: Finds answer spans within the given context")

print("\n💡 HOW IT WORKS:")
print("1. Tokenizes question + context together")
print("2. BERT encoder processes the combined input")
print("3. Linear layers predict start/end positions")
print("4. Extracts text span with highest probability")

## Summary

In this notebook, we learned about question answering with BERT:

### 🔑 Key Concepts Mastered
- **Question Answering**: Extracting answers from text contexts using questions
- **BERT for QA**: How BERT uses span classification for answer extraction
- **Pipeline Usage**: Simple interface for question answering tasks
- **Result Interpretation**: Understanding answers, confidence scores, and positions

### 📈 Best Practices Learned
- Use clear, specific questions for better results
- Provide sufficient context containing the answer information
- Check confidence scores to assess answer reliability
- Understand that extractive QA finds spans, not generated answers

### 🚀 Next Steps
- **Advanced QA**: Explore more sophisticated question-answering models
- **Custom Fine-tuning**: Learn to fine-tune QA models on specific domains
- **Documentation**: Check [Hugging Face Transformers Documentation](https://huggingface.co/docs/transformers/task_summary#question-answering)
- **Related Notebooks**: Explore other notebooks in this series for more NLP tasks

💡 **Notice how easy it is to use BERT for different tasks once it's been pretrained. You only need to add a specific head to the pretrained model to manipulate the hidden states into your desired output!**

---

## About the Author

**Vu Hung Nguyen** - AI Engineer & Researcher

Connect with me:
- 🌐 **Website**: [vuhung16au.github.io](https://vuhung16au.github.io/)
- 💼 **LinkedIn**: [linkedin.com/in/nguyenvuhung](https://www.linkedin.com/in/nguyenvuhung/)
- 💻 **GitHub**: [github.com/vuhung16au](https://github.com/vuhung16au/)

*This notebook is part of the [HF Transformer Trove](https://github.com/vuhung16au/hf-transformer-trove) educational series.*