<a href="https://colab.research.google.com/github/dslmllab/dSL-Lab-Coding-Challenge/blob/main/6_question_answering_systems.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Question Answering Systems

## Learning Objectives

At the end of this notebook, you will be able to:

1. Understand different types of question answering systems
2. Implement rule-based and retrieval-based QA systems
3. Build reading comprehension models
4. Create knowledge-based question answering
5. Implement conversational QA systems
6. Handle multi-hop reasoning questions
7. Evaluate QA system performance
8. Deploy QA systems for production use

## Introduction to Question Answering

Question Answering (QA) is a computer science discipline that focuses on building systems that automatically answer questions posed by humans in natural language. QA systems combine techniques from information retrieval, natural language processing, and machine learning.

### Types of QA Systems:

1. **Factoid QA**: Answer specific factual questions ("Who is the president of France?")
2. **Reading Comprehension**: Answer questions based on given passages
3. **Open-domain QA**: Answer questions using large knowledge bases
4. **Conversational QA**: Multi-turn question answering in dialogue
5. **Visual QA**: Answer questions about images
6. **Multi-hop QA**: Questions requiring reasoning across multiple facts

### QA System Architecture:

1. **Question Analysis**: Parse and understand the question
2. **Document Retrieval**: Find relevant documents/passages
3. **Answer Extraction**: Extract or generate the answer
4. **Answer Ranking**: Score and rank potential answers
5. **Response Generation**: Format the final response

### Applications:

- **Search Engines**: Direct answers to search queries
- **Chatbots**: Customer service and virtual assistants
- **Educational Tools**: Tutoring and learning systems
- **Information Systems**: Enterprise knowledge bases
- **Healthcare**: Medical diagnosis assistance

In [24]:
# Install required packages!pip install numpy pandas matplotlib seaborn nltk spacy scikit-learn torch transformers tqdm networkx# Download spaCy English model if not presentimport spacytry:    spacy.load("en_core_web_sm")except OSError:    !python -m spacy download en_core_web_sm# Download required NLTK dataimport nltkfor item in ['punkt', 'stopwords', 'wordnet', 'averaged_perceptron_tagger', 'maxent_ne_chunker', 'words']:    nltk.download(item, quiet=True)

In [1]:
import re
import json
import nltk
import spacy
import numpy as np
import pandas as pd
from collections import defaultdict, Counter
import matplotlib.pyplot as plt
import seaborn as sns
from typing import List, Tuple, Dict, Any, Optional
import warnings
warnings.filterwarnings('ignore')

# Scikit-learn imports
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics import accuracy_score, f1_score

# Deep learning imports
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

# NLTK downloads
nltk_downloads = ['punkt', 'stopwords', 'wordnet','maxent_ne_chunker_tab', 'averaged_perceptron_tagger', 'maxent_ne_chunker', 'words','punkt_tab','averaged_perceptron_tagger_eng']
for item in nltk_downloads:
    nltk.download(item, quiet=True)

from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk import ne_chunk, pos_tag

print("Libraries imported successfully!")

Libraries imported successfully!


## 1. Rule-Based Question Answering

Starting with a simple rule-based approach using patterns and templates.

In [26]:
class RuleBasedQA:
    def __init__(self):
        # Question patterns and their corresponding answer extractors
        self.question_patterns = {
            'who': {
                'patterns': [r'who is', r'who was', r'who are', r'who were'],
                'answer_type': 'PERSON',
                'extraction_method': 'named_entity'
            },
            'what': {
                'patterns': [r'what is', r'what was', r'what are', r'what were', r'what does'],
                'answer_type': 'DEFINITION',
                'extraction_method': 'definition'
            },
            'where': {
                'patterns': [r'where is', r'where was', r'where are', r'where were'],
                'answer_type': 'LOCATION',
                'extraction_method': 'named_entity'
            },
            'when': {
                'patterns': [r'when is', r'when was', r'when did', r'when will'],
                'answer_type': 'TIME',
                'extraction_method': 'temporal'
            },
            'how': {
                'patterns': [r'how many', r'how much', r'how long', r'how often'],
                'answer_type': 'QUANTITY',
                'extraction_method': 'numerical'
            }
        }

        # Simple knowledge base
        self.knowledge_base = {
            'facts': [
                "Albert Einstein was a German-born theoretical physicist.",
                "Paris is the capital of France.",
                "The Eiffel Tower is located in Paris, France.",
                "World War II ended in 1945.",
                "Shakespeare wrote Romeo and Juliet.",
                "The Great Wall of China is approximately 13,000 miles long.",
                "Marie Curie won Nobel prizes in Physics and Chemistry.",
                "The Amazon River is the longest river in the world.",
                "DNA stands for Deoxyribonucleic Acid.",
                "Mount Everest is the highest mountain in the world."
            ],
            'definitions': {
                'artificial intelligence': 'AI is the simulation of human intelligence in machines.',
                'machine learning': 'ML is a method of data analysis that automates analytical model building.',
                'natural language processing': 'NLP is a branch of AI that helps computers understand human language.',
                'deep learning': 'Deep learning is a subset of ML based on artificial neural networks.'
            }
        }

        self.lemmatizer = WordNetLemmatizer()
        self.stop_words = set(stopwords.words('english'))

    def analyze_question(self, question: str) -> Dict[str, Any]:
        """Analyze question to determine type and extract key information"""
        question_lower = question.lower().strip()

        # Determine question type
        question_type = None
        for q_type, info in self.question_patterns.items():
            for pattern in info['patterns']:
                if re.search(pattern, question_lower):
                    question_type = q_type
                    break
            if question_type:
                break

        # Extract keywords
        tokens = word_tokenize(question_lower)
        keywords = [self.lemmatizer.lemmatize(token) for token in tokens
                   if token.isalpha() and token not in self.stop_words]

        return {
            'type': question_type,
            'keywords': keywords,
            'original': question
        }

    def extract_named_entities(self, text: str, entity_type: str) -> List[str]:
        """Extract named entities of specified type"""
        tokens = word_tokenize(text)
        pos_tags = pos_tag(tokens)
        named_entities = ne_chunk(pos_tags)

        entities = []
        for chunk in named_entities:
            if hasattr(chunk, 'label') and chunk.label() == entity_type:
                entity_name = ' '.join([token for token, pos in chunk.leaves()])
                entities.append(entity_name)

        return entities

    def find_relevant_facts(self, keywords: List[str], top_k: int = 3) -> List[str]:
        """Find relevant facts from knowledge base"""
        relevant_facts = []

        for fact in self.knowledge_base['facts']:
            fact_lower = fact.lower()
            score = sum(1 for keyword in keywords if keyword in fact_lower)
            if score > 0:
                relevant_facts.append((fact, score))

        # Sort by relevance and return top-k
        relevant_facts.sort(key=lambda x: x[1], reverse=True)
        return [fact for fact, score in relevant_facts[:top_k]]

    def extract_answer(self, question_info: Dict[str, Any], relevant_facts: List[str]) -> str:
        """Extract answer based on question type and relevant facts"""
        if not relevant_facts:
            return "I don't have enough information to answer that question."

        question_type = question_info['type']
        keywords = question_info['keywords']

        if question_type == 'who':
            # Look for person names
            for fact in relevant_facts:
                persons = self.extract_named_entities(fact, 'PERSON')
                if persons:
                    return persons[0]

        elif question_type == 'where':
            # Look for locations
            for fact in relevant_facts:
                locations = self.extract_named_entities(fact, 'GPE')  # Geopolitical entity
                if locations:
                    return locations[0]
                # Simple pattern matching for location indicators
                if ' in ' in fact.lower():
                    parts = fact.lower().split(' in ')
                    if len(parts) > 1:
                        return parts[1].split('.')[0].strip().title()

        elif question_type == 'when':
            # Look for temporal expressions
            for fact in relevant_facts:
                # Simple regex for years
                years = re.findall(r'\b(19|20)\d{2}\b', fact)
                if years:
                    return years[0]

        elif question_type == 'what':
            # Check definitions first
            for keyword in keywords:
                if keyword in self.knowledge_base['definitions']:
                    return self.knowledge_base['definitions'][keyword]

            # Return most relevant fact
            return relevant_facts[0]

        elif question_type == 'how':
            # Look for numerical answers
            for fact in relevant_facts:
                numbers = re.findall(r'\b\d+[,\d]*\b', fact)
                if numbers:
                    return numbers[0]

        # Default: return most relevant fact
        return relevant_facts[0]

    def answer_question(self, question: str) -> Dict[str, Any]:
        """Main method to answer a question"""
        # Analyze question
        question_info = self.analyze_question(question)

        # Find relevant facts
        relevant_facts = self.find_relevant_facts(question_info['keywords'])

        # Extract answer
        answer = self.extract_answer(question_info, relevant_facts)

        return {
            'question': question,
            'question_type': question_info['type'],
            'keywords': question_info['keywords'],
            'relevant_facts': relevant_facts,
            'answer': answer
        }

# Initialize and test rule-based QA
rule_qa = RuleBasedQA()

# Test questions
test_questions = [
    "Who was Albert Einstein?",
    "What is the capital of France?",
    "Where is the Eiffel Tower located?",
    "When did World War II end?",
    "What is artificial intelligence?",
    "How long is the Great Wall of China?",
    "Who wrote Romeo and Juliet?"
]

print("Rule-Based Question Answering Results:")
print("=" * 60)

for question in test_questions:
    result = rule_qa.answer_question(question)
    print(f"Q: {question}")
    print(f"Type: {result['question_type']}")
    print(f"Keywords: {result['keywords']}")
    print(f"A: {result['answer']}")
    print("-" * 60)

Rule-Based Question Answering Results:
Q: Who was Albert Einstein?
Type: who
Keywords: ['albert', 'einstein']
A: Albert
------------------------------------------------------------
Q: What is the capital of France?
Type: what
Keywords: ['capital', 'france']
A: Paris is the capital of France.
------------------------------------------------------------
Q: Where is the Eiffel Tower located?
Type: where
Keywords: ['eiffel', 'tower', 'located']
A: Paris
------------------------------------------------------------
Q: When did World War II end?
Type: when
Keywords: ['world', 'war', 'ii', 'end']
A: 19
------------------------------------------------------------
Q: What is artificial intelligence?
Type: what
Keywords: ['artificial', 'intelligence']
A: I don't have enough information to answer that question.
------------------------------------------------------------
Q: How long is the Great Wall of China?
Type: how
Keywords: ['long', 'great', 'wall', 'china']
A: 13,000
-----------------------

## 2. Retrieval-Based Question Answering

Implementing a more sophisticated approach using document retrieval and similarity matching.

In [27]:
class RetrievalBasedQA:
    def __init__(self, corpus: List[str]):
        self.corpus = corpus
        self.vectorizer = TfidfVectorizer(
            stop_words='english',
            max_features=1000,
            ngram_range=(1, 2)
        )
        self.document_vectors = None
        self.build_index()

    def build_index(self):
        """Build TF-IDF index for the corpus"""
        print(f"Building index for {len(self.corpus)} documents...")
        self.document_vectors = self.vectorizer.fit_transform(self.corpus)
        print("Index built successfully!")

    def retrieve_documents(self, question: str, top_k: int = 5) -> List[Tuple[str, float]]:
        """Retrieve top-k most relevant documents for the question"""
        # Vectorize question
        question_vector = self.vectorizer.transform([question])

        # Calculate similarities
        similarities = cosine_similarity(question_vector, self.document_vectors).flatten()

        # Get top-k documents
        top_indices = np.argsort(similarities)[-top_k:][::-1]

        retrieved_docs = []
        for idx in top_indices:
            if similarities[idx] > 0:  # Only return documents with positive similarity
                retrieved_docs.append((self.corpus[idx], similarities[idx]))

        return retrieved_docs

    def extract_answer_from_passage(self, question: str, passage: str) -> str:
        """Extract answer from a passage using simple heuristics"""
        # Split passage into sentences
        sentences = sent_tokenize(passage)

        # Find sentence most similar to question
        if len(sentences) == 1:
            return sentences[0]

        # Vectorize question and sentences
        all_texts = [question] + sentences
        vectors = self.vectorizer.transform(all_texts)

        # Calculate similarities between question and each sentence
        question_vector = vectors[0:1]
        sentence_vectors = vectors[1:]

        similarities = cosine_similarity(question_vector, sentence_vectors).flatten()

        # Return most similar sentence
        best_sentence_idx = np.argmax(similarities)
        return sentences[best_sentence_idx]

    def answer_question(self, question: str, top_k: int = 3) -> Dict[str, Any]:
        """Answer question using retrieval-based approach"""
        # Retrieve relevant documents
        retrieved_docs = self.retrieve_documents(question, top_k)

        if not retrieved_docs:
            return {
                'question': question,
                'answer': "I couldn't find relevant information to answer your question.",
                'confidence': 0.0,
                'source_documents': []
            }

        # Extract answer from most relevant document
        best_passage, confidence = retrieved_docs[0]
        answer = self.extract_answer_from_passage(question, best_passage)

        return {
            'question': question,
            'answer': answer,
            'confidence': confidence,
            'source_documents': [doc for doc, score in retrieved_docs]
        }

# Create a larger corpus for retrieval-based QA
retrieval_corpus = [
    "Albert Einstein was a German-born theoretical physicist who developed the theory of relativity. He received the Nobel Prize in Physics in 1921.",
    "Paris is the capital and most populous city of France. It is known for landmarks like the Eiffel Tower and the Louvre Museum.",
    "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It was completed in 1889 and stands 324 meters tall.",
    "World War II was a global war that lasted from 1939 to 1945. It ended with the surrender of Japan in September 1945.",
    "William Shakespeare was an English playwright and poet. He wrote famous plays including Romeo and Juliet, Hamlet, and Macbeth.",
    "The Great Wall of China is a series of fortifications built across northern China. It stretches approximately 13,000 miles in total length.",
    "Marie Curie was a Polish-French physicist and chemist. She was the first woman to win a Nobel Prize and the first person to win Nobel Prizes in two different sciences.",
    "The Amazon River is the longest river in the world, flowing approximately 4,000 miles through South America from Peru to Brazil.",
    "DNA, or Deoxyribonucleic Acid, is the hereditary material in humans and almost all other organisms. It contains genetic instructions for development and function.",
    "Mount Everest is Earth's highest mountain above sea level, located in the Himalayas between Nepal and Tibet. It stands 8,848.86 meters tall.",
    "Artificial Intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, and self-correction.",
    "Machine Learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data.",
    "The Roman Empire was one of the largest empires in ancient history. At its peak, it controlled territory from Britain to the Middle East and North Africa.",
    "Leonardo da Vinci was an Italian Renaissance polymath known for his paintings, inventions, and scientific observations. His most famous works include the Mona Lisa and The Last Supper.",
    "The Pacific Ocean is the largest and deepest ocean on Earth. It covers about one-third of the Earth's surface and contains more than half of the free water on Earth."
]

# Initialize retrieval-based QA
retrieval_qa = RetrievalBasedQA(retrieval_corpus)

# Test questions
retrieval_test_questions = [
    "Who was Albert Einstein and what did he achieve?",
    "What is the height of Mount Everest?",
    "When was the Eiffel Tower completed?",
    "What is the length of the Amazon River?",
    "Who painted the Mona Lisa?",
    "What is machine learning?",
    "Which ocean is the largest?"
]

print("\nRetrieval-Based Question Answering Results:")
print("=" * 70)

for question in retrieval_test_questions:
    result = retrieval_qa.answer_question(question)
    print(f"Q: {question}")
    print(f"A: {result['answer']}")
    print(f"Confidence: {result['confidence']:.3f}")
    print(f"Source: {result['source_documents'][0][:100]}...")
    print("-" * 70)

Building index for 15 documents...
Index built successfully!

Retrieval-Based Question Answering Results:
Q: Who was Albert Einstein and what did he achieve?
A: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
Confidence: 0.340
Source: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity. He r...
----------------------------------------------------------------------
Q: What is the height of Mount Everest?
A: Mount Everest is Earth's highest mountain above sea level, located in the Himalayas between Nepal and Tibet.
Confidence: 0.317
Source: Mount Everest is Earth's highest mountain above sea level, located in the Himalayas between Nepal an...
----------------------------------------------------------------------
Q: When was the Eiffel Tower completed?
A: The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.
Confidence: 0.406
Source: The Eiffel Tower is a wrought-ir

## 3. Reading Comprehension QA

Building a system that can answer questions based on a given passage.

In [28]:
class ReadingComprehensionQA:
    def __init__(self):
        self.vectorizer = TfidfVectorizer(stop_words='english', ngram_range=(1, 3))

        # Question type patterns
        self.question_types = {
            'who': r'^who\b',
            'what': r'^what\b',
            'where': r'^where\b',
            'when': r'^when\b',
            'why': r'^why\b',
            'how': r'^how\b',
            'which': r'^which\b',
            'yes_no': r'^(is|are|was|were|do|does|did|can|could|will|would)\b'
        }

    def identify_question_type(self, question: str) -> str:
        """Identify the type of question"""
        question_lower = question.lower().strip()

        for q_type, pattern in self.question_types.items():
            if re.match(pattern, question_lower):
                return q_type

        return 'unknown'

    def extract_keywords(self, text: str) -> List[str]:
        """Extract important keywords from text"""
        tokens = word_tokenize(text.lower())
        stop_words = set(stopwords.words('english'))

        # Filter out stop words and non-alphabetic tokens
        keywords = [token for token in tokens
                   if token.isalpha() and token not in stop_words and len(token) > 2]

        return keywords

    def find_answer_candidates(self, passage: str, question: str, question_type: str) -> List[Tuple[str, float]]:
        """Find potential answer spans in the passage"""
        sentences = sent_tokenize(passage)
        question_keywords = set(self.extract_keywords(question))

        candidates = []

        for sentence in sentences:
            sentence_keywords = set(self.extract_keywords(sentence))

            # Calculate keyword overlap
            overlap = len(question_keywords.intersection(sentence_keywords))
            overlap_score = overlap / len(question_keywords) if question_keywords else 0

            # Calculate semantic similarity using TF-IDF
            try:
                vectors = self.vectorizer.fit_transform([question, sentence])
                similarity = cosine_similarity(vectors[0:1], vectors[1:2])[0][0]
            except:
                similarity = 0

            # Combined score
            combined_score = 0.6 * similarity + 0.4 * overlap_score

            if combined_score > 0.1:  # Threshold for relevance
                candidates.append((sentence, combined_score))

        # Sort by score
        candidates.sort(key=lambda x: x[1], reverse=True)
        return candidates

    def extract_specific_answer(self, sentence: str, question: str, question_type: str) -> str:
        """Extract specific answer from sentence based on question type"""
        if question_type == 'yes_no':
            # For yes/no questions, return yes or no based on context
            return "Yes" if sentence else "No"

        elif question_type in ['who', 'what', 'where', 'when', 'why', 'how', 'which']:
            # For wh-questions, try to extract specific entities or phrases

            if question_type == 'who':
                # Look for person names (capitalized words)
                words = sentence.split()
                names = [word for word in words if word[0].isupper() and word.isalpha()]
                if names:
                    return ' '.join(names[:2])  # Return first two capitalized words

            elif question_type == 'when':
                # Look for temporal expressions
                time_patterns = [
                    r'\b(19|20)\d{2}\b',  # Years
                    r'\b(January|February|March|April|May|June|July|August|September|October|November|December)\b',  # Months
                    r'\b\d{1,2}\s+(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{4}\b'  # Dates
                ]

                for pattern in time_patterns:
                    matches = re.findall(pattern, sentence, re.IGNORECASE)
                    if matches:
                        return matches[0] if isinstance(matches[0], str) else ' '.join(matches[0])

            elif question_type == 'where':
                # Look for location indicators
                if ' in ' in sentence:
                    parts = sentence.split(' in ')
                    if len(parts) > 1:
                        location = parts[1].split(',')[0].split('.')[0].strip()
                        return location

            elif question_type == 'how':
                # Look for quantities or measurements
                quantity_patterns = [
                    r'\b\d+[,\d]*\s*(meters?|feet|miles?|kilometers?|years?|hours?|minutes?)\b',
                    r'\b\d+[,\d]*\b'
                ]

                for pattern in quantity_patterns:
                    matches = re.findall(pattern, sentence, re.IGNORECASE)
                    if matches:
                        return matches[0]

        # If no specific extraction, return the whole sentence
        return sentence

    def answer_question(self, passage: str, question: str) -> Dict[str, Any]:
        """Answer question based on given passage"""
        # Identify question type
        question_type = self.identify_question_type(question)

        # Find answer candidates
        candidates = self.find_answer_candidates(passage, question, question_type)

        if not candidates:
            return {
                'question': question,
                'answer': "I cannot find an answer to this question in the given passage.",
                'confidence': 0.0,
                'question_type': question_type,
                'source_sentence': None
            }

        # Get best candidate
        best_sentence, confidence = candidates[0]

        # Extract specific answer
        answer = self.extract_specific_answer(best_sentence, question, question_type)

        return {
            'question': question,
            'answer': answer,
            'confidence': confidence,
            'question_type': question_type,
            'source_sentence': best_sentence
        }

# Initialize reading comprehension QA
rc_qa = ReadingComprehensionQA()

# Test passage and questions
test_passage = """
Albert Einstein was born on March 14, 1879, in Ulm, Germany. He was a theoretical physicist who developed the theory of relativity, one of the two pillars of modern physics. Einstein received the Nobel Prize in Physics in 1921 for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect. He published more than 300 scientific papers and 150 non-scientific works. Einstein is widely regarded as one of the greatest physicists of all time. He died on April 18, 1955, in Princeton, New Jersey, at the age of 76.
"""

rc_test_questions = [
    "When was Albert Einstein born?",
    "Where was Einstein born?",
    "What did Einstein develop?",
    "When did Einstein receive the Nobel Prize?",
    "How many scientific papers did Einstein publish?",
    "Where did Einstein die?",
    "Was Einstein a physicist?"
]

print("\nReading Comprehension QA Results:")
print("=" * 70)
print(f"Passage: {test_passage.strip()}")
print("=" * 70)

for question in rc_test_questions:
    result = rc_qa.answer_question(test_passage, question)
    print(f"Q: {question}")
    print(f"A: {result['answer']}")
    print(f"Type: {result['question_type']}")
    print(f"Confidence: {result['confidence']:.3f}")
    print("-" * 70)


Reading Comprehension QA Results:
Passage: Albert Einstein was born on March 14, 1879, in Ulm, Germany. He was a theoretical physicist who developed the theory of relativity, one of the two pillars of modern physics. Einstein received the Nobel Prize in Physics in 1921 for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect. He published more than 300 scientific papers and 150 non-scientific works. Einstein is widely regarded as one of the greatest physicists of all time. He died on April 18, 1955, in Princeton, New Jersey, at the age of 76.
Q: When was Albert Einstein born?
A: March
Type: when
Confidence: 0.646
----------------------------------------------------------------------
Q: Where was Einstein born?
A: Ulm
Type: where
Confidence: 0.567
----------------------------------------------------------------------
Q: What did Einstein develop?
A: Einstein is widely regarded as one of the greatest physicists of all time.
Type: w

## 4. Neural Reading Comprehension Model

Implementing a simple neural network for reading comprehension.

In [29]:
class QADataset(Dataset):
    def __init__(self, contexts, questions, answers, vocab, max_length=200):
        self.contexts = contexts
        self.questions = questions
        self.answers = answers
        self.vocab = vocab
        self.max_length = max_length

    def __len__(self):
        return len(self.questions)

    def text_to_indices(self, text):
        """Convert text to indices using vocabulary"""
        tokens = word_tokenize(text.lower())
        indices = [self.vocab.get(token, self.vocab['<UNK>']) for token in tokens]

        # Pad or truncate
        if len(indices) < self.max_length:
            indices.extend([self.vocab['<PAD>']] * (self.max_length - len(indices)))
        else:
            indices = indices[:self.max_length]

        return indices

    def __getitem__(self, idx):
        context = self.contexts[idx]
        question = self.questions[idx]
        answer = self.answers[idx]

        # Convert to indices
        context_indices = self.text_to_indices(context)
        question_indices = self.text_to_indices(question)

        # For simplicity, we'll treat this as a classification problem
        # In a real scenario, you'd have start/end positions for the answer
        answer_class = hash(answer) % 100  # Simple hash-based classification

        return (
            torch.tensor(context_indices, dtype=torch.long),
            torch.tensor(question_indices, dtype=torch.long),
            torch.tensor(answer_class, dtype=torch.long)
        )

class NeuralQAModel(nn.Module):
    def __init__(self, vocab_size, embedding_dim=128, hidden_dim=256, num_classes=100):
        super(NeuralQAModel, self).__init__()

        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.context_lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True, bidirectional=True)
        self.question_lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True, bidirectional=True)

        self.attention = nn.MultiheadAttention(hidden_dim * 2, num_heads=8, batch_first=True)
        self.classifier = nn.Linear(hidden_dim * 4, num_classes)
        self.dropout = nn.Dropout(0.3)

    def forward(self, context, question):
        # Embed inputs
        context_emb = self.embedding(context)  # (batch_size, seq_len, embedding_dim)
        question_emb = self.embedding(question)

        # LSTM encoding
        context_encoded, _ = self.context_lstm(context_emb)  # (batch_size, seq_len, hidden_dim*2)
        question_encoded, _ = self.question_lstm(question_emb)

        # Attention mechanism
        attended_context, _ = self.attention(question_encoded, context_encoded, context_encoded)

        # Pool representations
        context_pooled = torch.mean(context_encoded, dim=1)  # (batch_size, hidden_dim*2)
        question_pooled = torch.mean(attended_context, dim=1)

        # Combine representations
        combined = torch.cat([context_pooled, question_pooled], dim=1)  # (batch_size, hidden_dim*4)
        combined = self.dropout(combined)

        # Classification
        output = self.classifier(combined)

        return output

class NeuralReadingComprehension:
    def __init__(self):
        self.vocab = None
        self.model = None
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    def build_vocabulary(self, texts, min_freq=2):
        """Build vocabulary from texts"""
        word_freq = Counter()

        for text in texts:
            tokens = word_tokenize(text.lower())
            word_freq.update(tokens)

        # Create vocabulary
        vocab = {'<PAD>': 0, '<UNK>': 1}
        idx = 2

        for word, freq in word_freq.items():
            if freq >= min_freq:
                vocab[word] = idx
                idx += 1

        self.vocab = vocab
        return vocab

    def create_training_data(self):
        """Create synthetic training data for demonstration"""
        contexts = [
            "Albert Einstein was born in Germany in 1879. He developed the theory of relativity.",
            "Paris is the capital of France. It is known for the Eiffel Tower.",
            "The Pacific Ocean is the largest ocean in the world. It covers one-third of Earth's surface.",
            "Shakespeare wrote many famous plays including Romeo and Juliet.",
            "The Great Wall of China is approximately 13,000 miles long."
        ]

        questions = [
            "Where was Einstein born?",
            "What is the capital of France?",
            "Which ocean is the largest?",
            "Who wrote Romeo and Juliet?",
            "How long is the Great Wall?"
        ]

        answers = [
            "Germany",
            "Paris",
            "Pacific Ocean",
            "Shakespeare",
            "13,000 miles"
        ]

        return contexts, questions, answers

    def train_model(self, contexts, questions, answers, epochs=10):
        """Train the neural QA model"""
        # Build vocabulary
        all_texts = contexts + questions + answers
        self.build_vocabulary(all_texts)

        # Create dataset
        dataset = QADataset(contexts, questions, answers, self.vocab)
        dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

        # Initialize model
        vocab_size = len(self.vocab)
        self.model = NeuralQAModel(vocab_size).to(self.device)

        # Loss and optimizer
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.Adam(self.model.parameters(), lr=0.001)

        # Training loop
        for epoch in range(epochs):
            total_loss = 0
            self.model.train()

            for context, question, answer in dataloader:
                context = context.to(self.device)
                question = question.to(self.device)
                answer = answer.to(self.device)

                optimizer.zero_grad()
                outputs = self.model(context, question)
                loss = criterion(outputs, answer)
                loss.backward()
                optimizer.step()

                total_loss += loss.item()

            avg_loss = total_loss / len(dataloader)
            print(f"Epoch {epoch+1}/{epochs}, Loss: {avg_loss:.4f}")

    def predict(self, context, question):
        """Predict answer for given context and question"""
        if self.model is None:
            return "Model not trained yet."

        self.model.eval()

        # Create dummy dataset for prediction
        dummy_answer = "dummy"
        dataset = QADataset([context], [question], [dummy_answer], self.vocab)
        context_tensor, question_tensor, _ = dataset[0]

        # Add batch dimension
        context_tensor = context_tensor.unsqueeze(0).to(self.device)
        question_tensor = question_tensor.unsqueeze(0).to(self.device)

        with torch.no_grad():
            outputs = self.model(context_tensor, question_tensor)
            predicted_class = torch.argmax(outputs, dim=1).item()

        return f"Predicted class: {predicted_class} (This is a simplified demo)"

# Demonstrate neural QA (simplified version)
neural_qa = NeuralReadingComprehension()

print("\nNeural Reading Comprehension Demo:")
print("=" * 50)

# Create training data
contexts, questions, answers = neural_qa.create_training_data()

print(f"Training on {len(contexts)} examples...")
neural_qa.train_model(contexts, questions, answers, epochs=5)

# Test prediction
test_context = "Leonardo da Vinci was an Italian artist born in 1452. He painted the Mona Lisa."
test_question = "Who painted the Mona Lisa?"

prediction = neural_qa.predict(test_context, test_question)
print(f"\nTest Context: {test_context}")
print(f"Test Question: {test_question}")
print(f"Prediction: {prediction}")

print("\nNote: This is a simplified neural QA model for demonstration.")
print("Production models like BERT-based QA systems are much more sophisticated.")


Neural Reading Comprehension Demo:
Training on 5 examples...
Epoch 1/5, Loss: 4.7051
Epoch 2/5, Loss: 3.2979
Epoch 3/5, Loss: 2.1885
Epoch 4/5, Loss: 2.3634
Epoch 5/5, Loss: 2.1271

Test Context: Leonardo da Vinci was an Italian artist born in 1452. He painted the Mona Lisa.
Test Question: Who painted the Mona Lisa?
Prediction: Predicted class: 82 (This is a simplified demo)

Note: This is a simplified neural QA model for demonstration.
Production models like BERT-based QA systems are much more sophisticated.


## 5. Conversational QA System

Building a multi-turn conversation system that maintains context across questions.

In [30]:
class ConversationalQA:
    def __init__(self, knowledge_base: List[str]):
        self.knowledge_base = knowledge_base
        self.conversation_history = []
        self.context_entities = set()
        self.current_topic = None

        # Initialize components
        self.retrieval_qa = RetrievalBasedQA(knowledge_base)
        self.rc_qa = ReadingComprehensionQA()

        # Coreference resolution patterns
        self.pronouns = {'he', 'she', 'it', 'they', 'him', 'her', 'them', 'his', 'hers', 'its', 'their'}
        self.demonstratives = {'this', 'that', 'these', 'those'}

    def extract_entities(self, text: str) -> List[str]:
        """Extract named entities from text"""
        tokens = word_tokenize(text)
        pos_tags = pos_tag(tokens)
        named_entities = ne_chunk(pos_tags)

        entities = []
        for chunk in named_entities:
            if hasattr(chunk, 'label'):
                entity_name = ' '.join([token for token, pos in chunk.leaves()])
                entities.append(entity_name)

        return entities

    def resolve_coreferences(self, question: str) -> str:
        """Simple coreference resolution using conversation history"""
        if not self.conversation_history:
            return question

        words = word_tokenize(question.lower())
        resolved_question = question

        # Replace pronouns and demonstratives with entities from context
        for word in words:
            if word in self.pronouns or word in self.demonstratives:
                if self.context_entities:
                    # Use the most recent entity as replacement
                    recent_entity = list(self.context_entities)[-1]
                    resolved_question = resolved_question.replace(word, recent_entity, 1)

        return resolved_question

    def update_context(self, question: str, answer: str):
        """Update conversation context with new information"""
        # Extract entities from question and answer
        question_entities = self.extract_entities(question)
        answer_entities = self.extract_entities(answer)

        # Update context entities
        self.context_entities.update(question_entities)
        self.context_entities.update(answer_entities)

        # Keep only recent entities to avoid context pollution
        if len(self.context_entities) > 10:
            self.context_entities = set(list(self.context_entities)[-10:])

        # Update conversation history
        self.conversation_history.append({
            'question': question,
            'answer': answer,
            'entities': question_entities + answer_entities
        })

        # Keep only recent history
        if len(self.conversation_history) > 5:
            self.conversation_history = self.conversation_history[-5:]

    def get_conversation_context(self) -> str:
        """Get relevant context from conversation history"""
        if not self.conversation_history:
            return ""

        context_parts = []
        for turn in self.conversation_history[-3:]:  # Last 3 turns
            context_parts.append(f"Q: {turn['question']} A: {turn['answer']}")

        return " ".join(context_parts)

    def ask_question(self, question: str) -> Dict[str, Any]:
        """Process a question in conversational context"""
        original_question = question

        # Resolve coreferences
        resolved_question = self.resolve_coreferences(question)

        # Get conversation context
        context = self.get_conversation_context()

        # Enhance question with context if needed
        if context and len(resolved_question.split()) < 5:  # Short questions might need context
            enhanced_question = f"{context} {resolved_question}"
        else:
            enhanced_question = resolved_question

        # Get answer using retrieval-based QA
        qa_result = self.retrieval_qa.answer_question(enhanced_question)
        answer = qa_result['answer']
        confidence = qa_result['confidence']

        # Update context
        self.update_context(original_question, answer)

        return {
            'original_question': original_question,
            'resolved_question': resolved_question,
            'enhanced_question': enhanced_question,
            'answer': answer,
            'confidence': confidence,
            'context_entities': list(self.context_entities),
            'conversation_length': len(self.conversation_history)
        }

    def reset_conversation(self):
        """Reset conversation state"""
        self.conversation_history = []
        self.context_entities = set()
        self.current_topic = None

    def get_conversation_summary(self) -> str:
        """Get a summary of the conversation"""
        if not self.conversation_history:
            return "No conversation yet."

        summary = f"Conversation with {len(self.conversation_history)} turns.\n"
        summary += f"Entities discussed: {', '.join(self.context_entities)}\n"
        summary += "Recent questions:\n"

        for i, turn in enumerate(self.conversation_history[-3:], 1):
            summary += f"{i}. {turn['question']}\n"

        return summary

# Initialize conversational QA
conv_qa = ConversationalQA(retrieval_corpus)

# Simulate a conversation
conversation_questions = [
    "Who was Albert Einstein?",
    "When was he born?",
    "What did he develop?",
    "When did he receive the Nobel Prize?",
    "Where did he die?"
]

print("\nConversational QA Demo:")
print("=" * 60)

for i, question in enumerate(conversation_questions, 1):
    result = conv_qa.ask_question(question)

    print(f"Turn {i}:")
    print(f"  User: {result['original_question']}")
    if result['resolved_question'] != result['original_question']:
        print(f"  Resolved: {result['resolved_question']}")
    print(f"  Bot: {result['answer']}")
    print(f"  Confidence: {result['confidence']:.3f}")
    print(f"  Context entities: {result['context_entities']}")
    print("-" * 60)

print("\nConversation Summary:")
print(conv_qa.get_conversation_summary())

# Test coreference resolution
print("\nTesting coreference resolution:")
print("=" * 40)

conv_qa.reset_conversation()
conv_qa.ask_question("Tell me about Marie Curie.")
result = conv_qa.ask_question("What did she accomplish?")

print(f"Original: What did she accomplish?")
print(f"Resolved: {result['resolved_question']}")
print(f"Answer: {result['answer']}")

Building index for 15 documents...
Index built successfully!

Conversational QA Demo:
Turn 1:
  User: Who was Albert Einstein?
  Bot: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
  Confidence: 0.340
  Context entities: ['Albert', 'Albert Einstein', 'Einstein']
------------------------------------------------------------
Turn 2:
  User: When was he born?
  Resolved: WEinsteinn was he born?
  Bot: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
  Confidence: 0.759
  Context entities: ['Albert', 'Albert Einstein', 'Einstein']
------------------------------------------------------------
Turn 3:
  User: What did he develop?
  Resolved: What did Einstein develop?
  Bot: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
  Confidence: 0.779
  Context entities: ['Albert', 'Albert Einstein', 'Einstein']
--------------------------------------------------

## 6. QA System Evaluation

Implementing comprehensive evaluation metrics for QA systems.

In [31]:
class QAEvaluator:
    def __init__(self):
        self.metrics = {}

    def exact_match(self, predicted: str, ground_truth: str) -> float:
        """Calculate exact match score"""
        predicted_clean = predicted.strip().lower()
        ground_truth_clean = ground_truth.strip().lower()
        return 1.0 if predicted_clean == ground_truth_clean else 0.0

    def token_overlap_f1(self, predicted: str, ground_truth: str) -> float:
        """Calculate F1 score based on token overlap"""
        pred_tokens = set(word_tokenize(predicted.lower()))
        true_tokens = set(word_tokenize(ground_truth.lower()))

        if not pred_tokens and not true_tokens:
            return 1.0
        if not pred_tokens or not true_tokens:
            return 0.0

        overlap = pred_tokens.intersection(true_tokens)
        precision = len(overlap) / len(pred_tokens)
        recall = len(overlap) / len(true_tokens)

        if precision + recall == 0:
            return 0.0

        f1 = 2 * precision * recall / (precision + recall)
        return f1

    def semantic_similarity(self, predicted: str, ground_truth: str) -> float:
        """Calculate semantic similarity using TF-IDF cosine similarity"""
        try:
            vectorizer = TfidfVectorizer()
            vectors = vectorizer.fit_transform([predicted, ground_truth])
            similarity = cosine_similarity(vectors[0:1], vectors[1:2])[0][0]
            return similarity
        except:
            return 0.0

    def answer_relevance(self, predicted: str, question: str) -> float:
        """Calculate how relevant the answer is to the question"""
        try:
            vectorizer = TfidfVectorizer(stop_words='english')
            vectors = vectorizer.fit_transform([predicted, question])
            relevance = cosine_similarity(vectors[0:1], vectors[1:2])[0][0]
            return relevance
        except:
            return 0.0

    def evaluate_single(self, predicted: str, ground_truth: str, question: str) -> Dict[str, float]:
        """Evaluate a single QA prediction"""
        return {
            'exact_match': self.exact_match(predicted, ground_truth),
            'token_f1': self.token_overlap_f1(predicted, ground_truth),
            'semantic_similarity': self.semantic_similarity(predicted, ground_truth),
            'answer_relevance': self.answer_relevance(predicted, question)
        }

    def evaluate_dataset(self, predictions: List[str], ground_truths: List[str],
                        questions: List[str]) -> Dict[str, float]:
        """Evaluate entire dataset"""
        if len(predictions) != len(ground_truths) or len(predictions) != len(questions):
            raise ValueError("All lists must have the same length")

        all_metrics = defaultdict(list)

        for pred, truth, question in zip(predictions, ground_truths, questions):
            metrics = self.evaluate_single(pred, truth, question)
            for metric, value in metrics.items():
                all_metrics[metric].append(value)

        # Calculate average metrics
        avg_metrics = {}
        for metric, values in all_metrics.items():
            avg_metrics[f'avg_{metric}'] = np.mean(values)
            avg_metrics[f'std_{metric}'] = np.std(values)

        return avg_metrics

    def compare_systems(self, system_results: Dict[str, List[str]],
                       ground_truths: List[str], questions: List[str]) -> pd.DataFrame:
        """Compare multiple QA systems"""
        comparison_data = []

        for system_name, predictions in system_results.items():
            metrics = self.evaluate_dataset(predictions, ground_truths, questions)

            system_data = {'System': system_name}
            system_data.update(metrics)
            comparison_data.append(system_data)

        return pd.DataFrame(comparison_data)

    def detailed_error_analysis(self, predictions: List[str], ground_truths: List[str],
                               questions: List[str]) -> Dict[str, Any]:
        """Perform detailed error analysis"""
        error_types = {
            'exact_match_failures': [],
            'low_semantic_similarity': [],
            'irrelevant_answers': [],
            'empty_predictions': []
        }

        for i, (pred, truth, question) in enumerate(zip(predictions, ground_truths, questions)):
            metrics = self.evaluate_single(pred, truth, question)

            if metrics['exact_match'] == 0:
                error_types['exact_match_failures'].append({
                    'index': i,
                    'question': question,
                    'predicted': pred,
                    'ground_truth': truth,
                    'token_f1': metrics['token_f1']
                })

            if metrics['semantic_similarity'] < 0.3:
                error_types['low_semantic_similarity'].append({
                    'index': i,
                    'question': question,
                    'predicted': pred,
                    'ground_truth': truth,
                    'similarity': metrics['semantic_similarity']
                })

            if metrics['answer_relevance'] < 0.2:
                error_types['irrelevant_answers'].append({
                    'index': i,
                    'question': question,
                    'predicted': pred,
                    'relevance': metrics['answer_relevance']
                })

            if not pred.strip():
                error_types['empty_predictions'].append({
                    'index': i,
                    'question': question,
                    'ground_truth': truth
                })

        return error_types

# Create evaluation dataset
eval_questions = [
    "Who was Albert Einstein?",
    "What is the capital of France?",
    "When was the Eiffel Tower completed?",
    "How tall is Mount Everest?",
    "Who painted the Mona Lisa?"
]

ground_truth_answers = [
    "Albert Einstein was a German-born theoretical physicist",
    "Paris",
    "1889",
    "8,848.86 meters",
    "Leonardo da Vinci"
]

# Get predictions from different systems
rule_predictions = []
retrieval_predictions = []
rc_predictions = []

# Rule-based predictions
for question in eval_questions:
    result = rule_qa.answer_question(question)
    rule_predictions.append(result['answer'])

# Retrieval-based predictions
for question in eval_questions:
    result = retrieval_qa.answer_question(question)
    retrieval_predictions.append(result['answer'])

# Reading comprehension predictions (using Einstein passage for all questions)
for question in eval_questions:
    result = rc_qa.answer_question(test_passage, question)
    rc_predictions.append(result['answer'])

# Initialize evaluator
evaluator = QAEvaluator()

# Evaluate systems
system_results = {
    'Rule-based': rule_predictions,
    'Retrieval-based': retrieval_predictions,
    'Reading Comprehension': rc_predictions
}

print("\nQA System Evaluation Results:")
print("=" * 60)

comparison_df = evaluator.compare_systems(system_results, ground_truth_answers, eval_questions)
print(comparison_df.round(3))

# Detailed evaluation for retrieval-based system
print("\nDetailed Evaluation for Retrieval-based System:")
print("=" * 50)

for i, (question, pred, truth) in enumerate(zip(eval_questions, retrieval_predictions, ground_truth_answers)):
    metrics = evaluator.evaluate_single(pred, truth, question)
    print(f"\nQuestion {i+1}: {question}")
    print(f"Predicted: {pred}")
    print(f"Ground Truth: {truth}")
    print(f"Exact Match: {metrics['exact_match']:.3f}")
    print(f"Token F1: {metrics['token_f1']:.3f}")
    print(f"Semantic Similarity: {metrics['semantic_similarity']:.3f}")
    print(f"Answer Relevance: {metrics['answer_relevance']:.3f}")

# Error analysis
print("\nError Analysis for Retrieval-based System:")
print("=" * 50)

error_analysis = evaluator.detailed_error_analysis(retrieval_predictions, ground_truth_answers, eval_questions)

for error_type, errors in error_analysis.items():
    if errors:
        print(f"\n{error_type.replace('_', ' ').title()}: {len(errors)} cases")
        for error in errors[:2]:  # Show first 2 examples
            print(f"  Q: {error['question']}")
            print(f"  Predicted: {error.get('predicted', 'N/A')}")
            if 'ground_truth' in error:
                print(f"  Expected: {error['ground_truth']}")


QA System Evaluation Results:
                  System  avg_exact_match  std_exact_match  avg_token_f1  \
0             Rule-based              0.0              0.0         0.100   
1        Retrieval-based              0.0              0.0         0.167   
2  Reading Comprehension              0.0              0.0         0.089   

   std_token_f1  avg_semantic_similarity  std_semantic_similarity  \
0         0.122                    0.116                    0.143   
1         0.258                    0.168                    0.238   
2         0.178                    0.082                    0.164   

   avg_answer_relevance  std_answer_relevance  
0                  0.40                 0.242  
1                  0.35                 0.089  
2                  0.20                 0.400  

Detailed Evaluation for Retrieval-based System:

Question 1: Who was Albert Einstein?
Predicted: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
G

---

# Question Answering Challenges




### Challenge 1: Enhanced Question Classification
Improve the question classification system to handle more complex question types and patterns.

**Requirements:**
- Add support for complex question types (why, how much, which one, etc.)
- Implement question complexity scoring
- Handle multi-part questions
- Add confidence scoring for question classification

**Success Criteria:**
- Support 15+ question types
- Achieve 90%+ accuracy on question classification
- Handle compound questions correctly

In [32]:
# Your solution for Challenge 1
class EnhancedQuestionClassifier:
    def __init__(self):
        # TODO: Define comprehensive question patterns
        self.question_patterns = {}
        self.complexity_indicators = []

    def classify_question_type(self, question):
        # TODO: Implement enhanced question classification
        pass

    def calculate_complexity_score(self, question):
        # TODO: Calculate question complexity
        pass

    def handle_compound_questions(self, question):
        # TODO: Split and handle compound questions
        pass

    def get_classification_confidence(self, question):
        # TODO: Calculate confidence score
        pass

# Test your enhanced classifier
test_questions = [
    "Who was Einstein and what did he discover?",
    "Why is the sky blue?",
    "How much does an elephant weigh?",
    "Which planet is closest to the sun?"
]

# TODO: Test your implementation

### Challenge 2: Knowledge Graph QA
Build a question answering system that uses a knowledge graph to answer questions.

**Requirements:**
- Create a simple knowledge graph structure
- Implement graph traversal for answer finding
- Handle relationship-based queries
- Support multi-hop reasoning

**Success Criteria:**
- Build knowledge graph with 100+ entities
- Support relationship queries ("Who is married to X?")
- Handle 2-hop reasoning questions

In [33]:
# Your solution for Challenge 2
import networkx as nx

class KnowledgeGraphQA:
    def __init__(self):
        # TODO: Initialize knowledge graph
        self.graph = nx.DiGraph()
        self.entities = {}
        self.relations = {}

    def build_knowledge_graph(self, facts):
        # TODO: Build graph from facts
        pass

    def parse_query(self, question):
        # TODO: Parse question into graph query
        pass

    def traverse_graph(self, start_entity, relation, hops=1):
        # TODO: Traverse graph to find answers
        pass

    def answer_relationship_query(self, question):
        # TODO: Answer relationship-based questions
        pass

# Create sample knowledge graph
kg_facts = [
    ("Einstein", "born_in", "Germany"),
    ("Einstein", "developed", "Relativity Theory"),
    ("Paris", "capital_of", "France"),
    ("Eiffel Tower", "located_in", "Paris")
]

# TODO: Test your knowledge graph QA


### Challenge 3: Multi-Document QA
Build a system that can answer questions by combining information from multiple documents.

**Requirements:**
- Retrieve relevant passages from multiple documents
- Combine information across documents
- Handle conflicting information
- Provide source attribution

**Success Criteria:**
- Work with 100+ documents
- Successfully combine information from 3+ sources
- Handle contradictory information appropriately

In [34]:
# Your solution for Challenge 3
class MultiDocumentQA:
    def __init__(self, document_collection):
        # TODO: Initialize multi-document QA system
        self.documents = document_collection
        self.passage_retriever = None
        self.information_aggregator = None

    def retrieve_relevant_passages(self, question, top_k=10):
        # TODO: Retrieve passages from multiple documents
        pass

    def detect_information_conflicts(self, passages):
        # TODO: Detect conflicting information
        pass

    def aggregate_information(self, passages, question):
        # TODO: Combine information from multiple sources
        pass

    def provide_source_attribution(self, answer, sources):
        # TODO: Attribute answer to sources
        pass

    def answer_multi_document_question(self, question):
        # TODO: Main method for multi-document QA
        pass

# TODO: Test with multiple documents

### Challenge 4: Visual Question Answering
Extend the QA system to answer questions about images.

**Requirements:**
- Extract visual features from images
- Combine visual and textual information
- Handle spatial reasoning questions
- Support counting and color recognition

**Success Criteria:**
- Answer questions about objects in images
- Handle spatial relationships ("What is to the left of X?")
- Count objects accurately

In [35]:
# Your solution for Challenge 4
from PIL import Image
import torch
import torchvision.transforms as transforms

class VisualQA:
    def __init__(self):
        # TODO: Initialize visual QA system
        self.feature_extractor = None
        self.object_detector = None
        self.spatial_reasoner = None

    def extract_visual_features(self, image_path):
        # TODO: Extract features from image
        pass

    def detect_objects(self, image):
        # TODO: Detect objects in image
        pass

    def analyze_spatial_relationships(self, objects):
        # TODO: Analyze spatial relationships
        pass

    def count_objects(self, image, object_type):
        # TODO: Count specific objects
        pass

    def answer_visual_question(self, image_path, question):
        # TODO: Answer questions about images
        pass

# TODO: Test with sample images and questions



### Challenge 5: Reasoning-Intensive QA
Build a QA system that can handle complex reasoning tasks requiring multiple inference steps.

**Requirements:**
- Implement logical reasoning capabilities
- Handle mathematical word problems
- Support causal reasoning
- Provide step-by-step explanations

**Success Criteria:**
- Solve multi-step mathematical problems
- Handle logical puzzles and inference chains
- Provide clear reasoning explanations

In [36]:
# Your solution for Challenge 5
class ReasoningQA:
    def __init__(self):
        # TODO: Initialize reasoning QA system
        self.logical_reasoner = None
        self.math_solver = None
        self.causal_reasoner = None

    def parse_mathematical_problem(self, question):
        # TODO: Parse math word problems
        pass

    def solve_mathematical_problem(self, problem):
        # TODO: Solve mathematical problems step by step
        pass

    def perform_logical_inference(self, premises, conclusion):
        # TODO: Perform logical reasoning
        pass

    def analyze_causal_relationships(self, events):
        # TODO: Analyze cause and effect
        pass

    def generate_explanation(self, reasoning_steps):
        # TODO: Generate step-by-step explanations
        pass

    def answer_reasoning_question(self, question):
        # TODO: Main method for reasoning-intensive QA
        pass

# Test reasoning questions
reasoning_questions = [
    "If John has 5 apples and gives 2 to Mary, how many does he have left?",
    "All birds can fly. Penguins are birds. Can penguins fly?",
    "If it rains, the ground gets wet. The ground is wet. Did it rain?"
]

# TODO: Test your reasoning QA system

### Challenge 6: Real-time Collaborative QA
Build a real-time QA system that can learn from user feedback and collaborate with human experts.

**Requirements:**
- Implement online learning from user feedback
- Route difficult questions to human experts
- Learn from expert annotations
- Maintain quality over time

**Success Criteria:**
- Improve accuracy through user feedback
- Efficiently route questions to appropriate experts
- Demonstrate continuous learning

In [37]:
# Your solution for Challenge 6
import asyncio
from queue import Queue
import threading

class CollaborativeQA:
    def __init__(self):
        # TODO: Initialize collaborative QA system
        self.base_qa_system = None
        self.expert_queue = Queue()
        self.feedback_learner = None
        self.confidence_threshold = 0.8

    def assess_question_difficulty(self, question):
        # TODO: Assess if question needs expert help
        pass

    def route_to_expert(self, question, domain):
        # TODO: Route question to appropriate expert
        pass

    def collect_user_feedback(self, question, answer, feedback):
        # TODO: Collect and process user feedback
        pass

    def update_model_from_feedback(self, feedback_data):
        # TODO: Update model based on feedback
        pass

    def learn_from_expert_annotations(self, annotations):
        # TODO: Learn from expert corrections
        pass

    def answer_with_collaboration(self, question):
        # TODO: Main collaborative answering method
        pass

# TODO: Implement collaborative QA system

## 🎁 **Bonus Challenge: Production QA System**

Build a complete, production-ready question answering system with enterprise features.

### Requirements:
1. **Multi-modal Support**: Text, images, and audio questions
2. **Real-time Performance**: < 200ms response time
3. **Scalable Architecture**: Handle 100K+ questions per day
4. **Quality Assurance**: Automated quality checks
5. **Analytics Dashboard**: Usage and performance metrics
6. **API Management**: Rate limiting, authentication
7. **Model Management**: A/B testing, versioning
8. **Monitoring**: Real-time alerts and logging
9. **Multi-language Support**: 5+ languages
10. **Security**: Data privacy and access control

### Success Criteria:
- Handle 1000+ concurrent users
- 99.9% uptime
- Comprehensive API documentation
- Automated testing pipeline
- Docker and Kubernetes deployment
- Real-time monitoring dashboard

In [38]:
!pip3 install redis



In [39]:
# Your solution for Bonus Challenge
from flask import Flask, request, jsonify
import redis
import logging
from datetime import datetime
import asyncio

class ProductionQASystem:
    def __init__(self):
        # TODO: Initialize production QA system
        self.app = Flask(__name__)
        self.qa_models = {}  # Model registry
        self.cache = redis.Redis()
        self.analytics = {}
        self.setup_logging()
        self.setup_routes()

    def setup_logging(self):
        # TODO: Set up comprehensive logging
        pass

    def setup_routes(self):
        # TODO: Define API endpoints
        @self.app.route('/ask', methods=['POST'])
        def ask_question():
            # TODO: Main QA endpoint
            pass

        @self.app.route('/batch_ask', methods=['POST'])
        def batch_questions():
            # TODO: Batch processing endpoint
            pass

        @self.app.route('/analytics', methods=['GET'])
        def get_analytics():
            # TODO: Analytics dashboard
            pass

    def authenticate_request(self, request):
        # TODO: Implement authentication
        pass

    def rate_limit_check(self, user_id):
        # TODO: Implement rate limiting
        pass

    def quality_check(self, question, answer):
        # TODO: Automated quality checks
        pass

    def log_interaction(self, question, answer, response_time):
        # TODO: Log all interactions
        pass

    def monitor_performance(self):
        # TODO: Real-time performance monitoring
        pass

# Docker configuration
dockerfile = """
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]
"""

# Kubernetes deployment
k8s_config = """
apiVersion: apps/v1
kind: Deployment
metadata:
  name: qa-system
spec:
  replicas: 5
  selector:
    matchLabels:
      app: qa-system
  template:
    metadata:
      labels:
        app: qa-system
    spec:
      containers:
      - name: qa-system
        image: qa-system:latest
        ports:
        - containerPort: 5000
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
"""

# TODO: Complete the production system implementation