# Generation Module Verification

This notebook tests the `src.generation` module which provides:
- **Query Rewriter** - Converts context-dependent questions to standalone form
- **Generator** - Produces answers with I_DONT_KNOW fallback
- **Retrieval Grader** - CRAG component for document relevance
- **Hallucination Grader** - Self-RAG component for answer grounding

**Note:** Full tests require GPU (Llama 3.1 8B). Run on Kaggle for complete testing.

In [1]:
import sys
import os

sys.path.append(os.path.abspath(".."))

from src.generation import (
    create_generation_components,
    GenerationComponents,
    GradeDocuments,
    GradeHallucinations
)

  from .autonotebook import tqdm as notebook_tqdm


## Step 1: Verify Pydantic Models

These define the JSON output schemas for graders.

In [2]:
print("ðŸ“Š Pydantic Models:")
print(f"   â€¢ GradeDocuments: {list(GradeDocuments.model_fields.keys())}")
print(f"   â€¢ GradeHallucinations: {list(GradeHallucinations.model_fields.keys())}")

# Test instantiation
doc_grade = GradeDocuments(binary_score="yes")
hal_grade = GradeHallucinations(binary_score="no")

print(f"\n   Example GradeDocuments: {doc_grade.model_dump()}")
print(f"   Example GradeHallucinations: {hal_grade.model_dump()}")

print("\nâœ… Pydantic models work!")

ðŸ“Š Pydantic Models:
   â€¢ GradeDocuments: ['binary_score']
   â€¢ GradeHallucinations: ['binary_score']

   Example GradeDocuments: {'binary_score': 'yes'}
   Example GradeHallucinations: {'binary_score': 'no'}

âœ… Pydantic models work!


## Step 2: Verify GenerationComponents Dataclass

In [3]:
import dataclasses

print("ðŸ“Š GenerationComponents:")
if dataclasses.is_dataclass(GenerationComponents):
    fields = [f.name for f in dataclasses.fields(GenerationComponents)]
    print(f"   â€¢ Type: dataclass")
    print(f"   â€¢ Fields: {fields}")
else:
    print(f"   â€¢ Type: {type(GenerationComponents).__name__}")

print("\nâœ… GenerationComponents structure verified!")

ðŸ“Š GenerationComponents:
   â€¢ Type: dataclass
   â€¢ Fields: ['llm', 'query_rewriter', 'generator', 'retrieval_grader', 'hallucination_grader']

âœ… GenerationComponents structure verified!


## Step 3: Test Factory Function (GPU Required)

Load Llama model and create components.

In [4]:
# Use smaller model for faster testing
MODEL_ID = "meta-llama/Llama-3.2-1B-Instruct"

print(f"ðŸ”„ Loading {MODEL_ID}...")
components = create_generation_components(model_id=MODEL_ID)

print(f"\nðŸ“Š Components Created:")
print(f"   â€¢ llm: {type(components.llm).__name__}")
print(f"   â€¢ query_rewriter: {type(components.query_rewriter).__name__}")
print(f"   â€¢ generator: {type(components.generator).__name__}")
print(f"   â€¢ retrieval_grader: {type(components.retrieval_grader).__name__}")
print(f"   â€¢ hallucination_grader: {type(components.hallucination_grader).__name__}")

print("\nâœ… All components initialized!")

ðŸ”„ Loading meta-llama/Llama-3.2-1B-Instruct...
Creating Generation Components with model: meta-llama/Llama-3.2-1B-Instruct...


Device set to use cuda:0


Generation Components Ready.

ðŸ“Š Components Created:
   â€¢ llm: HuggingFacePipeline
   â€¢ query_rewriter: RunnableSequence
   â€¢ generator: RunnableSequence
   â€¢ retrieval_grader: RunnableSequence
   â€¢ hallucination_grader: RunnableSequence

âœ… All components initialized!


## Step 4: Test Chains

In [5]:
# Test Query Rewriter
print("ðŸ”„ Testing Query Rewriter...")

# NOTE: query_rewriter expects 'messages' key (not 'chat_history')
result = components.query_rewriter.invoke({
    "messages": [("user", "Who is the CEO of Apple?"), ("assistant", "Tim Cook.")],
    "question": "How old is he?"
})

print(f"   Input: 'How old is he?' (with context about Tim Cook)")
print(f"   Output: '{result}'")
print("âœ… Query Rewriter works!")

ðŸ”„ Testing Query Rewriter...
   Input: 'How old is he?' (with context about Tim Cook)
   Output: '

Tim Cook'
âœ… Query Rewriter works!


In [6]:
# Test Generator
print("ðŸ”„ Testing Generator...")

result = components.generator.invoke({
    "context": "Tim Cook is the CEO of Apple. He was born on November 1, 1960.",
    "question": "Who is Tim Cook?"
})

print(f"   Question: 'Who is Tim Cook?'")
print(f"   Answer: '{result[:200]}'")
print("âœ… Generator works!")

ðŸ”„ Testing Generator...
   Question: 'Who is Tim Cook?'
   Answer: '

I_DONT_KNOW'
âœ… Generator works!


In [7]:
# Test Retrieval Grader
print("ðŸ”„ Testing Retrieval Grader...")

result = components.retrieval_grader.invoke({
    "document": "Tim Cook is the CEO of Apple since 2011.",
    "question": "Who is the CEO of Apple?"
})

print(f"   Document relevant? {result}")
print("âœ… Retrieval Grader works!")

# Test Hallucination Grader
print("\nðŸ”„ Testing Hallucination Grader...")

result = components.hallucination_grader.invoke({
    "documents": "Tim Cook was born on November 1, 1960.",
    "generation": "Tim Cook is currently 64 years old."
})

print(f"   Answer supported? {result}")
print("âœ… Hallucination Grader works!")

ðŸ”„ Testing Retrieval Grader...
   Document relevant? {'binary_score': 'yes'}
âœ… Retrieval Grader works!

ðŸ”„ Testing Hallucination Grader...
   Answer supported? {'binary_score': 'no'}
âœ… Hallucination Grader works!


## Summary

**Components verified:**
- âœ… Pydantic models (GradeDocuments, GradeHallucinations)
- âœ… GenerationComponents dataclass
- âœ… Query Rewriter chain
- âœ… Generator chain
- âœ… Retrieval Grader chain
- âœ… Hallucination Grader chain