# The Shakespearean Scholar - Inference Notebook

This notebook demonstrates the RAG system's capabilities for answering questions about Julius Caesar.

**Make sure the Docker containers are running:**
```bash
docker-compose up -d
```

In [None]:
import requests
import json
from IPython.display import Markdown, display

BACKEND_URL = "http://localhost:8000"

def query_rag(question: str, top_k: int = 5):
    """Query the RAG system"""
    response = requests.post(
        f"{BACKEND_URL}/query",
        json={"query": question, "top_k": top_k, "include_sources": True}
    )
    return response.json()

def display_result(question: str, result: dict):
    """Display query result nicely"""
    display(Markdown(f"## Question\n{question}"))
    display(Markdown(f"## Answer\n{result['answer']}"))
    display(Markdown(f"**Confidence:** {result['confidence']:.2f}"))
    
    display(Markdown("## Sources"))
    for i, source in enumerate(result['sources'], 1):
        meta = source['metadata']
        display(Markdown(
            f"**Source {i}:** Act {meta['act']}, Scene {meta['scene']} - {meta['speaker']}\n\n"
            f"> {source['chunk'][:200]}...\n"
        ))

print("âœ… Utilities loaded")

## Check System Health

In [None]:
# Check if backend is running
response = requests.get(f"{BACKEND_URL}/health")
health = response.json()
print(f"Status: {health['status']}")
print(f"Vector Store Count: {health['vector_store_count']} chunks")

# Get stats
stats = requests.get(f"{BACKEND_URL}/stats").json()
print(f"\nEmbedding Model: {stats['embedding_model']}")
print(f"Collection: {stats['collection_name']}")

## Task 1: Factual Questions

Testing the system's ability to retrieve and answer direct factual questions.

In [None]:
# Question 1: Famous warning
question = "What does the Soothsayer say to Caesar?"
result = query_rag(question)
display_result(question, result)

In [None]:
# Question 2: Caesar's death
question = "What are Caesar's last words?"
result = query_rag(question)
display_result(question, result)

In [None]:
# Question 3: Final tribute
question = "What does Antony call Brutus at the end?"
result = query_rag(question)
display_result(question, result)

## Task 2: Analytical Questions

Testing the system's ability to handle complex analytical questions requiring synthesis.

In [None]:
# Analytical Question 1: Character analysis
question = "What are Brutus's internal conflicts as shown in his soliloquy in Act 2, Scene 1?"
result = query_rag(question, top_k=5)
display_result(question, result)

In [None]:
# Analytical Question 2: Rhetorical analysis
question = "What rhetorical devices does Antony use in his funeral oration?"
result = query_rag(question, top_k=5)
display_result(question, result)

## Task 3: Comparative Questions

Testing the system's ability to compare and contrast elements from different parts of the play.

In [None]:
# Comparative Question
question = "Compare and contrast Brutus and Antony's speeches to the plebeians after Caesar's assassination."
result = query_rag(question, top_k=7)
display_result(question, result)

## Task 4: Thematic Questions

Testing understanding of broader themes in the play.

In [None]:
# Thematic Question
question = "What role do omens and supernatural elements play in the tragedy?"
result = query_rag(question, top_k=5)
display_result(question, result)

## Task 5: Batch Evaluation

Process multiple questions and analyze performance.

In [None]:
# Batch query test
test_questions = [
    "Who is Octavius?",
    "Why do Brutus and Cassius argue?",
    "What appears at Brutus's bedside in camp?",
    "How does Cassius die?",
    "How does Brutus die?"
]

batch_response = requests.post(
    f"{BACKEND_URL}/batch_query",
    json={"queries": test_questions, "top_k": 3}
)
batch_results = batch_response.json()

print(f"Processed {batch_results['total']} questions\n")

# Show summary
for result in batch_results['results']:
    print(f"Q: {result['question']}")
    print(f"A: {result['answer'][:150]}...")
    print(f"Confidence: {result['confidence']:.2f}")
    print("-" * 80)

## Performance Analysis

In [None]:
# Calculate average confidence
confidences = [r['confidence'] for r in batch_results['results']]
avg_confidence = sum(confidences) / len(confidences)

print(f"Average Confidence: {avg_confidence:.3f}")
print(f"Min Confidence: {min(confidences):.3f}")
print(f"Max Confidence: {max(confidences):.3f}")

# Visualize
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 5))
plt.bar(range(len(confidences)), confidences)
plt.axhline(y=avg_confidence, color='r', linestyle='--', label='Average')
plt.xlabel('Question Number')
plt.ylabel('Confidence Score')
plt.title('RAG System Confidence Scores')
plt.legend()
plt.ylim(0, 1)
plt.show()

## Conclusion

This notebook demonstrates the RAG system's capabilities across different question types:

1. **Factual Questions**: Direct retrieval from the text
2. **Analytical Questions**: Synthesis and interpretation
3. **Comparative Questions**: Cross-referencing multiple parts
4. **Thematic Questions**: Understanding broader patterns

The system successfully provides accurate, cited answers appropriate for ICSE Class 10 students.