# Agent Demo - Multi-Agent Educational Tutor System

This notebook demonstrates the multi-agent system for educational tutoring:
1. **TutorAgent**: Provides explanations and educational content
2. **QuizAgent**: Generates quiz questions
3. **EvaluatorAgent**: Evaluates explanations and quizzes using LLM-as-judge

We'll demonstrate a complete flow: explanation → quiz generation → evaluation → memory storage.

In [2]:
# Setup path to import from src directory
import sys
from pathlib import Path

# Import framework and agents
from src.agent_framework import Coordinator
from src.agents import TutorAgent
from src.memory import MemoryStore, initialize_demo_user_memory
import json
import os
from dotenv import load_dotenv


In [4]:
# Add project root to path
project_root = Path().resolve().parent if Path().resolve().name == 'notebooks' else Path().resolve()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# Load environment variables
env_path = Path(__file__).parent.parent / '.env' if '__file__' in globals() else Path('.env')
print(f"Environment Path: {env_path}")
load_dotenv(dotenv_path=env_path, verbose=True, encoding='utf-16')

print("✓ Imports successful")
print(f"✓ Project root: {project_root}")
print(f"✓ GEMINI_API_KEY configured: {os.getenv('GEMINI_API_KEY') is not None}")

Environment Path: .env
✓ Imports successful
✓ Project root: C:\Users\punno\Documents\GitHub\google-agent-intensive-capstone-project
✓ GEMINI_API_KEY configured: True


## Step 1: Agent Registration

We'll create and register three agents:
- **TutorAgent**: For generating explanations
- **QuizAgent**: For creating quiz questions (stub implementation)
- **EvaluatorAgent**: For evaluating content quality (stub implementation)

The Coordinator manages communication between these agents.

In [5]:
# Create stub QuizAgent (to be fully implemented later)
from src.agent_framework import Agent
from typing import Dict, Any, Optional

In [6]:
class QuizAgent(Agent):
    """Stub implementation of QuizAgent for demonstration."""
    
    def __init__(self, name: str = "quiz_agent"):
        super().__init__(name=name)
        # In full implementation, this would initialize Gemini API
    
    def handle_message(self, message: Dict[str, Any], context: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
        action = message.get("action", "unknown")
        payload = message.get("payload", {})
        request_id = message.get("request_id", "unknown")
        
        if action == "generate_quiz":
            topic = payload.get("topic", "")
            difficulty = payload.get("difficulty", "intermediate")
            n_questions = payload.get("n_questions", 5)
            
            # Stub response - in full implementation, this would call Gemini API
            return {
                "status": "ok",
                "payload": {
                    "questions": [
                        {
                            "id": f"q{i+1}",
                            "question": f"Sample question {i+1} about {topic}",
                            "options": ["Option A", "Option B", "Option C", "Option D"],
                            "correct_answer": "Option A",
                            "answer_index": 0,
                            "explanation": f"Explanation for question {i+1}"
                        }
                        for i in range(n_questions)
                    ],
                    "topic": topic,
                    "difficulty": difficulty
                },
                "request_id": request_id,
                "meta": {"agent": self.name}
            }
        else:
            return {
                "status": "error",
                "payload": {"error": f"Unknown action: {action}"},
                "request_id": request_id,
                "meta": {"agent": self.name}
            }

In [7]:
class EvaluatorAgent(Agent):
    """Stub implementation of EvaluatorAgent for demonstration."""
    
    def __init__(self, name: str = "evaluator_agent"):
        super().__init__(name=name)
        # In full implementation, this would initialize Gemini API
    
    def handle_message(self, message: Dict[str, Any], context: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
        action = message.get("action", "unknown")
        payload = message.get("payload", {})
        request_id = message.get("request_id", "unknown")
        
        if action == "evaluate":
            source_text = payload.get("source_text", "")
            candidate = payload.get("candidate", "")
            
            # Stub response - in full implementation, this would call Gemini API with EVALUATOR_PROMPT
            return {
                "status": "ok",
                "payload": {
                    "accuracy_score": 4,
                    "clarity_score": 5,
                    "completeness_score": 4,
                    "usefulness_score": 5,
                    "overall_score": 4.5,
                    "hallucinated_claims": [],
                    "strengths": ["Clear explanation", "Good examples"],
                    "weaknesses": ["Could be more detailed"],
                    "recommendations": ["Add more examples"]
                },
                "request_id": request_id,
                "meta": {"agent": self.name}
            }
        else:
            return {
                "status": "error",
                "payload": {"error": f"Unknown action: {action}"},
                "request_id": request_id,
                "meta": {"agent": self.name}
            }

In [15]:
# Create coordinator
coordinator = Coordinator(timeout=60.0)

# Create and register agents
tutor_agent = TutorAgent(name="tutor_agent", model_name="gemini-2.5-flash-lite")
quiz_agent = QuizAgent(name="quiz_agent")
evaluator_agent = EvaluatorAgent(name="evaluator_agent")

coordinator.register(tutor_agent)
coordinator.register(quiz_agent)
coordinator.register(evaluator_agent)

print("✓ Coordinator created")
print("✓ Agents registered:")
for agent_name in coordinator.list_agents():
    print(f"  - {agent_name}")

✓ Coordinator created
✓ Agents registered:
  - tutor_agent
  - quiz_agent
  - evaluator_agent


## Step 2: Request Explanation

We'll ask the TutorAgent to explain "bias-variance tradeoff" at intermediate level.
This demonstrates the core tutoring capability of the system.

In [16]:
# Request explanation from TutorAgent
explain_message = {
    "action": "explain",
    "payload": {
        "topic": "bias-variance tradeoff",
        "level": "intermediate"
    },
    "request_id": "explain-001"
}

print("Requesting explanation from TutorAgent...")
print(f"Topic: {explain_message['payload']['topic']}")
print(f"Level: {explain_message['payload']['level']}\n")

Requesting explanation from TutorAgent...
Topic: bias-variance tradeoff
Level: intermediate



In [17]:
explain_response = coordinator.send(
    from_agent="tutor_agent",
    to_agent="tutor_agent",
    message=explain_message
)

if explain_response["status"] == "ok":
    explanation = explain_response["payload"]
    print("✓ Explanation received successfully\n")
    print("=" * 60)
    print("EXPLANATION SUMMARY")
    print("=" * 60)
    print(f"\n{explanation.get('summary', 'N/A')}\n")
    
    print("Step-by-step breakdown:")
    for i, step in enumerate(explanation.get('step_by_step', [])[:5], 1):
        print(f"  {i}. {step}")
    
    print(f"\nKey equations: {len(explanation.get('key_equations', []))} found")
    print(f"Examples provided: {len(explanation.get('examples', []))}")
    print(f"Confidence: {explanation.get('confidence', 'N/A')}")
else:
    print(f"✗ Error: {explain_response['payload']}")

✓ Explanation received successfully

EXPLANATION SUMMARY

The bias-variance tradeoff is a fundamental concept in machine learning that describes the relationship between two sources of error in a model: bias and variance.  Minimizing one often leads to an increase in the other, requiring a balance to achieve optimal predictive performance.

Step-by-step breakdown:
  1. Understand Bias: Bias represents the error introduced by approximating a real-world problem, which may be complex, by a simplified model. High bias means the model makes strong assumptions about the data, leading to underfitting.
  2. Understand Variance: Variance represents the error introduced by the model's sensitivity to small fluctuations in the training data. High variance means the model is too complex and learns the training data too well, including noise, leading to overfitting.
  3. The Tradeoff: As model complexity increases, bias typically decreases, but variance increases. Conversely, as model complexity dec

## Step 3: Generate Quiz Questions

After receiving the explanation, we'll ask the QuizAgent to generate quiz questions
on the same topic to test understanding.

In [18]:
# Generate quiz based on the explanation
quiz_message = {
    "action": "generate_quiz",
    "payload": {
        "topic": "bias-variance tradeoff",
        "difficulty": "intermediate",
        "n_questions": 5
    },
    "request_id": "quiz-001"
}

print("Requesting quiz generation from QuizAgent...")
print(f"Topic: {quiz_message['payload']['topic']}")
print(f"Number of questions: {quiz_message['payload']['n_questions']}\n")

Requesting quiz generation from QuizAgent...
Topic: bias-variance tradeoff
Number of questions: 5



In [19]:
quiz_response = coordinator.send(
    from_agent="tutor_agent",
    to_agent="quiz_agent",
    message=quiz_message
)

if quiz_response["status"] == "ok":
    quiz_data = quiz_response["payload"]
    questions = quiz_data.get("questions", [])
    print(f"✓ Quiz generated successfully: {len(questions)} questions\n")
    print("=" * 60)
    print("QUIZ QUESTIONS")
    print("=" * 60)
    
    for i, q in enumerate(questions[:3], 1):  # Show first 3 questions
        print(f"\nQuestion {i}: {q.get('question', 'N/A')}")
        if 'options' in q:
            for j, opt in enumerate(q['options'], 1):
                marker = "✓" if j-1 == q.get('answer_index', -1) else " "
                print(f"  {marker} {chr(64+j)}. {opt}")
        print(f"  Explanation: {q.get('explanation', 'N/A')}")
else:
    print(f"✗ Error: {quiz_response['payload']}")

✓ Quiz generated successfully: 5 questions

QUIZ QUESTIONS

Question 1: Sample question 1 about bias-variance tradeoff
  ✓ A. Option A
    B. Option B
    C. Option C
    D. Option D
  Explanation: Explanation for question 1

Question 2: Sample question 2 about bias-variance tradeoff
  ✓ A. Option A
    B. Option B
    C. Option C
    D. Option D
  Explanation: Explanation for question 2

Question 3: Sample question 3 about bias-variance tradeoff
  ✓ A. Option A
    B. Option B
    C. Option C
    D. Option D
  Explanation: Explanation for question 3


## Step 4: Evaluate Explanation Quality

We'll use the EvaluatorAgent (LLM-as-judge) to evaluate the quality of the explanation
we received. This demonstrates automatic quality assessment.

In [20]:
# Evaluate the explanation using EvaluatorAgent
# In a real scenario, we'd compare against source material
source_text = "Bias-variance tradeoff is a fundamental concept in machine learning..."

evaluate_message = {
    "action": "evaluate",
    "payload": {
        "source_text": source_text,
        "candidate": explanation.get("summary", "") + "\n" + "\n".join(explanation.get("step_by_step", []))
    },
    "request_id": "eval-001"
}

print("Requesting evaluation from EvaluatorAgent...\n")

Requesting evaluation from EvaluatorAgent...



In [21]:
eval_response = coordinator.send(
    from_agent="tutor_agent",
    to_agent="evaluator_agent",
    message=evaluate_message
)

if eval_response["status"] == "ok":
    evaluation = eval_response["payload"]
    print("✓ Evaluation completed successfully\n")
    print("=" * 60)
    print("EVALUATION RESULTS")
    print("=" * 60)
    print(f"\nAccuracy Score: {evaluation.get('accuracy_score', 'N/A')}/5")
    print(f"Clarity Score: {evaluation.get('clarity_score', 'N/A')}/5")
    print(f"Completeness Score: {evaluation.get('completeness_score', 'N/A')}/5")
    print(f"Usefulness Score: {evaluation.get('usefulness_score', 'N/A')}/5")
    print(f"Overall Score: {evaluation.get('overall_score', 'N/A')}/5")
    
    strengths = evaluation.get('strengths', [])
    if strengths:
        print(f"\nStrengths:")
        for strength in strengths:
            print(f"  ✓ {strength}")
    
    weaknesses = evaluation.get('weaknesses', [])
    if weaknesses:
        print(f"\nWeaknesses:")
        for weakness in weaknesses:
            print(f"  - {weakness}")
else:
    print(f"✗ Error: {eval_response['payload']}")

✓ Evaluation completed successfully

EVALUATION RESULTS

Accuracy Score: 4/5
Clarity Score: 5/5
Completeness Score: 4/5
Usefulness Score: 5/5
Overall Score: 4.5/5

Strengths:
  ✓ Clear explanation
  ✓ Good examples

Weaknesses:
  - Could be more detailed


## Step 5: Save Results to Memory

We'll save the explanation, quiz, and evaluation results to the memory store
for future reference and to track user progress.

In [22]:
# Initialize memory store
memory_store = MemoryStore(storage_path=str(project_root / "data" / "memory_store.json"))

# Save explanation
memory_store.save("session:001:explanation", {
    "topic": "bias-variance tradeoff",
    "level": "intermediate",
    "explanation": explanation,
    "timestamp": memory_store._get_timestamp(),
    "tags": ["explanation", "bias-variance", "intermediate"]
})

# Save quiz
memory_store.save("session:001:quiz", {
    "topic": "bias-variance tradeoff",
    "quiz_data": quiz_data,
    "timestamp": memory_store._get_timestamp(),
    "tags": ["quiz", "bias-variance", "intermediate"]
})

# Save evaluation
memory_store.save("session:001:evaluation", {
    "topic": "bias-variance tradeoff",
    "evaluation": evaluation,
    "timestamp": memory_store._get_timestamp(),
    "tags": ["evaluation", "bias-variance"]
})

# Save session summary
memory_store.save("session:001:summary", {
    "session_id": "001",
    "topic": "bias-variance tradeoff",
    "actions": ["explain", "generate_quiz", "evaluate"],
    "timestamp": memory_store._get_timestamp(),
    "tags": ["session", "bias-variance"]
})

print("✓ Results saved to memory store\n")
print("=" * 60)
print("MEMORY STORAGE")
print("=" * 60)

✓ Results saved to memory store

MEMORY STORAGE


In [23]:
# Verify saved data
saved_explanation = memory_store.load("session:001:explanation")
saved_quiz = memory_store.load("session:001:quiz")
saved_eval = memory_store.load("session:001:evaluation")

print(f"\n✓ Explanation saved: {saved_explanation is not None}")
print(f"✓ Quiz saved: {saved_quiz is not None}")
print(f"✓ Evaluation saved: {saved_eval is not None}")

# Search by tag
bias_variance_items = memory_store.search_by_tag("bias-variance")
print(f"\n✓ Found {len(bias_variance_items)} items tagged 'bias-variance'")

# Close memory store
memory_store.close()
print("\n✓ Memory store operations completed")


✓ Explanation saved: True
✓ Quiz saved: True
✓ Evaluation saved: True

✓ Found 4 items tagged 'bias-variance'

✓ Memory store operations completed


## Summary

This demonstration showed:

1. **Multi-Agent Coordination**: Three agents (Tutor, Quiz, Evaluator) working together
2. **Explanation Generation**: TutorAgent providing structured explanations
3. **Quiz Creation**: QuizAgent generating assessment questions
4. **Quality Evaluation**: EvaluatorAgent assessing explanation quality
5. **Memory Persistence**: Saving all results for future reference

The system demonstrates:
- ✅ Agent-to-agent communication via Coordinator
- ✅ Structured message passing
- ✅ LLM-powered content generation
- ✅ Automatic quality assessment
- ✅ Persistent memory storage

**Next Steps**: 
- Implement full QuizAgent and EvaluatorAgent with Gemini API
- Add SearchAgent for finding educational resources
- Integrate with user profiles for personalized tutoring

In [24]:
# Display complete results in JSON format
print("=" * 60)
print("COMPLETE SESSION RESULTS")
print("=" * 60)

results = {
    "explanation": explanation,
    "quiz": quiz_data,
    "evaluation": evaluation,
    "session_info": {
        "session_id": "001",
        "topic": "bias-variance tradeoff",
        "agents_used": coordinator.list_agents()
    }
}

print(json.dumps(results, indent=2, default=str))

COMPLETE SESSION RESULTS
{
  "explanation": {
    "summary": "The bias-variance tradeoff is a fundamental concept in machine learning that describes the relationship between two sources of error in a model: bias and variance.  Minimizing one often leads to an increase in the other, requiring a balance to achieve optimal predictive performance.",
    "step_by_step": [
      "Understand Bias: Bias represents the error introduced by approximating a real-world problem, which may be complex, by a simplified model. High bias means the model makes strong assumptions about the data, leading to underfitting.",
      "Understand Variance: Variance represents the error introduced by the model's sensitivity to small fluctuations in the training data. High variance means the model is too complex and learns the training data too well, including noise, leading to overfitting.",
      "The Tradeoff: As model complexity increases, bias typically decreases, but variance increases. Conversely, as model c