In [None]:
## Quickstart (How to run this notebook)

1. **Fork this notebook** on Kaggle.
2. In **Kaggle ‚Üí Settings ‚Üí Secrets**, create a secret called `GOOGLE_API_KEY` with your Gemini API key.
3. Run the cells in order:
   - Section 1: Environment & Core Imports
   - Section 2‚Äì5: Data models, agents, session helpers, UX helpers
   - Section 6: Demo Cells
   - Section 9‚Äì10: Evaluation + Metrics
4. Optionally:
   - Uncomment the `adk create ...` and `adk web ...` cells to experiment with the ADK Web UI in this environment.


# Algorithm Mentor ‚Äì Multi-Agent AI Tutor for Algorithms (Kaggle Demo)

## Problem & Value
Many undergrad students struggle with algorithms and data structures because:

- Explanations are either too formal or too shallow.
- Practice questions are scattered across textbooks and websites.
- Visual intuition for graphs, DP tables, and recursion is hard to build.
- ESL students and returning learners need simpler language and step-by-step help.

**Algorithm Mentor** uses a multi-agent AI tutor to:

- Give structured explanations (overview ‚Üí intuition ‚Üí trace ‚Üí pseudocode ‚Üí pitfalls).
- Generate synthetic practice problems and rubrics on demand.
- Visualize algorithms step-by-step on tiny examples.
- Adapt to a student's persona (overloaded undergrad, working parent, ESL learner) and track lightweight mastery.

In a real deployment, this could reduce time-to-understanding (e.g. Dijkstra, knapsack)
from **days of trial-and-error** to **one or two focused study sessions.**

**Algorithm Mentor** is a multi-agent AI tutor for algorithms and data structures, designed for stressed undergrads, returning learners, and ESL students.  
It combines a Concept Explainer, Problem Generator + Auto-Grader, Visualization Agent, and a Diagnostic + Personalization Orchestrator, all powered by Google‚Äôs ADK and Gemini.  

This notebook shows how these agents collaborate in one tutoring flow, how we track session state and mastery inside the notebook, and how we automatically evaluate the system with a Judge Agent.

---

## 1. How this project uses Agentic AI concepts

The competition asks us to apply several Agentic AI concepts.  
Here‚Äôs how this notebook maps to those requirements:

| Concept area                      | What is implemented in this notebook                                                                 |
|----------------------------------|--------------------------------------------------------------------------------------------------------|
| **Multi-agent system**           | Separate agents: Concept Explainer, ProblemGen + Auto-Grader, Visualization, Diagnostic + Personalization (orchestrator), and Judge. Each is an ADK `Agent` with its own system prompt + runner. |
| **Sequential / orchestrated flows** | The Diagnostic + Personalization Agent reads the session state and student message, then plans which specialist agents to call next (explain, practice, visualize). This simulates a sequential orchestration loop. |
| **Tools (built-in)**             | Agents are created with ADK and can use built-in tools (e.g., `google_search`) via the ADK `tools` interface. |
| **Sessions & state management**  | Custom `SessionState`, `StudentProfile`, `MasteryEntry`, and `MasteryUpdate` dataclasses track mode, topic, difficulty, mastery map, and recent intents inside the notebook kernel. |
| **Short-term context engineering** | The `SessionState.to_dict()` method exposes a compact JSON view (tail of `chat_history` + `rolling_summary`), which is injected into the Diagnostic Agent prompt to guide personalization. |
| **Long-term memory (lightweight)** | A simple JSON file (`algorithm_mentor_memory.json`) is used as a stand-in for a Memory Bank: we persist profile, mastery levels, and notes across kernel restarts and reload them on init. |
| **Context compaction**           | `compact_history_if_needed(...)` performs a basic compaction strategy: keep only the last N turns verbatim and fold earlier messages into a `rolling_summary` string. |
| **Observability (logs + metrics)** | `METRICS` dict tracks counts of explainer/problem/viz/diagnostic/judge calls and eval runs. Helper `print_metrics()` summarizes usage; agent helpers print traces to show what is being called. |
| **Agent evaluation**             | A separate Judge Agent (`judge_agent`) scores outputs from the Concept Explainer and ProblemGen agents using a JSON rubric. `run_eval_suite()` runs a tiny eval set and reports pass/fail + scores. |

> **Note:** In a real production system, these ideas would be wired into persistent `SessionService` / `MemoryService`, long-running operations, and full CI/CD. Here we demonstrate the core patterns directly inside a Kaggle notebook.

---

## 2. System architecture at a glance

At a high level, the notebook models the following multi-agent architecture:

```text
                     ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
User (student)  ‚îÄ‚îÄ‚îÄ‚ñ∫ ‚îÇ Diagnostic + Personalization  ‚îÇ
   message           ‚îÇ     Agent (orchestrator)      ‚îÇ
                     ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚ñ≤‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                     ‚îÇ plans next actions
                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                 ‚îÇ                   ‚îÇ                    ‚îÇ
                 ‚ñº                   ‚ñº                    ‚ñº
       ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê   ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
       ‚îÇ Concept        ‚îÇ   ‚îÇ ProblemGen +       ‚îÇ  ‚îÇ Visualization     ‚îÇ
       ‚îÇ Explainer Agent‚îÇ   ‚îÇ Auto-Grader Agent  ‚îÇ  ‚îÇ Agent             ‚îÇ
       ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò   ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

                               ‚ñ≤
                               ‚îÇ (for offline evaluation)
                               ‚ñº
                        ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                        ‚îÇ Judge Agent    ‚îÇ
                        ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

### How the agents collaborate

- The **Diagnostic + Personalization Agent** receives:
  - the latest **student message**, and  
  - a compact JSON view of the current `SessionState`  
  Then it decides:
  - which **topic** to focus on,
  - which **difficulty** (easy / medium / hard) to use, and  
  - whether to call the **Concept Explainer**, **ProblemGen + Auto-Grader**, **Visualization Agent**, or some combination of them.

- The **Concept Explainer Agent** produces a **structured teaching explanation** with:
  - overview  
  - intuition  
  - step-by-step trace on a tiny example  
  - pseudocode  
  - time & space complexity  
  - common pitfalls  
  - self-quiz questions (no answers)

- The **ProblemGen + Auto-Grader Agent** creates **synthetic practice problems** and short **rubrics/answers** for algorithms & data structures.

- The **Visualization Agent** generates **step-by-step visual explanations in Markdown**, using tiny synthetic examples (arrays, graphs, tables, etc.).

- The **Judge Agent** is used **only in the evaluation section** to score other agents‚Äô outputs using a JSON rubric.

All agents are implemented using **Google‚Äôs ADK** (`Agent` + `InMemoryRunner`) and are structured so they can later be reused in an **A2A** or **deployed** setup.

---

### 3. How to read this notebook

The rest of the notebook is organized as:

1. **Environment & Core Imports**  
   Set up the API key, ADK, Gemini model, and optional ADK Web UI helper.

2. **Shared Data Models**  
   Define `StudentProfile`, `SessionState`, mastery-related dataclasses, and evaluation dataclasses.

3. **Agents & Runners**  
   Instantiate the five main agents (explainer, problem, viz, diagnostic, judge) with their system prompts and ADK `InMemoryRunner`s.

4. **Session & Memory Helpers**  
   Implement simple mastery updates, context compaction, and JSON-based long-term memory load/save.

5. **UX Helpers (Agent Calls)**  
   Notebook-friendly async helpers like `run_diagnostic_turn`, `call_concept_explainer`, `call_problem_generator`, and `call_visualization_agent`.

6. **Demo Cells**  
   Example calls that show a full tutoring flow for topics like Dijkstra‚Äôs algorithm, binary search, and dynamic programming.

7. **Observability & Evaluation**  
   Metrics collection, an evaluation suite using the Judge Agent, and a quick sanity check (`run_eval_suite()` + `print_metrics()`).

In [None]:
# === 1. Environment & Core Imports ‚Äì API key, ADK, Gemini, optional Web UI ===

import os
import json
from typing import Any, Dict, List, Optional, Literal
from dataclasses import dataclass, field

from kaggle_secrets import UserSecretsClient

# Load the Google API key from Kaggle secrets and export as env var.
try:
    GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
    os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
    print("‚úÖ Gemini API key setup complete from Kaggle secrets.")
except Exception as e:
    print(
        "üîë Authentication Error: Please make sure you have added 'GOOGLE_API_KEY' "
        "to your Kaggle secrets. Details:", e
    )

# ADK + Gemini imports
from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.runners import InMemoryRunner
from google.adk.tools import google_search
from google.genai import types

print("‚úÖ ADK components imported successfully.")

# ADK Web UI helper 
from IPython.display import display, HTML
from jupyter_server.serverapp import list_running_servers


def get_adk_proxy_url() -> str:
    """
    Compute the proxied ADK web UI URL for the current Kaggle notebook.

    Returns:
        The URL prefix string passed to `adk web --url_prefix`.
    """
    PROXY_HOST = "https://kkb-production.jupyter-proxy.kaggle.net"
    ADK_PORT = "8000"

    servers = list(list_running_servers())
    if not servers:
        raise Exception("No running Jupyter servers found.")

    baseURL = servers[0]["base_url"]

    try:
        path_parts = baseURL.split("/")
        kernel = path_parts[2]
        token = path_parts[3]
    except IndexError:
        raise Exception(f"Could not parse kernel/token from base URL: {baseURL}")

    url_prefix = f"/k/{kernel}/{token}/proxy/proxy/{ADK_PORT}"
    url = f"{PROXY_HOST}{url_prefix}"

    styled_html = f"""
    <div style="padding: 15px; border: 2px solid #f0ad4e; border-radius: 8px; background-color: #fef9f0; margin: 20px 0;">
        <div style="font-family: sans-serif; margin-bottom: 12px; color: #333; font-size: 1.1em;">
            <strong>‚ö†Ô∏è OPTIONAL: ADK Web UI</strong>
        </div>
        <div style="font-family: sans-serif; margin-bottom: 15px; color: #333; line-height: 1.5;">
            Run <code>!adk web --url_prefix {url_prefix}</code> in the next cell, keep it running,
            then click this button to open the ADK Web UI in a new tab.
        </div>
        <a href='{url}' target='_blank' style="
            display: inline-block; background-color: #1a73e8; color: white; padding: 10px 20px;
            text-decoration: none; border-radius: 25px; font-family: sans-serif; font-weight: 500;
            box-shadow: 0 2px 5px rgba(0,0,0,0.2); transition: all 0.2s ease;">
            Open ADK Web UI ‚Üó
        </a>
    </div>
    """

    display(HTML(styled_html))

    return url_prefix


print("‚úÖ Helper function for ADK proxy URL defined.")

# Retry config for Gemini 
retry_config = types.HttpRetryOptions(
    attempts=5,           # Maximum retry attempts
    exp_base=7,           # Delay multiplier for exponential backoff
    initial_delay=1,      # Initial delay before first retry (in seconds)
    http_status_codes=[   # Retry on these HTTP error codes
        429, 500, 503, 504
    ],
)

print("‚úÖ Retry configuration for Gemini defined.")

In [None]:
# =====================================================================
# 2. Shared Data Models ‚Äì StudentProfile, SessionState, Mastery, Eval
# =====================================================================

@dataclass
class StudentProfile:
    """
    Simple student profile used for personalization.
    """
    persona: Optional[str] = None  # "Sara", "Ela", "Ali", or custom
    preferred_language_level: Literal["simple", "standard", "deep"] = "standard"
    preferred_code_language: Optional[str] = "C++"  # e.g., "C++", "Python"
    explanation_level: Optional[str] = None        # text hint like "short", "detailed"
    goal_description: Optional[str] = None         # e.g., "prepare for midterm"


@dataclass
class MasteryEntry:
    """
    Tracks mastery for one topic.
    """
    topic: str
    mastery_level: float = 0.0     # 0.0 (weak) to 1.0 (strong)
    last_updated_turn: int = 0


@dataclass
class MasteryUpdate:
    """
    Proposed update to the mastery map.
    """
    topic: str
    delta: float
    reason: str


@dataclass
class SessionState:
    """
    Session-level working memory for Algorithm Mentor.
    - chat_history: short-term conversational history inside this session
    - rolling_summary: compact summary of older turns (for context compaction)
    - long_term_notes: simple JSON-based long-term memory across sessions
    """
    turn_index: int = 0
    mode: str = "tutor"                      # "tutor" | "practice" | "exam" | "review"
    current_topic: Optional[str] = None
    current_difficulty: Optional[str] = None # "easy" | "medium" | "hard" | None
    student_profile: StudentProfile = field(default_factory=StudentProfile)
    mastery_map: Dict[str, MasteryEntry] = field(default_factory=dict)
    recent_intents: List[str] = field(default_factory=list)

    
    chat_history: List[Dict[str, str]] = field(default_factory=list)
    rolling_summary: Optional[str] = None
    long_term_notes: List[str] = field(default_factory=list)

    def to_dict(self) -> Dict[str, Any]:
        """
        Convert to a JSON-friendly dictionary.

        NOTE: We only expose a *tail* of chat_history plus rolling_summary
        so Diagnostic Agent sees compact context instead of the full history.
        """
        history_tail = (
            self.chat_history[-6:] if len(self.chat_history) > 6 else list(self.chat_history)
        )

        return {
            "turn_index": self.turn_index,
            "mode": self.mode,
            "current_topic": self.current_topic,
            "current_difficulty": self.current_difficulty,
            "student_profile": {
                "persona": self.student_profile.persona,
                "preferred_language_level": self.student_profile.preferred_language_level,
                "preferred_code_language": self.student_profile.preferred_code_language,
                "explanation_level": self.student_profile.explanation_level,
                "goal_description": self.student_profile.goal_description,
            },
            "mastery_map": {
                topic: {
                    "topic": entry.topic,
                    "mastery_level": entry.mastery_level,
                    "last_updated_turn": entry.last_updated_turn,
                }
                for topic, entry in self.mastery_map.items()
            },
            "recent_intents": list(self.recent_intents),
            # 
            "chat_history_tail": history_tail,
            "rolling_summary": self.rolling_summary,
            "long_term_notes": list(self.long_term_notes),
        }


@dataclass
class EvalTestCase:
    """
    Description of an evaluation test for a given agent.
    (Eval scaffolding, optional to use.)
    """
    id: str
    agent: str                         # "explainer" | "problem" | "visualization" | "orchestrator"
    description: str
    prompt: str
    expected_properties: List[str]


@dataclass
class EvalResult:
    """
    Result of a single evaluation test.
    """
    test_id: str
    agent: str
    score: float
    passed: bool
    judge_notes: str


@dataclass
class EvalSummary:
    """
    Aggregated summary of evaluation results.
    """
    results: List[EvalResult] = field(default_factory=list)

    @property
    def average_score(self) -> float:
        if not self.results:
            return 0.0
        return sum(r.score for r in self.results) / len(self.results)

    @property
    def num_passed(self) -> int:
        return sum(1 for r in self.results if r.passed)

    @property
    def num_total(self) -> int:
        return len(self.results)


print("‚úÖ Shared dataclasses (StudentProfile, SessionState, Eval) defined (with Day 3 fields).")

In [None]:
# =====================================================================
# 3. Agents & Runners ‚Äì core multi-agent tutor components
# =====================================================================

# === 3.1 Concept Explainer Agent ===============================================

CONCEPT_EXPLAINER_SYSTEM_PROMPT = """
You are the Concept Explainer Agent for an educational system called Algorithm Mentor.

Your role:

- Teach algorithms and data structures using ONLY synthetic content.
- Focus on learners like:
    - Sara ‚Äì overloaded CS undergrad, wants C++-style examples.
    - Ela ‚Äì working mom returning to tech, limited time, likes short focused explanations.
    - Ali ‚Äì newcomer / ESL learner, good at math, needs simple English and visuals.

Core topics you handle include (but are not limited to):

- Asymptotic and Algorithm analysis (Big-O / Big-Theta / Big-Omega).
- Hashing
- Recursion, recursion trees, and induction.
- Sorting algorithms (insertion, merge sort, quicksort, heapsort, etc.).
- Search Trees, Balanced BSTs, B-Trees
- Searching (linear search, binary search).
- Elementary Data structures (arrays, linked lists, stacks, queues, heaps, hash tables, trees).
- Heaps, Priority Queues.
- Algorithmic Paradigms.
- Graphs and graph algorithms.
- Dynamic programming (Fibonacci, knapsack, coin change, LIS, etc.).
- NP-Completeness

Content rules:

- Use ONLY synthetic content. Invent your own graphs, arrays, and examples.
- Do not quote or copy from textbooks, slides, or real assignments.
- Keep examples small and easy to follow.

For each explanation request, internally follow this SEQUENTIAL pipeline:

1. Plan the explanation.
2. Overview: high-level description and what problem this algorithm or concept solves.
3. Intuition: friendly, human explanation (analogy, story, or picture in words).
4. Why it matters: where we use it and why it is useful.
5. Step-by-step trace: run the algorithm on a tiny synthetic example and describe the steps.
6. Pseudocode: clear pseudocode adapted to the preferred code style (e.g., C++-like).
7. Time complexity: typical time and space complexity with a short justification.
8. Pitfalls: common mistakes and misconceptions (2‚Äì5 items).
9. Check-your-understanding: 2‚Äì4 self-quiz questions (NO answers).

Persona adaptation:

- If the user persona or language preference indicates ESL/beginner, use simple English and short sentences.
- If they prefer C++ examples, make pseudocode look C++-like (loops, arrays, etc.).
- If they want a deeper explanation, add a bit more detail (e.g., proof sketch or invariants).

Output STRUCTURE:
By default, produce a **Markdown explanation** with clear sections:

### Overview

...

### Intuition

...

### Why it matters

...

### Step-by-step trace (on a small example)

...

### Pseudocode

...

### Time & space complexity

...

### Common pitfalls

- ...

### Check your understanding

1. ...
2. ...

If the user explicitly asks for a JSON STRUCTURE, then:

- Return a single valid JSON object with:
    - topic: string
    - level: "simple" | "standard" | "deep"
    - persona_used: string or null
    - sections: object with fields:
        - overview
        - intuition
        - why_it_matters
        - pseudocode
        - step_by_step_trace
        - time_complexity
        - pitfalls
        - check_your_understanding (array of 2‚Äì4 short strings)
    - visualization_suggestion: object with:
        - viz_type: string or null (e.g., "sorting", "graph_bfs", "graph_dfs", "recursion_tree", "dp_table")
        - spec: object with minimal synthetic data (tiny arrays/graphs/tables).
- Do NOT wrap JSON in markdown or backticks.

Overall behavior loop (Agentic pattern):

1. Get mission: understand the request (topic, level, persona).
2. Scan scene: infer their level and needs from the message.
3. Think: plan explanation structure.
4. Act: produce structured explanation (and JSON if requested).
5. Observe & iterate: if the user is still confused, refine or give more targeted examples.

Safety:

- Do not claim to use any real course materials.
- Do not leak or fabricate solutions to private exams.
"""

print("‚úÖ Concept Explainer system prompt defined.")

concept_explainer_agent = Agent(
    name="concept_explainer_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config,
    ),
    description=(
        "Algorithm Mentor's Concept Explainer ‚Äì teaches algorithms and data "
        "structures using synthetic examples, with persona-aware explanations."
    ),
    instruction=CONCEPT_EXPLAINER_SYSTEM_PROMPT,
    tools=[google_search],  
)

concept_explainer_runner = InMemoryRunner(agent=concept_explainer_agent)
print("‚úÖ Concept Explainer Agent + runner defined.")

In [None]:
# === 3.2 Problem Generator + Auto-Grader Agent =================================

PROBLEM_GEN_AUTOGRADER_SYSTEM_PROMPT = """
You are the Problem Generator + Auto-Grader Agent for an educational system
called Algorithm Mentor.

Your job has TWO main parts:

1. PROBLEM GENERATION
2. AUTO-GRADING

You operate only on **algorithms and data structures** content and you use
ONLY synthetic, invented questions (no real exam or assignment copying).

======================================================================

1. Topics and Scope
======================================================================

You can generate problems on topics such as (but not limited to):

- Asymptotic analysis (Big-O, Big-Theta, Big-Omega, worst/best/average case).
- Recursion, recursion trees, and basic induction.
- Sorting algorithms:
    - Insertion sort, selection sort, bubble sort (for intuition)
    - Merge sort, quicksort, heap sort
- Searching:
    - Linear search, binary search
- Data structures:
    - Arrays, linked lists, stacks, queues, deques
    - Priority queues, binary heaps
    - Hash tables (hash functions, collisions, chaining, probing)
    - Trees (binary trees, BSTs, AVL trees, heaps, etc.)
- Graphs and graph algorithms:
    - BFS, DFS, edge classification (tree/back/forward/cross)
    - Topological sort
    - Single-source shortest paths (Dijkstra, Bellman‚ÄìFord)
    - Minimum spanning trees (Prim, Kruskal)
- Dynamic programming:
    - Canonical examples like Fibonacci, 0/1 knapsack, coin change, LIS, etc.
- General algorithmic modelling:
    - Recognizing when to use graphs, DP, greedy, divide-and-conquer.
- NP-Completeness
- Heaps, Priority Queues 
- Algorithmic Paradigms 
- Hashing
- Search Trees, Balanced BSTs, B-Trees 

You do NOT use any private or proprietary course materials.

# ======================================================================
2. Problem Generation Requirements

When asked to generate practice problems, you:

- Create ONLY synthetic questions.
- Ensure each question is:
    - Clear
    - Unambiguous
    - Self-contained (enough detail to solve)
- Respect the requested:
    - TOPIC (e.g., "BFS", "dynamic programming")
    - DIFFICULTY (easy / medium / hard)
    - QUESTION TYPE (if specified): "mcq" | "open_ended" | "code"

Internal pipeline (your reasoning steps, not printed):

1. Get mission:
    - Read the topic, difficulty, and requested number of problems.
    - Identify question types (MCQ/open_ended/code).
2. Plan problems:
    - For each problem, choose a concrete small scenario or concept focus.
3. Author questions:
    - Write the actual problem text in clear exam/practice style.
4. Create answer key / rubric:
    - MCQ: identify the correct option and why.
    - Open-ended: list key points required in a good answer.
    - Code: describe the intended algorithm and important details
    (correctness, complexity, edge cases).
5. Package as structured data (see JSON schema below).

# ======================================================================
3. Auto-Grading Requirements

When asked to grade a student's answer, you:

- Read the original problem and its internal answer/rubric.
- Read the student's answer (and code, if provided).
- Compare the student answer against the expected answer/rubric.
- Decide:
    - A numeric score (0.0‚Äì1.0).
    - A verdict: "correct", "partially_correct", or "incorrect".
- Write feedback:
    - Explain briefly what they did well.
    - Explain what was missing or wrong.
    - Suggest one small next step to improve.

For code answers:

- Focus on algorithm correctness, structure, and complexity.
- You may reason about a few small test cases in your head.
- You do NOT execute arbitrary code; you reason about it conceptually.

# ======================================================================
4. JSON Schemas (for structured outputs)

You can respond either in:

- Natural language (for interactive chat), or
- **Structured JSON format** when explicitly requested.

When JSON is requested, use the following structures:

4.1 GeneratedProblem JSON

For each problem, the structure is:

{
"id": "string",                  // unique within the generated set
"topic": "string",               // e.g., "BFS"
"difficulty": "easy|medium|hard",
"question_type": "mcq|open_ended|code",
"prompt": "string",              // the question text
"choices": ["..."] or null,      // for MCQ, else null
"correct_answer": { ... },       // internal answer representation
"rubric": "string"               // description of what a good answer should contain
}

- For MCQ:
    - "choices": list of option strings, e.g., ["A) ...", "B) ...", ...]
    - "correct_answer": e.g., { "type": "mcq", "correct_index": 1 }
- For open_ended:
    - "choices": null
    - "correct_answer": e.g., { "type": "open_ended", "key_points": ["...", "..."] }
- For code:
    - "choices": null
    - "correct_answer": e.g., {
    "type": "code",
    "intended_algorithm": "description",
    "required_properties": ["...", "..."]
    }

4.2 GradingResult JSON

When grading, use:

{
"status": "success" | "error",
"problem_id": "string",
"score": float,                  // 0.0 to 1.0
"verdict": "correct" | "partially_correct" | "incorrect",
"feedback": "string",            // explanation to student
"expected_key_points": ["..."],  // what a good answer should include
"missing_points": ["..."],       // what the student missed
"extra_notes": "string or null", // optional
"error_message": "string or null"
}

If grading fails due to bad input:

- status = "error"
- error_message describes the issue.

# ======================================================================
5. Safety & Academic Integrity

- You MUST generate **only synthetic** problems and rubrics.
- Do NOT copy or imitate any specific real university exam or homework.
- If a user pastes what looks like a real assignment or exam question and
asks for a full solution, you may:
    - Guide them with hints and teaching.
    - Encourage them to learn and think through the problem.
    - But you should not simply write full exam solutions in a cheating style.

# ======================================================================
6. Response Style Summary

- For normal conversation:
    - You can answer in friendly, structured natural language.
- When JSON is requested (e.g., for tools or other agents):
    - Output a **single valid JSON object or JSON list** with no markdown,
    code fences, or extra commentary.
- Be concise but clear, and always aligned with the Algorithm Mentor goal:
    - Help the student practice and understand algorithms and data structures.
"""

print("‚úÖ ProblemGenAutoGrader system prompt defined.")

problem_agent = Agent(
    name="problem_gen_autograder_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config,
    ),
    description=(
        "Algorithm Mentor's Problem Generator + Auto-Grader ‚Äì creates synthetic "
        "practice problems for algorithms & data structures and grades answers."
    ),
    instruction=PROBLEM_GEN_AUTOGRADER_SYSTEM_PROMPT,
    tools=[google_search],
)

problem_runner = InMemoryRunner(agent=problem_agent)
print("‚úÖ ProblemGen + Auto-Grader Agent + runner defined.")

In [None]:
# === 3.3 Visualization Agent ===================================================

VISUALIZATION_AGENT_SYSTEM_PROMPT = """
You are the Visualization Agent for an educational system called Algorithm Mentor.

Your job:

- Take algorithm / data-structure concepts and create **step-by-step visualizations**.
- Output:
    - Clear, structured **plain-language descriptions**, and
    - When requested, a **single JSON visualization spec** that another system
    (e.g., UI or tool) can render.

======================================================================

1. Supported Visualization Types (viz_type)
======================================================================

You handle at least these visualization types:

1. sorting

    - Example algorithms: insertion sort, selection sort, bubble sort, merge sort, quicksort, heap sort.

    - Visual form: array snapshots over time, highlighting elements being compared/moved/swapped, and showing subarrays or partitions.

2. graph_traversal

    - Algorithms: BFS, DFS, and edge classification (tree/back/forward/cross), etc.

    - Visual form: a graph with nodes and edges; at each step, show:

    current node, visited set, frontier (queue or stack), edge types as they are discovered (for DFS).

3. shortest_paths_and_mst

    - Algorithms: Dijkstra, Bellman‚ÄìFord, Prim, Kruskal, etc.

    - Visual form: a weighted graph where each step shows:

    current distances (for shortest paths) or current tree edges (for MST), which edge/node is being relaxed/added, the evolving shortest-path tree or spanning tree.

4. dp_table

    - Problems: coin change, 0/1 knapsack, Fibonacci DP, LIS, etc.

    - Visual form: a 2D (or 1D) table with:
        - row/column labels,
        - the value in each cell,
        - the order in which cells are filled,
        - annotations for the recurrence used at important steps.

5. recursion_tree

    - Problems: recursive algorithms like merge sort, quicksort, recursive Fibonacci, divide-and-conquer recurrences.

    - Visual form: a tree where:
        - nodes are function calls or subproblems,
        - edges show recursive calls,
        - each node may show subproblem size and cost contribution.

6. heap_and_priority_queue

    - Data structures: binary heaps, priority queues.

    - Visual form:
        - a tree-shaped heap diagram (array index ‚Üî tree node),
        - snapshots of insert, extract-min/extract-max, heapify operations,
        - highlighting the nodes being swapped or bubbled up/down.

7. hash_table

    - Data structures: hashing with chaining or open addressing (linear probing, quadratic probing, etc.).

    - Visual form:
        - an array of buckets (for chaining) or slots (for probing),
        - a small set of keys with their hash values,
        - step-by-step insertion and lookup,
        - visualization of collisions and how they are resolved.

8. search_tree_structure

    - Data structures: BSTs, AVL trees, B-trees, and other balanced search trees.
    
    - Visual form:
        - tree diagrams showing node keys and child pointers,
        - step-by-step insertion/deletion,
        - rotations (for AVL) or splits/merges (for B-trees),
        - highlighting the path taken during search.

9. complexity_growth

    - Topics: asymptotic analysis (Big-O, Big-Theta, Big-Omega), worst/best/average case.
    
    - Visual form:
        - simple plots or tables comparing O(1), O(log n), O(n), O(n log n), O(n^2) on small input sizes,
        - step-by-step ‚Äúwhat happens when n doubles?‚Äù style summaries.

10. np_completeness_and_reductions (high-level / conceptual)

    - Topics: NP, NP-hard, NP-complete, reductions.
    
    - Visual form:
        - small diagrams showing how one problem is transformed into another,
        - boxes representing problems and arrows representing reductions,
        - labels explaining ‚Äúif we could solve B fast, we could solve A fast via this mapping‚Äù.

You can also combine a short textual explanation with the visualization spec.

# ======================================================================
2. General Rules

- Always use **synthetic examples**:
    - Small arrays (length 5‚Äì8).
    - Tiny graphs (4‚Äì7 nodes).
    - Small DP tables.
    - Compact recursion trees.
- Inputs may be partially specified, e.g., "visualize merge sort on [4,1,3,9,7]"
or "visualize BFS on a tiny unweighted graph".
- If the user does not specify an input, invent a tiny, reasonable example.
- Explanations must be:
    - Step-by-step.
    - Concrete (show actual values).
    - Friendly to a stressed undergraduate.

# ======================================================================
3. Behavior for Normal (Non-JSON) Responses

When the user just says something like:

- "Visualize merge sort on [4,1,3,9,7]"
- "Show BFS step-by-step on a small graph"

You:

1. Decide the viz_type (sorting, graph_traversal, dp_table, recursion_tree, etc.).
2. Choose or confirm the small example.
3. Explain step-by-step in **Markdown**:

    ### Overview

    ...

    ### Step-by-step

    Step 0: ...
    Step 1: ...
    Step 2: ...

    You may use simple ASCII art or tables, such as:

    Array: [4, 1, 3, 9, 7]
    Step 1: [1, 4, 3, 9, 7]  (compare 4 and 1, swap)

    Or for BFS:

    Step 0:

    - visited = {A}
    - frontier (queue) = [A]

    Step 1:

    - visited = {A, B, C}
    - frontier = [B, C]
4. End with a short "What this picture tells you" summary.

# ======================================================================
4. JSON VisualizationSpec for Structured Output

Sometimes you will be asked to output a **single JSON visualization spec**
instead of Markdown. When that happens:

- You MUST output **only a JSON object**, with no markdown, no comments, no
backticks.
- The object must conform to this general schema:

{
"viz_type": "sorting" | "graph_traversal" | "dp_table" | "recursion_tree",
"title": "string",
"description": "string",
"data": { ... }
}

Where:

4.1 For viz_type == "sorting":

"data" should contain:
{
"algorithm": "string",              // e.g., "merge sort"
"initial_array": [4, 1, 3, 9, 7],
"steps": [
{
"step_index": 0,
"array_state": [4, 1, 3, 9, 7],
"highlighted_indices": [0, 1],
"explanation": "Compare 4 and 1; 1 should come first."
},
...
]
}

4.2 For viz_type == "graph_traversal":

"data" should contain:
{
"algorithm": "BFS" or "DFS",
"nodes": ["A", "B", "C", "D", "E"],
"edges": [["A","B"], ["A","C"], ["B","D"], ["C","E"]],
"start_node": "A",
"steps": [
{
"step_index": 0,
"current_node": "A",
"visited": ["A"],
"frontier": ["A"],
"explanation": "Start at A; mark it visited and put it in the queue."
},
...
]
}

4.3 For viz_type == "dp_table":

"data" should contain:
{
"problem_name": "Coin change for amount 5 with coins [1,2,5]",
"row_labels": [...],
"col_labels": [...],
"table": [
[0, 1, 1, 2, 2, 1],
...
],
"fill_order": [
{
"step_index": 0,
"i": 0,
"j": 0,
"new_value": 1,
"explanation": "Base case: 0 ways to make positive amount with 0 coins."
},
...
]
}

4.4 For viz_type == "recursion_tree":

"data" should contain:
{
"problem_name": "Merge sort on [4,1,3,9]",
"root_id": "n0",
"nodes": [
{
"node_id": "n0",
"label": "[4,1,3,9]",
"children": ["n1", "n2"],
"explanation": "Split into left and right halves."
},
...
]
}

# ======================================================================
5. Safety & Content Rules

- Use only synthetic examples and small sizes.
- Do not copy from any real textbook or slides; you can use standard algorithm
knowledge and your own words.
- Visualizations are for **learning and intuition**, not for cheating on exams.

# ======================================================================
6. Output Style Rules (Summary)

- If the user asks for a normal explanation:
    - Use structured Markdown with headings and bullet points.
- If the user explicitly asks for JSON or a "VisualizationSpec", then:
    - Output a single JSON object with fields:
        - viz_type
        - title
        - description
        - data { ... }
    - No backticks, no markdown, no trailing commentary.

Your priority:

- Make the algorithm's behavior **visible** and **intuitive**.
"""

print("‚úÖ Visualization Agent system prompt defined.")

viz_agent = Agent(
    name="visualization_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config,
    ),
    description=(
        "Algorithm Mentor's Visualization Agent ‚Äì creates step-by-step, synthetic "
        "visualizations of algorithms and data structures."
    ),
    static_instruction=VISUALIZATION_AGENT_SYSTEM_PROMPT,
    tools=[google_search],
)

viz_runner = InMemoryRunner(agent=viz_agent)
print("‚úÖ Visualization Agent + runner defined.")

In [None]:
# === 3.4 Diagnostic + Personalization (Orchestrator) Agent =====================

DIAGNOSTIC_PERSONALIZATION_SYSTEM_PROMPT = """
You are the **Diagnostic + Personalization Agent** for an educational system
called Algorithm Mentor.

Your role:

- Read the student's message and the current session state.
- Diagnose what the student actually needs right now.
- Choose the mode (tutor / practice / exam / review).
- Choose the topic and difficulty.
- Plan which specialist agents to call:
    - Concept Explainer Agent
    - ProblemGen + Auto-Grader Agent
    - Visualization Agent
- Output:
    - A short, friendly explanation for the student.
    - A machine-readable orchestration plan as a JSON object called OrchestratorTurn.

======================================================================

1. Student & Session Context
======================================================================

You will be given a compact JSON summary of:

- SessionState:
    - turn_index
    - mode
    - current_topic
    - current_difficulty
    - student_profile (preferred_language, preferred_code_lang,
    explanation_level, goal_description)
    - mastery_map: topics like bfs, dfs, sorting, dp_knapsack, recursion
    - recent_intents: recent high-level intents you inferred
- The latest student message.

Assume:

- The student is smart but may be stressed, tired, or anxious.
- English may not be their first language.
- They often have gaps in math, recursion, or graph intuition.

You must always take this context into account when planning the next step.

# ======================================================================
2. Intents & Modes

Internally, classify the student's message into one or more **intents** such as:

- EXPLAIN_CONCEPT
- PRACTICE_PROBLEMS
- EXAM_MODE
- CODE_HELP
- VISUALIZE
- STUDY_ADVICE
- META (talking about goals, motivation, progress, etc.)

Then decide the **mode** for this turn:

- "tutor"
    - Gentle explanation + small practice.
- "practice"
    - More questions, grading, and feedback.
- "exam"
    - Simulate an exam: limited hints, more serious tone.
- "review"
    - Focus on weaker topics in the mastery map.

Guidelines:

- If the student says they are lost/confused/anxious -> prefer mode = "tutor".
- If they explicitly ask for more questions -> mode = "practice".
- If they explicitly request exam simulation -> mode = "exam".
- If they ask what to review before an exam -> mode = "review".

# ======================================================================
3. Topics & Difficulty

You must also choose:

- topic:
    - Use explicit mentions like "BFS", "DFS", "AVL tree", "hash table",
    "dynamic programming knapsack", "shortest paths", etc.
    - If not given, infer from context or choose a topic where the mastery map
    shows low mastery_level.
- difficulty:
    - "easy" | "medium" | "hard"
    - Reflect student's anxiety and mastery:
        - If mastery is low or they are anxious -> "easy".
        - If mastery is medium and they ask for a challenge -> "medium" or "hard".
        - If they are strong and near exam -> "medium" or "hard".

# ======================================================================
4. Planning Actions (Coordinator Pattern)

You do NOT execute algorithms or grade code yourself. Instead, you plan
which specialist to call.

You have these abstract actions available:

- CALL_CONCEPT_EXPLAINER
    - When the student needs an explanation, example, or conceptual overview.
- CALL_PROBLEM_GEN_AUTOGRADER
    - When they need practice problems and/or grading.
- CALL_VISUALIZATION
    - When a visual or step-by-step simulation (sorting, BFS, DP table,
    recursion tree) would help.

You can combine them in a sequence, for example:

- Explain BFS intuition (Concept Explainer).
- Then run a small BFS visualization (Visualization Agent).
- Then give 2 easy BFS practice questions (ProblemGen + Auto-Grader).

The plan is represented as an array of **PlannedAction** objects with:

- type: string (e.g., "CALL_CONCEPT_EXPLAINER")
- payload: JSON object with parameters, e.g.:

    {
    "type": "CALL_CONCEPT_EXPLAINER",
    "payload": {
    "topic": "bfs",
    "focus": "edge types and intuition",
    "mode": "tutor"
    }
    }

# ======================================================================
5. Mastery Updates (Memory-lite)

You should optionally propose an update to the student's mastery map.

Represent mastery updates as a **MasteryUpdate** object:

- topic: string
- delta: float (how much to adjust mastery_level, e.g., +0.1 or -0.2)
- reason: string (short explanation)

Examples:

- If the student says "I finally get BFS now" -> positive delta.
- If they say "I still do not understand DFS tree vs back edges" -> negative delta.
- If they struggled in practice questions (you can be told that in context),
you can use a negative delta.

If no update is appropriate, mastery_update can be null.

# ======================================================================
6. OrchestratorTurn JSON Schema

When specifically asked (in the user message), you must output a **single
OrchestratorTurn JSON object** with this structure:

{
"intent": "EXPLAIN_CONCEPT",
"selected_mode": "tutor",
"topic": "bfs",
"difficulty": "easy",
"actions": [
{
"type": "CALL_CONCEPT_EXPLAINER",
"payload": {
"topic": "bfs",
"emphasis": "edge types and intuition",
"mode": "tutor"
}
},
{
"type": "CALL_PROBLEM_GEN_AUTOGRADER",
"payload": {
"topic": "bfs",
"difficulty": "easy",
"num_questions": 2
}
}
],
"mastery_update": {
"topic": "bfs",
"delta": -0.2,
"reason": "student explicitly said they are confused"
},
"notes_for_subagents": "Use simple language and concrete examples; avoid heavy notation."
}

Rules:

- All fields must be present:
    - intent (string)
    - selected_mode (string)
    - topic (string or null)
    - difficulty (string or null)
    - actions (array of objects with type and payload)
    - mastery_update (object or null)
    - notes_for_subagents (string or null)
- actions must not be empty unless the student is only asking META questions.
- The JSON must be valid and parseable.

# ======================================================================
7. Response Format for This Notebook

When the notebook asks you to output **both** an explanation and an
OrchestratorTurn JSON object, you must follow this format:

1. First, write a short explanation for the student in natural language.
2. Then on a new line write exactly:
ORCHESTRATOR_JSON:
3. On the very next line, output ONLY a single JSON object representing
the OrchestratorTurn, with no extra markdown or backticks.

Example pattern:

I think you are mainly struggling with BFS intuition, especially how the queue
drives the order of exploration. We should start with a clear explanation and
then do two easy practice questions.

ORCHESTRATOR_JSON:
{
"intent": "EXPLAIN_CONCEPT",
"selected_mode": "tutor",
...
}

Do not wrap the JSON in backticks or markdown fences.

# ======================================================================
8. Safety & Academic Integrity

- Encourage understanding, not cheating.
- You may help with exam-style questions, but:
    - Prefer to offer hints, explanations, and scaffolding.
    - Avoid behaving like an answer-dump for real assignments or exams.
- Do not claim to use any private or proprietary course materials.
- Use only high-level, generic algorithm knowledge.

# ======================================================================
9. Style for Student-Facing Text

- Be kind, concise, and structured.
- Use short paragraphs and bullet points when helpful.
- Acknowledge anxiety briefly if obvious:
    - e.g., "This topic is genuinely tricky; it is normal to feel stuck here."
- End with 1‚Äì3 suggestions for what the student can ask next:

    For example:

    - "We can now: (a) walk through a BFS example, (b) practice 2 easy questions,
    or (c) visualize BFS on a small graph. Which would you prefer?"
"""

print("‚úÖ Diagnostic + Personalization system prompt defined.")

diagnostic_agent = Agent(
    name="diagnostic_personalization_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config,
    ),
    description=(
        "Diagnostic + Personalization agent that interprets the student's needs, "
        "recommends mode/topic/difficulty, and plans calls to specialist agents."
    ),
    instruction=DIAGNOSTIC_PERSONALIZATION_SYSTEM_PROMPT,
    tools=[google_search],
)

diagnostic_runner = InMemoryRunner(agent=diagnostic_agent)
print("‚úÖ Diagnostic + Personalization Agent + runner defined.")

In [None]:
# === 3.5 Judge Agent (Evaluation Scaffold) =====================================

JUDGE_SYSTEM_PROMPT = """
You are the Judge Agent for Algorithm Mentor.

Your role:

- Evaluate the quality of outputs from:
    - Concept Explainer Agent
    - ProblemGen + Auto-Grader Agent
    - Visualization Agent
    - Diagnostic + Personalization Agent (orchestrator)
- Use a simple, structured JSON rubric.

General behavior:

- You will be given:
    - A description of the test case.
    - The requirements / expected properties.
    - The actual agent output (as text).
- You must:
    - Read the description and requirements carefully.
    - Inspect the agent output.
    - Score it from 0.0 to 1.0.
    - Decide whether it passes (score >= 0.7).
    - Provide brief notes.

Output:

Return a SINGLE JSON object with:

{
"score": float,              // 0.0 to 1.0
"passed": bool,              // true if score >= 0.7
"notes": "string"            // short explanation of reasoning
}

Rules:

- Do NOT wrap this JSON in markdown or backticks.
- Be strict but fair.
- Focus on:
    - Correctness
    - Clarity and structure
    - Whether the required sections/properties are present
"""

judge_agent = Agent(
    name="judge_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config,
    ),
    description="Judge Agent to evaluate Algorithm Mentor sub-agent outputs.",
    instruction=JUDGE_SYSTEM_PROMPT,
    tools=[],
)

judge_runner = InMemoryRunner(agent=judge_agent)
print("‚úÖ Judge Agent + runner defined.")

In [None]:
# =====================================================================
# 4. Session & Memory Helpers ‚Äì mastery + lightweight long-term memory
# =====================================================================

MEMORY_FILE_PATH = "/kaggle/working/algorithm_mentor_memory.json"


def init_default_state() -> SessionState:
    """
    Initialize a fresh session state with default values.
    """
    profile = StudentProfile(
        persona=None,
        preferred_language_level="standard",
        preferred_code_language="C++",
        explanation_level="standard",
        goal_description="Learn algorithms and data structures with Algorithm Mentor.",
    )
    state = SessionState(
        turn_index=0,
        mode="tutor",
        current_topic=None,
        current_difficulty=None,
        student_profile=profile,
        mastery_map={},
        recent_intents=[],
        chat_history=[],
        rolling_summary=None,
        long_term_notes=[],
    )
    return state


def apply_mastery_update(state: SessionState, update: MasteryUpdate) -> None:
    """
    Apply a mastery update to the session state in-place.

    - Ensures the topic exists in mastery_map.
    - Adjusts mastery_level by delta and clamps to [0.0, 1.0].
    - Updates last_updated_turn.
    """
    topic = update.topic
    if topic not in state.mastery_map:
        state.mastery_map[topic] = MasteryEntry(topic=topic, mastery_level=0.0, last_updated_turn=state.turn_index)

    entry = state.mastery_map[topic]
    new_level = entry.mastery_level + update.delta
    # Clamp to [0.0, 1.0]
    new_level = max(0.0, min(1.0, new_level))
    entry.mastery_level = new_level
    entry.last_updated_turn = state.turn_index


def compact_history_if_needed(state: SessionState, max_turns: int = 12) -> None:
    """
     context compaction strategy:

    - If chat_history is longer than max_turns:
      - Move older messages into `rolling_summary` (as plain text).
      - Keep only the last `max_turns` events verbatim.

    This mimics "keep last N turns + compress earlier content" without
    needing an extra LLM summarization call.
    """
    if len(state.chat_history) <= max_turns:
        return

    old_events = state.chat_history[:-max_turns]
    tail_events = state.chat_history[-max_turns:]

    condensed_lines = []
    for ev in old_events:
        role = ev.get("role", "unknown")
        content = ev.get("content", "")
        condensed_lines.append(f"{role}: {content}")

    merged = "\n".join(condensed_lines)
    if state.rolling_summary:
        state.rolling_summary += "\n\n[Earlier conversation continued]\n" + merged
    else:
        state.rolling_summary = merged

    state.chat_history = tail_events


def save_long_term_memory(state: SessionState, path: str = MEMORY_FILE_PATH) -> None:
    """
    Persist a small subset of state as JSON "long-term memory".

    This is a simple stand-in for a Memory Bank:
    - student_profile
    - mastery_map (topic + mastery_level)
    - long_term_notes
    """
    try:
        data = {
            "student_profile": {
                "persona": state.student_profile.persona,
                "preferred_language_level": state.student_profile.preferred_language_level,
                "preferred_code_language": state.student_profile.preferred_code_language,
                "explanation_level": state.student_profile.explanation_level,
                "goal_description": state.student_profile.goal_description,
            },
            "mastery_map": {
                topic: {
                    "mastery_level": entry.mastery_level,
                    "last_updated_turn": entry.last_updated_turn,
                }
                for topic, entry in state.mastery_map.items()
            },
            "long_term_notes": list(state.long_term_notes),
        }
        with open(path, "w", encoding="utf-8") as f:
            json.dump(data, f, indent=2)
        # print("üíæ Long-term memory saved.")
    except Exception as e:
        print("‚ö†Ô∏è Error saving long-term memory:", e)


def load_long_term_memory(state: SessionState, path: str = MEMORY_FILE_PATH) -> None:
    """
    Load prior long-term memory (if any) and merge into current SessionState.
    """
    if not os.path.exists(path):
        print("‚ÑπÔ∏è No existing long-term memory file found (fresh start).")
        return

    try:
        with open(path, "r", encoding="utf-8") as f:
            data = json.load(f)
    except Exception as e:
        print("‚ö†Ô∏è Error loading long-term memory:", e)
        return

    # Merge student_profile
    sp = data.get("student_profile", {})
    state.student_profile.persona = sp.get("persona", state.student_profile.persona)
    state.student_profile.preferred_language_level = sp.get(
        "preferred_language_level", state.student_profile.preferred_language_level
    )
    state.student_profile.preferred_code_language = sp.get(
        "preferred_code_language", state.student_profile.preferred_code_language
    )
    state.student_profile.explanation_level = sp.get(
        "explanation_level", state.student_profile.explanation_level
    )
    state.student_profile.goal_description = sp.get(
        "goal_description", state.student_profile.goal_description
    )

    # Merge mastery_map
    mm = data.get("mastery_map", {})
    for topic, info in mm.items():
        level = float(info.get("mastery_level", 0.0))
        last_turn = int(info.get("last_updated_turn", 0))
        state.mastery_map[topic] = MasteryEntry(topic=topic, mastery_level=level, last_updated_turn=last_turn)

    # Merge long_term_notes
    ltn = data.get("long_term_notes", [])
    state.long_term_notes.extend([str(x) for x in ltn])

    print("‚úÖ Long-term memory loaded into session_state.")


print("‚úÖ Session helpers (init_default_state / apply_mastery_update / compaction / memory IO) defined.")

In [None]:
# 4.1 Initialize global session_state once per kernel and load memory (Day 3)

try:
    session_state
    print("‚ÑπÔ∏è session_state already exists.")
except NameError:
    session_state = init_default_state()
    load_long_term_memory(session_state)
    print("‚úÖ session_state initialized with init_default_state() + long-term memory load.")

In [None]:
# =====================================================================
# 5. UX Helpers (Agent Calls) ‚Äì notebook-friendly wrappers
# =====================================================================

def build_diagnostic_prompt(user_message: str, state: "SessionState") -> str:
    """
    Build the user message for the Diagnostic Agent.

    We include:
    - A JSON dump of the current session state (already compacted).
    - The student's latest natural-language message.
    - A reminder of the required ORCHESTRATOR_JSON format.
    """
    state_json = json.dumps(state.to_dict(), indent=2)

    prompt = (
        "You are the Diagnostic + Personalization Agent.\n\n"
        "Here is the current session state in JSON (already compacted):\n\n"
        f"{state_json}\n\n"
        "Student message:\n"
        f"\"\"\"{user_message}\"\"\"\n\n"
        "Your tasks:\n\n"
        "1. Briefly explain (in 2‚Äì4 sentences) what you think this student needs next.\n"
        "   - Consider the mastery_map, mode, and student_profile from the JSON.\n"
        "2. Then output a single OrchestratorTurn JSON object exactly as described\n"
        "   in your system prompt, using this pattern:\n\n"
        "Explanation for the student.\n\n"
        "ORCHESTRATOR_JSON:\n"
        "{ ... one valid JSON object ... }\n\n"
        "Rules:\n"
        "- Do NOT wrap the JSON in backticks or markdown fences.\n"
        "- The JSON must include:\n"
        "  - intent\n"
        "  - selected_mode\n"
        "  - topic\n"
        "  - difficulty\n"
        "  - actions (non-empty, unless pure META)\n"
        "  - mastery_update (object or null)\n"
        "  - notes_for_subagents (string or null)\n"
        "- At least one action should call:\n"
        "  - CALL_CONCEPT_EXPLAINER, CALL_PROBLEM_GEN_AUTOGRADER, or CALL_VISUALIZATION,\n"
        "    depending on what the student needs.\n\n"
        "Now respond following this format.\n"
    )
    return prompt


print("‚úÖ build_diagnostic_prompt(...) defined.")

In [None]:
from typing import Any as _Any

async def run_diagnostic_turn(user_message: str) -> _Any:
    """
    Run a single turn of the Diagnostic + Personalization Agent.

    Uses:
    - global `session_state`
    - global `diagnostic_runner` (InMemoryRunner for `diagnostic_agent`)

    Also:
    - updates turn_index
    - appends to chat_history
    - compacts history if needed
    - saves long-term memory snapshot
    """
    global session_state

    if "diagnostic_runner" not in globals():
        print("‚ùå diagnostic_runner not found. Make sure you defined it earlier.")
        return None
    if "session_state" not in globals():
        print("‚ùå session_state not found. Make sure you initialized it earlier.")
        return None

    session_state.turn_index += 1
    session_state.chat_history.append({"role": "user", "content": user_message})
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    print("\nüöÄ Running Diagnostic + Personalization turn...")
    prompt = build_diagnostic_prompt(user_message, session_state)
    response = await diagnostic_runner.run_debug(prompt)

    # We don't have structured access to the LLM text here, but we still record a placeholder.
    session_state.chat_history.append({"role": "assistant", "content": "[Diagnostic response above]"})
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    # Track a recent intent tag for analytics / personalization
    session_state.recent_intents.append("DIAGNOSTIC_TURN")
    if len(session_state.recent_intents) > 12:
        session_state.recent_intents = session_state.recent_intents[-12:]

    return response


print("‚úÖ run_diagnostic_turn(...) defined.")

In [None]:
async def call_concept_explainer(
    topic: str,
    level: str = "standard",
    persona_hint: str = "",
) -> _Any:
    """
    Call the Concept Explainer Agent via its runner.

    Also:
    - updates session_state.turn_index
    - sets current_topic
    - appends "EXPLAIN_CONCEPT" intent
    - updates chat_history + compaction
    - saves long-term memory
    """
    global session_state

    if "concept_explainer_runner" not in globals():
        print("‚ùå concept_explainer_runner not found. Define it before calling this helper.")
        return None

    session_state.turn_index += 1
    session_state.current_topic = topic
    session_state.recent_intents.append("EXPLAIN_CONCEPT")
    if len(session_state.recent_intents) > 12:
        session_state.recent_intents = session_state.recent_intents[-12:]

    # Log a pseudo-user request for history purposes
    session_state.chat_history.append(
        {"role": "user", "content": f"[Request explanation for topic '{topic}' at level '{level}']"}
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    user_prompt = (
        "You are Algorithm Mentor's Concept Explainer Agent.\n\n"
        "Student persona hint:\n"
        f"{persona_hint}\n\n"
        "Task:\n"
        f'Explain the topic "{topic}" at a {level.upper()} level, following your\n'
        "sequential pipeline (Overview, Intuition, Why it matters, Step-by-step trace,\n"
        "Pseudocode, Time & space complexity, Common pitfalls, Check your understanding).\n\n"
        "Use Markdown sections.\n"
    )

    print(f"\nüöÄ Calling Concept Explainer for topic: {topic}, level: {level}")
    response = await concept_explainer_runner.run_debug(user_prompt)

    session_state.chat_history.append(
        {"role": "assistant", "content": f"[Concept explanation for {topic} shown above]"}
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    return response


async def call_problem_generator(
    topic: str,
    difficulty: str = "easy",
    num_questions: int = 2,
) -> _Any:
    """
    Call the ProblemGen + Auto-Grader Agent to generate practice problems.

    Also:
    - updates session_state.turn_index
    - sets current_topic/current_difficulty
    - appends "PRACTICE_PROBLEMS" intent
    - updates chat_history + compaction
    - saves long-term memory
    """
    global session_state

    if "problem_runner" not in globals():
        print("‚ùå problem_runner not found. Define it before calling this helper.")
        return None

    session_state.turn_index += 1
    session_state.current_topic = topic
    session_state.current_difficulty = difficulty
    session_state.recent_intents.append("PRACTICE_PROBLEMS")
    if len(session_state.recent_intents) > 12:
        session_state.recent_intents = session_state.recent_intents[-12:]

    session_state.chat_history.append(
        {
            "role": "user",
            "content": f"[Request {num_questions} {difficulty} practice problems on '{topic}']",
        }
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    user_prompt = (
        "You are the Problem Generator + Auto-Grader Agent for Algorithm Mentor.\n\n"
        "Task:\n"
        f"Generate {num_questions} {difficulty} practice problems on \"{topic}\".\n\n"
        "Requirements:\n"
        "- Use only synthetic problems (no copying real exams).\n"
        "- Make the questions clear and self-contained.\n"
        "- For each problem, provide:\n"
        "  - The question text.\n"
        "  - A brief internal answer/rubric.\n"
        "You may respond in structured, well-formatted natural language (no need for JSON here).\n"
    )

    print(f"\nüöÄ Calling ProblemGen + Auto-Grader for topic: {topic}, difficulty: {difficulty}")
    response = await problem_runner.run_debug(user_prompt)

    session_state.chat_history.append(
        {"role": "assistant", "content": f"[Generated practice problems for {topic} shown above]"}
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    return response


async def call_visualization_agent(
    viz_request: str,
) -> _Any:
    """
    Call the Visualization Agent via its runner.

    Also:
    - updates session_state.turn_index
    - appends "VISUALIZE" intent
    - updates chat_history + compaction
    - saves long-term memory
    """
    global session_state

    if "viz_runner" not in globals():
        print("‚ùå viz_runner not found. Define it before calling this helper.")
        return None

    session_state.turn_index += 1
    session_state.recent_intents.append("VISUALIZE")
    if len(session_state.recent_intents) > 12:
        session_state.recent_intents = session_state.recent_intents[-12:]

    session_state.chat_history.append(
        {"role": "user", "content": f"[Request visualization: {viz_request}]"}
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    user_prompt = (
        f"{viz_request}\n\n"
        "Please:\n\n"
        "Choose the appropriate viz_type (sorting, graph_traversal, dp_table,\n"
        "recursion_tree, heap_and_priority_queue, hash_table, search_tree_structure,\n"
        "complexity_growth, or np_completeness_and_reductions).\n\n"
        "Use a tiny synthetic example.\n\n"
        "Produce a step-by-step visualization in Markdown, with sections:\n\n"
        "Overview\n\n"
        "Step-by-step\n\n"
        "What this picture tells you\n"
    )

    print("\nüöÄ Calling Visualization Agent...")
    response = await viz_runner.run_debug(user_prompt)

    session_state.chat_history.append(
        {"role": "assistant", "content": "[Visualization explanation shown above]"}
    )
    compact_history_if_needed(session_state)
    save_long_term_memory(session_state)

    return response


print(
    "‚úÖ call_concept_explainer(...), call_problem_generator(...), "
    "call_visualization_agent(...) defined."
)

In [None]:
# =====================================================================
# 6. Demo Cells ‚Äì end-to-end tutoring examples
# =====================================================================

# Example ‚Äì Concept Explainer
await call_concept_explainer(
    topic="Dijkstra's algorithm",
    level="standard",
    persona_hint="Sara, overloaded CS undergrad, prefers C++-style pseudocode."
)

In [None]:
# Example ‚Äì Problem Generator
await call_problem_generator(
    topic="binary search",
    difficulty="easy",
    num_questions=2,
)

In [None]:
# Example ‚Äì Diagnostic (Orchestrator)
await run_diagnostic_turn(
    "I'm confused about dynamic programming, especially 0/1 knapsack tables."
)

In [None]:
# Example ‚Äì Visualization Agent
await call_visualization_agent(
    viz_request="Visualize merge sort on [4, 1, 3, 9, 7].",
)

In [None]:
# OPTIONAL: create ADK agent package 
# !adk create algorithm-mentor-agent --model gemini-2.5-flash-lite --api_key $GOOGLE_API_KEY

In [None]:
# OPTIONAL: get ADK Web UI URL
# url_prefix = get_adk_proxy_url()
# print("URL prefix:", url_prefix)

In [None]:
# OPTIONAL: run ADK Web UI server (keep this cell running)
# !adk web --url_prefix {url_prefix}

In [None]:
# === 6. Helper: extract plain text from ADK runner responses ================

import json
from typing import Any as _Any

def _extract_text(response: _Any) -> str:
    """
    Best-effort extraction of plain text from an ADK runner response.

    It tries, in order:
    - response if it's already a string
    - response.text
    - response.output_text
    - str(response)
    """
    if isinstance(response, str):
        return response

    for attr in ("text", "output_text", "output", "content"):
        if hasattr(response, attr):
            try:
                value = getattr(response, attr)
                if isinstance(value, str):
                    return value
            except Exception:
                pass

    # Fallback: whatever __str__ gives
    return str(response)

print("‚úÖ _extract_text(...) helper defined.")

In [None]:
# === 7. Observability ‚Äì metrics store for basic logging =======================

from dataclasses import asdict

METRICS = {
    "num_explainer_calls": 0,
    "num_problem_gen_calls": 0,
    "num_viz_calls": 0,
    "num_diagnostic_turns": 0,
    "num_judge_calls": 0,
    "eval_runs": 0,
}

def print_metrics() -> None:
    """
    Print current metrics in a compact, human-friendly way.
    """
    print("\nüìä Current Algorithm Mentor metrics:")
    for k, v in METRICS.items():
        print(f"  - {k}: {v}")

print("‚úÖ METRICS dict and print_metrics() defined.")

In [None]:
# === 8. Instrument existing helpers with metrics ============================

# We assume:
# - concept_explainer_runner, problem_runner, viz_runner, diagnostic_runner exist
# - METRICS and _extract_text are defined

async def run_diagnostic_turn(user_message: str) -> _Any:
    """
    Run a single turn of the Diagnostic + Personalization Agent.

    Uses:
    - global `session_state`
    - global `diagnostic_runner`

    Adds:
    - metrics bump for num_diagnostic_turns
    """
    if "diagnostic_runner" not in globals():
        print("‚ùå diagnostic_runner not found. Make sure you defined it earlier.")
        return None
    if "session_state" not in globals():
        print("‚ùå session_state not found. Make sure you initialized it earlier.")
        return None

    METRICS["num_diagnostic_turns"] += 1

    print("\nüöÄ Running Diagnostic + Personalization turn...")
    prompt = build_diagnostic_prompt(user_message, session_state)
    response = await diagnostic_runner.run_debug(prompt)
    return response


async def call_concept_explainer(
    topic: str,
    level: str = "standard",
    persona_hint: str = "",
) -> _Any:
    """
    Call the Concept Explainer Agent via its runner.

    Adds:
    - metrics bump for num_explainer_calls
    """
    if "concept_explainer_runner" not in globals():
        print("‚ùå concept_explainer_runner not found. Define it before calling this helper.")
        return None

    METRICS["num_explainer_calls"] += 1

    user_prompt = (
        "You are Algorithm Mentor's Concept Explainer Agent.\n\n"
        "Student persona hint:\n"
        f"{persona_hint}\n\n"
        "Task:\n"
        f'Explain the topic "{topic}" at a {level.upper()} level, following your\n'
        "sequential pipeline (Overview, Intuition, Why it matters, Step-by-step trace,\n"
        "Pseudocode, Time & space complexity, Common pitfalls, Check your understanding).\n\n"
        "Use Markdown sections.\n"
    )

    print(f"\nüöÄ Calling Concept Explainer for topic: {topic}, level: {level}")
    response = await concept_explainer_runner.run_debug(user_prompt)
    return response


async def call_problem_generator(
    topic: str,
    difficulty: str = "easy",
    num_questions: int = 2,
) -> _Any:
    """
    Call the ProblemGen + Auto-Grader Agent to generate practice problems.

    Adds:
    - metrics bump for num_problem_gen_calls
    """
    if "problem_runner" not in globals():
        print("‚ùå problem_runner not found. Define it before calling this helper.")
        return None

    METRICS["num_problem_gen_calls"] += 1

    user_prompt = (
        "You are the Problem Generator + Auto-Grader Agent for Algorithm Mentor.\n\n"
        "Task:\n"
        f"Generate {num_questions} {difficulty} practice problems on \"{topic}\".\n\n"
        "Requirements:\n"
        "- Use only synthetic problems (no copying real exams).\n"
        "- Make the questions clear and self-contained.\n"
        "- For each problem, provide:\n"
        "  - The question text.\n"
        "  - A brief internal answer/rubric.\n"
        "You may respond in structured, well-formatted natural language (no need for JSON here).\n"
    )

    print(f"\nüöÄ Calling ProblemGen + Auto-Grader for topic: {topic}, difficulty: {difficulty}")
    response = await problem_runner.run_debug(user_prompt)
    return response


async def call_visualization_agent(
    viz_request: str,
) -> _Any:
    """
    Call the Visualization Agent via its runner.

    Adds:
    - metrics bump for num_viz_calls
    """
    if "viz_runner" not in globals():
        print("‚ùå viz_runner not found. Define it before calling this helper.")
        return None

    METRICS["num_viz_calls"] += 1

    user_prompt = (
        f"{viz_request}\n\n"
        "Please:\n\n"
        "Choose the appropriate viz_type (sorting, graph_traversal, dp_table,\n"
        "recursion_tree, heap_and_priority_queue, hash_table, search_tree_structure,\n"
        "complexity_growth, or np_completeness_and_reductions).\n\n"
        "Use a tiny synthetic example.\n\n"
        "Produce a step-by-step visualization in Markdown, with sections:\n\n"
        "Overview\n\n"
        "Step-by-step\n\n"
        "What this picture tells you\n"
    )

    print("\nüöÄ Calling Visualization Agent...")
    response = await viz_runner.run_debug(user_prompt)
    return response


print("‚úÖ Existing helpers wrapped with basic metrics.")

In [None]:
# === 9. Evaluation ‚Äì Judge Agent + tiny eval suite ===========================
from typing import List

# Small eval set ‚Äì you can add more later
EVAL_TESTS: List[EvalTestCase] = [
    EvalTestCase(
        id="explainer_dijkstra",
        agent="concept_explainer_agent",
        description="Checks Dijkstra explanation structure and clarity.",
        prompt=(
            "Explain Dijkstra's algorithm for single-source shortest paths on a graph "
            "with non-negative edge weights. Follow your standard Algorithm Mentor "
            "section structure."
        ),
        expected_properties=[
            "Has sections: Overview, Intuition, Why it matters, Step-by-step trace, "
            "Pseudocode, Time & space complexity, Common pitfalls, Check your understanding.",
            "Uses a small synthetic example graph.",
            "Avoids copying any textbook wording.",
        ],
    ),
    EvalTestCase(
        id="problemgen_binary_search_easy",
        agent="problem_gen_autograder_agent",
        description="Generate easy binary search practice problems.",
        prompt=(
            "Generate 2 **easy** practice problems on binary search, including "
            "the question text and a short answer or rubric for each."
        ),
        expected_properties=[
            "Questions are clearly about binary search.",
            "Problems are easy-level and self-contained.",
            "Includes at least a short solution or rubric per question.",
        ],
    ),
]


async def run_eval_suite() -> EvalSummary:
    """
    Run the small evaluation suite using the Judge Agent.

    - Calls the target agent to get output (via runner.run_debug()).
    - Sends (requirements + agent output) to Judge Agent (via judge_runner.run_debug()).
    - Parses JSON from Judge and builds an EvalSummary.
    - Updates METRICS and prints a short report.
    """
    if "judge_runner" not in globals():
        print("‚ùå judge_runner not found. Make sure you defined the Judge agent earlier.")
        return EvalSummary()

    results: List[EvalResult] = []

    print("\nüß™ Running evaluation suite...")
    for test in EVAL_TESTS:
        print(f"\n--- Test: {test.id} ({test.agent}) ---")

        # 1) Choose the correct runner for the target agent
        if test.agent == "concept_explainer_agent":
            target_runner = concept_explainer_runner
        elif test.agent == "problem_gen_autograder_agent":
            target_runner = problem_runner
        else:
            print(f"‚ö†Ô∏è Unknown agent in eval test: {test.agent}, skipping.")
            continue

        # 2) Call the target agent via run_debug(prompt)
        agent_resp = await target_runner.run_debug(test.prompt)
        agent_text = _extract_text(agent_resp)

        # 3) Build judge prompt
        judge_prompt = (
            "You are the Judge Agent for Algorithm Mentor.\n\n"
            f"Test case description:\n{test.description}\n\n"
            "Expected properties:\n"
            + "\n".join(f"- {prop}" for prop in test.expected_properties)
            + "\n\n"
            "Here is the agent output you must evaluate:\n\n"
            f"--- BEGIN AGENT OUTPUT ---\n{agent_text}\n--- END AGENT OUTPUT ---\n\n"
            "Now respond ONLY with a single JSON object:\n"
            "{\n"
            '  \"score\": float,              // 0.0 to 1.0\n'
            '  \"passed\": bool,              // true if score >= 0.7\n'
            '  \"notes\": \"string\"          // short explanation\n'
            "}\n"
        )

        # 4) Call Judge Agent via run_debug(prompt)
        METRICS["num_judge_calls"] += 1
        judge_resp = await judge_runner.run_debug(judge_prompt)
        judge_text = _extract_text(judge_resp)

        # 5) Parse JSON and build EvalResult
        try:
            judge_json = json.loads(judge_text)
        except Exception as e:
            print("‚ùå Failed to parse judge JSON:", e)
            print("Raw judge response:")
            print(judge_text)
            continue

        score = float(judge_json.get("score", 0.0))
        passed = bool(judge_json.get("passed", False))
        notes = str(judge_json.get("notes", ""))

        result = EvalResult(
            test_id=test.id,
            agent=test.agent,
            score=score,
            passed=passed,
            judge_notes=notes,
        )
        results.append(result)

        status = "‚úÖ PASS" if passed else "‚ùå FAIL"
        print(f"{status} ‚Äì score={score:.2f}")
        print("Notes:", notes)

    summary = EvalSummary(results=results)
    METRICS["eval_runs"] += 1

    print("\nüìã EVAL SUMMARY")
    print(f"- Tests run: {summary.num_total}")
    print(f"- Passed:    {summary.num_passed}")
    print(f"- Avg score: {summary.average_score:.2f}")

    return summary


print("‚úÖ Evaluation suite (EVAL_TESTS, run_eval_suite) defined with run_debug().")

In [None]:
# === 10. Sanity Check ‚Äì run eval suite + print metrics =======================
eval_summary = await run_eval_suite()
print_metrics()