# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [None]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [None]:
# TODO: Import the necessary libs
# For example: 
# import os

# from lib.agents import Agent
# from lib.llm import LLM
# from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
# from lib.tooling import tool

import os
import json
import uuid
from pathlib import Path
from typing import List, Dict, Any
from datetime import datetime

import chromadb
from chromadb.utils import embedding_functions
from dotenv import load_dotenv
from tavily import TavilyClient
from pydantic import BaseModel, Field

from lib.agents import Agent
from lib.llm import LLM
from lib.messages import BaseMessage
from lib.tooling import tool

In [None]:
# TODO: Load environment variables
# load_dotenv()

# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
CHROMA_OPENAI_API_KEY = os.getenv("CHROMA_OPENAI_API_KEY") or OPENAI_API_KEY

assert OPENAI_API_KEY is not None, "Missing OPENAI_API_KEY in .env"
assert TAVILY_API_KEY is not None, "Missing TAVILY_API_KEY in .env"
assert CHROMA_OPENAI_API_KEY is not None, "Missing CHROMA_OPENAI_API_KEY (or OPENAI_API_KEY) in .env"

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [None]:
# TODO: Create retrieve_game tool
# It should use chroma client and collection you created
# chroma_client = chromadb.PersistentClient(path="chromadb")
# collection = chroma_client.get_collection("udaplay")
# Tool Docstring:
#    Semantic search: Finds most results in the vector DB
#    args:
#    - query: a question about game industry. 
#
#    You'll receive results as list. Each element contains:
#    - Platform: like Game Boy, Playstation 5, Xbox 360...)
#    - Name: Name of the Game
#    - YearOfRelease: Year when that game was released for that platform
#    - Description: Additional details about the game

CHROMA_DIR = Path(os.getenv("CHROMA_PATH") or "chromadb").resolve()
COLLECTION_NAME = "udaplay"
MEMORY_COLLECTION_NAME = "udaplay_long_term_memory"

chroma_client = chromadb.PersistentClient(path=str(CHROMA_DIR))
try:
    embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
        api_key=CHROMA_OPENAI_API_KEY,
        model_name="text-embedding-3-small",
    )
except TypeError:
    embedding_fn = embedding_functions.OpenAIEmbeddingFunction(api_key=CHROMA_OPENAI_API_KEY)

collection = chroma_client.get_or_create_collection(
    name=COLLECTION_NAME,
    embedding_function=embedding_fn,
 )
memory_collection = chroma_client.get_or_create_collection(
    name=MEMORY_COLLECTION_NAME,
    embedding_function=embedding_fn,
 )

@tool
def retrieve_game(query: str, n_results: int = 5) -> List[Dict[str, Any]]:
    """Semantic search in the local game VectorDB."""
    res = collection.query(
        query_texts=[query],
        n_results=n_results,
        include=["documents", "metadatas", "distances"],
    )

    docs = res.get("documents", [[]])[0]
    metas = res.get("metadatas", [[]])[0]
    dists = res.get("distances", [[]])[0]

    out: List[Dict[str, Any]] = []
    for doc, meta, dist in zip(docs, metas, dists):
        item = dict(meta or {})
        item["_indexed_text"] = doc
        item["_distance"] = dist
        item["_source"] = {"type": "local_vector_db", "collection": COLLECTION_NAME}
        out.append(item)

    return out

#### Evaluate Retrieval Tool

In [None]:
# TODO: Create evaluate_retrieval tool
# You might use an LLM as judge in this tool to evaluate the performance
# You need to prompt that LLM with something like:
# "Your task is to evaluate if the documents are enough to respond the query. "
# "Give a detailed explanation, so it's possible to take an action to accept it or not."
# Use EvaluationReport to parse the result
# Tool Docstring:
#    Based on the user's question and on the list of retrieved documents, 
#    it will analyze the usability of the documents to respond to that question. 
#    args: 
#    - question: original question from user
#    - retrieved_docs: retrieved documents most similar to the user query in the Vector Database
#    The result includes:
#    - useful: whether the documents are useful to answer the question
#    - description: description about the evaluation result

class EvaluationReport(BaseModel):
    useful: bool = Field(description="Whether the documents are useful to answer the question")
    confidence: float = Field(description="Confidence score 0-1", ge=0.0, le=1.0)
    description: str = Field(description="Explanation for the decision")
    should_web_search: bool = Field(description="Whether to fall back to web search")

@tool
def evaluate_retrieval(question: str, retrieved_docs: List[Dict[str, Any]]) -> Dict[str, Any]:
    """Evaluate whether retrieved docs are sufficient; otherwise recommend web search."""
    judge = LLM(model="gpt-4o-mini", temperature=0.0)

    prompt = (
        "You are a strict retrieval evaluator.\n"
        "Decide whether the retrieved documents are sufficient to answer the user's question accurately.\n\n"
        f"Question:\n{question}\n\n"
        "Retrieved documents (JSON):\n"
        f"{json.dumps(retrieved_docs, ensure_ascii=False)}\n\n"
        "Return ONLY a JSON object matching this schema:\n"
        "{useful: bool, confidence: float (0-1), description: str, should_web_search: bool}\n\n"
        "Rules:\n"
        "- useful=true only if the docs directly contain the needed facts.\n"
        "- If the question is time-sensitive (e.g., 'right now/currently'), prefer web search.\n"
        "- should_web_search=true when useful is false OR confidence < 0.65.\n"
    )

    try:
        ai = judge.invoke(prompt, response_format=EvaluationReport)
        report = EvaluationReport.model_validate_json(ai.content)
    except Exception:
        # Safe fallback heuristic if structured parsing fails
        useful = bool(retrieved_docs)
        report = EvaluationReport(
            useful=useful,
            confidence=0.5 if useful else 0.0,
            description="Fallback evaluation (structured parsing failed).",
            should_web_search=not useful,
        )

    if (not report.useful) or (report.confidence < 0.65):
        report.should_web_search = True
    return report.model_dump()

#### Game Web Search Tool

In [None]:
# TODO: Create game_web_search tool
# Please use Tavily client to search the web
# Tool Docstring:
#    Semantic search: Finds most results in the vector DB
#    args:
#    - question: a question about game industry. 

@tool
def game_web_search(question: str, search_depth: str = "advanced", max_results: int = 5) -> Dict[str, Any]:
    """Search the web using Tavily API (fallback) and store a compact memory fragment."""
    client = TavilyClient(api_key=TAVILY_API_KEY)
    search_result = client.search(
        query=question,
        search_depth=search_depth,
        include_answer=True,
        include_raw_content=False,
        include_images=False,
        max_results=max_results,
    )

    formatted = {
        "answer": search_result.get("answer", ""),
        "results": search_result.get("results", []),
        "search_metadata": {"timestamp": datetime.now().isoformat(), "query": question},
        "_source": {"type": "tavily"},
    }

    # Persist a compact memory fragment for future use
    urls = [r.get("url") for r in formatted.get("results", []) if isinstance(r, dict) and r.get("url")]
    memory_text = (
        f"Question: {question}\n"
        f"Web answer: {formatted.get('answer','')}\n"
        f"URLs: {json.dumps(urls, ensure_ascii=False)}"
    )
    metadata = {
        "kind": "web_search_result",
        "question": question,
        "timestamp": datetime.now().isoformat(),
        "urls": json.dumps(urls, ensure_ascii=False),
    }
    doc_id = f"web_{uuid.uuid4().hex}"
    try:
        if hasattr(memory_collection, "upsert"):
            memory_collection.upsert(ids=[doc_id], documents=[memory_text], metadatas=[metadata])
        else:
            memory_collection.add(ids=[doc_id], documents=[memory_text], metadatas=[metadata])
    except Exception:
        pass

    return formatted

### Agent

In [None]:
# TODO: Create your Agent abstraction using StateMachine
# Equip with an appropriate model
# Craft a good set of instructions 
# Plug all Tools you developed

tools = [retrieve_game, evaluate_retrieval, game_web_search]

INSTRUCTIONS = (
    "You are UdaPlay, an AI research agent for the video game industry.\n"
    "You have tools to retrieve from a local VectorDB and to search the web.\n\n"
    "Workflow you MUST follow:\n"
    "1) Call retrieve_game first.\n"
    "2) Call evaluate_retrieval(question, retrieved_docs).\n"
    "3) If should_web_search is true, call game_web_search(question).\n\n"
    "Final answer requirements:\n"
    "- Provide a concise response and include citations (URLs) if web search was used.\n"
    "- If you cannot find the answer, say you don't know. Do not fabricate.\n"
 )

agent = Agent(
    model_name="gpt-4o-mini",
    instructions=INSTRUCTIONS,
    tools=tools,
    temperature=0.2,
 )

agent

In [None]:
# TODO: Invoke your agent
# - When Pokémon Gold and Silver was released?
# - Which one was the first 3D platformer Mario game?
# - Was Mortal Kombat X realeased for Playstation 5?

def print_trace(messages: List[BaseMessage]):
    for m in messages:
        tool_calls = getattr(m, "tool_calls", None)
        if tool_calls:
            print(f"-> role={m.role} tool_calls={[tc.function.name for tc in tool_calls]}")
        else:
            print(f"-> role={m.role} content={str(m.content)[:200]}")

queries = [
    "When was Pokémon Gold and Silver released?",
    "Which one was the first 3D platformer Mario game?",
    "Was Mortal Kombat X released for PlayStation 5?",
]

for q in queries:
    print("\n" + "=" * 90)
    print("USER:", q)
    run = agent.invoke(q, session_id="demo")
    final_state = run.get_final_state()
    messages = final_state.get("messages", []) if final_state else []
    print("\nTRACE:")
    print_trace(messages)
    if messages:
        print("\nFINAL ANSWER:")
        print(messages[-1].content)

### (Optional) Advanced

In [None]:
# TODO: Update your agent with long-term memory
# TODO: Convert the agent to be a state machine, with the tools being pre-defined nodes

# Note: In this implementation, long-term memory is already persisted via the
# `udaplay_long_term_memory` ChromaDB collection inside `game_web_search()`.
# The `Agent` is already implemented using a StateMachine in lib/agents.py.