# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [17]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [18]:
# Import the necessary libs
import os
import chromadb
from chromadb.utils import embedding_functions
from lib.agents import Agent
from lib.llm import LLM
from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
from lib.tooling import tool
from dotenv import load_dotenv
from typing import List, Optional, Dict, Any
from tavily import TavilyClient
from datetime import datetime
from lib.messages import BaseMessage

In [19]:
# Load environment variables
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

In [20]:
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [21]:
chroma_client = chromadb.PersistentClient(path="chromadb")
embedding_fn = embedding_functions.OpenAIEmbeddingFunction(api_base="https://openai.vocareum.com/v1",api_key=OPENAI_API_KEY)
collection = chroma_client.get_collection("udaplay", embedding_function=embedding_fn)

In [None]:
@tool
def retrieve_game(query: str, n_results: int = 5):
    """
    Perform semantic search query and return list of game info dicts.

    Args:
        query (str): Query string about the game industry.
        n_results (int): Number of top results to return.

    Returns:
        List[dict]: List of games where each dict contains:
            - Platform
            - Name
            - YearOfRelease
            - Description
    """
    # Ensure single query wrapped in list
    query_texts = [query]

    results = collection.query(
        query_texts=query_texts,
        n_results=n_results,
        include=['documents', 'distances', 'metadatas']
    )

    # Extract the first (and only) query results sublists
    metadatas = results.get('metadatas', [[]])[0]

    # Build the list of game dictionaries with requested fields
    games = []
    for metadata in metadatas:
        game_info = {
            "Platform": metadata.get("Platform"),
            "Name": metadata.get("Name"),
            "YearOfRelease": metadata.get("YearOfRelease"),
            "Description": metadata.get("Description")
        }
        games.append(game_info)

    return games

#### Testing the retrieve_game tool

In [23]:
results = retrieve_game.__call__(query="Best RPG games on Game Boy", n_results=3)

In [24]:
results

[{'Platform': 'Game Boy Color',
  'Name': 'Pokémon Gold and Silver',
  'YearOfRelease': 1999,
  'Description': 'Second-generation Pokémon games introducing new regions, Pokémon, and gameplay mechanics.'},
 {'Platform': 'Game Boy Advance',
  'Name': 'Pokémon Ruby and Sapphire',
  'YearOfRelease': 2002,
  'Description': 'Third-generation Pokémon games set in the Hoenn region, featuring new Pokémon and double battles.'},
 {'Platform': 'Nintendo 64',
  'Name': 'Super Mario 64',
  'YearOfRelease': 1996,
  'Description': "A groundbreaking 3D platformer that set new standards for the genre, featuring Mario's quest to rescue Princess Peach."}]

#### Evaluate Retrieval Tool

In [25]:
from pydantic import BaseModel, Field
from lib.llm import LLM
from lib.parsers import PydanticOutputParser

In [26]:
llm = LLM(model = "gpt-4o", base_url="https://openai.vocareum.com/v1",api_key=OPENAI_API_KEY)

In [27]:
class EvaluationReport(BaseModel):
    useful: bool = Field(description="Whether the documents are useful to answer the question")
    description: str = Field(description="Detailed explanation about the evaluation result")

In [28]:
class RetrievalEvaluator:
    """
    Tool to evaluate if retrieved documents are sufficient to answer the query.
    Uses an LLM as a judge with structured output parsing.
    """

    def __init__(self, llm):
        self.llm = llm
        self.parser = PydanticOutputParser(model_class=EvaluationReport)

    def evaluate_retrieval(self, question: str, retrieved_docs: list[str]) -> EvaluationReport:
        """
        Evaluate the usability of retrieved documents for answering the question.

        Args:
            question (str): Original user question.
            retrieved_docs (list[str]): List of retrieved documents (texts).

        Returns:
            EvaluationReport: Structured evaluation with usefulness and explanation.
        """

        # Compose prompt to instruct the LLM judge
        prompt = f"""
        Your task is to evaluate if the documents are enough to respond to the query.

        User Query:
        {question}

        Retrieved Documents:
        {chr(10).join(f"- {doc}" for doc in retrieved_docs)}

        Please answer in JSON format with the following fields:
        - useful: true if the documents are sufficient to answer the question, false otherwise
        - description: provide a detailed explanation to justify your assessment, so that it is possible to decide whether to accept or reject these documents for answering the query.
        """

        # Invoke LLM with prompt, asking for structured output
        response = self.llm.invoke(input=prompt, response_format=EvaluationReport)

        # Parse the LLM response into EvaluationReport, with fallback for errors
        try:
            evaluation = self.parser.parse(response)
        except Exception as e:
            # Fallback: if parsing fails, return a default negative evaluation with explanation
            evaluation = EvaluationReport(
                useful=False,
                description=f"Failed to parse LLM evaluation response: {e}. Raw response: {response.content}"
            )

        return evaluation

In [None]:
@tool
def evaluate_retrieval(question: str, retrieved_docs: list[str]) -> dict:
    """
    Based on the user's question and the list of retrieved documents,
    analyze the usability of the documents to respond to that question.

    Args:
        question (str): Original user question.
        retrieved_docs (list[str]): Retrieved documents most similar to the user query.

    Returns:
        dict: {
            "useful": bool,           # Whether the documents are useful to answer the question
            "description": str        # Detailed explanation about the evaluation result
        }
    """
    evaluator = RetrievalEvaluator(llm)

    evaluation_report = evaluator.evaluate_retrieval(question, retrieved_docs)

    # Convert EvaluationReport Pydantic model to dict and return relevant fields
    return {
        "useful": evaluation_report.useful,
        "description": evaluation_report.description
    }



#### Testing the evaluate retrieval tool

In [30]:
sample_question = "What are the most popular RPG games on Game Boy?"
sample_docs = [
    "[Game Boy Color] Pokémon Gold and Silver (1999) - Second-generation Pokémon games introducing new regions, Pokémon, and gameplay mechanics.",
    "[Game Boy Advance] Pokémon Ruby and Sapphire (2002) - Third-generation Pokémon games set in the Hoenn region, featuring new Pokémon and double battles."
]

report = evaluate_retrieval.__call__(sample_question, sample_docs)

In [31]:
print(report)

{'useful': False, 'description': "The retrieved documents focus on Pokémon games, specifically Pokémon Gold and Silver for the Game Boy Color and Pokémon Ruby and Sapphire for the Game Boy Advance. While Pokémon games are indeed popular RPGs on the Game Boy platform, the query asks for the most popular RPG games on the Game Boy, which includes both the original Game Boy and Game Boy Color. \n\nThe documents do not provide a comprehensive list or mention other popular RPGs that were available on the original Game Boy, such as:\n\n1. **Final Fantasy Legend Series**: Known as the SaGa series in Japan, these games were among the first RPGs available on the Game Boy and were quite popular.\n2. **The Legend of Zelda: Link's Awakening**: Although more of an action-adventure game, it is often included in discussions about RPGs due to its gameplay style.\n3. **Dragon Warrior Monsters**: A spin-off of the Dragon Quest series, this game was well-received and popular among RPG fans.\n\nThe documen

#### Game Web Search Tool

In [None]:
@tool
def web_search(query: str, search_depth: str = "advanced") -> Dict:
    """
    Search the web using Tavily API
    args:
        query (str): Search query
        search_depth (str): Type of search - 'basic' or 'advanced' (default: advanced)
    """

    # Perform the search
    search_result = tavily_client.search(
        query=query,
        search_depth=search_depth,
        include_answer=True,
        include_raw_content=False,
        include_images=False
    )
    
    # Format the results
    formatted_results = {
        "answer": search_result.get("answer", ""),
        "results": search_result.get("results", []),
        "search_metadata": {
            "timestamp": datetime.now().isoformat(),
            "query": query
        }
    }
    
    return formatted_results 

#### Testing the web_search tool

In [33]:
result = web_search.__call__(query = "Who is the current president in 2025?")

In [34]:
print(result)

{'answer': 'Donald Trump is the current president in 2025. He is the 47th president, inaugurated on January 20, 2025. His administration includes Vice President J.D. Vance.', 'results': [{'url': 'https://en.wikipedia.org/wiki/President_of_the_United_States', 'title': 'President of the United States', 'content': 'Donald Trump is the 47th and current president since January 20, 2025.', 'score': 0.8740022, 'raw_content': None}, {'url': 'https://www.usa.gov/presidents', 'title': 'Presidents, vice presidents, and first ladies | USAGov', 'content': 'Learn about the duties of president, vice president, and first lady of the United States. Find out how to contact and learn more about current and past leaders.\n\n## President of the United States\n\nThe president of the United States is the:\n\n### Current president\n\nThe 47th and current president of the United States is Donald John Trump. He was sworn into office on January 20, 2025.\n\n### Former U.S. presidents [...] The vice president of 

### Agent

In [35]:
tools = [retrieve_game, evaluate_retrieval, web_search]

In [36]:
# Create your Agent abstraction using StateMachine
# Equip with an appropriate model
# Craft a good set of instructions 
# Plug all Tools you developed

simple_agent = Agent(
    model_name="gpt-4o",
    instructions=(
            "You are a web-aware assistant that can search for update information "
            "For each query, you must use your retrieve_game tool to get relevant information from the vector database to answer the question. "
            "After you receive the results of that tool, use the evaluate_retrieval tool to see if the results are sufficient to answer the question. "
            "If they are deemed sufficient and useful is true, then return the answer. "
            "If they are deemed insufficient and useful is false, fall back to using the web_search tool to look up the query on the internet. "
            "The web search results can then be used to draft your final answer." 
            "Always cite your sources and explain any discrepancies found.\n"
    ),
    tools = tools,
    base_url="https://openai.vocareum.com/v1",
    api_key=OPENAI_API_KEY
)

In [37]:
def print_messages(messages: List[BaseMessage]):
    for m in messages:
        print(f" -> (role = {m.role}, content = {m.content}, tool_calls = {getattr(m, 'tool_calls', None)})")

In [38]:
queries = [
    "When was Pokémon Gold and Silver released?",
    "Which one was the first 3D platformer Mario game?",
    "Was Mortal Kombat X released for Playstation 5?"
]

for i, query in enumerate(queries):
    print(f"Query: {query}")
    run = simple_agent.invoke(query, session_id = i)
    messages = run.get_final_state()['messages']
    print_messages(messages)


Query: When was Pokémon Gold and Silver released?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
 -> (role = system, content = You are a web-aware assistant that can search for update information For each query, you must use your retrieve_game tool to get relevant information from the vector database to answer the question. After you receive the results of that tool, use the evaluate_retrieval tool to see if the results are sufficient to answer the question. If they are deemed sufficient and useful is true, then return the answer. If they are deemed insufficient and useful is false, fall back to using the web_search tool to look up the query on the internet. The web search results can then 

#### Let's prove our agent kept short term memory of the session

In [39]:
for i in range(len(queries)):
    print(f"Session {i}:")
    query = "What have we talked about so far?"
    run = simple_agent.invoke(query, session_id = i)
    print(run.get_final_state()["messages"][-1].content)


Session 0:
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
You asked about the release date of Pokémon Gold and Silver, and I provided the information that they were released for the Game Boy Color in 1999.
Session 1:
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
We discussed the first 3D platformer Mario game, which was identified as *Super Mario 64*, released in 1996 for the Nintendo 64. It was notable for setting new standards in the 3D platform genre with its innovative camera system and control mechanics. Although earlier games like *Jumping Flash!* were also 3D platformers, *Super Mario 64* is often credited with defining the modern experience of the genre.
Session 2:
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_p