# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [None]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [1]:
# TODO: Import the necessary libs
# For example: 
import os

from lib.agents import Agent
from lib.llm import LLM
from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
from lib.state_machine import StateMachine, Step, EntryPoint, Termination, Run
from lib.tooling import ToolCall,tool
from dotenv import load_dotenv
from typing import List, Dict, Any, Optional, TypedDict
import chromadb
from chromadb.utils import embedding_functions
import json
from tavily import TavilyClient
from pydantic import BaseModel, Field

In [2]:
# TODO: Load environment variables
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [3]:
# TODO: Create retrieve_game tool
# It should use chroma client and collection you created
# chroma_client = chromadb.PersistentClient(path="chromadb")
# collection = chroma_client.get_collection("udaplay")
# Tool Docstring:
#    Semantic search: Finds most results in the vector DB
#    args:
#    - query: a question about game industry. 
#
#    You'll receive results as list. Each element contains:
#    - Platform: like Game Boy, Playstation 5, Xbox 360...)
#    - Name: Name of the Game
#    - YearOfRelease: Year when that game was released for that platform
#    - Description: Additional details about the game

chroma_client = chromadb.PersistentClient(path="chromadb")
collection = chroma_client.get_collection("udaplay")

@tool
def retrieve_game(query: str) -> List[Dict[str, Any]]:
    """
    Semantic search: Finds most results in the vector DB.

    Args:
        query: a question about game industry. 

    Returns:
        A list of dictionaries. Each element contains:
        - Platform: like Game Boy, Playstation 5, Xbox 360...)
        - Name: Name of the Game
        - YearOfRelease: Year when that game was released for that platform
        - Description: Additional details about the game
    """
    
    # Query the collection
    results = collection.query(
        query_texts=[query],
        n_results=5, # Retrieve top 5 similar documents
        include=['metadatas', 'documents']
    )
    
    formatted_results = []
    
    # Parse the results safely
    if results and results.get('metadatas') and results.get('documents'):
        # The query returns a list of lists (batch processing), we take the first index [0]
        for metadata, document in zip(results['metadatas'][0], results['documents'][0]):
            formatted_results.append({
                "Platform": metadata.get("Platform", "Unknown"),
                "Name": metadata.get("Name", "Unknown"),
                "YearOfRelease": metadata.get("YearOfRelease", "Unknown"),
                "Description": document 
            })
            
    return formatted_results

#### Evaluate Retrieval Tool

In [None]:
# TODO: Create evaluate_retrieval tool
# You might use an LLM as judge in this tool to evaluate the performance
# You need to prompt that LLM with something like:
# "Your task is to evaluate if the documents are enough to respond the query. "
# "Give a detailed explanation, so it's possible to take an action to accept it or not."
# Use EvaluationReport to parse the result
# Tool Docstring:
#    Based on the user's question and on the list of retrieved documents, 
#    it will analyze the usability of the documents to respond to that question. 
#    args: 
#    - question: original question from user
#    - retrieved_docs: retrieved documents most similar to the user query in the Vector Database
#    The result includes:
#    - useful: whether the documents are useful to answer the question
#    - description: description about the evaluation result

class EvaluationReport(BaseModel):
    useful: bool = Field(description="whether the documents are useful to answer the question")
    description: str = Field(description="description about the evaluation result")

@tool
def evaluate_retrieval(question: str, retrieved_docs: Optional[List[Dict[str, Any]]] = None) -> Dict[str, Any]:
    """
    Analyzes if retrieved documents answer the question.
    """
    # 1. Safety check for missing argument
    if retrieved_docs is None:
        return json.dumps({
            "useful": False, 
            "description": "Error: 'retrieved_docs' was missing. Please pass the output of 'retrieve_game'."
        })

    # 2. Initialize LLM
    eval_llm = LLM(model="gpt-4o-mini", temperature=0)

    # 3. Create Message List (System + User)
    # We wrap the system prompt in a SystemMessage object
    docs_content = json.dumps(retrieved_docs, indent=2)
    messages = [
        SystemMessage(content="Your task is to evaluate if the documents are enough to respond the query. Return a JSON with 'useful': true/false and a description."),
        UserMessage(content=f"User Question: {question}\n\nRetrieved Documents:\n{docs_content}")
    ]

    # 4. Invoke using valid arguments: 'input' and 'response_format'
    try:
        response = eval_llm.invoke(
            input=messages, 
            response_format=EvaluationReport
        )
        
        # Parse valid JSON from the response content
        report = EvaluationReport.model_validate_json(response.content)
        return report.model_dump()
        
    except Exception as e:
        return json.dumps({
            "useful": False, 
            "description": f"Evaluation failed: {str(e)}"
        })

#### Game Web Search Tool

In [6]:
# TODO: Create game_web_search tool
# Please use Tavily client to search the web
# Tool Docstring:
#    Semantic search: Finds most results in the vector DB
#    args:
#    - question: a question about game industry. 

tavily_client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))

@tool
def game_web_search(question: str) -> str:
    """
    Search the web for current or external information about the game industry.
    Use this when the internal database lacks information or for recent news.

    Args:
        question: a specific question about the game industry. 
        
    Returns:
        A summarized string of relevant web search results.
    """
    
    # Perform the search
    response = tavily_client.search(
        query=question,
        search_depth="advanced", 
        include_images=False,
        max_results=5,
        include_raw_content=False,
        include_answer=True # Tries to get a direct answer generated by Tavily
    )
    
    # Format the output
    search_results = ""
    
    # Check for a direct answer first
    answer = response.get("answer")
    if answer:
        search_results += f"Web Search Answer: {answer}\n\n"
    else:
        search_results += "Web Search Results:\n"
        
    # Append specific snippets for context
    for result in response.get("results", []):
        search_results += f"- Title: {result.get('title', 'No Title')}\n"
        search_results += f"  Snippet: {result.get('content', 'No Content')}\n"
        search_results += f"  Source: {result.get('url', 'No URL')}\n"
        search_results += "---\n"
        
    return search_results

### Agent

In [12]:
# TODO: Create your Agent abstraction using StateMachine
# Equip with an appropriate model
# Craft a good set of instructions 
# Plug all Tools you developed

# TODO: Create your Agent abstraction using StateMachine

# Define the Agent's behavior
AGENT_INSTRUCTIONS = """
You are UdaPlay, an AI Research Agent specializing in the video game industry. 
Your goal is to answer user questions accurately using the available tools.

**Your workflow must be:**
1. **Initial Check:** For any question about games, history, or platforms, first use the `retrieve_game` tool to check internal knowledge.
2. **Evaluation:** Use the `evaluate_retrieval` tool to check if the retrieved documents are sufficient to answer the question.
3. **Action:**
    - If the documents are marked as `useful: true`, synthesize the final answer ONLY from those documents and cite the specific document used.
    - If the documents are marked as `useful: false` or if the question is about recent news or external topics not found in the DB, proceed to use the `game_web_search` tool.
4. **Final Answer:** Always provide a clear, concise, and direct answer to the user's question based on the most reliable source (VectorDB or Web Search).
if the answer is from the internet, make sure to include the url in the response.
5. **Memory:** Maintain context from previous turns to answer follow-up questions effectively.
"""

# Equip the agent with the necessary model and tools
udaplay_agent = Agent(
    model_name="gpt-4o", # Use a powerful model for complex reasoning and tool use
    instructions=AGENT_INSTRUCTIONS,
    tools=[retrieve_game, evaluate_retrieval, game_web_search],
)

In [13]:
# TODO: Invoke your agent
# - When Pokémon Gold and Silver was released?
# - Which one was the first 3D platformer Mario game?
# - Was Mortal Kombat X realeased for Playstation 5?

questions = [
    "When Pokémon Gold and Silver was released?",          
    "Which one was the first 3D platformer Mario game?", 
    "Was Mortal Kombat X released for Playstation 5?",
    "What was the release year of that Mario game you just mentioned?"
]

print(f"Testing UdaPlay Agent with {len(questions)} queries...\n")

for q in questions:
    print(f"--- USER: {q} ---")
    
    run_object = udaplay_agent.invoke(q)
    
    final_state = run_object.get_final_state()
    final_response = final_state["messages"][-1].content
    
    print(f"AGENT: {final_response}\n")

Testing UdaPlay Agent with 4 queries...

--- USER: When Pokémon Gold and Silver was released? ---
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
AGENT: Pokémon Gold and Silver were released in Japan on November 21, 1999, and in North America on October 15, 2000. [Source](https://www.pokemon.com/us/pokemon-video-games/pokemon-gold-version

In [14]:
# check if memory is maintained between two runs with the same sesion id
SESSION_ID = "my_udaplay_session_001"

q1 = "What are the most famous games from 2010-2015?"
q2 = "On which platfroms did those games run?"
q3 = "if you pick the best game out of those and only one, which one will you pick?"

print(udaplay_agent.invoke(q1, session_id=SESSION_ID).get_final_state()["messages"][-1].content)
print(udaplay_agent.invoke(q2, session_id=SESSION_ID).get_final_state()["messages"][-1].content)
print(udaplay_agent.invoke(q3, session_id=SESSION_ID).get_final_state()["messages"][-1].content)

[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
The most famous games from 2010 to 2015 include:

1. **Red Dead Redemption** (2010) - Celebrated for its storytelling and open-world gameplay.
2. **The Last of Us** (2013) - Known for its emotional narrative and character development.
3. **Grand Theft Auto V** (2013) - Famous for its expansive open world and engaging missions.
4. **The Witcher 3: Wild Hunt** (2015) - Acclaimed for its rich story and open-world exploration.
5. **Dark Souls** (2011) - Renowned for its challenging gameplay and intricate world design.
6. **Elder Scrolls V: Skyrim** (2011) - Po

### (Optional) Advanced

In [None]:
# TODO: Update your agent with long-term memory
# TODO: Convert the agent to be a state machine, with the tools being pre-defined nodes
# TODO: Visualization: Create a dashboard or visualization of the agent’s retrieval process or knowledge base.
# TODO: Structured Output: Return answers in both natural language and structured JSON for easy integration.
