# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [19]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [20]:
import os
import json
import chromadb
from chromadb.utils import embedding_functions
from dotenv import load_dotenv
from tavily import TavilyClient
from pydantic import BaseModel, Field

from lib.agents import Agent
from lib.llm import LLM
from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
from lib.tooling import tool
from lib.parsers import PydanticOutputParser

In [21]:
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [22]:
# Initialize ChromaDB client and collection
chroma_client = chromadb.PersistentClient(path="chromadb")
embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
    api_key=OPENAI_API_KEY,
    model_name="text-embedding-3-small"
)
collection = chroma_client.get_collection("udaplay_games", embedding_function=embedding_fn)

@tool
def retrieve_game(query: str) -> str:
    """
    Semantic search: Finds most similar results in the vector DB
    
    Args:
        query: a question about game industry
    
    Returns:
        Results as a list. Each element contains:
        - Platform: like Game Boy, Playstation 5, Xbox 360...
        - Name: Name of the Game
        - YearOfRelease: Year when that game was released for that platform
        - Description: Additional details about the game
    """
    results = collection.query(
        query_texts=[query],
        n_results=5
    )
    
    # Format results
    formatted_results = []
    if results['metadatas'] and results['metadatas'][0]:
        for metadata in results['metadatas'][0]:
            formatted_results.append({
                "Platform": metadata.get("Platform", "Unknown"),
                "Name": metadata.get("Name", "Unknown"),
                "YearOfRelease": metadata.get("YearOfRelease", "Unknown"),
                "Description": metadata.get("Description", "No description available")
            })
    
    return json.dumps(formatted_results, indent=2)

#### Evaluate Retrieval Tool

In [23]:
# Define EvaluationReport model for structured output
class EvaluationReport(BaseModel):
    """Evaluation of retrieved documents"""
    useful: bool = Field(description="Whether the documents are useful to answer the question")
    description: str = Field(description="Detailed explanation about the evaluation result")

@tool
def evaluate_retrieval(question: str, retrieved_docs: str) -> str:
    """
    Based on the user's question and on the list of retrieved documents,
    it will analyze the usability of the documents to respond to that question.
    
    Args:
        question: original question from user
        retrieved_docs: retrieved documents most similar to the user query in the Vector Database
    
    Returns:
        The result includes:
        - useful: whether the documents are useful to answer the question
        - description: description about the evaluation result
    """
    # Create LLM judge
    llm_judge = LLM(model="gpt-4o-mini", temperature=0.0, api_key=OPENAI_API_KEY)
    
    # Prompt the LLM to evaluate
    evaluation_prompt = f"""Your task is to evaluate if the documents are enough to respond the query.
Give a detailed explanation, so it's possible to take an action to accept it or not.

User Question: {question}

Retrieved Documents:
{retrieved_docs}

Evaluate whether these documents contain sufficient information to answer the user's question."""
    
    # Use structured output
    response = llm_judge.invoke(
        input=evaluation_prompt,
        response_format=EvaluationReport
    )
    
    # Parse the response
    parser = PydanticOutputParser(model_class=EvaluationReport)
    evaluation = parser.parse(response)
    
    return json.dumps({
        "useful": evaluation.useful,
        "description": evaluation.description
    }, indent=2)

#### Game Web Search Tool

In [24]:
# Initialize Tavily client
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)

@tool
def game_web_search(question: str) -> str:
    """
    Web search: Searches the web for information about games
    
    Args:
        question: a question about game industry
    
    Returns:
        Search results from the web with relevant information
    """
    # Search the web using Tavily
    response = tavily_client.search(
        query=question,
        max_results=5
    )
    
    # Format the results
    results = []
    for result in response.get('results', []):
        results.append({
            "title": result.get("title", ""),
            "url": result.get("url", ""),
            "content": result.get("content", "")
        })
    
    return json.dumps(results, indent=2)

### Agent

In [25]:
# Create the UdaPlay Agent
udaplay_agent = Agent(
    model_name="gpt-4o-mini",
    instructions="""You are UdaPlay, an AI Research Agent specialized in the video game industry.

Your role is to help users find information about video games by:
1. First searching your internal knowledge base (vector database) using retrieve_game
2. Evaluating if the retrieved information is sufficient using evaluate_retrieval
3. If the internal knowledge is not sufficient, searching the web using game_web_search
4. Providing clear, accurate, and helpful answers

Always follow this workflow:
- Start by retrieving information from the internal database
- Evaluate if the retrieved documents are useful
- Only search the web if the internal knowledge is insufficient
- Synthesize information from multiple sources when needed
- Be concise but informative in your responses""",
    tools=[retrieve_game, evaluate_retrieval, game_web_search],
    temperature=0.7
)

print("✓ UdaPlay Agent created successfully")
print(f"✓ Model: gpt-4o-mini")
print(f"✓ Tools: {len(udaplay_agent.tools)} tools available")

✓ UdaPlay Agent created successfully
✓ Model: gpt-4o-mini
✓ Tools: 3 tools available


In [26]:
# Test the agent with sample queries using the same session to demonstrate conversation state
queries = [
    "When was Pokémon Gold and Silver released?",
    "Which one was the first 3D platformer Mario game?",
    "Was Mortal Kombat X released for Playstation 5?"
]

# Use the same session_id for all queries to maintain conversation state
session_id = "demo_session"

for i, query in enumerate(queries, 1):
    print(f"\n{'='*80}")
    print(f"Query {i}: {query}")
    print('='*80)

    # Invoke the agent with the same session_id
    run = udaplay_agent.invoke(query, session_id=session_id)

    # Get the final state
    final_state = run.get_final_state()

    if final_state and final_state.get("messages"):
        messages = final_state["messages"]

        # Track sources for citations
        sources = []
        used_internal_db = False
        used_web_search = False
        web_urls = []

        print(f"\n{'─'*80}")
        print("AGENT REASONING & TOOL USAGE:")
        print('─'*80)

        # Display reasoning and tool usage
        for msg in messages:
            # Show tool calls (agent's reasoning/actions)
            if isinstance(msg, AIMessage) and hasattr(msg, 'tool_calls') and msg.tool_calls:
                for tool_call in msg.tool_calls:
                    # Access attributes directly, not as dict
                    tool_name = getattr(tool_call, 'name', 'unknown') if hasattr(tool_call, 'name') else tool_call.function.name
                    tool_args = getattr(tool_call, 'args', {}) if hasattr(tool_call, 'args') else json.loads(tool_call.function.arguments)
                    
                    print(f"\n🔧 Tool: {tool_name}")
                    print(f"   Arguments: {json.dumps(tool_args, indent=6)}")

                    if tool_name == 'retrieve_game':
                        used_internal_db = True
                    elif tool_name == 'game_web_search':
                        used_web_search = True

            # Show tool results
            if isinstance(msg, ToolMessage):
                tool_name = msg.name if hasattr(msg, 'name') else 'unknown'
                print(f"\n📊 Result from {tool_name}:")

                # Parse and display result nicely
                try:
                    result_data = json.loads(msg.content)

                    # Handle different tool outputs
                    if tool_name == 'retrieve_game':
                        print(f"   Retrieved {len(result_data)} documents from internal database")
                        for idx, doc in enumerate(result_data[:3], 1):  # Show first 3
                            print(f"   {idx}. {doc.get('Name')} ({doc.get('YearOfRelease')}) - {doc.get('Platform')}")

                    elif tool_name == 'evaluate_retrieval':
                        print(f"   Useful: {result_data.get('useful')}")
                        print(f"   Evaluation: {result_data.get('description')}")

                    elif tool_name == 'game_web_search':
                        print(f"   Found {len(result_data)} web results")
                        for idx, res in enumerate(result_data[:3], 1):  # Show first 3
                            url = res.get('url', '')
                            if url:
                                web_urls.append(url)
                            print(f"   {idx}. {res.get('title')} - {url}")

                except:
                    # If not JSON, show raw content (truncated)
                    content_preview = msg.content[:200] + "..." if len(msg.content) > 200 else msg.content
                    print(f"   {content_preview}")

        # Build citation sources
        if used_internal_db:
            sources.append("Internal Game Database (ChromaDB)")
        if used_web_search:
            sources.extend(web_urls[:3])  # Add up to 3 URLs

        # Display final answer with citations
        print(f"\n{'─'*80}")
        print("FINAL ANSWER:")
        print('─'*80)

        # Find the last AI message with content
        final_answer = None
        for msg in reversed(messages):
            if isinstance(msg, AIMessage) and msg.content and not (hasattr(msg, 'tool_calls') and msg.tool_calls):
                final_answer = msg.content
                break

        if final_answer:
            print(f"\n{final_answer}")

            # Add citations
            if sources:
                print(f"\n{'─'*40}")
                print("📚 CITATIONS & SOURCES:")
                for idx, source in enumerate(sources, 1):
                    print(f"   [{idx}] {source}")

        # Print token usage
        if final_state.get("total_tokens"):
            print(f"\n[Tokens used: {final_state['total_tokens']}]")

print(f"\n\n{'='*80}")
print(f"REPORT SUMMARY")
print(f"{'='*80}")
print(f"✓ All {len(queries)} queries were processed in session: '{session_id}'")
print(f"✓ Agent maintained conversation context across all queries")
print(f"✓ Each query shows: reasoning, tool usage, final answer, and citations")
print('='*80)


Query 1: When was Pokémon Gold and Silver released?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

────────────────────────────────────────────────────────────────────────────────
AGENT REASONING & TOOL USAGE:
────────────────────────────────────────────────────────────────────────────────

🔧 Tool: retrieve_game
   Arguments: {
      "query": "Pok\u00e9mon Gold and Silver release date"
}

📊 Result from retrieve_game:
   Retrieved 1182 documents from internal database
   "[\n  {\n    \"Platform\": \"Game Boy Color\",\n    \"Name\": \"Pok\\u00e9mon Gold and Silver\",\n    \"YearOfRelease\": 1999,\n    \"Description\": \"Second-generation Pok\\u00e9mon games introducing...

🔧 Tool: evaluate_

### (Optional) Advanced

In [None]:
# TODO: Update your agent with long-term memory
# TODO: Convert the agent to be a state machine, with the tools being pre-defined nodes