# [SOLUTION] Udaplay Project - Part 02

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [1]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [2]:
# Import necessary libraries
import os
import json
import chromadb
from chromadb.utils import embedding_functions
from dotenv import load_dotenv
from tavily import TavilyClient
from pydantic import BaseModel, Field

from lib.agents import Agent
from lib.llm import LLM
from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
from lib.tooling import tool

In [3]:
# Load environment variables
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

# Verify API keys are loaded
assert OPENAI_API_KEY is not None, "OPENAI_API_KEY not found"
assert TAVILY_API_KEY is not None, "TAVILY_API_KEY not found"

print("✓ Environment variables loaded successfully")

✓ Environment variables loaded successfully


In [4]:
# Initialize ChromaDB client and get the collection
chroma_client = chromadb.PersistentClient(path="chromadb")

embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
    api_key=OPENAI_API_KEY,
    model_name="text-embedding-ada-002"
)

collection = chroma_client.get_collection(
    name="udaplay",
    embedding_function=embedding_fn
)

print(f"✓ Connected to collection with {collection.count()} documents")

✓ Connected to collection with 15 documents


In [5]:
# Initialize Tavily client for web search
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)

print("✓ Tavily client initialized")

✓ Tavily client initialized


### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [6]:
@tool
def retrieve_game(query: str) -> str:
    """
    Semantic search: Finds most relevant results in the vector DB.
    
    args:
    - query: a question about game industry. 

    You'll receive results as list. Each element contains:
    - Platform: like Game Boy, Playstation 5, Xbox 360...)
    - Name: Name of the Game
    - YearOfRelease: Year when that game was released for that platform
    - Description: Additional details about the game
    - Genre: The genre of the game
    - Publisher: The publisher of the game
    """
    try:
        # Query the collection for relevant games
        results = collection.query(
            query_texts=[query],
            n_results=5  # Get top 5 results
        )
        
        # Format the results
        if results['metadatas'] and len(results['metadatas'][0]) > 0:
            formatted_results = []
            for metadata in results['metadatas'][0]:
                formatted_results.append({
                    "Name": metadata.get('Name', 'Unknown'),
                    "Platform": metadata.get('Platform', 'Unknown'),
                    "YearOfRelease": metadata.get('YearOfRelease', 'Unknown'),
                    "Genre": metadata.get('Genre', 'Unknown'),
                    "Publisher": metadata.get('Publisher', 'Unknown'),
                    "Description": metadata.get('Description', 'No description available')
                })
            return json.dumps(formatted_results, indent=2)
        else:
            return json.dumps({"message": "No results found in the database"})
    except Exception as e:
        return json.dumps({"error": str(e)})

print("✓ retrieve_game tool created")

✓ retrieve_game tool created


#### Evaluate Retrieval Tool

In [7]:
# Define the evaluation report structure
class EvaluationReport(BaseModel):
    """Evaluation report for retrieved documents"""
    useful: bool = Field(description="Whether the documents are useful to answer the question")
    description: str = Field(description="Detailed explanation of the evaluation")
    confidence: float = Field(description="Confidence level (0-1) in the evaluation")

print("✓ EvaluationReport model defined")

✓ EvaluationReport model defined


In [8]:
@tool
def evaluate_retrieval(question: str, retrieved_docs: str) -> str:
    """
    Based on the user's question and on the list of retrieved documents, 
    it will analyze the usability of the documents to respond to that question. 
    
    args: 
    - question: original question from user
    - retrieved_docs: retrieved documents most similar to the user query in the Vector Database
    
    The result includes:
    - useful: whether the documents are useful to answer the question
    - description: description about the evaluation result
    - confidence: confidence level in the evaluation (0-1)
    """
    try:
        # Use LLM as a judge to evaluate the retrieval quality
        evaluation_llm = LLM(model="gpt-4o-mini", temperature=0.0)
        
        evaluation_prompt = f"""
Your task is to evaluate if the retrieved documents are sufficient to answer the user's question.

User Question: {question}

Retrieved Documents:
{retrieved_docs}

Analyze the documents and determine:
1. Are the documents relevant to the question?
2. Do they contain enough information to answer the question?
3. What is your confidence level in this evaluation?

Provide a detailed explanation so it's possible to take an action to accept it or search the web for more information.
"""
        
        response = evaluation_llm.invoke(
            evaluation_prompt,
            response_format=EvaluationReport
        )
        
        # Parse the response
        evaluation = EvaluationReport.model_validate_json(response.content)
        
        return json.dumps({
            "useful": evaluation.useful,
            "description": evaluation.description,
            "confidence": evaluation.confidence
        }, indent=2)
        
    except Exception as e:
        return json.dumps({"error": str(e)})

print("✓ evaluate_retrieval tool created")

✓ evaluate_retrieval tool created


#### Game Web Search Tool

In [9]:
@tool
def game_web_search(question: str) -> str:
    """
    Performs web search to find information about video games.
    Use this when internal database doesn't have sufficient information.
    
    args:
    - question: a question about game industry.
    
    Returns:
    - Web search results with relevant information
    """
    try:
        # Use Tavily to search the web
        search_results = tavily_client.search(
            query=question,
            max_results=5,
            search_depth="advanced"
        )
        
        # Format the results
        formatted_results = []
        for result in search_results.get('results', []):
            formatted_results.append({
                "title": result.get('title', 'No title'),
                "url": result.get('url', ''),
                "content": result.get('content', 'No content'),
                "score": result.get('score', 0)
            })
        
        return json.dumps({
            "query": question,
            "results": formatted_results,
            "answer": search_results.get('answer', 'No direct answer available')
        }, indent=2)
        
    except Exception as e:
        return json.dumps({"error": str(e)})

print("✓ game_web_search tool created")

✓ game_web_search tool created


### Agent

In [10]:
# Create the UdaPlay agent with all tools
instructions = """
You are UdaPlay, an AI Research Agent specialized in video game industry information.

Your workflow:
1. First, try to answer questions using the retrieve_game tool to search the internal database
2. Use evaluate_retrieval to assess if the retrieved information is sufficient
3. If the information is not sufficient (useful=false or low confidence), use game_web_search to find additional information
4. Combine information from multiple sources when needed
5. Always cite your sources (internal database or web)
6. Provide clear, well-structured answers

Guidelines:
- Be accurate and factual
- If you're not sure, say so and explain what information is missing
- When using web search results, mention that the information comes from external sources
- Format your answers in a clear, readable way
- Include relevant details like release dates, platforms, publishers, and descriptions
"""

udaplay_agent = Agent(
    model_name="gpt-4o-mini",
    instructions=instructions,
    tools=[retrieve_game, evaluate_retrieval, game_web_search],
    temperature=0.7
)

print("✓ UdaPlay agent created successfully")

✓ UdaPlay agent created successfully


### Test the Agent

In [11]:
# Test Query 1: Information available in database
print("="*80)
print("TEST 1: When was Pokémon Gold and Silver released?")
print("="*80)

run1 = udaplay_agent.invoke("When was Pokémon Gold and Silver released?")
final_state1 = run1.get_final_state()

if final_state1 and final_state1['messages']:
    # Get the last AI message
    for msg in reversed(final_state1['messages']):
        if msg.role == 'assistant' and msg.content:
            print("\nAgent Response:")
            print(msg.content)
            break

print("\n" + "="*80)

TEST 1: When was Pokémon Gold and Silver released?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

Agent Response:
**Pokémon Gold and Silver** were released in **1999** for the **Game Boy Color**. These games are part of the second generation of Pokémon games and introduced new regions, Pokémon, and gameplay mechanics. They were published by **Nintendo** and are classified as role-playing games. 

If you need more details or have further questions, feel free to ask!



In [12]:
# Test Query 2: Another database query
print("="*80)
print("TEST 2: Which one was the first 3D platformer Mario game?")
print("="*80)

run2 = udaplay_agent.invoke("Which one was the first 3D platformer Mario game?")
final_state2 = run2.get_final_state()

if final_state2 and final_state2['messages']:
    for msg in reversed(final_state2['messages']):
        if msg.role == 'assistant' and msg.content:
            print("\nAgent Response:")
            print(msg.content)
            break

print("\n" + "="*80)

TEST 2: Which one was the first 3D platformer Mario game?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

Agent Response:
The first 3D platformer Mario game is **Super Mario 64**, which was released in **1996** for the **Nintendo 64**. This game is considered groundbreaking as it set new standards for the platforming genre and features Mario's quest to rescue Princess Peach.

If you have any more questions or need further information, feel free to ask!



In [13]:
# Test Query 3: May require web search
print("="*80)
print("TEST 3: Was Mortal Kombat X released for PlayStation 5?")
print("="*80)

run3 = udaplay_agent.invoke("Was Mortal Kombat X released for PlayStation 5?")
final_state3 = run3.get_final_state()

if final_state3 and final_state3['messages']:
    for msg in reversed(final_state3['messages']):
        if msg.role == 'assistant' and msg.content:
            print("\nAgent Response:")
            print(msg.content)
            break

print("\n" + "="*80)

TEST 3: Was Mortal Kombat X released for PlayStation 5?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

Agent Response:
**Mortal Kombat X** was released on **April 14, 2015**, for the **PlayStation 4**, Xbox One, and PC. However, it was not released for the **PlayStation 5**. There was a later version called **Mortal Kombat XL**, which included all previously released downloadable content and was released for PlayStation 4, but again, no official version for PlayStation 5 was launched.

For more details, you can refer to the [Wikipedia page for Mortal Kombat X](https://en.wikipedia.org/wiki/Mortal_Komb

### Agent Performance Analysis

In [14]:
# Analyze the agent's tool usage across all runs
def analyze_run(run, query_name):
    print(f"\n{query_name}:")
    print("-" * 60)
    
    final_state = run.get_final_state()
    if not final_state:
        print("No final state available")
        return
    
    # Count tool calls
    tool_calls = {}
    for msg in final_state['messages']:
        if hasattr(msg, 'tool_calls') and msg.tool_calls:
            for call in msg.tool_calls:
                tool_name = call.function.name
                tool_calls[tool_name] = tool_calls.get(tool_name, 0) + 1
    
    print(f"Tool Usage:")
    for tool, count in tool_calls.items():
        print(f"  - {tool}: {count} time(s)")
    
    # Token usage
    if 'total_tokens' in final_state:
        print(f"Total Tokens Used: {final_state['total_tokens']}")
    
    print(f"Total Messages: {len(final_state['messages'])}")

print("="*80)
print("AGENT PERFORMANCE ANALYSIS")
print("="*80)

analyze_run(run1, "Query 1 (Pokémon Gold and Silver)")
analyze_run(run2, "Query 2 (First 3D Mario)")
analyze_run(run3, "Query 3 (Mortal Kombat X on PS5)")

AGENT PERFORMANCE ANALYSIS

Query 1 (Pokémon Gold and Silver):
------------------------------------------------------------
Tool Usage:
  - retrieve_game: 1 time(s)
  - evaluate_retrieval: 1 time(s)
Total Tokens Used: 2878
Total Messages: 7

Query 2 (First 3D Mario):
------------------------------------------------------------
Tool Usage:
  - retrieve_game: 2 time(s)
  - evaluate_retrieval: 2 time(s)
Total Tokens Used: 5287
Total Messages: 13

Query 3 (Mortal Kombat X on PS5):
------------------------------------------------------------
Tool Usage:
  - retrieve_game: 3 time(s)
  - evaluate_retrieval: 3 time(s)
  - game_web_search: 1 time(s)
Total Tokens Used: 12138
Total Messages: 21


### Additional Test Queries

In [15]:
# Test with custom queries
custom_query = "What racing games are available for PlayStation?"

print("="*80)
print(f"CUSTOM TEST: {custom_query}")
print("="*80)

run_custom = udaplay_agent.invoke(custom_query)
final_state_custom = run_custom.get_final_state()

if final_state_custom and final_state_custom['messages']:
    for msg in reversed(final_state_custom['messages']):
        if msg.role == 'assistant' and msg.content:
            print("\nAgent Response:")
            print(msg.content)
            break

print("\n" + "="*80)

CUSTOM TEST: What racing games are available for PlayStation?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

Agent Response:
Here are some notable racing games available for PlayStation:

1. **Gran Turismo**
   - **Platform:** PlayStation 1
   - **Year of Release:** 1997
   - **Publisher:** Sony Computer Entertainment
   - **Description:** A realistic racing simulator featuring a wide array of cars and tracks, setting a new standard for the genre.

2. **Gran Turismo 5**
   - **Platform:** PlayStation 3
   - **Year of Release:** 2010
   - **Publisher:** Sony Computer Entertainment
   - **Description:**

## Summary

In this notebook, we successfully implemented:

### ✓ Three Core Tools:
1. **retrieve_game**: Searches the vector database for game information
2. **evaluate_retrieval**: Uses LLM as a judge to assess retrieval quality
3. **game_web_search**: Falls back to web search using Tavily API

### ✓ Intelligent Agent:
- Implements a state machine workflow
- Maintains conversation state across queries
- Makes intelligent decisions about when to use web search
- Combines information from multiple sources
- Provides well-structured, cited answers

### ✓ Testing & Evaluation:
- Tested with multiple query types
- Analyzed tool usage and performance
- Demonstrated both database retrieval and web search capabilities

The UdaPlay agent is now fully functional and ready for production use!