# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [1]:

import os

import chromadb
from openai import OpenAI
from tavily import TavilyClient
from dotenv import load_dotenv

In [2]:

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

openai_client = OpenAI(api_key=OPENAI_API_KEY,base_url="https://openai.vocareum.com/v1")
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)

print("✅ OpenAI and Tavily clients initialized")



✅ OpenAI and Tavily clients initialized


In [3]:
# Import additional libraries for the AI Agent
import json
from typing import List, Dict, Any
from pydantic import BaseModel, Field

# Import the custom library modules
from lib.agents import Agent
from lib.llm import LLM
from lib.messages import UserMessage, SystemMessage, ToolMessage, AIMessage
from lib.tooling import Tool
from lib.state_machine import StateMachine, Step, EntryPoint, Termination
from lib.memory import ShortTermMemory

print("✅ Additional libraries imported successfully")


✅ Additional libraries imported successfully


In [4]:
from chromadb.utils import embedding_functions
embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
    model_name="text-embedding-3-small",
    api_key=os.getenv('OPENAI_API_KEY'),
    api_base="https://openai.vocareum.com/v1"
)
chroma_client = chromadb.PersistentClient(path="chromadb")
collection = chroma_client.get_collection("udaplay", embedding_function=embedding_fn)

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [5]:

def retrieve_game(query: str) -> List[Dict[str, Any]]:
    """
    Semantic search: Finds most results in the vector DB
    
    Args:
        query: a question about game industry
        
    Returns:
        List of dictionaries with game information in the format:
        [{'name': 'Game Name', 'platform': 'Platform', 'year': year, 'publisher': 'Publisher', 'score': score}, ...]
    """
    results = collection.query(
        query_texts=[query],
        n_results=5
    )
    
    # Format results in the desired format
    formatted_results = []
    if 'documents' in results and 'metadatas' in results and 'distances' in results:
        for i in range(len(results['documents'][0])):
            metadata = results['metadatas'][0][i]
            distance = results['distances'][0][i]
            
            # Convert distance to similarity score (lower distance = higher similarity)
            score = 1 - distance if distance <= 1 else 0
            
            formatted_result = {
                'name': metadata['Name'],
                'platform': metadata['Platform'],
                'year': metadata['YearOfRelease'],
                'publisher': metadata['Publisher'],
                'score': round(score, 4)
            }
            formatted_results.append(formatted_result)
    
    return formatted_results

# Test the retrieve_game tool
print("🔍 Testing retrieve_game tool:")
test_results = retrieve_game("What platform was Gran Turismo launched on?")
print("Results:")
for result in test_results:
    print(f"  {result}")

# Create the Tool object
retrieve_game_tool = Tool(retrieve_game)
print(f"\n✅ retrieve_game tool created: {retrieve_game_tool}")

🔍 Testing retrieve_game tool:
Results:
  {'name': 'Gran Turismo', 'platform': 'PlayStation 1', 'year': 1997, 'publisher': 'Sony Computer Entertainment', 'score': 0.2703}
  {'name': 'Gran Turismo 5', 'platform': 'PlayStation 3', 'year': 2010, 'publisher': 'Sony Computer Entertainment', 'score': 0.1987}
  {'name': 'Grand Theft Auto: San Andreas', 'platform': 'PlayStation 2', 'year': 2004, 'publisher': 'Rockstar Games', 'score': 0}
  {'name': 'Super Mario 64', 'platform': 'Nintendo 64', 'year': 1996, 'publisher': 'Nintendo', 'score': 0}
  {'name': "Marvel's Spider-Man 2", 'platform': 'PlayStation 5', 'year': 2023, 'publisher': 'Sony Interactive Entertainment', 'score': 0}

✅ retrieve_game tool created: <Tool name=retrieve_game params=['query']>


#### Evaluate Retrieval Tool

In [6]:
# Create evaluate_retrieval tool
class EvaluationOutput(BaseModel):
    useful: bool = Field(description="Whether the documents are useful to answer the question")
    description: str = Field(description="Detailed explanation of the evaluation result")

def evaluate_retrieval(question: str, retrieved_docs: List[Dict[str, Any]]) -> Dict[str, Any]:
    """
    Evaluate the quality of retrieved documents for answering a question
    
    Args:
        question: The original question from user
        retrieved_docs: Retrieved documents from the vector database
        
    Returns:
        Dictionary with evaluation results including 'useful' and 'description'
    """
    # Use an LLM to evaluate the retrieval
    response = openai_client.chat.completions.create(
        model="gpt-4",
        messages=[{
            "role": "system",
            "content": "Your task is to evaluate if the documents are enough to respond the query. Please provide your evaluation in JSON format with two fields: 'useful' (boolean) indicating whether the documents are useful to answer the question, and 'description' (string) providing a detailed explanation of your evaluation. Give a detailed explanation, so it's possible to take an action to accept it or not."
        }, {
            "role": "user",
            "content": f"Question: {question}\nRetrieved documents: {retrieved_docs}\nPlease provide your evaluation in valid JSON format matching the following schema:\n{EvaluationOutput.model_json_schema()}"
        }]
    )
    
    # Parse the JSON response
    try:
        evaluation = json.loads(response.choices[0].message.content)
        return evaluation
    except json.JSONDecodeError:
        return {"useful": False, "description": "Failed to parse evaluation response"}

# Test the evaluation tool
print("🔍 Testing evaluate_retrieval tool:")
test_docs = retrieve_game("What platform was Gran Turismo launched on?")
evaluation = evaluate_retrieval("What platform was Gran Turismo launched on?", test_docs)
print(f"Evaluation: {evaluation}")

# Create the Tool object
evaluate_retrieval_tool = Tool(evaluate_retrieval)
print(f"\n✅ evaluate_retrieval tool created: {evaluate_retrieval_tool}")


🔍 Testing evaluate_retrieval tool:
Evaluation: {'useful': True, 'description': 'The first document retrieved provides relevant information regarding the question. It shows that Gran Turismo was launched on PlayStation 1. Although other documents provide information about subsequent games in the Gran Turismo series or different games entirely, the first document is sufficient to answer the given query.'}

✅ evaluate_retrieval tool created: <Tool name=evaluate_retrieval params=['question', 'retrieved_docs']>


In [7]:
# Create game_web_search tool
def game_web_search(question: str) -> List[Dict[str, Any]]:
    """
    Search the web for game industry information using Tavily
    
    Args:
        question: A question about the game industry
        
    Returns:
        List of search results with relevant information
    """
    try:
        # Use Tavily to search the web
        response = tavily_client.search(
            query=question,
            search_depth="basic",
            max_results=5
        )
        
        # Format results in a consistent format
        formatted_results = []
        if 'results' in response:
            for result in response['results']:
                formatted_result = {
                    'title': result.get('title', 'No title'),
                    'url': result.get('url', 'No URL'),
                    'content': result.get('content', 'No content'),
                    'score': result.get('score', 0.0)
                }
                formatted_results.append(formatted_result)
        
        return formatted_results
        
    except Exception as e:
        print(f"Error in web search: {e}")
        return [{"title": "Search Error", "url": "", "content": f"Failed to search: {str(e)}", "score": 0.0}]

# Test the web search tool
print("🔍 Testing game_web_search tool:")
web_results = game_web_search("What platform was Gran Turismo launched on?")
print("Web search results:")
for result in web_results[:2]:  # Show first 2 results
    print(f"  Title: {result['title']}")
    print(f"  URL: {result['url']}")
    print(f"  Content: {result['content'][:100]}...")
    print()

# Create the Tool object
game_web_search_tool = Tool(game_web_search)
print(f"✅ game_web_search tool created: {game_web_search_tool}")


🔍 Testing game_web_search tool:
Web search results:
  Title: Gran Turismo (1997 video game) - Wikipedia
  URL: https://en.wikipedia.org/wiki/Gran_Turismo_(1997_video_game)
  Content: ***Gran Turismo*** is a 1997 sim racing video game developed and published by Sony Computer Entertai...

  Title: My First Gran Turismo launches on PS5 and PS4 December 6
  URL: https://blog.playstation.com/2024/12/04/my-first-gran-turismo-launches-on-ps5-and-ps4-december-6/
  Content: [Skip to content](https://blog.playstation.com/2024/12/04/my-first-gran-turismo-launches-on-ps5-and-...

✅ game_web_search tool created: <Tool name=game_web_search params=['question']>


In [8]:
# Create the UdaPlay Agent using the proper StateMachine architecture
from lib.agents import Agent
from lib.llm import LLM
from lib.messages import AIMessage
from typing import Dict

# First, we need to update the LLM class to support base_url for Vocareum
# Let's create a custom LLM class that extends the base one
class VocareumLLM(LLM):
    def __init__(self, model: str = "gpt-4o-mini", temperature: float = 0.0, 
                 tools=None, api_key: str = None, base_url: str = None):
        self.model = model
        self.temperature = temperature
        self.client = OpenAI(api_key=api_key, base_url=base_url) if api_key else OpenAI()
        self.tools: Dict[str, Tool] = {
            tool.name: tool for tool in (tools or [])
        }

# Create the UdaPlay Agent with proper StateMachine workflow
udaplay_agent = Agent(
    model_name="gpt-4",
    instructions="""
    You are UdaPlay, an AI Research Agent specializing in the video game industry.
    
    Your capabilities:
    1. Answer questions using internal knowledge (RAG) via retrieve_game tool
    2. Evaluate the quality of retrieved information via evaluate_retrieval tool
    3. Search the web for additional information via game_web_search tool when needed
    4. Maintain conversation context and provide well-cited answers
    
    Workflow:
    1. First, try to answer using internal knowledge (retrieve_game)
    2. Evaluate if the retrieved information is sufficient (evaluate_retrieval)
    3. If not sufficient, search the web for additional information (game_web_search)
    4. Provide a comprehensive, well-cited answer
    
    Always cite your sources and provide clear, structured responses.
    """,
    tools=[retrieve_game_tool, evaluate_retrieval_tool, game_web_search_tool],
    temperature=0.7
)

# Override the Agent's LLM creation to use Vocareum base URL
def _llm_step_vocareum(self, state):
    """Step logic: Process the current state through the LLM with Vocareum base URL"""
    # Initialize LLM with Vocareum base URL - use the base LLM class with proper parameters
    llm = LLM(
        model=self.model_name,
        temperature=self.temperature,
        tools=self.tools,
        api_key=OPENAI_API_KEY,
        base_url="https://openai.vocareum.com/v1"
    )

    response = llm.invoke(state["messages"])
    tool_calls = response.tool_calls if response.tool_calls else None

    current_total = state.get("total_tokens", 0)
    if response.token_usage:
        current_total += response.token_usage.total_tokens

    # Create AI message with content and tool calls
    ai_message = AIMessage(
        content=response.content, 
        tool_calls=tool_calls,
    )

    return {
        "messages": state["messages"] + [ai_message],
        "current_tool_calls": tool_calls,
        "session_id": state["session_id"],
        "total_tokens": current_total,
    }

# Apply the custom LLM step to the agent
udaplay_agent._llm_step = _llm_step_vocareum.__get__(udaplay_agent, Agent)

# Recreate the StateMachine with the updated _llm_step method
udaplay_agent.workflow = udaplay_agent._create_state_machine()

print("✅ UdaPlay Agent created with StateMachine architecture!")
print(f"Agent tools: {[tool.name for tool in udaplay_agent.tools]}")
print(f"Agent model: {udaplay_agent.model_name}")
print(f"Agent temperature: {udaplay_agent.temperature}")
print(f"Agent workflow: {udaplay_agent.workflow}")

✅ UdaPlay Agent created with StateMachine architecture!
Agent tools: ['retrieve_game', 'evaluate_retrieval', 'game_web_search']
Agent model: gpt-4
Agent temperature: 0.7
Agent workflow: StateMachine(schema=['user_query', 'instructions', 'messages', 'current_tool_calls', 'total_tokens'])


In [9]:
# Demonstrate conversation state management using proper Agent class
print("💬 Demonstrating conversation state management...")
print("=" * 50)

# Create a new session
session_id = "demo_session"
# try:
    # udaplay_agent.reset_session(session_id)
# except SessionNotFoundError:
#     # Initialize the session first if it doesn't exist
#     udaplay_agent.memory.sessions[session_id] = []

# First query
print("🔍 Query 1: What is Gran Turismo?")
run1 = udaplay_agent.invoke("What is Gran Turismo?", session_id)
final_state1 = run1.get_final_state()
if final_state1 and "messages" in final_state1:
    ai_messages1 = [msg for msg in final_state1["messages"] if hasattr(msg, 'role') and msg.role == "assistant"]
    response1 = ai_messages1[-1].content if ai_messages1 else "No response"
    print(f"🤖 Response: {response1}")
else:
    print("🤖 Response: No response generated")
print()

# Follow-up query that references previous context
print("🔍 Query 2: What platform was it first released on?")
run2 = udaplay_agent.invoke("What platform was it first released on?", session_id)
final_state2 = run2.get_final_state()
if final_state2 and "messages" in final_state2:
    ai_messages2 = [msg for msg in final_state2["messages"] if hasattr(msg, 'role') and msg.role == "assistant"]
    response2 = ai_messages2[-1].content if ai_messages2 else "No response"
    print(f"🤖 Response: {response2}")
else:
    print("🤖 Response: No response generated")
print()

# Show conversation history using proper Agent methods
print("📚 Conversation History:")
session_runs = udaplay_agent.get_session_runs(session_id)
for i, run in enumerate(session_runs, 1):
    final_state = run.get_final_state()
    if final_state and "messages" in final_state:
        # Find user and assistant messages - handle both dict and object formats
        user_messages = []
        ai_messages = []
        for msg in final_state["messages"]:
            if hasattr(msg, 'role'):
                if msg.role == "user":
                    user_messages.append(msg)
                elif msg.role == "assistant":
                    ai_messages.append(msg)
            elif isinstance(msg, dict):
                if msg.get("role") == "user":
                    user_messages.append(msg)
                elif msg.get("role") == "assistant":
                    ai_messages.append(msg)
        
        if user_messages and ai_messages:
            # Extract content from user message
            user_msg = user_messages[-1]
            query = user_msg.content if hasattr(user_msg, 'content') else user_msg.get("content", "No query")
            
            # Extract content from AI message
            ai_msg = ai_messages[-1]
            response = ai_msg.content if hasattr(ai_msg, 'content') else ai_msg.get("content", "No response")
            
            print(f"  Turn {i}:")
            print(f"    Query: {query}")
            print(f"    Response: {response[:100]}...")
            print()

print("✅ Conversation state management demonstrated!")
print(f"📊 Total runs in session: {len(session_runs)}")
print(f"🔧 Workflow steps executed: {[snapshot.step_id for snapshot in run2.snapshots]}")


💬 Demonstrating conversation state management...
🔍 Query 1: What is Gran Turismo?
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__
🤖 Response: Gran Turismo (GT) is a series of sim racing video games developed by Polyphony Digital, under the wing of Sony Computer Entertainment. The origins of the series can be traced back to 1992, started by Kazunori Yamauchi, an employee at Sony Computer Entertainment Japan at that time. The first Gran Turismo game was developed by a team named Polys Entertainment, which began with five members and eventually grew to seventeen[^1^].

The term "Gran Turismo" is Italian fo

In [10]:

# from pydantic import BaseModel, Field
# from lib.parsers import PydanticOutputParser

# class EvaluationOutput(BaseModel):
#     useful: bool = Field(description="Whether the documents are useful to answer the question")
#     description: str = Field(description="Detailed explanation of the evaluation result")

# def evaluate_retrieval(question, retrieved_docs):
#     # Use an LLM to evaluate the retrieval
#     response = openai_client.chat.completions.create(
#         model="gpt-4",
#         messages=[{
#             "role": "system",
#             "content": "Your task is to evaluate if the documents are enough to respond the query. Please provide your evaluation in JSON format with two fields: 'useful' (boolean) indicating whether the documents are useful to answer the question, and 'description' (string) providing a detailed explanation of your evaluation. Give a detailed explanation, so it's possible to take an action to accept it or not."
#         }, {
#             "role": "user",
#             "content": f"Question: {question}\nRetrieved documents: {retrieved_docs}\nPlease provide your evaluation in valid JSON format matching the following schema:\n{EvaluationOutput.model_json_schema()}"
#         }]
#     )
#     return response.choices[0].message.content

# #test the evaluation tool
# evaluate_retrieval("What platform was Gran Turismo launched on?", retrieve_game("What platform was Gran Turismo launched on?"))

In [11]:
# def game_web_search(question, max_results=5):
#     r = tavily_client.search(
#         query=question,
#         max_results=max_results,
#         include_answer=True
#     )
#     answer = r.get("answer")
#     results = [{"title": x.get("title"), "url": x.get("url")} for x in r.get("results", [])]
#     return {"answer": answer, "results": results}


# w = game_web_search("What platform was Gran Turismo launched on?", max_results=5)
# print("Answer:", w["answer"])
# print("Results shown:", len(w["results"]))

In [None]:
# Test the UdaPlay Agent with sample queries
print("🎮 Testing UdaPlay Agent with sample queries...")
print("=" * 60)

# Test queries
test_queries = [
    "When was Pokémon Gold and Silver released?",
    "Which one was the first 3D platformer Mario game?", 
    "Was Mortal Kombat X released for PlayStation 5?"
]

# Test each query using the proper Agent.invoke() method
test_session = "test_session"
for i, query in enumerate(test_queries, 1):
    print(f"\n🔍 Query {i}: {query}")
    print("-" * 40)
    
    try:
        # Use the proper Agent.invoke() method with StateMachine workflow
        run_result = udaplay_agent.invoke(query, test_session)
        
        # Get the final response from the run result
        final_state = run_result.get_final_state()
        if final_state and "messages" in final_state:
            # Print the agent's reasoning and tool usage
            print("\n🤔 Agent's Workflow:")
            
            # Extract tool usage from messages
            tool_usage_found = False
            for msg in final_state["messages"]:
                # Check if it's a tool message
                if hasattr(msg, 'role') and msg.role == "tool":
                    print(f"📌 Using tool: {getattr(msg, 'name', 'Unknown tool')}")
                    print(f"🔍 Tool Call ID: {getattr(msg, 'tool_call_id', 'N/A')}")
                    print(f"📝 Tool Output: {getattr(msg, 'content', 'No output')}\n")
                    tool_usage_found = True
                elif isinstance(msg, dict) and msg.get("role") == "tool":
                    print(f"📌 Using tool: {msg.get('name', 'Unknown tool')}")
                    print(f"🔍 Tool Call ID: {msg.get('tool_call_id', 'N/A')}")
                    print(f"📝 Tool Output: {msg.get('content', 'No output')}\n")
                    tool_usage_found = True
                # Check if it's an AI message with tool calls
                elif hasattr(msg, 'tool_calls') and msg.tool_calls:
                    for tool_call in msg.tool_calls:
                        print(f"🔧 Tool Call: {tool_call.function.name}")
                        print(f"🔍 Arguments: {tool_call.function.arguments}")
                        print(f"🆔 Call ID: {tool_call.id}\n")
                        tool_usage_found = True
                elif isinstance(msg, dict) and msg.get("tool_calls"):
                    for tool_call in msg["tool_calls"]:
                        print(f"🔧 Tool Call: {tool_call['function']['name']}")
                        print(f"🔍 Arguments: {tool_call['function']['arguments']}")
                        print(f"🆔 Call ID: {tool_call['id']}\n")
                        tool_usage_found = True
            
            if not tool_usage_found:
                print("No tool usage detected in this workflow")
            
            # Find and print the final response
            ai_messages = [msg for msg in final_state["messages"] 
                         if (hasattr(msg, 'role') and msg.role == "assistant") or
                            (isinstance(msg, dict) and msg.get("role") == "assistant")]
            
            if ai_messages:
                last_msg = ai_messages[-1]
                response_content = (last_msg.content if hasattr(last_msg, 'content') 
                                 else last_msg.get("content", "No response generated"))
                print("🎯 Final Answer:")
                print(response_content)
            else:
                print("❌ No final answer found")
        else:
            print("❌ No final state available")
            
    except Exception as e:
        print(f"❌ Error: {e}")
        import traceback
        traceback.print_exc()
    
    print("\n" + "=" * 60)

print("\n✅ Agent testing completed!")
print(f"📊 Session runs: {len(udaplay_agent.get_session_runs(test_session))}")
print(f"🔧 Agent workflow steps: {list(udaplay_agent.workflow.steps.keys())}")


🎮 Testing UdaPlay Agent with sample queries...

🔍 Query 1: When was Pokémon Gold and Silver released?
----------------------------------------
[StateMachine] Starting: __entry__
[StateMachine] Executing step: message_prep
[StateMachine] Executing step: llm_processor
[StateMachine] Executing step: tool_executor
[StateMachine] Executing step: llm_processor
[StateMachine] Terminating: __termination__

🤔 Agent's Workflow:
🔧 Tool Call: retrieve_game
🔍 Arguments: {
  "query": "When was Pokémon Gold and Silver released?"
}
🆔 Call ID: call_g6MOdpcBSeWMqIy9c9N8HGYf

📌 Using tool: retrieve_game
🔍 Tool Call ID: call_g6MOdpcBSeWMqIy9c9N8HGYf
📝 Tool Output: "[{'name': 'Pok\u00e9mon Gold and Silver', 'platform': 'Game Boy Color', 'year': 1999, 'publisher': 'Nintendo', 'score': 0.3226}, {'name': 'Pok\u00e9mon Ruby and Sapphire', 'platform': 'Game Boy Advance', 'year': 2002, 'publisher': 'Nintendo', 'score': 0}, {'name': 'Super Mario World', 'platform': 'Super Nintendo Entertainment System (SNES)', 'y

### (Optional) Advanced

In [13]:
# TODO: Update your agent with long-term memory
# TODO: Convert the agent to be a state machine, with the tools being pre-defined nodes