# [STARTER] Udaplay Project

## Part 02 - Agent

In this part of the project, you'll use your VectorDB to be part of your Agent as a tool.

You're building UdaPlay, an AI Research Agent for the video game industry. The agent will:
1. Answer questions using internal knowledge (RAG)
2. Search the web when needed
3. Maintain conversation state
4. Return structured outputs
5. Store useful information for future use

### Setup

In [1]:
# Import tools and libraries
from dotenv import load_dotenv
from tavily import TavilyClient
import os
import json
import re
from pydantic import BaseModel
from openai import OpenAI
import chromadb
from sentence_transformers import SentenceTransformer
from langchain.tools import tool

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Load environment variables
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

### Tools

Build at least 3 tools:
- retrieve_game: To search the vector DB
- evaluate_retrieval: To assess the retrieval performance
- game_web_search: If no good, search the web


#### Retrieve Game Tool

In [3]:


# ------------------------------
# Embedding + Chroma setup
# ------------------------------
embedder = SentenceTransformer('all-MiniLM-L6-v2')
chroma_client = chromadb.PersistentClient(path="chroma_db_jupiter")

def embed(texts: list[str]):
    """Return list of embedding vectors."""
    vectors = embedder.encode(texts, convert_to_numpy=True)
    return vectors.tolist()

# ------------------------------
# RETRIEVE_GAME TOOL
# ------------------------------
@tool("retrieve_game", return_direct=False)
def retrieve_game(query: str, n_results: int = 3):
    """
    Semantic search: Finds the most relevant games from the vector DB.

    Args:
        query (str): A question about the game industry.

    Returns:
        List[Dict]: Each result contains:
            - Platform (e.g., Game Boy, PS5, Xbox 360)
            - Name (name of the game)
            - YearOfRelease
            - Description
    """
    try:
        query_vector = embed([query])[0]
        collection = chroma_client.get_collection("games_collection_new")

        results = collection.query(
            query_embeddings=[query_vector],
            n_results=n_results,
            include=["documents", "metadatas"]
        )

        metadatas = results.get("metadatas", [[]])[0]
        documents = results.get("documents", [[]])[0]

        output = []
        for meta, doc in zip(metadatas, documents):
            output.append({
                "Platform": meta.get("Platform"),
                "Name": meta.get("Name"),
                "YearOfRelease": meta.get("YearOfRelease"),
                "Description": meta.get("Description"),
                "document": doc
            })

        return output

    except Exception as e:
        return [{"error": str(e)}]


#### Evaluate Retrieval Tool

In [13]:

class EvaluationReport(BaseModel):
    useful: bool
    description: str

@tool("evaluate_retrieval", return_direct=False)
def evaluate_retrieval(question: str, retrieved_docs: list[dict]):
    """
    LLM judge to evaluate whether retrieved docs are sufficient.
    """

    prompt = f"""
You are an evaluation assistant. Your task is to decide whether the retrieved documents are sufficient to answer the user's question.

Question:
{question}

Retrieved Documents:
{json.dumps(retrieved_docs, indent=2)}

Return ONLY valid JSON in this format:

{{"useful": true, "description": "..."}}
"""

    client = OpenAI(api_key=OPENAI_API_KEY, base_url=os.getenv("OPENAI_BASE_URL"))

    resp = client.chat.completions.create(
        model=os.getenv("EVAL_MODEL", "gpt-4o-mini"),
        messages=[{"role": "user", "content": prompt}],
        temperature=0.0
    )

    raw = resp.choices[0].message.content

    # --- JSON extraction ---
    try:
        parsed = json.loads(raw)
    except:
        match = re.search(r"\{[\s\S]*\}", raw)
        if not match:
            return {
                "useful": False,
                "description": f"Failed to extract JSON. Raw: {raw}"
            }
        parsed = json.loads(match.group(0))

    try:
        report = EvaluationReport.model_validate(parsed)
    except Exception as e:
        return {
            "useful": False,
            "description": f"Invalid JSON schema: {str(e)}. Raw: {parsed}"
        }

    return report.model_dump()



#### Game Web Search Tool

In [10]:

@tool("game_web_search", return_direct=False)
def game_web_search(question: str, max_results: int = 3):
    """
    Uses Tavily client to search the web for a gaming-related question.

    Semantic search: Finds relevant web results.
    Args:
        question: A question about the game industry.
        :param question:
        :param max_results:
    """
    try:
        tavily = TavilyClient(api_key=TAVILY_API_KEY)
        resp = tavily.search(
            query=question,
            include_answer=True,
            max_results=max_results
        )
        return {
            "answer": resp.get("answer"),
            "results": resp.get("results")
        }
    except Exception as e:
        return {
            "answer": None,
            "results": [],
            "error": str(e)
        }


### Agent

In [11]:
class AgentStateMachine:
    def __init__(self, retrieve_tool, evaluate_tool, web_tool, llm_tool):
        self.retrieve_tool = retrieve_tool
        self.evaluate_tool = evaluate_tool
        self.web_tool = web_tool
        self.llm_tool = llm_tool
        self.state = "start"
        self.last_retrieved = None
        self.last_evaluation = None

    def step(self, question):

        # START ‚Üí RETRIEVE
        if self.state == "start":
            print("üîé Retrieving from local DB...")
            results = self.retrieve_tool.run(question)   # ‚Üê correct
            self.last_retrieved = results
            self.state = "retrieved"
            return {"state": self.state, "retrieved": results}

        # RETRIEVED ‚Üí EVALUATE
        elif self.state == "retrieved":
            print("üß† Evaluating retrieval quality...")
            evaluation = self.evaluate_tool.run({
                "question": question,
                "retrieved_docs": self.last_retrieved
            })                                           # ‚Üê correct
            self.last_evaluation = evaluation

            if evaluation["useful"]:
                self.state = "done"
                return {
                    "state": "done",
                    "source": "local",
                    "data": self.last_retrieved,
                    "explanation": evaluation["description"]
                }
            else:
                self.state = "need_web"
                return {"state": "need_web"}

        # NEED_WEB ‚Üí FINAL ANSWER
        elif self.state == "need_web":
            print("üåê Local insufficient ‚Üí searching the web...")
            web = self.web_tool.run(question)             # ‚Üê correct

            prompt = f"""
Local DB was insufficient.

Question: {question}

Local:
{json.dumps(self.last_retrieved, indent=2)}

Web:
{json.dumps(web, indent=2)}

Provide a concise, accurate answer.
"""

            answer = self.llm_tool(prompt)                # ‚Üê plain call
            self.state = "done"

            return {
                "state": "done",
                "source": "web",
                "data": answer,
                "explanation": self.last_evaluation["description"]
            }

    def run(self, question):
        self.state = "start"

        s1 = self.step(question)
        if self.state == "done": return s1

        s2 = self.step(question)
        if self.state == "done": return s2

        return self.step(question)



In [15]:
client = OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=os.getenv("OPENAI_BASE_URL")
)
agent = AgentStateMachine(
    retrieve_tool=retrieve_game,
    evaluate_tool=evaluate_retrieval,
    web_tool=game_web_search,
    llm_tool=lambda prompt: client.chat.completions.create(
        model=os.getenv("ANSWER_MODEL", "gpt-4.1-mini"),
        messages=[
            {"role": "user", "content": prompt}
        ],
        max_tokens=300
    ).choices[0].message.content
)

queries = [
    "When Pok√©mon Gold and Silver was released?",
    "Which one was the first 3D platformer Mario game?",
    "Was Mortal Kombat X released for Playstation 5",
    "What are some popular racing games released for PlayStation?",
    "Tell me about the game that introduced the character 'Solid Snake'."
    "Tell me something"
]

for q in queries:
    print("\n========================================")
    print("QUESTION:", q)
    print("========================================\n")

    result = agent.run(q)

    print('\n=== RESULT ===')
    print('Source:', result['source'])
    print('Explanation:', result['explanation'])
    print('Data:', result['data'])
    print('\n')


QUESTION: When Pok√©mon Gold and Silver was released?

üîé Retrieving from local DB...
üß† Evaluating retrieval quality...

===== FULL RAW RESPONSE =====
ChatCompletion(id='chatcmpl-CcdvVNfSUk1oHAP68j0GcxXFN48uX', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='{"useful": true, "description": "The retrieved documents provide the release year of Pok√©mon Gold and Silver, which is 1999, thus answering the user\'s question."}', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1763325977, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp_560af6e559', usage=CompletionUsage(completion_tokens=36, prompt_tokens=405, total_tokens=441, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_t

### (Optional) Advanced

In [None]:
# TODO: Update your agent with long-term memory
# TODO: Convert the agent to be a state machine, with the tools being pre-defined nodes