<a href="https://colab.research.google.com/github/ntrajic/AgenticRoomReservation/blob/main/ADK_Eval_Guide.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🤖 Building Your First AI Agent with the Google ADK

Welcome! This notebook is your first step into the exciting world of **AI Agents**. An agent is more than just a chatbot; it's a smart program that uses a Large Language Model (LLM) like Gemini to **reason, plan, and use tools** to accomplish tasks.

In this guide, we will build a simple "Product Research Assistant" agent. This agent will be able to answer questions about product details and prices by using specialized tools we provide it.

We'll cover three key stages of professional agent development:
1.  **💻 Implementation**: Defining the agent's logic and tools.
2.  **🔧 Unit Testing**: Verifying that each tool works correctly in isolation.
3.  **🧪 Evaluation**: Testing the entire agent to see if it can reason correctly and choose the right tool for a given query.

Let's get started!

---
## Author

Hi, I'm Qingyue (Annie) Wang, a Developer Advocate and AI Engineer at **Google**. I'm passionate about helping developers build amazing things with AI and cloud technologies.

If you have questions about this notebook, feel free to reach out on [LinkedIn](https://www.linkedin.com/in/qingyuewang/) or [X (formerly Twitter)](https://twitter.com/qingyuewang).

```
 (\__/)
 (•ㅅ•)
 /づ  📚      Enjoy learning about AI Agents!
```

---
### 🎁 🛑 Important Prerequisite: Setup Your Environment! 🛑 🎁
-----------------------------------------------------------------------------

You will need a **Google AI API Key** to run this notebook.

👉 **Get Your Key HERE**: [https://codelabs.developers.google.com/onramp/instructions#1](https://codelabs.developers.google.com/onramp/instructions#1)

*Note: The LangChain integration in Part 4 requires an additional API key from a service like Tavily, which has a free tier.*

-----------------------------------------------------------------------------
```
 ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️  ⬆️
   /\_/\     /\_/\     /\_/\      /\_/\      /\_/\
  ( ^_^ )   ( -.- )   ( >_< )   ( =^.^= )   ( o_o )
```

## 1. Setup: Installing Libraries

First things first, we need to install the necessary Python libraries. We'll use the `pip` command to do this.

* `google-adk`: This is the **Agent Development Kit (ADK)**. It provides the core building blocks for creating agents, like the `LlmAgent` and `FunctionTool` classes we'll use later.
* `google-genai`: This library allows our Python code to communicate with the Google Gemini family of models, which will be the "brain" of our agent.

In [None]:
# @title 1. Install Libraries
!pip install google-adk google-genai -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.7/44.7 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m40.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m240.0/240.0 kB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m218.1/218.1 kB[0m [31m21.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m335.7/335.7 kB[0m [31m28.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m155.9/155.9 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.6/65.6 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

---
## 2. Configuration: Setting Your API Key

To use Google's AI models, you need an API key. This key is like a password that proves to Google that you have permission to use its services. 🔑

This code block does two things:
1.  It uses `getpass` to create a secure input field for your API key. This prevents your key from being displayed on the screen or saved in the notebook's output.
2.  It sets the API key as an **environment variable** (`os.environ`). This is a standard and secure practice that allows the Google libraries to automatically find and use the key without you having to paste it into your code directly.

> **Get your API Key**: You can get a free API key from [Google AI for Developers](https://makersuite.google.com/app/apikey).

In [None]:
# @title 2. Enter Your API Key
import getpass
import os

# Securely get the API key from the user
print("🔑 Enter your Google AI API key to continue.")
google_api_key = getpass.getpass("   API key: ").strip()

# Check if the key was provided and set it as an environment variable
if not google_api_key:
  raise ValueError("A Google AI API key is required to run this notebook.")

os.environ['GOOGLE_API_KEY'] = google_api_key
print("✅ API key configured.")

🔑 Enter your Google AI API key to continue.
   API key: ··········
✅ API key configured.


---
## 3. Implementation: Defining the Agent and Its Tools

This is the core of our project! We'll define the agent's capabilities here. The `%%writefile agent.py` command is a special "magic" command in notebooks that saves the content of this cell into a new file named `agent.py`. This helps keep our code organized.

Let's break down what's inside `agent.py`:

### Agent Tools 🛠️
A **Tool** is a function that an agent can call to get information or perform an action. Here, we create two simple Python functions that act as our tools:
- `get_product_details()`: Simulates looking up product information from a database.
- `get_product_price()`: Simulates looking up a product's price.

We then wrap these Python functions inside `FunctionTool(...)`. This ADK class inspects our function, understands its name and arguments (`product_name: str`), and makes it usable by the LLM.

### The Agent's Brain 🧠
The `LlmAgent` is the central component. We configure it with:
- **`model`**: We specify `gemini-2.5-flash`, a fast and powerful model perfect for this kind of task.
- **`tools`**: We give the agent the list of tools it's allowed to use.
- **`instruction`**: This is the most critical part! The instruction (or "system prompt") is a set of rules and guidelines that tells the agent how to behave. It's how we steer the LLM's reasoning. We explicitly tell it **when** to use each tool based on keywords in the user's query.

In [None]:
# @title 3. Define the Agent and Tools (agent.py)
%%writefile agent.py
import textwrap
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool

# --- Tool Definitions ---
# These are simple Python functions that our agent can call.
# In a real application, these might connect to a database or a live API.

def get_product_details(product_name: str) -> str:
    """Gathers basic details about a product."""
    details = {
        "smartphone": "A cutting-edge smartphone with advanced features.",
        "laptop": "A high-performance laptop for work and play.",
        "headphones": "Wireless headphones with noise cancellation.",
    }
    return details.get(product_name.lower(), "Product details not found.")

def get_product_price(product_name: str) -> str:
    """Gathers the price of a product."""
    prices = {"smartphone": "$999", "laptop": "$1299", "headphones": "$199"}
    return prices.get(product_name.lower(), "Product price not found.")

# --- Tool Objects ---
# We wrap our functions in the `FunctionTool` class so the Agent can use them.
get_product_details_tool = FunctionTool(get_product_details)
get_product_price_tool = FunctionTool(get_product_price)

# --- Agent Definition ---
# These instructions are the agent's guide. It tells the LLM how to behave
# and when to use the tools we've provided.
AGENT_INSTRUCTIONS = textwrap.dedent("""
    You are a product research assistant.
    - If the user asks for "price", "cost", or "how much", you MUST use the `get_product_price` tool.
    - If the user asks for "details" or "about", you MUST use the `get_product_details` tool.
    - If a query mentions both price and details, you MUST prioritize the `get_product_price` tool.
    - Respond only with the direct output from the tool.
""")

# We create the agent, providing the model, instructions, and tools.
root_agent = LlmAgent(
    name="ProductAgent",
    model="gemini-2.5-flash",
    instruction=AGENT_INSTRUCTIONS,
    tools=[get_product_details_tool, get_product_price_tool],
)

Writing agent.py


---
## 4. Unit Testing: Verifying Your Tools

Before we test the whole agent, it's a best practice to test its individual components. This is called **Unit Testing**. If the tools don't work, the agent can't work!

We use Python's built-in `unittest` framework to write a few simple checks:
- We test that our functions return the correct data when a product is found (e.g., `get_product_price("smartphone")` should be `"$999"`).
- We test the "unhappy path": what happens when a product is *not* found (e.g., `get_product_price("toaster")` should return the "not found" message).
- We also check that the functions are case-insensitive, as we designed them to be.

Seeing `OK` in the output means all our tool functions are behaving exactly as we expect. ✅

In [None]:
# @title 4. Run Unit Tests for Tools
import unittest
# We import the functions directly from the agent.py file we just created.
from agent import get_product_price, get_product_details

class TestProductTools(unittest.TestCase):
    """A test suite for the agent's individual tools."""
    def test_get_price_found(self):
        self.assertEqual(get_product_price("smartphone"), "$999")
        self.assertEqual(get_product_price("LAPTOP"), "$1299")

    def test_get_price_not_found(self):
        self.assertEqual(get_product_price("toaster"), "Product price not found.")

    def test_get_details_found(self):
        self.assertEqual(get_product_details("headphones"), "Wireless headphones with noise cancellation.")

    def test_get_details_not_found(self):
        self.assertEqual(get_product_details("watch"), "Product details not found.")

print("--- Running Tool Unit Tests ---")
# This command runs the tests defined in the class above.
unittest.main(argv=['first-arg-is-ignored'], exit=False, verbosity=2)

test_get_details_found (__main__.TestProductTools.test_get_details_found) ... ok
test_get_details_not_found (__main__.TestProductTools.test_get_details_not_found) ... ok
test_get_price_found (__main__.TestProductTools.test_get_price_found) ... ok
test_get_price_not_found (__main__.TestProductTools.test_get_price_not_found) ... ok

----------------------------------------------------------------------
Ran 4 tests in 0.006s

OK


--- Running Tool Unit Tests ---


<unittest.main.TestProgram at 0x7f7cbdff3090>

---
## 5. Evaluation: Creating Test Cases for the Agent

Now for the exciting part: testing the agent's reasoning. While unit tests check if a function works, **evaluation** checks if the agent *chooses the right function* and produces the correct final answer.

We create a JSON file (`evaluation.test.json`) to define our test cases. This is a standard format for agent evaluation. Each `eval_case` has:
- **`user_content`**: The question we will ask the agent.
- **`final_response`**: The *exact* answer we expect the agent to give.

This file acts as our "answer key." We'll use it in the next step to automatically grade our agent's performance.

In [None]:
# @title 5. Create Agent Evaluation Set (JSON)
%%writefile evaluation.test.json
{
  "eval_set_id": "product_agent_eval_set",
  "eval_cases": [
    {
      "eval_id": "get_smartphone_price",
      "conversation": [
        {
          "user_content": { "parts": [{ "text": "How much does the smartphone cost?" }] },
          "final_response": { "parts": [{ "text": "$999" }] }
        }
      ]
    },
    {
      "eval_id": "get_laptop_details",
      "conversation": [
        {
          "user_content": { "parts": [{ "text": "Tell me about the laptop" }] },
          "final_response": { "parts": [{ "text": "A high-performance laptop for work and play." }] }
        }
      ]
    },
    {
      "eval_id": "get_unknown_product_price",
      "conversation": [
        {
          "user_content": { "parts": [{ "text": "What is the price of a toaster?" }] },
          "final_response": { "parts": [{ "text": "Product price not found." }] }
        }
      ]
    }
  ]
}

Writing evaluation.test.json


---
## 6. Execution: Running the Evaluation

This final code block brings everything together. It runs our evaluation by systematically comparing the agent's performance against the "answer key" we created in `evaluation.test.json`.

Here’s the process:
1.  **Load Data**: It opens and reads the `evaluation.test.json` file.
2.  **Setup Runner**: It uses the `Runner` from the ADK, which is the engine for executing agent interactions.
3.  **Loop Through Cases**: The code iterates through each test case from the JSON file.
4.  **Run Query**: For each case, it sends the `user_content` (the query) to our agent.
5.  **Compare Results**: It takes the agent's final text response (`actual`) and compares it directly to the `final_response` from our file (`expected`).
6.  **Display Report**: Finally, it prints a clean report showing which tests `PASSED` and which `FAILED`.

If all tests pass, congratulations! 🎉 You've successfully built, tested, and evaluated your first AI agent.

In [None]:
# @title 6. Run the Agent Evaluation
import json
import uuid
from google.adk import Runner
from google.adk.sessions import InMemorySessionService
from google.genai.types import Content, Part
from IPython.display import Markdown

# Import the agent we defined in agent.py
from agent import root_agent

# --- Setup Dependencies ---
# The evaluation framework needs a session service to manage conversations.
# InMemorySessionService is a simple one for local testing.
session_service = InMemorySessionService()
my_user_id = f"user-{uuid.uuid4()}" # A unique ID for the user

# --- Helper function to run a single query ---
async def run_agent_query(agent, query: str, session, user_id: str):
    runner = Runner(agent=agent, session_service=session_service, app_name=agent.name)
    final_response = ""
    try:
        # Run the agent and wait for events
        async for event in runner.run_async(
            user_id=user_id,
            session_id=session.id,
            new_message=Content(parts=[Part(text=query)], role="user")
        ):
            # We only care about the final text response for this test
            if event.is_final_response():
                final_response = event.content.parts[0].text
    except Exception as e:
        final_response = f"An error occurred: {e}"
    return final_response

# --- Main Evaluation Loop ---
async def run_agent_evaluation(agent, eval_cases: list, user_id: str):
    results = []
    print(f"--- 🧪 Starting Evaluation for Agent: {agent.name} ---")
    for case in eval_cases:
        query = case["conversation"][0]["user_content"]["parts"][0]["text"]
        expected = case["conversation"][0]["final_response"]["parts"][0]["text"]
        eval_id = case["eval_id"]

        print(f"\nRunning Case: '{eval_id}'...")
        print(f"Query: '{query}'")

        # Create a new session for each evaluation case to ensure they are independent
        session = await session_service.create_session(app_name=agent.name, user_id=user_id)
        actual = await run_agent_query(agent, query, session, user_id)

        # The core of the evaluation: is the actual response what we expected?
        passed = (expected.strip() == actual.strip())

        results.append({
            "eval_id": eval_id,
            "passed": passed,
            "expected": expected,
            "actual": actual
        })

    print("\n--- ✅ Evaluation Complete ---")
    return results

# --- Execute the Evaluation ---
with open("evaluation.test.json") as f:
    eval_data = json.load(f)

results = await run_agent_evaluation(root_agent, eval_data["eval_cases"], user_id=my_user_id)

# --- Display the Results ---
print("\n" + "="*40)
print("          EVALUATION RESULTS")
print("="*40)
for result in results:
    status = "✅ PASSED" if result["passed"] else "❌ FAILED"
    print(f"\n🧪 Eval ID:  {result['eval_id']} ({status})")
    print(f"  - Expected: {result['expected']}")
    print(f"  - Actual:   {result['actual']}")
    print("—" * 40)

--- 🧪 Starting Evaluation for Agent: ProductAgent ---

Running Case: 'get_smartphone_price'...
Query: 'How much does the smartphone cost?'





Running Case: 'get_laptop_details'...
Query: 'Tell me about the laptop'





Running Case: 'get_unknown_product_price'...
Query: 'What is the price of a toaster?'





--- ✅ Evaluation Complete ---

          EVALUATION RESULTS

🧪 Eval ID:  get_smartphone_price (✅ PASSED)
  - Expected: $999
  - Actual:   $999

————————————————————————————————————————

🧪 Eval ID:  get_laptop_details (✅ PASSED)
  - Expected: A high-performance laptop for work and play.
  - Actual:   A high-performance laptop for work and play.

————————————————————————————————————————

🧪 Eval ID:  get_unknown_product_price (✅ PASSED)
  - Expected: Product price not found.
  - Actual:   Product price not found.

————————————————————————————————————————


---
## 🎓 Next Steps & Further Learning

Excellent work! You've learned the fundamentals of the agent development lifecycle. From here, you can explore more advanced topics:

- **More Complex Tools**: Your tools could interact with real-world APIs, databases, or files.
- **Chaining Agents**: You can have one agent (a "manager") that delegates tasks to other, more specialized agents.
- **Advanced Evaluation**: Instead of exact-match, you can use an LLM to grade the agent's responses for quality and correctness.

Check out these videos to continue your learning journey:

* [Google AI Agent Development Kit (ADK) tutorial youtube](https://www.youtube.com/results?search_query=Google+AI+Agent+Development+Kit+tutorial)
* [LLM Agents Explained youtube](https://www.youtube.com/results?search_query=LLM+Agents+Explained)