# Use Vertex Session Service for Free with Agent Engine, ADK, and Vertex Express Mode

This tutorial extends from the [Quickstart example](https://google.github.io/adk-docs/get-started/quickstart/) for [Agent Development Kit](https://google.github.io/adk-docs/get-started/).

We'll embark on building a **Weather Bot agent**, creating single agent that can look up weather, we will connect it with some free Vertex services like the VertexAiSessionService, allowing for users to save their sessions on vertex!

**What is ADK Again?**

As a reminder, ADK is a Python framework designed to streamline the development of applications powered by Large Language Models (LLMs). It offers robust building blocks for creating agents that can reason, plan, utilize tools, interact dynamically with users, and collaborate effectively within a team.

**In this tutorial, you will master:**

*   ✅ **Tool Definition & Usage:** Crafting Python functions (`tools`) that grant agents specific abilities (like fetching data) and instructing agents on how to use them effectively.
*   ✅ **Agent Engine APIs:** Deploying local agents and using agent engine capabilities like the VertexAiSession service for free

**End State Expectation:**

By completing this tutorial, you will have built a functional weather agent that can utilize the agent engine building blocks.

**Prerequisites:**

*   ✅ **Solid understanding of Python programming.**
*   ✅ **Familiarity with Large Language Models (LLMs), APIs, and the concept of agents.**
*   ❗ **Crucially: Completion of the ADK Quickstart tutorial(s) or equivalent foundational knowledge of ADK basics (Agent, Runner, SessionService, basic Tool usage).** This tutorial builds directly upon those concepts.
*   ✅ **API Keys** for the LLMs you intend to use (e.g., Google AI Studio for Gemini, OpenAI Platform, Anthropic Console).


---

**Ready to build your agent team? Let's dive in!**

In [None]:
# Setup and Installation
# Install ADK

!pip install google-adk -q

print("Installation complete.")

In [None]:
# @title Import necessary libraries
import os
import asyncio
from google import adk
from google.adk.agents import Agent
from google.adk.sessions import VertexAiSessionService
from google.adk.memory import VertexAiMemoryBankService
from google.adk.runners import Runner
from google.genai import types # For creating message Content/Parts

import warnings
# Ignore all warnings
warnings.filterwarnings("ignore")

# Configure API Keys (Replace with your actual keys!)
We can use Vertex Services like VertexAiSessionService and access models for free through Vertex Express Mode! Sign up with your google account here: https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview to unlock free access to certain services and models for free without the need of adding your credit card.

Vertex Express mode combined with ADK allow for the creation of advanced agents for free! You will have access to certain Gemini models and Agent Engine services like Session and Memory, all for free without a billing account!

In [None]:
# Gemini API Key (Get from Vertex Express Mode)
easygcp_api_key = "INSERT_USER_ID" #@param {type:"string"}
os.environ["GOOGLE_API_KEY"] = easygcp_api_key
# Set vertex to true
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "True"

# --- Verify Keys (Optional Check) ---
print("API Keys Set:")
print(f"Google API Key set: {'Yes' if os.environ.get('GOOGLE_API_KEY') and os.environ['GOOGLE_API_KEY'] != 'INSERT API KEY HERE' else 'No (REPLACE PLACEHOLDER!)'}")

When creating an agent, we need to choose the model we want. The Vertex Express mode API key allows for the use of several gemini models for free, and you can use any of the models listed here: https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview#models

In [None]:
# --- Define Model Constants for easier use ---

# Use an allowlisted model for EasyGCP, we will use gemini 2.0
MODEL_GEMINI_2_5_PRO = "gemini-2.5-pro"

print("\nEnvironment configured.")

---

## Step 1: Your First Agent \- Basic Weather Lookup

Let's begin by building the fundamental component of our Weather Bot: a single agent capable of performing a specific task – looking up weather information. This involves creating two core pieces:

1. **A Tool:** A Python function that equips the agent with the *ability* to fetch weather data.  
2. **An Agent:** The AI "brain" that understands the user's request, knows it has a weather tool, and decides when and how to use it.

---

**1\. Define the Tool (`get_weather`)**

In ADK, **Tools** are the building blocks that give agents concrete capabilities beyond just text generation. They are typically regular Python functions that perform specific actions, like calling an API, querying a database, or performing calculations.

Our first tool will provide a *mock* weather report. This allows us to focus on the agent structure without needing external API keys yet. Later, you could easily swap this mock function with one that calls a real weather service.

**Key Concept: Docstrings are Crucial\!** The agent's LLM relies heavily on the function's **docstring** to understand:

* *What* the tool does.  
* *When* to use it.  
* *What arguments* it requires (`city: str`).  
* *What information* it returns.

**Best Practice:** Write clear, descriptive, and accurate docstrings for your tools. This is essential for the LLM to use the tool correctly.

In [None]:
# @title Define the get_weather Tool
def get_weather(city: str) -> dict:
    """Retrieves the current weather report for a specified city.

    Args:
        city (str): The name of the city (e.g., "New York", "London", "Tokyo").

    Returns:
        dict: A dictionary containing the weather information.
              Includes a 'status' key ('success' or 'error').
              If 'success', includes a 'report' key with weather details.
              If 'error', includes an 'error_message' key.
    """
    print(f"--- Tool: get_weather called for city: {city} ---") # Log tool execution
    city_normalized = city.lower().replace(" ", "") # Basic normalization

    # Mock weather data
    mock_weather_db = {
        "newyork": {"status": "success", "report": "The weather in New York is sunny with a temperature of 25°C."},
        "london": {"status": "success", "report": "It's cloudy in London with a temperature of 15°C."},
        "tokyo": {"status": "success", "report": "Tokyo is experiencing light rain and a temperature of 18°C."},
    }

    if city_normalized in mock_weather_db:
        return mock_weather_db[city_normalized]
    else:
        return {"status": "error", "error_message": f"Sorry, I don't have weather information for '{city}'."}

# Example tool usage (optional test)
print(get_weather("New York"))
print(get_weather("Paris"))

---

**2\. Define the Agent (`weather_agent`)**

Now, let's create the **Agent** itself. An `Agent` in ADK orchestrates the interaction between the user, the LLM, and the available tools.

We configure it with several key parameters:

* `name`: A unique identifier for this agent (e.g., "weather\_agent\_v1").  
* `model`: Specifies which LLM to use (e.g., `MODEL_GEMINI_2_5_PRO`). We'll start with a specific Gemini model.  
* `description`: A concise summary of the agent's overall purpose. This becomes crucial later when other agents need to decide whether to delegate tasks to *this* agent.  
* `instruction`: Detailed guidance for the LLM on how to behave, its persona, its goals, and specifically *how and when* to utilize its assigned `tools`.  
* `tools`: A list containing the actual Python tool functions the agent is allowed to use (e.g., `[get_weather]`).

**Best Practice:** Provide clear and specific `instruction` prompts. The more detailed the instructions, the better the LLM can understand its role and how to use its tools effectively. Be explicit about error handling if needed.

**Best Practice:** Choose descriptive `name` and `description` values. These are used internally by ADK and are vital for features like automatic delegation (covered later).

In [None]:
# @title Define the Weather Agent

weather_agent = Agent(
    name="weather_agent_v1",
    model=MODEL_GEMINI_2_5_PRO,
    description="Provides weather information for specific cities.",
    instruction="You are a helpful weather assistant. "
                "When the user asks for the weather in a specific city, "
                "use the 'get_weather' tool to find the information. "
                "If the tool returns an error, inform the user politely. "
                "If the tool is successful, present the weather report clearly.",
    tools=[get_weather, adk.tools.preload_memory_tool.PreloadMemoryTool()], # Pass the function directly
)

print(f"Agent '{weather_agent.name}' created using model '{MODEL_GEMINI_2_5_PRO}'.")

---

**3\. Define the Agent Engine**

An Agent Engine is a set of services that enables developers to deploy, manage, and scale AI agents in production. Agent Engine handles the infrastructure to scale agents in production so you can focus on creating applications. In Vertex Express mode, we have access to certain Agent Engine services for free, mainly the Session and Memory services, which allow for context management. Each session and memory is associated with an Agent Engine.

We configure our Agent Engine with several key parameters:

* `displayName`: A unique identifier for this agent engine (e.g., "weather\_agent\_v1").  
* `description`: A concise summary of the agent engine's overall purpose. This can help you remember what is does.

In [None]:
# @title Create the Agent Engine
from google import genai
import json

# Create Agent Engine with GenAI SDK
client = genai.Client(vertexai = True)._api_client
string_response = client.request(
        http_method='POST',
        path=f'reasoningEngines',
        request_dict={"displayName": "Express-Mode-Agent-Engine", "description": "Test Agent Engine demo"},
    ).body
response = json.loads(string_response)
response

In [None]:
APP_NAME="/".join(response['name'].split("/")[:6])
APP_NAME

In [None]:
APP_ID=APP_NAME.split('/')[-1]
APP_ID

---

**4\. Setup Runner and Session Service**

To manage conversations and execute the agent, we need two more components:

* `SessionService`: Responsible for managing conversation history and state for different users and sessions. The `VertexAiSessionService` is an implementation that stores everything in vertex, allowing for persistent session storage. It keeps track of the messages exchanged. We'll explore state persistence more in Step 4\.  
* `Runner`: The engine that orchestrates the interaction flow. It takes user input, routes it to the appropriate agent, manages calls to the LLM and tools based on the agent's logic, handles session updates via the `SessionService`, and yields events representing the progress of the interaction.

In [None]:
# @title Create Our Initial Session

# Create Vertex AI Session through ADK
session_service = VertexAiSessionService(agent_engine_id=APP_ID)
memory_service = VertexAiMemoryBankService(agent_engine_id=APP_ID)

USER_ID = "INSERT_USER_ID" #@param {type:"string"}
session = await session_service.create_session(app_name=APP_ID, user_id=USER_ID)
SESSION_ID = session.id
session

In [None]:
# @title Create an Agent Runner

# Connect with ADK. ADK will also use the easygcp key to generate content
print(f"Session created: App='{APP_ID}', User='{USER_ID}', Session='{SESSION_ID}'")
# --- Runner ---
# Key Concept: Runner orchestrates the agent execution loop.
runner = Runner(
    agent=weather_agent, # The agent we want to run
    app_name=APP_ID,   # Associates runs with our app
    session_service=session_service, # Uses vertex session service
    memory_service=memory_service # Uses vertex memory service
)
print(f"Runner created for agent '{runner.agent.name}'.")

---

**5\. Interact with the Agent**

We need a way to send messages to our agent and receive its responses. Since LLM calls and tool executions can take time, ADK's `Runner` operates asynchronously.

We'll define an `async` helper function (`call_agent_async`) that:

1. Takes a user query string.  
2. Packages it into the ADK `Content` format.  
3. Calls `runner.run_async`, providing the user/session context and the new message.  
4. Iterates through the **Events** yielded by the runner. Events represent steps in the agent's execution (e.g., tool call requested, tool result received, intermediate LLM thought, final response).  
5. Identifies and prints the **final response** event using `event.is_final_response()`.

**Why `async`?** Interactions with LLMs and potentially tools (like external APIs) are I/O-bound operations. Using `asyncio` allows the program to handle these operations efficiently without blocking execution.

In [None]:
# @title Define Agent Interaction Function

from google.genai import types # For creating message Content/Parts

async def call_agent_async(query: str, runner, user_id, session_id):
  """Sends a query to the agent and prints the final response."""
  print(f"\n>>> User Query: {query}")

  # Prepare the user's message in ADK format
  content = types.Content(role='user', parts=[types.Part(text=query)])

  final_response_text = "Agent did not produce a final response." # Default

  # Key Concept: run_async executes the agent logic and yields Events.
  # We iterate through events to find the final answer.
  async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
      # You can uncomment the line below to see *all* events during execution
      print(f"  [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")

      # Key Concept: is_final_response() marks the concluding message for the turn.
      if event.is_final_response():
          if event.content and event.content.parts:
             # Assuming text response in the first part
             final_response_text = event.content.parts[0].text
          elif event.actions and event.actions.escalate: # Handle potential errors/escalations
             final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
          # Add more checks here if needed (e.g., specific error codes)
          break # Stop processing events once the final response is found

  print(f"<<< Agent Response: {final_response_text}")

---

**6\. Run the Conversation**

Finally, let's test our setup by sending a few queries to the agent. We wrap our `async` calls in a main `async` function and run it using `await`.

Watch the output:

* See the user queries.  
* Notice the `--- Tool: get_weather called... ---` logs when the agent uses the tool.  
* Observe the agent's final responses labelled with "Agent Response", including how it handles the case where weather data isn't available (for Paris).

In [None]:
# @title Run the Initial Conversation

# We need an async function to await our interaction helper
async def run_conversation():
    await call_agent_async("What is the weather like in London?",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID)

    await call_agent_async("How about Paris?",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID) # Expecting the tool's error message

    await call_agent_async("Tell me the weather in New York",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID)
    await call_agent_async("I prefer the weather in New York, that sounds nicer than the weather in London",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID)
    await call_agent_async("What cities did I ask you about previously?",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID)

# Execute the conversation using await in an async context (like Colab/Jupyter)
await run_conversation()

---

Congratulations\! You've successfully built and interacted with your first ADK agent, and used the Vertex Session Service for free!

---

**8\. Test out Agent Memory**

Lets see if the agent will remember our preferences from our previous session.
- Here we can create a new session, then ask the agent about something we talked about in the previous session. In this example, we ask about our weather preferences after talking about preferring New York weather in the previous conversation.
- The agent should utilize the vertex memory service to retrieve relevant details about the user, then utilize that in its responses.

In [None]:
# @title Create a Memory Based on the Previous Session

# We can generate a memory given the previous session id
memory_service.add_session_to_memory(session)
response

In [None]:
# @title Test the Agent Memory

# Create a new session, and lets see if it will remember our preferences based on our user id
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)
SESSION_ID = session.id

print(f"New Session created: App='{APP_NAME}', User='{USER_ID}', Session='{SESSION_ID}'")

await call_agent_async("What weather do I prefer?",
                                       runner=runner,
                                       user_id=USER_ID,
                                       session_id=SESSION_ID)