# Intro to Building and Deploying an Agent with Agent Engine in Vertex AI
       


## Overview

### Agent Engine in Vertex AI

[Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview) is a managed service that helps you to build and deploy an agent framework. It gives you the flexibility to choose how much reasoning you want to delegate to the LLM and how much you want to handle with customized code. Agent Engine integrates closely with the Python SDK for the Gemini model in Vertex AI, and it can manage prompts, agents, and examples in a modular way. Agent Engine is compatible with LangChain, LlamaIndex, or other Python frameworks.


### Objectives

In this tutorial, you will learn how to build and deploy an agent (model, tools, and reasoning) using the Vertex AI SDK for Python.

You'll build and deploy an agent that uses the Gemini model, Python functions as tools, and LangChain for orchestration.

You will complete the following tasks:

- Install the Vertex AI SDK for Python
- Use the Vertex AI SDK to build components of a simple agent
- Test your agent locally before deploying
- Deploy and test your agent on Vertex AI
- Customize each layer of your agent (model, tools, orchestration)

## Getting Started


### Install Vertex AI SDK for Python

Install the latest version of the Vertex AI SDK for Python as well as extra dependencies related to Agent Engine and LangChain:

In [1]:
%pip install --upgrade --quiet \
    "google-cloud-aiplatform[agent_engines,langchain]" \
    cloudpickle==3.0.0 \
    "pydantic>=2.10" \
    requests

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
apache-beam 2.46.0 requires cloudpickle~=2.2.1, but you have cloudpickle 3.0.0 which is incompatible.
apache-beam 2.46.0 requires pyarrow<10.0.0,>=3.0.0, but you have pyarrow 19.0.1 which is incompatible.
asl 0.1 requires google-cloud-aiplatform==1.85.0, but you have google-cloud-aiplatform 1.97.0 which is incompatible.
asl 0.1 requires google-genai==1.7.0, but you have google-genai 1.19.0 which is incompatible.
asl 0.1 requires langchain-core==0.3.47, but you have langchain-core 0.3.63 which is incompatible.
asl 0.1 requires langchain-google-vertexai==2.0.15, but you have langchain-google-vertexai 2.0.24 which is incompatible.
asl 0.1 requires pydantic==2.9.2, but you have pydantic 2.11.5 which is incompatible.
asl 0.1 requires pyyaml==5.3.1, but you have pyyaml 6.0.2 which is incompatible.
datasets 2.14.5 r

### Restart current runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel.

In [2]:
# Restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>
</div>


In [1]:
PROJECT_ID = !(gcloud config get-value project)
PROJECT_ID = PROJECT_ID[0]
BUCKET_NAME = PROJECT_ID
STAGING_BUCKET = f"gs://{BUCKET_NAME}"
LOCATION = "us-central1"

import vertexai

vertexai.init(
    project=PROJECT_ID, location=LOCATION, staging_bucket=STAGING_BUCKET
)

## Example: Build and deploy an agent

### Import libraries

In [2]:
from vertexai import agent_engines
from vertexai.preview.reasoning_engines import LangchainAgent

### Define model

As you construct your agent from the bottom up, the first component deals with which generative model you want to use in your agent.

<img width="40%" src="https://storage.googleapis.com/github-repo/generative-ai/gemini/agent-engine/images/agent-stack-1.png" alt="Components of an agent in Agent Engine on Vertex AI" />

Here you'll use the Gemini 2.0 model:

In [3]:
model = "gemini-2.0-flash"

### Define Python functions (tools)

The second component of your agent includes tools and functions, which enable the generative model to interact with external systems, databases, document stores, and other APIs so that the model can get the most up-to-date information or take action with those systems.

<img width="40%" src="https://storage.googleapis.com/github-repo/generative-ai/gemini/agent-engine/images/agent-stack-2.png" alt="Components of an agent in Agent Engine on Vertex AI" />

In this example, you'll define a function called `get_exchange_rate` that uses the `requests` library to retrieve real-time currency exchange information from an API:

In [4]:
def get_exchange_rate(
    currency_from: str = "USD",
    currency_to: str = "EUR",
    currency_date: str = "latest",
):
    """Retrieves the exchange rate between two currencies on a specified date."""
    import requests

    response = requests.get(
        f"https://api.frankfurter.app/{currency_date}",
        params={"from": currency_from, "to": currency_to},
    )
    return response.json()

Test the function with sample inputs to ensure that it's working as expected:

In [5]:
get_exchange_rate(currency_from="USD", currency_to="SEK")

{'amount': 1.0, 'base': 'USD', 'date': '2025-06-11', 'rates': {'SEK': 9.6077}}

### Define agent

The third component of your agent involves adding a reasoning layer, which helps your agent use the tools that you provided to help the end user achieve a higher-level goal.

<img width="40%" src="https://storage.googleapis.com/github-repo/generative-ai/gemini/agent-engine/images/agent-stack-3.png" alt="Components of an agent in Agent Engine on Vertex AI" />

If you were to use Gemini and Function Calling on their own without a reasoning layer, you would need to handle the process of calling functions and APIs in your application code, and you would need to implement retries and additional logic to ensure that your function calling code is resilient to failures and malformed requests.

Here, you'll use the LangChain agent template provided in the Vertex AI SDK for Agent Engine, which brings together the model, tools, and reasoning that you've built up so far:

In [6]:
agent = LangchainAgent(
    model=model,
    tools=[get_exchange_rate],
    agent_executor_kwargs={"return_intermediate_steps": True},
)

Now we can test the model and agent behavior to ensure that it's working as expected before we deploy it:

### Test your agent locally

With all of the core components of your agent in place, you can send a prompt to your agent using `.query` to test that it's working as expected, including the intermediate steps that the agent performed between the input prompt and the generated summary output. In the default mode, the agent processes your input and returns the **entire agent output in a single response when complete**:

In [7]:
agent.query(
    input="What's the exchange rate from US dollars to Indian currency latest ?"
)

{'input': "What's the exchange rate from US dollars to Indian currency latest ?",
 'output': 'The current exchange rate from USD to INR is 85.54.\n',
 'intermediate_steps': [[{'lc': 1,
    'type': 'constructor',
    'id': ['langchain', 'schema', 'agent', 'ToolAgentAction'],
    'kwargs': {'tool': 'get_exchange_rate',
     'tool_input': {'currency_from': 'USD', 'currency_to': 'INR'},
     'log': "\nInvoking: `get_exchange_rate` with `{'currency_from': 'USD', 'currency_to': 'INR'}`\n\n\n",
     'type': 'AgentActionMessageLog',
     'message_log': [{'lc': 1,
       'type': 'constructor',
       'id': ['langchain', 'schema', 'messages', 'AIMessageChunk'],
       'kwargs': {'content': '',
        'additional_kwargs': {'function_call': {'name': 'get_exchange_rate',
          'arguments': '{"currency_from": "USD", "currency_to": "INR"}'}},
        'response_metadata': {'safety_ratings': [],
         'usage_metadata': {},
         'finish_reason': 'STOP',
         'model_name': 'gemini-2.0-fla

In addition to the default query mode, the `.stream_query` method allows you to **see the agent's intermediate steps and final output from the chain**.

Instead of waiting for the agent to complete all sub-tasks, the agent sends back the response in **chunks as it's being generated**:

In [8]:
message_types = {"actions": "Action", "messages": "Message", "output": "Output"}
for chunk in agent.stream_query(
    input="What's the exchange rate from US dollars to Indian currency latest ?"
):
    for key, label in message_types.items():
        if key in chunk:
            print("\n------\n")
            print(f"{label}:")
            print()
            print(chunk[key])


------

Action:

[{'lc': 1, 'type': 'constructor', 'id': ['langchain', 'schema', 'agent', 'ToolAgentAction'], 'kwargs': {'tool': 'get_exchange_rate', 'tool_input': {'currency_to': 'INR', 'currency_from': 'USD'}, 'log': "\nInvoking: `get_exchange_rate` with `{'currency_to': 'INR', 'currency_from': 'USD'}`\n\n\n", 'type': 'AgentActionMessageLog', 'message_log': [{'lc': 1, 'type': 'constructor', 'id': ['langchain', 'schema', 'messages', 'AIMessageChunk'], 'kwargs': {'content': '', 'additional_kwargs': {'function_call': {'name': 'get_exchange_rate', 'arguments': '{"currency_to": "INR", "currency_from": "USD"}'}}, 'response_metadata': {'safety_ratings': [], 'usage_metadata': {}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.0-flash'}, 'type': 'AIMessageChunk', 'id': 'run--7fad9f95-decb-4730-bdf6-f223727935ca', 'tool_calls': [{'name': 'get_exchange_rate', 'args': {'currency_to': 'INR', 'currency_from': 'USD'}, 'id': 'e5a9e8db-2622-4aa0-b838-b5cb27c2b7d1', 'type': 'tool_call'}], 'usage_me

This allows you to observe the agent's actions in real-time (such as function calls, and intermediate steps), which is helpful for debugging purposes or for providing real-time updates to the end user.

### Deploy your agent on Vertex AI

Now that you've specified a model, tools, and reasoning for your agent and tested it out, you're ready to deploy your agent as a remote service in Vertex AI!

<img width="40%" src="https://storage.googleapis.com/github-repo/generative-ai/gemini/agent-engine/images/agent-stack-4.png" alt="Components of an agent in Agent Engine on Vertex AI" />

You can re-define the agent to avoid any stateful information in the agent due to our testing in the previous cell:

In [9]:
agent = LangchainAgent(
    model=model,
    tools=[get_exchange_rate],
)

Now you're ready to deploy your agent to Agent Engine in Vertex AI by calling `agent_engines.create()` along with:

1. The instance of your agent class
2. The Python packages and versions that your agent requires at runtime, similar to how you would define packages and versions in a `requirements.txt` file.

In [10]:
remote_agent = agent_engines.create(
    agent,
    requirements=[
        "google-cloud-aiplatform[agent_engines,langchain]",
        "cloudpickle==3.0.0",
        "pydantic>=2.10",
        "requests",
    ],
    display_name="Currency Exchange Agent",
)

Identified the following requirements: {'cloudpickle': '3.1.1', 'google-cloud-aiplatform': '1.97.0', 'pydantic': '2.11.5'}
The following requirements are incompatible: {'cloudpickle==3.1.1 (required: ==3.0.0)'}
The final list of requirements: ['google-cloud-aiplatform[agent_engines,langchain]', 'cloudpickle==3.0.0', 'pydantic>=2.10', 'requests']
Using bucket sanjana-sandbox-012024
Wrote to gs://sanjana-sandbox-012024/agent_engine/agent_engine.pkl
Writing to gs://sanjana-sandbox-012024/agent_engine/requirements.txt
Creating in-memory tarfile of extra_packages
Writing to gs://sanjana-sandbox-012024/agent_engine/dependencies.tar.gz
Creating AgentEngine
Create AgentEngine backing LRO: projects/27138301346/locations/us-central1/reasoningEngines/6291643028645871616/operations/2357353857664679936
View progress and logs at https://console.cloud.google.com/logs/query?project=sanjana-sandbox-012024
AgentEngine created. Resource name: projects/27138301346/locations/us-central1/reasoningEngines/62

Now you can send a prompt to your remote agent using `.query` to test that it's working as expected:

In [11]:
remote_agent.query(input="What's the exchange rate from US dollars to Euro ?")

{'output': 'The current exchange rate from USD to EUR is 1 USD = 0.87466 EUR.',
 'input': "What's the exchange rate from US dollars to Euro ?"}

Or you can stream the results back from the remote agent interactively using `.stream_query`:

### Querying your deployed agent

You've now deployed your agent and can [interact with it in multiple ways](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/use/overview), both within this notebook and from other applications or environments. The primary methods for accessing your deployed agent are via the Python client library or through REST API calls. Here's an overview of both methods:


In [12]:
# List all agent engines
all_agent_engines = agent_engines.list()
print("All Agent Engines:")
for agent in all_agent_engines:
    print(f"- {agent.display_name} : {agent.resource_name}")

All Agent Engines:
- Currency Exchange Agent : projects/27138301346/locations/us-central1/reasoningEngines/6291643028645871616


Use the resource name to load the agent in your other notebook or Python script, then query the remote agent as usual:

In [None]:
RESOURCE_ID = "2598515412341620736"
remote_agent = agent_engines.get(RESOURCE_ID)

In [None]:
remote_agent.query(input="What's the exchange rate from US dollars to Euro ?")

{'input': "What's the exchange rate from US dollars to Euro ?",
 'output': 'The current exchange rate from USD to EUR is 1 USD = 0.88794 EUR.'}

## Cleaning up

After you've finished, it's a good practice to clean up your cloud resources. You can delete the deployed Agent Engine instance to avoid any unexpected charges on your Google Cloud account.

In [None]:
remote_agent.delete()