# Get started with Zawadi Tunu Built with Memory Bank on ADK

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/mwanyumba7/Zawadi-Tunu/blob/main/zawadi_tunu_with_memory_bank_on_adk.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
</table>

## Overview
We will be building an Agent `Zawadi Tunu` which is an agent ispecialized in gift recommendation engine designed to elevate the act of gift-giving, transforming it from a chore into a deeply personalized and meaningful experience. Built as a demonstration of the Agent Development Kit (ADK) for Gemini and the Vertex AI Memory Bank, this agent's core strength lies in its ability to retain and recall nuanced information over time.

Instead of generic suggestions, the agent uses its long-term memory to remember key details gleaned from previous conversations: a niece's age and past birthday presents (like the "red bike" in the demo), a spouse's stated preference for aisle seats, or a parent's hint about a desired hobby.

This tutorial demonstrates how to build ADK agents with Memory, including both `InMemoryMemoryService` and `VertexAiMemoryBankService`.

This tutorial will cover:
* Comparing `InMemoryMemoryService` vs. `VertexAiMemoryBankService`.
* Generating memories with ADK and Agent Engine Memory Bank.
* Retrieving memories with ADK and Agent Engine Memory Bank.
* Customizing your Memory Bank instance's behavior.

## Get started

### Install Google Gen AI SDK and other required packages


In [1]:
%pip install google-adk google-cloud-aiplatform --upgrade --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.9/43.9 kB[0m [31m882.2 kB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.1/8.1 MB[0m [31m39.1 MB/s[0m eta [36m0:00:00[0m
[?25h

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [2]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [4]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

# fmt: off
PROJECT = "zawadi-tunu-475816"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
# fmt: on
if not PROJECT or PROJECT == "[your-project-id]":
    PROJECT = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

client = vertexai.Client(project=PROJECT, location=LOCATION)

# Set environment variables for ADK.
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "TRUE"
os.environ["GOOGLE_CLOUD_PROJECT"] = PROJECT
os.environ["GOOGLE_CLOUD_LOCATION"] = LOCATION

## (Optional) Set up logging
To surface ADK logs in your notebook, you may need to configure the logging level of ADK's logger.

In [5]:
import logging
import sys

# Set up a general handler for all logs to be printed in your Colab output.
# We'll set the overall level to INFO.
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    stream=sys.stdout,
)

# Get the specific logger used by ADK.
adk_logger = logging.getLogger("google_adk")

# Set its level to DEBUG to ensure you capture ALL messages from the SDK,
# even if the general level is higher (like INFO).
adk_logger.setLevel(logging.DEBUG)

## Define a simple agent

The ADK Runner defines what session service is used by the agent. The session service is responsible for saving the conversation history.

In this example, we're using the in-memory session service (`InMemorySessionService`). However, you can use any session service with Memory.

If you want to use Google's pre-built Memory tools, you can provide a Memory service to the Runner as well. In this example, we'll start by directly invoking memory so that we can compare different services. So, we're not directly setting `memory_service` in the Runner.

In [6]:
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai.types import Content, Part

APP_NAME = "memory_example_app"
MODEL = "gemini-2.5-flash"

agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
)

session_service = InMemorySessionService()

runner = Runner(agent=agent, app_name=APP_NAME, session_service=session_service)


def call_agent(query, session, user_id):
    content = Content(role="user", parts=[Part(text=query)])
    events = runner.run(user_id=user_id, session_id=session, new_message=content)

    for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)

## Introduction to ADK Memory Service
ADK memory services offer an interface for creating and searching for memories. The [`BaseMemoryService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/base_memory_service.py) defines the interface for managing this searchable, long-term knowledge store. Its primary responsibilities are:

* Ingesting Information (`add_session_to_memory`): Taking the contents of a (usually completed) Session and adding relevant information to the long-term knowledge store.
* Searching Information (`search_memory`): Allowing an agent (typically via a Tool) to query the knowledge store and retrieve relevant snippets or context based on a search query.

In this colab, we'll cover two options:

* `InMemoryMemoryService`: A in-memory service that allows you to search your conversation history to retrieve the raw dialogue using basic keyword matching.
* `VertexAiMemoryBankService`: A remote service that remotely generates memories for "meaningful" information from user conversation.


You can also interact directly with Memory Bank by using the Agent Engine SDK.

### `InMemoryMemoryService`

[`InMemoryMemoryService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/in_memory_memory_service.py) persists **all** information in-memory. The events are not condensed, and information is not extracted from them.

In [7]:
from google.adk.memory import InMemoryMemoryService

in_memory_service = InMemoryMemoryService()

### `VertexAiMemoryBankService`
Memory Bank will only persist information that's meaningful for future interactions. [`VertexAiMemoryBankService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/vertex_ai_memory_bank_service.py) provides a built-in integration between Memory Bank and ADK.

To get started with Memory Bank, you need to first create an Agent Engine. This only takes a few seconds.

If you don't have a Google Cloud project, you can use [Vertex in Express mode, which only requires an API key](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/set-up#authentication).

In [8]:
import vertexai
from google.adk.memory import VertexAiMemoryBankService

client = vertexai.Client(project=PROJECT, location=LOCATION)

agent_engine = client.agent_engines.create(
    config={
        "context_spec": {
            "memory_bank_config": {
                "generation_config": {
                    "model": f"projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/gemini-2.5-flash"
                }
            }
        }
    }
)

memory_bank_service = VertexAiMemoryBankService(
    agent_engine_id=agent_engine.api_resource.name.split("/")[-1],
    project=PROJECT,
    location=LOCATION,
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=zawadi-tunu-475816.
INFO:vertexai_genai.agentengines:Agent Engine created. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')


## Generating Memories

Memories will be generated from your conversation history stored in the `Session` object. The ADK `add_session_to_memory` method is responsible for triggering memory generation for your memory service, but the exact behavior depends on which service you use:

* `InMemoryMemoryService` persists all information, including the raw dialogue.

* `VertexAiMemoryBankService` only persists *meaningful* information. Not all conversations will result in generated memories. Additionally, the information will be condensed to individual, self-contained memories.

In this example, we create two sessions:
1. `chit_chat_session`, which is a conversation that contains events that may not be meaningful for future interactions.
2. `recommendations_session`, which is a conversation about looking for birthday presents for a family member. It includes implicit information that may be helpful for future conversations.

In [9]:
USER_ID = "My User"

chit_chat_session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID
)

call_agent(
    "what types of questions do you answer", chit_chat_session.id, user_id=USER_ID
)
# Refresh session object.
chit_chat_session = await session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=chit_chat_session.id
)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"what types of questions do you answer"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
As Generic_QA_Agent, I'm designed to answer a very wide variety of questions across numerous topics. My capabilities include, but are not limited

Agent Response:  As Generic_QA_Agent, I'm designed to answer a very wide variety of questions across numerous topics. My capabilities include, but are not limited to:

*   **Factual Information:** Questions about history, science, geography, current events, definitions, statistics, and more.
*   **Explanations:** Breaking down complex concepts, explaining how things work, or elaborating on a topic.
*   **How-to Guides:** Providing instructions or steps for various tasks.
*   **Summaries:** Condensing longer texts or topics into key points.
*   **Creative Writing/Brainstorming:** Generating ideas, writing short pieces of text, or helping with creative tasks.
*   **Comparisons:** Differentiating between similar items or concepts.
*   **Troubleshooting (conceptual):** Offering general advice or common solutions to problems (though I can't interact with systems directly).
*   **General Knowledge:** Anything that falls under the umbrella of widely available information.

Essentially, if it'

In [10]:
recommendations_session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID
)

call_agent(
    "I have a three year old niece. What recommendations do you have for birthday presents?",
    recommendations_session.id,
    user_id=USER_ID,
)
call_agent(
    "I like the idea of a bike. Are there any age appropriate options?",
    recommendations_session.id,
    user_id=USER_ID,
)

recommendations_session = await session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=recommendations_session.id
)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"I have a three year old niece. What recommendations do you have for birthday presents?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
That's wonderful! Three-year-olds are at such a fun age, full of imagination and eager to explor

Agent Response:  That's wonderful! Three-year-olds are at such a fun age, full of imagination and eager to explore. Here are some birthday present recommendations, categorized to help you find the perfect gift for your niece:

**1. Imaginative & Pretend Play:**
*   **Play Kitchen or Doctor's Kit:** Kids this age love to imitate adults. A play kitchen with food and utensils, or a doctor's kit, can provide hours of creative role-playing.
*   **Dress-Up Clothes:** A simple cape, a princess dress, an animal costume, or even just a fun hat can spark endless adventures.
*   **Dollhouse or Animal Figures:** Simple dollhouses with chunky furniture or sets of realistic (or fantastical) animal figures encourage storytelling and world-building.
*   **Puppets:** Hand puppets are great for putting on shows, developing language skills, and expressing emotions.

**2. Creative & Artistic:**
*   **Washable Markers, Crayons & Large Paper:** Look for chunky crayons that are easy for small hands to grip. 

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
Yes, absolutely! A bike can be a fantastic gift for a three-year-old, encouraging outdoor play, developing motor skills, and building confidence. For this age group, the focus is usually on **learning fundamentals** and **safety**.

Here are the most age-appropriate options:

1.  **Balance Bikes (Highly Recommended)**
    *   **What it is:** A two-wheeled bike without pedals or training wheels. Kids propel themselves by pushing off the ground with their feet.
    *   **Why it's great for 3-year-olds:** This is often the *best* option for learning to ride a bike. It teaches children the most crucial skill first: **balance**. They learn to steer, lean into turns, and control their speed using their feet.
    *   **Benefits:**
        *   **Teaches Balance First:** This makes the transi

Agent Response:  Yes, absolutely! A bike can be a fantastic gift for a three-year-old, encouraging outdoor play, developing motor skills, and building confidence. For this age group, the focus is usually on **learning fundamentals** and **safety**.

Here are the most age-appropriate options:

1.  **Balance Bikes (Highly Recommended)**
    *   **What it is:** A two-wheeled bike without pedals or training wheels. Kids propel themselves by pushing off the ground with their feet.
    *   **Why it's great for 3-year-olds:** This is often the *best* option for learning to ride a bike. It teaches children the most crucial skill first: **balance**. They learn to steer, lean into turns, and control their speed using their feet.
    *   **Benefits:**
        *   **Teaches Balance First:** This makes the transition to a pedal bike much easier and faster later on.
        *   **Builds Confidence:** Kids feel in control because their feet are always on the ground.
        *   **Develops Coordinatio

### InMemoryMemoryService

First, let's see what memories are extracted from the conversation using `InMemoryMemoryService`

The memories include the raw dialogue of the conversation.

In [11]:
await in_memory_service.add_session_to_memory(chit_chat_session)
await in_memory_service.add_session_to_memory(recommendations_session)

In [12]:
# Peek at the memories.
in_memory_service._session_events

{'memory_example_app/My User': {'3acd1ca4-c69f-488c-a3d1-4b2a7a865d17': [Event(content=Content(
     parts=[
       Part(
         text='what types of questions do you answer'
       ),
     ],
     role='user'
   ), grounding_metadata=None, partial=None, turn_complete=None, finish_reason=None, error_code=None, error_message=None, interrupted=None, custom_metadata=None, usage_metadata=None, live_session_resumption_update=None, input_transcription=None, output_transcription=None, avg_logprobs=None, logprobs_result=None, cache_metadata=None, citation_metadata=None, invocation_id='e-74a1d305-d555-4de5-9364-16c46d9da517', author='user', actions=EventActions(skip_summarization=None, state_delta={}, artifact_delta={}, transfer_to_agent=None, escalate=None, requested_auth_configs={}, requested_tool_confirmations={}, compaction=None, end_of_agent=None, agent_state=None), long_running_tool_ids=None, branch=None, id='6a9c53c2-6d3d-423a-a2c5-280a5abb61b1', timestamp=1761065557.325974),
   Event(c

In [13]:
await in_memory_service.search_memory(app_name=APP_NAME, user_id=USER_ID, query="test")

SearchMemoryResponse(memories=[])

In [14]:
await in_memory_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="what should I get my mom for mother's day",
)

SearchMemoryResponse(memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='what types of questions do you answer'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T16:52:37.325974'), MemoryEntry(content=Content(
  parts=[
    Part(
      text="""As Generic_QA_Agent, I'm designed to answer a very wide variety of questions across numerous topics. My capabilities include, but are not limited to:

*   **Factual Information:** Questions about history, science, geography, current events, definitions, statistics, and more.
*   **Explanations:** Breaking down complex concepts, explaining how things work, or elaborating on a topic.
*   **How-to Guides:** Providing instructions or steps for various tasks.
*   **Summaries:** Condensing longer texts or topics into key points.
*   **Creative Writing/Brainstorming:** Generating ideas, writing short pieces of text, or helping with creative tasks.
*   **Comparisons:** Differentiating between similar items or concepts.


### Memory Bank via `VertexAiMemoryBankService`

Now, let's use Memory Bank to generate and store memories. Unlike `InMemoryMemoryService`, only content that's meaningful interactions will be persisted. If there's no meaningful content, the `add_session_to_memory` call will be a no-op; memories are not generated.

Memory generation happens in the background. It's a non-blocking operation, so `add_session_to_memory` may complete before memories are actually generated in the background. If you want memory generation to be a blocking operation, use the Agent Engine SDK:

```
client.agent_engines.memories.generate(
    ...,
    config={
        # Default value is True.
        "wait_for_completion": True
    }
)
```

There are two key operations that Memory Bank performs under-the-hood with memory generation:

* **Extraction**: Extracts information about the user from their conversations with the agent. Only information that matches at least one of your instance's memory topics will be persisted.
* **Consolidation**: Identifies if existing memories with the same scope should be deleted or updated based on the extracted information. Memory Bank checks that new memories are not duplicative or contradictory before merging them with existing memories. If existing memories don't overlap with the new information, a new memory will be created.

#### ADK `add_session_to_memory`

In [15]:
await memory_bank_service.add_session_to_memory(chit_chat_session)
await memory_bank_service.add_session_to_memory(recommendations_session)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/3716771325369384960' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T17:49:07.739569Z', 'updateTime': '2025-10-21T17:49:07.739569Z'}} done=None error=None response=None
INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/8310442945287290880' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'c

If you run this immediately after `add_session_to_memory`, there may be no memories printed. `add_session_to_memory` is a non-blocking function and memory generation can take a few seconds to run.

In [16]:
await memory_bank_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="What should I get my mom for mother's day",
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9339746950956977 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 801749, tzinfo=TzInfo(UTC)),
  fact='I am looking for birthday present recommendations for my three-year-old niece.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/7415365352709685248',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 801749, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9903799096779854 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 797140, tzinfo=TzInfo(UTC)),
  fact='I like the idea of a bike as a birthday present for my three-year-old niece.',
  name='projects

SearchMemoryResponse(memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='I am looking for birthday present recommendations for my three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:49:10.801749+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I like the idea of a bike as a birthday present for my three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:49:10.797140+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I have a three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:49:10.796991+00:00')])

#### Agent Engine SDK `GenerateMemories`

If I want to interact directly with Memory Bank, I can use the Agent Engine SDK. For example, I can use [`RetrieveMemories`](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/fetch-memories#scope-based) to retrieve my memories with or without similarity search.

Memories are isolated by their `scope`, which is an arbitrary dictionary. `VertexAiMemoryBankService` uses the scope keys `app_name` and `user_id`. `app_name` is set by your Runner, and `user_id` is set when creating your `Session`.

In [17]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME,
           "user_id": USER_ID},
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5109522343495991296',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 797140, tzinfo=TzInfo(UTC)),
     fact='I like the idea of a bike as a birthday present for my three-year-old niece.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/2803679334282297344',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10,

You can also use `Retrievememories` to retrieve memories using similarity search.

In [18]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    similarity_search_params={
        "search_query": "what should I get my niece for her birthday?"
    },
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   distance=0.5668985003612532,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 801749, tzinfo=TzInfo(UTC)),
     fact='I am looking for birthday present recommendations for my three-year-old niece.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/7415365352709685248',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 801749, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7347648884243799,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5109522343495991296',
     scope={
       'app_name': 'memory_example_app',
       'user_i

Memory Bank uses a process called "consolidation" to ensure that we do not upload duplicative or contradictory information for the same scope.

In [19]:
import pprint

operation = client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [
                        {
                            "text": "I like the idea of getting my niece a bike for her third birthday"
                        }
                    ],
                }
            }
        ]
    },
)
print("***OPERATION RESPONSE***")
pprint.pprint(operation)

print("\n***GENERATED MEMORIES***")
for generated_memory in operation.response.generated_memories:
    pprint.pprint(client.agent_engines.memories.get(name=generated_memory.memory.name))

***OPERATION RESPONSE***
AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T17:49:38.547396Z',
      'updateTime': '2025-10-21T17:49:43.251717Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/7842068584040759296',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.UPDATED: 'UPDATED'>,
        memory=Memory(
          name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/2803679334282297344'
        )
      ),
    ]
  )
)

***GENERATED MEMORIES***
Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 797140, tzinfo=TzInfo(UTC)),
  fact='I like the idea of getting my niece a bike for her

What if your agent extracted memories that you want to combine with your existing memories? You can also use `GenerateMemories` to consolidate pre-extracted facts for the same scope [documenation](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories#consolidate-pre-extracted-memories).

Memory Bank already contains memories on this topic. So, the relevant existing memories will be updated rather than creating a new, duplicative memory.

In [20]:
operation = client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    direct_memories_source={"direct_memories": [{"fact": "I got my niece a red bike"}]},
)

print("***OPERATION RESPONSE***")
pprint.pprint(operation)

print("\n***GENERATED MEMORIES***")
for generated_memory in operation.response.generated_memories:
    if generated_memory.action != "DELETED":
        pprint.pprint(
            client.agent_engines.memories.get(name=generated_memory.memory.name)
        )

***OPERATION RESPONSE***
AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T17:49:47.652179Z',
      'updateTime': '2025-10-21T17:49:56.564816Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/2897116193187954688',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/7865725315446734848'
        )
      ),
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.DELETED: 'DELETED'>,
        memory=Memory(
          name='projects/507092439621/

#### Automate with Callbacks

To automate the triggering of memory generation, you can implement a callback to generate memories after every turn. See the [`Remember this: Agent state and memory with ADK` blog post](https://cloud.google.com/blog/topics/developers-practitioners/remember-this-agent-state-and-memory-with-adk) for more information.

In [21]:
from google import adk


async def auto_save_session_to_memory_callback(callback_context):
    # Use the invocation context to access the conversation history that should
    # be used as the data source for memory generation.
    await memory_bank_service.add_session_to_memory(
        callback_context._invocation_context.session
    )
    print("\n****Triggered memory generation****\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    after_agent_callback=auto_save_session_to_memory_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [22]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "What christmas present recommendations do you have for my niece?",
    session.id,
    user_id=USER_ID,
)

call_agent(
    "I think I'll get her a doll. What types are good for her age?",
    session.id,
    user_id=USER_ID,
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8091518668940462 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5109522343495991296',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8478242206712423 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 56, 485525, tzinfo=TzInfo(UTC)),
  fact='I got my niece a red bike.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/78657253154467348

Agent Response:  Since your niece is three years old and already has a red bike, here are some Christmas present recommendations that she might enjoy:

1.  **Creative Art Supplies:** Washable crayons or markers, large paper, a coloring book with her favorite characters, or a Play-Doh set with various tools. These encourage creativity and fine motor skills.
2.  **Imaginative Play:**
    *   **Dress-up Clothes:** A princess gown, a superhero cape, or animal costumes can spark hours of imaginative play.
    *   **Play Kitchen Set:** A small play kitchen with toy food and utensils can be a big hit for role-playing.
    *   **Dollhouse or Animal Figures:** Simple, durable sets that encourage storytelling.
3.  **Building Toys:** Large building blocks like LEGO Duplo or magnetic tiles are excellent for developing spatial reasoning and creativity.
4.  **Picture Books:** A collection of engaging and colorful picture books. You could look for interactive books with flaps or textures.
5.  **Age-A

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/7076456647387774976' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T17:50:13.982465Z', 'updateTime': '2025-10-21T17:50:13.982465Z'}} done=None error=None response=None



****Triggered memory generation****



INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9376007979564559 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5109522343495991296',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9480893672918641 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 49, 56, 485525, tzinfo=TzInfo(UTC)),
  fact='I got my niece a red bike.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/78657253154467348

Agent Response:  That's a wonderful idea! Dolls are fantastic for fostering nurturing play, imagination, and social-emotional development in three-year-olds.

Here are some types of dolls that are generally great for her age:

1.  **Soft-Bodied Baby Dolls:** These are often a huge hit. They're perfect for cuddling, rocking, and mimicking parental care. Look for ones with soft bodies and gentle, un-stylable hair (or molded hair) that won't tangle easily. Many come with simple accessories like a bottle or a blanket.
    *   **Why they're good:** Encourage empathy, role-playing, and are easy for small hands to hold and hug.

2.  **Rag Dolls:** Classic and timeless, rag dolls are typically very soft, safe (no hard parts to worry about), and often have friendly, embroidered faces. They're durable and can withstand a lot of love and play.
    *   **Why they're good:** Extremely huggable, safe, and often machine washable.

3.  **Toddler-Sized Dolls (larger than baby dolls, smaller than Americ

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/361589602978365440' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T17:50:22.572070Z', 'updateTime': '2025-10-21T17:50:22.572070Z'}} done=None error=None response=None



****Triggered memory generation****



In [23]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    similarity_search_params={
        "search_query": "What was I thinking about getting my niece for Christmas?"
    },
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   distance=0.6968911070737007,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC)),
     fact='I will get my niece a doll for Christmas.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/6712803810839887872',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7943347759252403,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 17, 49, 10, 796991, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5109522343495991296',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_tim

The prior example used the entire Session to generate memories. However, this can be costly as the size of the session increases.

You can also use callbacks to only save the current turn to Memory Bank. This has the downside, however, where you lose some of the context of the conversation.

In [24]:
async def auto_save_user_turn_to_memory_callback(callback_context):
    last_turn = callback_context._invocation_context.user_content
    client.agent_engines.memories.generate(
        name=agent_engine.api_resource.name,
        scope={"app_name": APP_NAME, "user_id": USER_ID},
        direct_contents_source={"events": [{"content": last_turn}]},
        config={"wait_for_completion": False},
    )
    print("\n****Triggered memory generation****\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    before_agent_callback=auto_save_user_turn_to_memory_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [25]:
call_agent(
    "I think I'll get her a doll. What types are good for her age?",
    session.id,
    user_id=USER_ID,
)


****Triggered memory generation****



INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.6383386611799132 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC)),
  fact='I will get my niece a doll for Christmas.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/6712803810839887872',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9222959753372013 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834273, tzinfo=TzInfo(UTC)),
  fact='My niece already has a red bike.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/

Agent Response:  That's a lovely choice! Dolls are wonderful for fostering imaginative play, nurturing skills, and emotional development in three-year-olds.

Here are the types of dolls that are generally excellent for her age:

1.  **Soft-Bodied Baby Dolls:** These are often a favorite because they're perfect for cuddling, rocking, and mimicking caretaking. Look for ones with soft bodies and easy-to-manage hair (either molded or soft, non-tangling fiber). Many come with simple accessories like a bottle or a blanket.
    *   **Why they're great:** They encourage empathy, role-playing, and are perfectly sized for small arms to hug.

2.  **Rag Dolls:** Timeless and sweet, rag dolls are typically very soft, safe (no hard parts), and usually have embroidered features. They are quite durable and can withstand lots of play.
    *   **Why they're great:** Extremely huggable, very safe, and often machine washable.

3.  **Toddler-Sized Dolls:** These dolls often look like slightly older childre

## Using Memory in your agent

Now, let's link our agent to Memory, so that memories can be fetched by your agent and used for inference. This allows your agent to remember information from prior sessions in a new, empty session.

You can use ADK's built-in tools to fetch memories and incorporate them in the prompt. When using the built-in tools, it's important to provide both a memory tool when creating your Agent and a memory service when defining your Runner.

### `adk.PreloadMemoryTool`
The `PreloadMemoryTool` always retrieves memories at the start of each turn and includes the memories in the system instruction.

Under the hood, it invokes `memory_service.search_memory` for the given User and App Name. Memory is invoked before the LLM is called, so there will not be an associated tool call logged for memory.

We're going to use a [callback](https://google.github.io/adk-docs/callbacks/) to print out the System Instructions. Since we use `before_model_callback`, the callback executes right before the model is called.

<img src="https://google.github.io/adk-docs/assets/callback_flow.png">

In [26]:
from google import adk


def log_system_instructions(callback_context, llm_request):
    """A callback to print the LLM request."""
    print(
        f"\n*System Instruction*:\n{llm_request.config.system_instruction}\n*********\n"
    )


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    before_model_callback=log_system_instructions,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

This conversation uses a new session, so it doesn't have access to the user's conversation history without using memory.

Since we're using `PreloadMemoryTool`, the retrieved memories will be appended to the System Instructions.

In [27]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "Can you remind me what I got my niece for her christmas?",
    session.id,
    user_id=USER_ID,
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.7431920531286359 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC)),
  fact='I will get my niece a doll for Christmas.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/6712803810839887872',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834190, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8400188529646397 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 50, 27, 834273, tzinfo=TzInfo(UTC)),
  fact='My niece already has a red bike.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".

The following content is from your previous conversations with the user.
They may be useful for answering the user's current query.
<PAST_CONVERSATIONS>
Time: 2025-10-21T17:50:27.834190+00:00
user: I will get my niece a doll for Christmas.
Time: 2025-10-21T17:50:27.834273+00:00
user: My niece already has a red bike.
Time: 2025-10-21T17:50:27.830958+00:00
user: I have a three-year-old niece.
</PAST_CONVERSATIONS>

*********



INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
You mentioned that you *will get* your niece a doll for Christmas.
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw response:
{"sdk_http_response":{"headers":{"Content-Type":"application/json; charset=UTF-8","Vary":"Origin, X-Origin, Referer","Content-Encoding":"gzip","Date":"Tue, 21 Oct 2025 17:51:45 GMT","Server":"scaffolding on HTTPServer2","X-XSS-Protection":"0","X-Frame-Options":"SAMEORIGIN","X-Content-Type-Options":"nosniff","Transfer-Encoding":"chunked"}},"candidates":[{"content":{"parts":[{"text":"You mentioned that you *will get* your niece a doll for Christmas."}],"role":"model"},"finish_reason":"STOP","avg_logprobs":-4.229559326171875}],"create_time":"2025-10-21T17:51:43.043243Z","m

Agent Response:  You mentioned that you *will get* your niece a doll for Christmas.


### adk.LoadMemoryTool

Now, let's use `LoadMemoryTool`. Unlike `PreloadMemoryTool`, this tool acts like a standard tool. Your agent needs to decide whether the tool should be invoked.

In [28]:
from google import adk


def log_tool_call(tool, args, tool_response, **kwargs):
    """A callback to print the LLM request."""
    print(f"\n*Tool*:\n{tool}")
    print(f"\n*Tool call*:\n{args}")
    print(f"\n*Tool response*:\n{tool_response}\n*********\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.load_memory_tool.LoadMemoryTool()],
    before_model_callback=log_system_instructions,
    after_tool_callback=log_tool_call,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [29]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "Can you remind me what I got my niece for her christmas?",
    session.id,
    user_id=USER_ID,
)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Can you remind me what I got my niece for her christmas?"}],"role":"user"}
-----------------------------------------------------------
Functions:
load_memory: {'query': {'type': <Type.STRING: 'STRING'>}} 
-----------------------------------------------------------




*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

*********



INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
None
-----------------------------------------------------------
Function calls:
name: load_memory, args: {'query': 'what I got my niece for her christmas'}
-----------------------------------------------------------
Raw response:
{"sdk_http_response":{"headers":{"Content-Type":"application/json; charset=UTF-8","Vary":"Origin, X-Origin, Referer","Content-Encoding":"gzip","Date":"Tue, 21 Oct 2025 17:51:56 GMT","Server":"scaffolding on HTTPServer2","X-XSS-Protection":"0","X-Frame-Options":"SAMEORIGIN","X-Content-Type-Options":"nosniff","Transfer-Encoding":"chunked"}},"candidates":[{"content":{"parts":[{"thought_signature":"CowDAePx_1760i0aAYf1UaQ_B-CxAM4Eh1ycvRY2l-VYGN1IQf2xV97nMiCa_uLbhVjT39HkuNN9L7Qj7em5Dwet5xUywmYBHYp7qf7ZQWTFjPBDCMBdakQJN-crTV15B-dfNq0HsMM96mFkd1XUwUa_gTUc1regD_M-D


*Tool*:
<google.adk.tools.load_memory_tool.LoadMemoryTool object at 0x7c24fe67d970>

*Tool call*:
{'query': 'what I got my niece for her christmas'}

*Tool response*:
memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='I will get my niece a doll for Christmas.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:50:27.834190+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I have a three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:50:27.830958+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='My niece already has a red bike.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T17:50:27.834273+00:00')]
*********


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory fu

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
You were going to get your niece a doll for Christmas.
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw response:
{"sdk_http_response":{"headers":{"Content-Type":"application/json; charset=UTF-8","Vary":"Origin, X-Origin, Referer","Content-Encoding":"gzip","Date":"Tue, 21 Oct 2025 17:51:58 GMT","Server":"scaffolding on HTTPServer2","X-XSS-Protection":"0","X-Frame-Options":"SAMEORIGIN","X-Content-Type-Options":"nosniff","Transfer-Encoding":"chunked"}},"candidates":[{"content":{"parts":[{"thought_signature":"CqIDAePx_15bLima_AQAvZPM_AUjSHikMTXg9nSkkW06P0JxNmG5co4qcCu2JShvYO49rXwHPHMl6cHxhDFcOvNIunTvKwVJUtH39wP9clvYl06HRoN3XpdTBMZr7UF3jPxWMRYPgZGnFCMuF_7-uHDVIf5nQ3xSYKdyz9g-CjQW6bkU8Y_pHkHQqZVAIV

Agent Response:  You were going to get your niece a doll for Christmas.


If the agent decides that Memory is not helpful for a given query, the Memory tool will not be invoked.

In [30]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent("Hi!", session.id, user_id=USER_ID)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
-----------------------------------------------------------
Functions:
load_memory: {'query': {'type': <Type.STRING: 'STRING'>}} 
-----------------------------------------------------------




*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

*********



INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
Hello! How can I help you today?
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw response:
{"sdk_http_response":{"headers":{"Content-Type":"application/json; charset=UTF-8","Vary":"Origin, X-Origin, Referer","Content-Encoding":"gzip","Date":"Tue, 21 Oct 2025 17:52:04 GMT","Server":"scaffolding on HTTPServer2","X-XSS-Protection":"0","X-Frame-Options":"SAMEORIGIN","X-Content-Type-Options":"nosniff","Transfer-Encoding":"chunked"}},"candidates":[{"content":{"parts":[{"thought_signature":"CrcBAePx_14tg5w61n6ClSibyGORUIhAythlnwSm7PUREE69th9U9FBCF-bdraam4pTxf9CIJf2nohPnm5GWueC677q-U41wKfcjpCny-KjTVMmSvIS16kwOTwOnZ-tRsVkFhQs_2nkXQH_ZuiAbinjYU1CIbZsoBJKJ4Y_1Ie1i0TpPgO9ko2XTpb4xV0GpALe0M19M-Bpxpbvs4gDp

Agent Response:  Hello! How can I help you today?


### Custom callback to retrieve memories

If you want more control over how memories are retrieved, you can create your own custom callback.

For example, you can use the Agent Engine SDK within the callback to retrieve memories without similarity search. This is lower latency than using similarity search.

You can also customize your `scope` rather than using the default "app_name" and "user_id" scope keys used by ADK's memory service.

In [31]:
def retrieve_memories_callback(callback_context, llm_request):
    user_id = callback_context._invocation_context.user_id

    response = client.agent_engines.memories.retrieve(
        name=agent_engine.api_resource.name,
        # Unlike the ADK Memory Service, this does not use App Name in the scope.
        scope={"user_id": user_id},
    )
    memories = [f"* {memory.memory.fact}" for memory in list(response)]
    if not memories:
        # No memories to add to System Instructions.
        return
    # Append formatted memories to the System Instructions
    llm_request.config.system_instruction += (
        "\nHere is information that you have about the user:\n"
    )
    llm_request.config.system_instruction += "\n".join(memories)


def log_tool_call(tool, args, tool_response, **kwargs):
    """A callback to print the LLM request."""
    print(f"\n*Tool*:\n{tool}")
    print(f"\n*Tool call*:\n{args}")
    print(f"\n*Tool response*:\n{log_tool_call}\n*********\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    before_model_callback=retrieve_memories_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

Since we're using a different scope than the ADK Memory Service, we don't have access to the prior memories. Memories are isolated by their scope dictionary.

In [32]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent("Hi!", session.id, user_id=USER_ID)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
Hi there! How can I help you today?
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw re

Agent Response:  Hi there! How can I help you today?


Let's first generate memories for our new scope.

In [33]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I have four nieces!"}]}}
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T17:52:17.058967Z',
      'updateTime': '2025-10-21T17:52:19.343624Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/2374698636412977152',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/1470613844580630528'
        )
      ),
    ]
  )
)

In [34]:
call_agent("What information do you know about me?", session.id, user_id=USER_ID)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
Here is information that you have about the user:
* I have four nieces.
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
{"parts":[{"text":"Hi there! How can I help you today?"}],"role":"model"}
{"parts":[{"text":"What information do you know about me?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------

Agent Response:  I know that you have four nieces!


## Customize your Memory Bank's behavior

So far, we've used Memory Bank with it's default settings. Now, let's customize some configurations. The configuration is set when creating or updating your Agent Engine.

### Customizing Memory Extraction

By default, Memory Bank considers memories that fit the following topics to be meaningful:

* **Personal information (`USER_PERSONAL_INFO`)**: Significant personal information about the user, like names, relationships, hobbies, and important dates. For example, "I work at Google" or "My wedding anniversary is on December 31".
* **User preferences (`USER_PREFERENCES`)**: Stated or implied likes, dislikes, preferred styles, or patterns. For example, "I prefer the middle seat."
* **Key conversation events and task outcomes (`KEY_CONVERSATION_DETAILS`)**: Important milestones or conclusions within the dialogue. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
* **Explicit remember / forget instructions (`EXPLICIT_INSTRUCTIONS`)**: Information that the user explicitly asks the agent to remember or forget. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."

These are managed topics where Memory Bank manages their definition. If you want to use a subset of these topics or use custom topics, you can provide a "CustomizationConfig" to customize what information Memory Bank should consider meaningful to persist.

#### Managed Topics

We're going to update our Memory Bank to only extract user preferences.

This will take affect for all requests to Memory Bank, regardless of whether you use ADK or Agent Engine SDK.

In [35]:
user_preferences_config = {
    "scope_keys": ["user_id"],
    "memory_topics": [
        {"managed_memory_topic": {"managed_topic_enum": "USER_PREFERENCES"}}
    ],
}


client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "customization_configs": [user_preferences_config],
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=zawadi-tunu-475816.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')


AgentEngine(api_resource.name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')

Memories that fit the `USER_PREFERENCES` category will be persisted.

In [36]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I prefer the aisle seat"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T17:52:47.966770Z',
      'updateTime': '2025-10-21T17:52:50.761164Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/9222421869829816320',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5713004693563637760'
        )
      ),
    ]
  )
)

Memories that don't fit `USER_PREFERENCES` won't be persisted.

In [40]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I'm going to get my niece a red bike"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T18:13:49.748610Z',
      'updateTime': '2025-10-21T18:13:51.325266Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/6290859987388334080'
)

#### Custom Topics
Alternatively, you can define your own custom topics where you define your own labels, descriptions, and examples.

In [39]:
from vertexai.types import MemoryBankCustomizationConfig as CustomizationConfig
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExample as GenerateMemoriesExample,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSource as ConversationSource,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSourceEvent as ConversationSourceEvent,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleGeneratedMemory as ExampleGeneratedMemory,
)
from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import (
    MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic as CustomMemoryTopic,
)

memory_topic = MemoryTopic(
    custom_memory_topic=CustomMemoryTopic(
        label="business_feedback",
        description="""Specific user feedback about their experience at
the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
staff friendliness, service speed, cleanliness, and any suggestions for
improvement.""",
    )
)


example = GenerateMemoriesExample(
    conversation_source=ConversationSource(
        events=[
            ConversationSourceEvent(
                content=Content(
                    role="model",
                    parts=[
                        Part(
                            text="Welcome back to The Daily Grind! We'd love to hear your feedback on your visit."
                        )
                    ],
                )
            ),
            ConversationSourceEvent(
                content=Content(
                    role="user",
                    parts=[
                        Part(
                            text="Hey. The drip coffee was a bit lukewarm today, which was a bummer. Also, the music was way too loud, I could barely hear my friend."
                        )
                    ],
                )
            ),
        ]
    ),
    generated_memories=[
        ExampleGeneratedMemory(
            fact="The user reported that the drip coffee was lukewarm."
        ),
        ExampleGeneratedMemory(
            fact="The user felt the music in the shop was too loud."
        ),
    ],
)

noop_example = GenerateMemoriesExample(
    conversation_source=ConversationSource(
        events=[
            ConversationSourceEvent(
                content=Content(
                    role="model",
                    parts=[
                        Part(
                            text="Welcome back to The Daily Grind! We'd love to hear your feedback on your visit."
                        )
                    ],
                )
            ),
            ConversationSourceEvent(
                content=Content(
                    role="user", parts=[Part(text="Thanks for the coffee!")]
                )
            ),
        ]
    ),
    generated_memories=[],
)

client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "customization_configs": [
                    CustomizationConfig(
                        memory_topics=[memory_topic],
                        generate_memories_examples=[example, noop_example],
                    )
                ],
            }
        }
    },
)

ModuleNotFoundError: No module named 'vertexai.types'

In [41]:
client.agent_engines.memories.generate(
    scope={"user_id": "123"},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I prefer the aisle seat"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T18:14:33.909493Z',
      'updateTime': '2025-10-21T18:14:36.175732Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/8200386229393424384',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5433781516666667008'
        )
      ),
    ]
  )
)

In [42]:
import vertexai

client.agent_engines.memories.generate(
    scope={"user_id": "123"},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "You should have almond milk"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T18:14:40.858286Z',
      'updateTime': '2025-10-21T18:14:43.162246Z'
    }
  },
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/operations/8439077009644060672'
)

In [43]:
list(
    client.agent_engines.memories.retrieve(
        name=agent_engine.api_resource.name, scope={"user_id": "123"}
    )
)

[RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 18, 14, 35, 115215, tzinfo=TzInfo(UTC)),
     fact='I prefer the aisle seat.',
     name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/5433781516666667008',
     scope={
       'user_id': '123'
     },
     update_time=datetime.datetime(2025, 10, 21, 18, 14, 35, 115215, tzinfo=TzInfo(UTC))
   )
 )]

### Dynamic TTL

You can also configure Memory Bank to automatically set the TTL for any generated or manually created (or updated) memories.

#### Default TTL

Default TTL applies to all operations that create or update a Memory.

In [44]:
client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "ttl_config": {
                    # 30 days
                    "default_ttl": f"{60 * 60 * 24 * 30}s"
                }
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=zawadi-tunu-475816.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')


AgentEngine(api_resource.name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')

In [45]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I have four nieces!"}]}}
        ]
    },
    config={"wait_for_completion": True},
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

Memory(
  create_time=datetime.datetime(2025, 10, 21, 17, 52, 18, 443509, tzinfo=TzInfo(UTC)),
  expire_time=datetime.datetime(2025, 11, 20, 18, 14, 56, 356299, tzinfo=TzInfo(UTC)),
  fact='I have four nieces.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/1470613844580630528',
  scope={
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 18, 14, 56, 359695, tzinfo=TzInfo(UTC))
)

#### Granular TTL

The TTL is calculated based on which operation created or updated the Memory. If not set for a given operation, then the operation won't update the Memory's expiration time.

Parameters include:
 * `create_ttl`: Applies to Memories created with CreateMemory operations
 * `generate_created_ttl`: Applies to Memories created with GenerateMemories
 * `generate_updated_ttl`: Applies to Memories updated with GenerateMemories

In [46]:
client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "ttl_config": {
                    "granular_ttl_config": {
                        # 365 days
                        "generate_created_ttl": f"{24 * 60 * 60 * 365}s",
                    }
                }
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=zawadi-tunu-475816.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')


AgentEngine(api_resource.name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216')

In [47]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "Remember that I work at Zindua School!"}],
                }
            }
        ]
    },
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

Memory(
  create_time=datetime.datetime(2025, 10, 21, 18, 15, 6, 379921, tzinfo=TzInfo(UTC)),
  expire_time=datetime.datetime(2026, 10, 21, 18, 15, 6, 372786, tzinfo=TzInfo(UTC)),
  fact='I work at Zindua School.',
  name='projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/7397350954200203264',
  scope={
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 18, 15, 6, 379921, tzinfo=TzInfo(UTC))
)

In [48]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I work at Google!"}]}}
        ]
    },
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

ClientError: 404 NOT_FOUND. {'error': {'code': 404, 'message': 'Memory projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216/memories/7397350954200203264 was not found.', 'status': 'NOT_FOUND'}}

## Cleaning up

It's always a best practice in cloud development to clean up resources you no longer need to avoid incurring unexpected costs. This final cell deletes the AgentEngine resources we created throughout this tutorial.

In [49]:
delete_agent_engines = True

agent_engine_name = agent_engine.api_resource.name

if delete_agent_engines:
    # Delete agent engine
    client.agent_engines.delete(name=agent_engine_name, force=True)

INFO:vertexai_genai.agentengines:Deleting AgentEngine resource: projects/507092439621/locations/us-central1/reasoningEngines/241140492157321216
INFO:vertexai_genai.agentengines:Started AgentEngine delete operation: projects/507092439621/locations/us-central1/operations/7377353397491466240
