<a href="https://colab.research.google.com/github/perfectmalcolm/Zawadi-Tunu/blob/main/zawadi_tunu_with_memory_bank_on_adk.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Get started with Zawadi Tunu Built with Memory Bank on ADK

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/mwanyumba7/Zawadi-Tunu/blob/main/zawadi_tunu_with_memory_bank_on_adk.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
</table>

## Overview
We will be building an Agent `Zawadi Tunu` which is an agent ispecialized in gift recommendation engine designed to elevate the act of gift-giving, transforming it from a chore into a deeply personalized and meaningful experience. Built as a demonstration of the Agent Development Kit (ADK) for Gemini and the Vertex AI Memory Bank, this agent's core strength lies in its ability to retain and recall nuanced information over time.

Instead of generic suggestions, the agent uses its long-term memory to remember key details gleaned from previous conversations: a niece's age and past birthday presents (like the "red bike" in the demo), a spouse's stated preference for aisle seats, or a parent's hint about a desired hobby.

This tutorial demonstrates how to build ADK agents with Memory, including both `InMemoryMemoryService` and `VertexAiMemoryBankService`.

This tutorial will cover:
* Comparing `InMemoryMemoryService` vs. `VertexAiMemoryBankService`.
* Generating memories with ADK and Agent Engine Memory Bank.
* Retrieving memories with ADK and Agent Engine Memory Bank.
* Customizing your Memory Bank instance's behavior.

## Get started

### Install Google Gen AI SDK and other required packages


In [1]:
%pip install google-adk google-cloud-aiplatform --upgrade --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.1/8.1 MB[0m [31m54.2 MB/s[0m eta [36m0:00:00[0m
[?25h

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [3]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [4]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

# fmt: off
PROJECT = "devfest-2025-day-1"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
# fmt: on
if not PROJECT or PROJECT == "[your-project-id]":
    PROJECT = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

client = vertexai.Client(project=PROJECT, location=LOCATION)

# Set environment variables for ADK.
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "TRUE"
os.environ["GOOGLE_CLOUD_PROJECT"] = PROJECT
os.environ["GOOGLE_CLOUD_LOCATION"] = LOCATION

  from google.cloud.aiplatform.utils import gcs_utils


## (Optional) Set up logging
To surface ADK logs in your notebook, you may need to configure the logging level of ADK's logger.

In [5]:
import logging
import sys

# Set up a general handler for all logs to be printed in your Colab output.
# We'll set the overall level to INFO.
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    stream=sys.stdout,
)

# Get the specific logger used by ADK.
adk_logger = logging.getLogger("google_adk")

# Set its level to DEBUG to ensure you capture ALL messages from the SDK,
# even if the general level is higher (like INFO).
adk_logger.setLevel(logging.DEBUG)

## Define a simple agent

The ADK Runner defines what session service is used by the agent. The session service is responsible for saving the conversation history.

In this example, we're using the in-memory session service (`InMemorySessionService`). However, you can use any session service with Memory.

If you want to use Google's pre-built Memory tools, you can provide a Memory service to the Runner as well. In this example, we'll start by directly invoking memory so that we can compare different services. So, we're not directly setting `memory_service` in the Runner.

In [6]:
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai.types import Content, Part

APP_NAME = "memory_example_app"
MODEL = "gemini-2.5-flash"

agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
)

session_service = InMemorySessionService()

runner = Runner(agent=agent, app_name=APP_NAME, session_service=session_service)


def call_agent(query, session, user_id):
    content = Content(role="user", parts=[Part(text=query)])
    events = runner.run(user_id=user_id, session_id=session, new_message=content)

    for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)

## Introduction to ADK Memory Service
ADK memory services offer an interface for creating and searching for memories. The [`BaseMemoryService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/base_memory_service.py) defines the interface for managing this searchable, long-term knowledge store. Its primary responsibilities are:

* Ingesting Information (`add_session_to_memory`): Taking the contents of a (usually completed) Session and adding relevant information to the long-term knowledge store.
* Searching Information (`search_memory`): Allowing an agent (typically via a Tool) to query the knowledge store and retrieve relevant snippets or context based on a search query.

In this colab, we'll cover two options:

* `InMemoryMemoryService`: A in-memory service that allows you to search your conversation history to retrieve the raw dialogue using basic keyword matching.
* `VertexAiMemoryBankService`: A remote service that remotely generates memories for "meaningful" information from user conversation.


You can also interact directly with Memory Bank by using the Agent Engine SDK.

### `InMemoryMemoryService`

[`InMemoryMemoryService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/in_memory_memory_service.py) persists **all** information in-memory. The events are not condensed, and information is not extracted from them.

In [7]:
from google.adk.memory import InMemoryMemoryService

in_memory_service = InMemoryMemoryService()

### `VertexAiMemoryBankService`
Memory Bank will only persist information that's meaningful for future interactions. [`VertexAiMemoryBankService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/vertex_ai_memory_bank_service.py) provides a built-in integration between Memory Bank and ADK.

To get started with Memory Bank, you need to first create an Agent Engine. This only takes a few seconds.

If you don't have a Google Cloud project, you can use [Vertex in Express mode, which only requires an API key](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/set-up#authentication).

In [8]:
import vertexai
from google.adk.memory import VertexAiMemoryBankService

client = vertexai.Client(project=PROJECT, location=LOCATION)

agent_engine = client.agent_engines.create(
    config={
        "context_spec": {
            "memory_bank_config": {
                "generation_config": {
                    "model": f"projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/gemini-2.5-flash"
                }
            }
        }
    }
)

memory_bank_service = VertexAiMemoryBankService(
    agent_engine_id=agent_engine.api_resource.name.split("/")[-1],
    project=PROJECT,
    location=LOCATION,
)

ClientError: 403 PERMISSION_DENIED. {'error': {'code': 403, 'message': 'This API method requires billing to be enabled. Please enable billing on project #devfest-2025-day-1 by visiting https://console.developers.google.com/billing/enable?project=devfest-2025-day-1 then retry. If you enabled billing for this project recently, wait a few minutes for the action to propagate to our systems and retry.', 'status': 'PERMISSION_DENIED', 'details': [{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'BILLING_DISABLED', 'domain': 'googleapis.com', 'metadata': {'service': 'aiplatform.googleapis.com', 'containerInfo': 'devfest-2025-day-1', 'consumer': 'projects/devfest-2025-day-1'}}, {'@type': 'type.googleapis.com/google.rpc.LocalizedMessage', 'locale': 'en-US', 'message': 'This API method requires billing to be enabled. Please enable billing on project #devfest-2025-day-1 by visiting https://console.developers.google.com/billing/enable?project=devfest-2025-day-1 then retry. If you enabled billing for this project recently, wait a few minutes for the action to propagate to our systems and retry.'}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Google developers console billing', 'url': 'https://console.developers.google.com/billing/enable?project=devfest-2025-day-1'}]}]}}

## Generating Memories

Memories will be generated from your conversation history stored in the `Session` object. The ADK `add_session_to_memory` method is responsible for triggering memory generation for your memory service, but the exact behavior depends on which service you use:

* `InMemoryMemoryService` persists all information, including the raw dialogue.

* `VertexAiMemoryBankService` only persists *meaningful* information. Not all conversations will result in generated memories. Additionally, the information will be condensed to individual, self-contained memories.

In this example, we create two sessions:
1. `chit_chat_session`, which is a conversation that contains events that may not be meaningful for future interactions.
2. `recommendations_session`, which is a conversation about looking for birthday presents for a family member. It includes implicit information that may be helpful for future conversations.

In [None]:
USER_ID = "My User"

chit_chat_session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID
)

call_agent(
    "what types of questions do you answer", chit_chat_session.id, user_id=USER_ID
)
# Refresh session object.
chit_chat_session = await session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=chit_chat_session.id
)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"what types of questions do you answer"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
As a general-purpose AI, I can answer a wide variety of questions across many domains. Here are some common types of questions I can help with:



Agent Response:  As a general-purpose AI, I can answer a wide variety of questions across many domains. Here are some common types of questions I can help with:

*   **Factual Questions:** "What is the capital of France?" or "When was the internet invented?"
*   **Definitional Questions:** "What is quantum mechanics?" or "Define 'photosynthesis'."
*   **Explanatory Questions:** "How does a car engine work?" or "Why do we have different seasons?"
*   **Comparative Questions:** "What are the differences between a democracy and a republic?" or "Compare Python and Java."
*   **Procedural/Instructional Questions:** "How do I make a simple résumé?" or "What are the steps to bake a chocolate cake?"
*   **Problem-Solving Questions (conceptual):** "What are some strategies to improve time management?" or "Suggest ways to reduce plastic waste."
*   **Creative/Generative Questions:** "Write a short story about a talking cat." or "Generate some ideas for a new marketing campaign."
*   **Summarizat

In [None]:
recommendations_session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID
)

call_agent(
    "I have a three year old niece. What recommendations do you have for birthday presents?",
    recommendations_session.id,
    user_id=USER_ID,
)
call_agent(
    "I like the idea of a bike. Are there any age appropriate options?",
    recommendations_session.id,
    user_id=USER_ID,
)

recommendations_session = await session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=recommendations_session.id
)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"I have a three year old niece. What recommendations do you have for birthday presents?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
That's a fun age to buy for! Three-year-olds are bursting with imagination, developing new skill

Agent Response:  That's a fun age to buy for! Three-year-olds are bursting with imagination, developing new skills rapidly, and becoming more independent. The best gifts encourage open-ended play, creativity, and learning.

Here are some recommendations for your three-year-old niece, categorized to help you choose:

**1. Creative & Art Supplies:**

*   **Play-Doh Set:** Great for fine motor skills, imagination, and sensory play. Look for sets with tools like cutters, rollers, or simple extruders.
*   **Jumbo Crayons, Washable Markers, or Finger Paints:** Large sizes are easier for small hands to grasp. Washable is key for parents! Pair with a large pad of paper or a tabletop easel.
*   **Water Wow! Books (Melissa & Doug):** These are fantastic for mess-free "painting" with water. The colors magically appear and then disappear as it dries, so they can be used again and again.
*   **Simple Craft Kits:** Look for kits with large pieces, like sticker scenes, foam art, or chunky bead string

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
That's a fantastic idea! Bikes are wonderful for gross motor development, outdoor play, and building confidence. For a three-year-old, there are definitely age-appropriate options, and the most recommended one for this age is often a **balance bike**.

Here's a breakdown of options suitable for a 3-year-old:

### 1. Balance Bikes (Highly Recommended)

*   **What they are:** Balance bikes are two-wheeled bikes without pedals. Children sit on the seat and propel themselves by pushing off the ground with their feet.
*   **Why they're great for 3-year-olds:**
    *   **Focus on Balance First:** This is their biggest advantage. Kids learn to balance and steer naturally without the added complexity of pedaling. This makes the transition to a pedal bike much easier later on, often skipping 

Agent Response:  That's a fantastic idea! Bikes are wonderful for gross motor development, outdoor play, and building confidence. For a three-year-old, there are definitely age-appropriate options, and the most recommended one for this age is often a **balance bike**.

Here's a breakdown of options suitable for a 3-year-old:

### 1. Balance Bikes (Highly Recommended)

*   **What they are:** Balance bikes are two-wheeled bikes without pedals. Children sit on the seat and propel themselves by pushing off the ground with their feet.
*   **Why they're great for 3-year-olds:**
    *   **Focus on Balance First:** This is their biggest advantage. Kids learn to balance and steer naturally without the added complexity of pedaling. This makes the transition to a pedal bike much easier later on, often skipping training wheels entirely.
    *   **Develops Confidence:** Children feel more in control because their feet are always close to the ground.
    *   **Gross Motor Skills:** Excellent for dev

### InMemoryMemoryService

First, let's see what memories are extracted from the conversation using `InMemoryMemoryService`

The memories include the raw dialogue of the conversation.

In [None]:
await in_memory_service.add_session_to_memory(chit_chat_session)
await in_memory_service.add_session_to_memory(recommendations_session)

In [None]:
# Peek at the memories.
in_memory_service._session_events

{'memory_example_app/My User': {'a09fd30b-0e56-4f85-96b8-7f42f820e79d': [Event(content=Content(
     parts=[
       Part(
         text='what types of questions do you answer'
       ),
     ],
     role='user'
   ), grounding_metadata=None, partial=None, turn_complete=None, finish_reason=None, error_code=None, error_message=None, interrupted=None, custom_metadata=None, usage_metadata=None, live_session_resumption_update=None, input_transcription=None, output_transcription=None, avg_logprobs=None, logprobs_result=None, cache_metadata=None, citation_metadata=None, invocation_id='e-b86a23ce-6127-4824-8d50-e720e372b819', author='user', actions=EventActions(skip_summarization=None, state_delta={}, artifact_delta={}, transfer_to_agent=None, escalate=None, requested_auth_configs={}, requested_tool_confirmations={}, compaction=None, end_of_agent=None, agent_state=None), long_running_tool_ids=None, branch=None, id='57ea1b0f-d22c-49d1-9f7d-08c014a44e3c', timestamp=1761054429.426718),
   Event(c

In [None]:
await in_memory_service.search_memory(app_name=APP_NAME, user_id=USER_ID, query="test")

SearchMemoryResponse(memories=[])

In [None]:
await in_memory_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="what should I get my mom for mother's day",
)

SearchMemoryResponse(memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='what types of questions do you answer'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T13:47:09.426718'), MemoryEntry(content=Content(
  parts=[
    Part(
      text="""As a general-purpose AI, I can answer a wide variety of questions across many domains. Here are some common types of questions I can help with:

*   **Factual Questions:** "What is the capital of France?" or "When was the internet invented?"
*   **Definitional Questions:** "What is quantum mechanics?" or "Define 'photosynthesis'."
*   **Explanatory Questions:** "How does a car engine work?" or "Why do we have different seasons?"
*   **Comparative Questions:** "What are the differences between a democracy and a republic?" or "Compare Python and Java."
*   **Procedural/Instructional Questions:** "How do I make a simple résumé?" or "What are the steps to bake a chocolate cake?"
*   **Problem-Solving Questions (conc

### Memory Bank via `VertexAiMemoryBankService`

Now, let's use Memory Bank to generate and store memories. Unlike `InMemoryMemoryService`, only content that's meaningful interactions will be persisted. If there's no meaningful content, the `add_session_to_memory` call will be a no-op; memories are not generated.

Memory generation happens in the background. It's a non-blocking operation, so `add_session_to_memory` may complete before memories are actually generated in the background. If you want memory generation to be a blocking operation, use the Agent Engine SDK:

```
client.agent_engines.memories.generate(
    ...,
    config={
        # Default value is True.
        "wait_for_completion": True
    }
)
```

There are two key operations that Memory Bank performs under-the-hood with memory generation:

* **Extraction**: Extracts information about the user from their conversations with the agent. Only information that matches at least one of your instance's memory topics will be persisted.
* **Consolidation**: Identifies if existing memories with the same scope should be deleted or updated based on the extracted information. Memory Bank checks that new memories are not duplicative or contradictory before merging them with existing memories. If existing memories don't overlap with the new information, a new memory will be created.

#### ADK `add_session_to_memory`

In [None]:
await memory_bank_service.add_session_to_memory(chit_chat_session)
await memory_bank_service.add_session_to_memory(recommendations_session)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/6122942571594383360' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T14:00:46.134926Z', 'updateTime': '2025-10-21T14:00:46.134926Z'}} done=None error=None response=None
INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/2811670945570226176' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'c

If you run this immediately after `add_session_to_memory`, there may be no memories printed. `add_session_to_memory` is a non-blocking function and memory generation can take a few seconds to run.

In [None]:
await memory_bank_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="What should I get my mom for mother's day",
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=1.0117469695290702 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I like the idea of giving my three-year-old niece a bike for her birthday, and I am looking for age-appropriate options.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83753, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=1.0594453096266772 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/43730

SearchMemoryResponse(memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='I like the idea of giving my three-year-old niece a bike for her birthday, and I am looking for age-appropriate options.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T14:00:56.083753+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I have a three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T14:00:56.083625+00:00')])

#### Agent Engine SDK `GenerateMemories`

If I want to interact directly with Memory Bank, I can use the Agent Engine SDK. For example, I can use [`RetrieveMemories`](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/fetch-memories#scope-based) to retrieve my memories with or without similarity search.

Memories are isolated by their `scope`, which is an arbitrary dictionary. `VertexAiMemoryBankService` uses the scope keys `app_name` and `user_id`. `app_name` is set by your Runner, and `user_id` is set when creating your `Session`.

In [None]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME,
           "user_id": USER_ID},
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6773180193344323584',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83625, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
     fact='I like the idea of giving my three-year-old niece a bike for her birthday, and I am looking for age-appropriate options.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
  

You can also use `Retrievememories` to retrieve memories using similarity search.

In [None]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    similarity_search_params={
        "search_query": "what should I get my niece for her birthday?"
    },
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7347648884243799,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6773180193344323584',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83625, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7844351878384009,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
     fact='I like the idea of giving my three-year-old niece a bike for her birthday, and I am looking for age-appropriate options.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632',
     scope={
       'app_nam

Memory Bank uses a process called "consolidation" to ensure that we do not upload duplicative or contradictory information for the same scope.

In [None]:
import pprint

operation = client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [
                        {
                            "text": "I like the idea of getting my niece a bike for her third birthday"
                        }
                    ],
                }
            }
        ]
    },
)
print("***OPERATION RESPONSE***")
pprint.pprint(operation)

print("\n***GENERATED MEMORIES***")
for generated_memory in operation.response.generated_memories:
    pprint.pprint(client.agent_engines.memories.get(name=generated_memory.memory.name))

***OPERATION RESPONSE***
AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:13:57.177888Z',
      'updateTime': '2025-10-21T14:14:08.005726Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/4212149692194095104',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.UPDATED: 'UPDATED'>,
        memory=Memory(
          name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632'
        )
      ),
    ]
  )
)

***GENERATED MEMORIES***
Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I like the idea of getting my three-year-old niece

What if your agent extracted memories that you want to combine with your existing memories? You can also use `GenerateMemories` to consolidate pre-extracted facts for the same scope [documenation](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories#consolidate-pre-extracted-memories).

Memory Bank already contains memories on this topic. So, the relevant existing memories will be updated rather than creating a new, duplicative memory.

In [None]:
operation = client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    direct_memories_source={"direct_memories": [{"fact": "I got my niece a red bike"}]},
)

print("***OPERATION RESPONSE***")
pprint.pprint(operation)

print("\n***GENERATED MEMORIES***")
for generated_memory in operation.response.generated_memories:
    if generated_memory.action != "DELETED":
        pprint.pprint(
            client.agent_engines.memories.get(name=generated_memory.memory.name)
        )

***OPERATION RESPONSE***
AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:15:51.979204Z',
      'updateTime': '2025-10-21T14:15:56.096074Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/7929308234635018240',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.UPDATED: 'UPDATED'>,
        memory=Memory(
          name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632'
        )
      ),
    ]
  )
)

***GENERATED MEMORIES***
Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I got my three-year-old niece a red bike for her t

#### Automate with Callbacks

To automate the triggering of memory generation, you can implement a callback to generate memories after every turn. See the [`Remember this: Agent state and memory with ADK` blog post](https://cloud.google.com/blog/topics/developers-practitioners/remember-this-agent-state-and-memory-with-adk) for more information.

In [None]:
from google import adk


async def auto_save_session_to_memory_callback(callback_context):
    # Use the invocation context to access the conversation history that should
    # be used as the data source for memory generation.
    await memory_bank_service.add_session_to_memory(
        callback_context._invocation_context.session
    )
    print("\n****Triggered memory generation****\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    after_agent_callback=auto_save_session_to_memory_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [None]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "What christmas present recommendations do you have for my niece?",
    session.id,
    user_id=USER_ID,
)

call_agent(
    "I think I'll get her a doll. What types are good for her age?",
    session.id,
    user_id=USER_ID,
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8091518668940462 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6773180193344323584',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83625, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8936996259031423 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I got my three-year-old niece a red bike for her third birthday.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688

Agent Response:  Your three-year-old niece sounds like she has a great bike already! For Christmas, you might consider something different to add to her playtime. Here are some recommendations for a three-year-old:

1.  **Creative Play:**
    *   **Art Supplies:** Large crayons, washable finger paints, chunky paintbrushes, or a simple easel with paper.
    *   **Play-Doh Sets:** These are always a hit and encourage fine motor skills and imaginative creation.
    *   **Sticker Books:** Reusable sticker books or large activity sticker books can provide hours of fun.

2.  **Building & Construction:**
    *   **DUPLO or Large Building Blocks:** Perfect for little hands to build towers, houses, and anything they can imagine.
    *   **Magnetic Tiles:** These are fantastic for open-ended building and exploring shapes.

3.  **Imaginative & Pretend Play:**
    *   **Dress-Up Clothes:** A simple costume (like a fairy, princess, doctor, or animal) can spark endless stories.
    *   **Play Kitche

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/683016434195890176' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T14:17:29.840373Z', 'updateTime': '2025-10-21T14:17:29.840373Z'}} done=None error=None response=None



****Triggered memory generation****



INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9376007979564559 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6773180193344323584',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83625, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9382428087527607 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I got my three-year-old niece a red bike for her third birthday.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688

Agent Response:  That's a lovely idea! Dolls are wonderful for encouraging imaginative play, nurturing, and developing social-emotional skills in three-year-olds.

For a three-year-old, you'll want something that is durable, easy for them to manage, and safe. Here are some types of dolls that are generally great for her age:

1.  **Soft-Bodied Baby Dolls:**
    *   **Why they're good:** These are often the first choice for this age. They are soft and cuddly, perfect for hugging and comforting. Many come with simple outfits that are easy for little hands to put on and take off (great for developing fine motor skills). They encourage nurturing play, like feeding, rocking, and putting to bed.
    *   **Look for:** Dolls with soft, washable bodies and simple, chunky clothing. Brands like Cabbage Patch Kids (the soft-bodied versions), Melissa & Doug, or smaller baby doll lines are popular.

2.  **Toddler Dolls (like some "My First" dolls):**
    *   **Why they're good:** These are often sli

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Generate memory response: name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/4399612026683392000' metadata={'@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata', 'genericMetadata': {'createTime': '2025-10-21T14:17:45.339368Z', 'updateTime': '2025-10-21T14:17:45.339368Z'}} done=None error=None response=None



****Triggered memory generation****



In [None]:
response = client.agent_engines.memories.retrieve(
    name=agent_engine.api_resource.name,
    scope={"app_name": APP_NAME, "user_id": USER_ID},
    similarity_search_params={
        "search_query": "What was I thinking about getting my niece for Christmas?"
    },
)
list(response)

[RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7943347759252403,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
     fact='I have a three-year-old niece.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6773180193344323584',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },
     update_time=datetime.datetime(2025, 10, 21, 14, 0, 56, 83625, tzinfo=TzInfo(UTC))
   )
 ),
 RetrieveMemoriesResponseRetrievedMemory(
   distance=0.7986168617004635,
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 14, 17, 48, 588965, tzinfo=TzInfo(UTC)),
     fact='I plan to get my three-year-old niece a doll for Christmas.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/8674121448559738880',
     scope={
       'app_name': 'memory_example_app',
       'user_id': 'My User'
     },

The prior example used the entire Session to generate memories. However, this can be costly as the size of the session increases.

You can also use callbacks to only save the current turn to Memory Bank. This has the downside, however, where you lose some of the context of the conversation.

In [None]:
async def auto_save_user_turn_to_memory_callback(callback_context):
    last_turn = callback_context._invocation_context.user_content
    client.agent_engines.memories.generate(
        name=agent_engine.api_resource.name,
        scope={"app_name": APP_NAME, "user_id": USER_ID},
        direct_contents_source={"events": [{"content": last_turn}]},
        config={"wait_for_completion": False},
    )
    print("\n****Triggered memory generation****\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    before_agent_callback=auto_save_user_turn_to_memory_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [None]:
call_agent(
    "I think I'll get her a doll. What types are good for her age?",
    session.id,
    user_id=USER_ID,
)


****Triggered memory generation****



INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.6587006361125946 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 14, 17, 48, 588965, tzinfo=TzInfo(UTC)),
  fact='I plan to get my three-year-old niece a doll for Christmas.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/8674121448559738880',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 17, 48, 588965, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.9376007979564559 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 859611, tzinfo=TzInfo(UTC)),
  fact='I have a three-year-old niece.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617

Agent Response:  That's a wonderful choice! A doll is a classic gift that encourages nurturing, imaginative play, and social-emotional development in three-year-olds.

When choosing a doll for her age, here are the key types that are generally best and why:

1.  **Soft-Bodied Baby Dolls:**
    *   **Why they're great:** These are often the most popular for this age. They're soft, cuddly, and perfect for hugging. Many come with simple, easy-to-manage clothing that helps children practice dressing skills. They naturally inspire nurturing play like feeding, rocking, and comforting.
    *   **What to look for:** Washable materials are a plus! Look for dolls with friendly embroidered or painted faces rather than small plastic parts.

2.  **Toddler Dolls (e.g., "My First" dolls):**
    *   **Why they're great:** These dolls might be a bit larger or have slightly more articulation (movable limbs) than pure baby dolls but are still designed for younger children. They can be great for slightly 

## Using Memory in your agent

Now, let's link our agent to Memory, so that memories can be fetched by your agent and used for inference. This allows your agent to remember information from prior sessions in a new, empty session.

You can use ADK's built-in tools to fetch memories and incorporate them in the prompt. When using the built-in tools, it's important to provide both a memory tool when creating your Agent and a memory service when defining your Runner.

### `adk.PreloadMemoryTool`
The `PreloadMemoryTool` always retrieves memories at the start of each turn and includes the memories in the system instruction.

Under the hood, it invokes `memory_service.search_memory` for the given User and App Name. Memory is invoked before the LLM is called, so there will not be an associated tool call logged for memory.

We're going to use a [callback](https://google.github.io/adk-docs/callbacks/) to print out the System Instructions. Since we use `before_model_callback`, the callback executes right before the model is called.

<img src="https://google.github.io/adk-docs/assets/callback_flow.png">

In [None]:
from google import adk


def log_system_instructions(callback_context, llm_request):
    """A callback to print the LLM request."""
    print(
        f"\n*System Instruction*:\n{llm_request.config.system_instruction}\n*********\n"
    )


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    before_model_callback=log_system_instructions,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

This conversation uses a new session, so it doesn't have access to the user's conversation history without using memory.

Since we're using `PreloadMemoryTool`, the retrieved memories will be appended to the System Instructions.

In [None]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "Can you remind me what I got my niece for her christmas?",
    session.id,
    user_id=USER_ID,
)

INFO:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Search memory response received.
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8436347388201234 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 13, 54, 55, 860546, tzinfo=TzInfo(UTC)),
  fact='I got my three-year-old niece a red bike for her third birthday.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/4467337184130629632',
  scope={
    'app_name': 'memory_example_app',
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 15, 56, 19585, tzinfo=TzInfo(UTC))
)
DEBUG:google_adk.google.adk.memory.vertex_ai_memory_bank_service:Retrieved memory: distance=0.8454955729521494 memory=Memory(
  create_time=datetime.datetime(2025, 10, 21, 14, 17, 48, 588965, tzinfo=TzInfo(UTC)),
  fact='I plan to get my three-year-old niece a doll for Christmas.',
  name='projects/437300234315/locations/us-centr


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".

The following content is from your previous conversations with the user.
They may be useful for answering the user's current query.
<PAST_CONVERSATIONS>
Time: 2025-10-21T14:15:56.019585+00:00
user: I got my three-year-old niece a red bike for her third birthday.
Time: 2025-10-21T14:17:48.588965+00:00
user: I plan to get my three-year-old niece a doll for Christmas.
Time: 2025-10-21T14:00:56.083625+00:00
user: I have a three-year-old niece.
</PAST_CONVERSATIONS>

*********



INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".

The following content is from your previous conversations with the user.
They may be useful for answering the user's current query.
<PAST_CONVERSATIONS>
Time: 2025-10-21T14:15:56.019585+00:00
user: I got my three-year-old niece a red bike for her third birthday.
Time: 2025-10-21T14:17:48.588965+00:00
user: I plan to get my three-year-old niece a doll for Christmas.
Time: 2025-10-21T14:00:56.083625+00:00
user: I have a three-year-old niece.
</PAST_CONVERSATIONS>

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Can you remind me what I got my niece for her christmas?"}],"role":"user

Agent Response:  You planned to get your niece a doll for Christmas.


### adk.LoadMemoryTool

Now, let's use `LoadMemoryTool`. Unlike `PreloadMemoryTool`, this tool acts like a standard tool. Your agent needs to decide whether the tool should be invoked.

In [None]:
from google import adk


def log_tool_call(tool, args, tool_response, **kwargs):
    """A callback to print the LLM request."""
    print(f"\n*Tool*:\n{tool}")
    print(f"\n*Tool call*:\n{args}")
    print(f"\n*Tool response*:\n{tool_response}\n*********\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    tools=[adk.tools.load_memory_tool.LoadMemoryTool()],
    before_model_callback=log_system_instructions,
    after_tool_callback=log_tool_call,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

In [None]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent(
    "Can you remind me what I got my niece for her christmas?",
    session.id,
    user_id=USER_ID,
)


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

*********



INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Can you remind me what I got my niece for her christmas?"}],"role":"user"}
-----------------------------------------------------------
Functions:
load_memory: {'query': {'type': <Type.STRING: 'STRING'>}} 
-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.goog


*Tool*:
<google.adk.tools.load_memory_tool.LoadMemoryTool object at 0x7fda78a18f50>

*Tool call*:
{'query': 'what I got my niece for christmas'}

*Tool response*:
memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='I have a three-year-old niece.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T14:00:56.083625+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I plan to get my three-year-old niece a doll for Christmas.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T14:17:48.588965+00:00'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='I got my three-year-old niece a red bike for her third birthday.'
    ),
  ],
  role='user'
), author='user', timestamp='2025-10-21T14:15:56.019585+00:00')]
*********


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
I'm sorry, I couldn't find a record of what you got your niece for Christmas. I can see that you *planned* to get her a doll, but not what you actually got her. I also see you got her a red bike for her third birthday.
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw response:
{"sdk_http_response":{"headers":{"Content-Type":"application/json; charset=UTF-8","Vary":"Origin, X-Origin, Referer","Content-Encoding":"gzip","Date":"Tue, 21 Oct 2025 14:29:57 GMT","Server":"scaffolding on HTTPServer2","X-XSS-Protection":"0","X-Frame-Options":"SAMEORIGIN","X-Content-Type-Options":"nosniff","Transfer-Encoding":"chunked"}},"candidates":[{"content":{"parts":[{"thought_signature":"CuIEAePx_16Jx-Ji1dPyddpYcg

Agent Response:  I'm sorry, I couldn't find a record of what you got your niece for Christmas. I can see that you *planned* to get her a doll, but not what you actually got her. I also see you got her a red bike for her third birthday.


If the agent decides that Memory is not helpful for a given query, the Memory tool will not be invoked.

In [None]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent("Hi!", session.id, user_id=USER_ID)


*System Instruction*:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

*********



INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".


You have memory. You can use it to answer questions. If any questions need
you to look up the memory, you should call load_memory function with a query.

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
-----------------------------------------------------------
Functions:
load_memory: {'query': {'type': <Type.STRING: 'STRING'>}} 
-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-------------------------------

Agent Response:  Hi there! How can I help you today?


### Custom callback to retrieve memories

If you want more control over how memories are retrieved, you can create your own custom callback.

For example, you can use the Agent Engine SDK within the callback to retrieve memories without similarity search. This is lower latency than using similarity search.

You can also customize your `scope` rather than using the default "app_name" and "user_id" scope keys used by ADK's memory service.

In [None]:
def retrieve_memories_callback(callback_context, llm_request):
    user_id = callback_context._invocation_context.user_id

    response = client.agent_engines.memories.retrieve(
        name=agent_engine.api_resource.name,
        # Unlike the ADK Memory Service, this does not use App Name in the scope.
        scope={"user_id": user_id},
    )
    memories = [f"* {memory.memory.fact}" for memory in list(response)]
    if not memories:
        # No memories to add to System Instructions.
        return
    # Append formatted memories to the System Instructions
    llm_request.config.system_instruction += (
        "\nHere is information that you have about the user:\n"
    )
    llm_request.config.system_instruction += "\n".join(memories)


def log_tool_call(tool, args, tool_response, **kwargs):
    """A callback to print the LLM request."""
    print(f"\n*Tool*:\n{tool}")
    print(f"\n*Tool call*:\n{args}")
    print(f"\n*Tool response*:\n{log_tool_call}\n*********\n")


agent = LlmAgent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="Answer the user's questions",
    before_model_callback=retrieve_memories_callback,
)

runner = Runner(
    agent=agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

Since we're using a different scope than the ADK Memory Service, we don't have access to the prior memories. Memories are isolated by their scope dictionary.

In [None]:
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)

call_agent("Hi!", session.id, user_id=USER_ID)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
Hello there! How can I help you today?
-----------------------------------------------------------
Function calls:

-----------------------------------------------------------
Raw

Agent Response:  Hello there! How can I help you today?


Let's first generate memories for our new scope.

In [None]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I have four nieces!"}]}}
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:36:07.761304Z',
      'updateTime': '2025-10-21T14:36:10.561373Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/2797386090502160384',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/2531211556826382336'
        )
      ),
    ]
  )
)

In [None]:
call_agent("What information do you know about me?", session.id, user_id=USER_ID)

INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.VERTEX_AI, stream: False
DEBUG:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
Answer the user's questions

You are an agent. Your internal name is "Generic_QA_Agent".
Here is information that you have about the user:
* I have four nieces.
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi!"}],"role":"user"}
{"parts":[{"text":"Hello there! How can I help you today?"}],"role":"model"}
{"parts":[{"text":"What information do you know about me?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

INFO:google_adk.google.adk.models.google_llm:Response received from the model.
DEBUG:google_adk.google.adk.models.google_llm:
LLM Response:
--------------------

Agent Response:  I know that you have four nieces.


## Customize your Memory Bank's behavior

So far, we've used Memory Bank with it's default settings. Now, let's customize some configurations. The configuration is set when creating or updating your Agent Engine.

### Customizing Memory Extraction

By default, Memory Bank considers memories that fit the following topics to be meaningful:

* **Personal information (`USER_PERSONAL_INFO`)**: Significant personal information about the user, like names, relationships, hobbies, and important dates. For example, "I work at Google" or "My wedding anniversary is on December 31".
* **User preferences (`USER_PREFERENCES`)**: Stated or implied likes, dislikes, preferred styles, or patterns. For example, "I prefer the middle seat."
* **Key conversation events and task outcomes (`KEY_CONVERSATION_DETAILS`)**: Important milestones or conclusions within the dialogue. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
* **Explicit remember / forget instructions (`EXPLICIT_INSTRUCTIONS`)**: Information that the user explicitly asks the agent to remember or forget. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."

These are managed topics where Memory Bank manages their definition. If you want to use a subset of these topics or use custom topics, you can provide a "CustomizationConfig" to customize what information Memory Bank should consider meaningful to persist.

#### Managed Topics

We're going to update our Memory Bank to only extract user preferences.

This will take affect for all requests to Memory Bank, regardless of whether you use ADK or Agent Engine SDK.

In [None]:
user_preferences_config = {
    "scope_keys": ["user_id"],
    "memory_topics": [
        {"managed_memory_topic": {"managed_topic_enum": "USER_PREFERENCES"}}
    ],
}


client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "customization_configs": [user_preferences_config],
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=unstacked-x-bwai.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')


AgentEngine(api_resource.name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')

Memories that fit the `USER_PREFERENCES` category will be persisted.

In [None]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I prefer the aisle seat"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:38:23.614611Z',
      'updateTime': '2025-10-21T14:38:26.594311Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/3001173973640675328',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/8818236636635594752'
        )
      ),
    ]
  )
)

Memories that don't fit `USER_PREFERENCES` won't be persisted.

In [None]:
client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I'm going to get my niece a red bike"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:38:33.401302Z',
      'updateTime': '2025-10-21T14:38:34.552294Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/3640685120727285760'
)

#### Custom Topics
Alternatively, you can define your own custom topics where you define your own labels, descriptions, and examples.

In [None]:
from vertexai.types import MemoryBankCustomizationConfig as CustomizationConfig
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExample as GenerateMemoriesExample,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSource as ConversationSource,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSourceEvent as ConversationSourceEvent,
)
from vertexai.types import (
    MemoryBankCustomizationConfigGenerateMemoriesExampleGeneratedMemory as ExampleGeneratedMemory,
)
from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import (
    MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic as CustomMemoryTopic,
)

memory_topic = MemoryTopic(
    custom_memory_topic=CustomMemoryTopic(
        label="business_feedback",
        description="""Specific user feedback about their experience at
the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
staff friendliness, service speed, cleanliness, and any suggestions for
improvement.""",
    )
)


example = GenerateMemoriesExample(
    conversation_source=ConversationSource(
        events=[
            ConversationSourceEvent(
                content=Content(
                    role="model",
                    parts=[
                        Part(
                            text="Welcome back to The Daily Grind! We'd love to hear your feedback on your visit."
                        )
                    ],
                )
            ),
            ConversationSourceEvent(
                content=Content(
                    role="user",
                    parts=[
                        Part(
                            text="Hey. The drip coffee was a bit lukewarm today, which was a bummer. Also, the music was way too loud, I could barely hear my friend."
                        )
                    ],
                )
            ),
        ]
    ),
    generated_memories=[
        ExampleGeneratedMemory(
            fact="The user reported that the drip coffee was lukewarm."
        ),
        ExampleGeneratedMemory(
            fact="The user felt the music in the shop was too loud."
        ),
    ],
)

noop_example = GenerateMemoriesExample(
    conversation_source=ConversationSource(
        events=[
            ConversationSourceEvent(
                content=Content(
                    role="model",
                    parts=[
                        Part(
                            text="Welcome back to The Daily Grind! We'd love to hear your feedback on your visit."
                        )
                    ],
                )
            ),
            ConversationSourceEvent(
                content=Content(
                    role="user", parts=[Part(text="Thanks for the coffee!")]
                )
            ),
        ]
    ),
    generated_memories=[],
)

client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "customization_configs": [
                    CustomizationConfig(
                        memory_topics=[memory_topic],
                        generate_memories_examples=[example, noop_example],
                    )
                ],
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=unstacked-x-bwai.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')


AgentEngine(api_resource.name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')

In [None]:
client.agent_engines.memories.generate(
    scope={"user_id": "123"},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "I prefer the aisle seat"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:40:13.466775Z',
      'updateTime': '2025-10-21T14:40:14.572092Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/2414298647198957568'
)

In [None]:
import vertexai

client.agent_engines.memories.generate(
    scope={"user_id": "123"},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "You should have almond milk"}],
                }
            }
        ]
    },
)

AgentEngineGenerateMemoriesOperation(
  done=True,
  metadata={
    '@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.GenerateMemoriesOperationMetadata',
    'genericMetadata': {
      'createTime': '2025-10-21T14:40:21.197068Z',
      'updateTime': '2025-10-21T14:40:22.505189Z'
    }
  },
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/operations/6017178349095354368',
  response=GenerateMemoriesResponse(
    generated_memories=[
      GenerateMemoriesResponseGeneratedMemory(
        action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: 'CREATED'>,
        memory=Memory(
          name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6170120055741743104'
        )
      ),
    ]
  )
)

In [None]:
list(
    client.agent_engines.memories.retrieve(
        name=agent_engine.api_resource.name, scope={"user_id": "123"}
    )
)

[RetrieveMemoriesResponseRetrievedMemory(
   memory=Memory(
     create_time=datetime.datetime(2025, 10, 21, 14, 40, 22, 431937, tzinfo=TzInfo(UTC)),
     fact='I suggested that the business should have almond milk.',
     name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6170120055741743104',
     scope={
       'user_id': '123'
     },
     update_time=datetime.datetime(2025, 10, 21, 14, 40, 22, 431937, tzinfo=TzInfo(UTC))
   )
 )]

### Dynamic TTL

You can also configure Memory Bank to automatically set the TTL for any generated or manually created (or updated) memories.

#### Default TTL

Default TTL applies to all operations that create or update a Memory.

In [None]:
client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "ttl_config": {
                    # 30 days
                    "default_ttl": f"{60 * 60 * 24 * 30}s"
                }
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=unstacked-x-bwai.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')


AgentEngine(api_resource.name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')

In [None]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I have four nieces!"}]}}
        ]
    },
    config={"wait_for_completion": True},
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

Memory(
  create_time=datetime.datetime(2025, 10, 21, 14, 36, 9, 601480, tzinfo=TzInfo(UTC)),
  expire_time=datetime.datetime(2025, 11, 20, 14, 42, 56, 299370, tzinfo=TzInfo(UTC)),
  fact='I have four nieces.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/2531211556826382336',
  scope={
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 42, 56, 302588, tzinfo=TzInfo(UTC))
)

#### Granular TTL

The TTL is calculated based on which operation created or updated the Memory. If not set for a given operation, then the operation won't update the Memory's expiration time.

Parameters include:
 * `create_ttl`: Applies to Memories created with CreateMemory operations
 * `generate_created_ttl`: Applies to Memories created with GenerateMemories
 * `generate_updated_ttl`: Applies to Memories updated with GenerateMemories

In [None]:
client.agent_engines.update(
    name=agent_engine.api_resource.name,
    config={
        "context_spec": {
            "memory_bank_config": {
                "ttl_config": {
                    "granular_ttl_config": {
                        # 365 days
                        "generate_created_ttl": f"{24 * 60 * 60 * 365}s",
                    }
                }
            }
        }
    },
)

INFO:vertexai_genai.agentengines:View progress and logs at https://console.cloud.google.com/logs/query?project=unstacked-x-bwai.
INFO:vertexai_genai.agentengines:Agent Engine updated. To use it in another session:
INFO:vertexai_genai.agentengines:agent_engine=client.agent_engines.get('projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')


AgentEngine(api_resource.name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928')

In [None]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {
                "content": {
                    "role": "user",
                    "parts": [{"text": "Remember that I work at Zindua School!"}],
                }
            }
        ]
    },
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

Memory(
  create_time=datetime.datetime(2025, 10, 21, 14, 44, 34, 692040, tzinfo=TzInfo(UTC)),
  expire_time=datetime.datetime(2026, 10, 21, 14, 44, 34, 685971, tzinfo=TzInfo(UTC)),
  fact='I work at Zindua School.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/8151703891784761344',
  scope={
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 44, 34, 692040, tzinfo=TzInfo(UTC))
)

In [None]:
response = client.agent_engines.memories.generate(
    scope={"user_id": USER_ID},
    name=agent_engine.api_resource.name,
    direct_contents_source={
        "events": [
            {"content": {"role": "user", "parts": [{"text": "I work at Google!"}]}}
        ]
    },
)

client.agent_engines.memories.get(
    name=response.response.generated_memories[0].memory.name
)

Memory(
  create_time=datetime.datetime(2025, 10, 21, 14, 45, 10, 120194, tzinfo=TzInfo(UTC)),
  expire_time=datetime.datetime(2026, 10, 21, 14, 45, 10, 114376, tzinfo=TzInfo(UTC)),
  fact='I work at Google.',
  name='projects/437300234315/locations/us-central1/reasoningEngines/283924688617340928/memories/6224163251270189056',
  scope={
    'user_id': 'My User'
  },
  update_time=datetime.datetime(2025, 10, 21, 14, 45, 10, 120194, tzinfo=TzInfo(UTC))
)

## Cleaning up

It's always a best practice in cloud development to clean up resources you no longer need to avoid incurring unexpected costs. This final cell deletes the AgentEngine resources we created throughout this tutorial.

In [None]:
delete_agent_engines = True

agent_engine_name = agent_engine.api_resource.name

if delete_agent_engines:
    # Delete agent engine
    client.agent_engines.delete(name=agent_engine_name, force=True)