In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Building Intelligent Agents: Google ADK Memory - Long-Term Knowledge (Part 2)

In this notebook, we'll give our agents long-term memory - a persistent, searchable knowledge store that transcends individual conversations.

#### Overview: From Conversations to Knowledge

In Part 1, we transformed stateless LLMs into conversational agents using **Sessions** - giving them the ability to remember within a single conversation. But there's a limitation: when you start a new session, all those valuable insights about user preferences, learned patterns, and important context vanish like morning mist.

**The Challenge:** Imagine a personal assistant who forgets everything about you every time you start a new conversation. They wouldn't remember that you prefer concise answers, that you're learning Python, or that you mentioned being colorblind last week. This is the gap between conversation-level memory (Sessions) and true personalization.

**The Solution:** In this notebook, we'll give our agents **long-term memory** - a persistent, searchable knowledge store that transcends individual conversations. Think of it as upgrading from sticky notes (Sessions) to a well-organized filing system (Memory) that your agent can reference across all interactions.

**What you'll learn:**
- The fundamental difference between Sessions and Memory in agent architecture
- How to implement persistent memory using ADK's Memory Services
- Strategies for extracting and storing valuable information from conversations
- Best practices for searching and utilizing stored memories
- The relationship between working memory (Session State) and long-term memory

**Time:** 10-20 minutes

By the end of this notebook, your agents will be able to learn from past interactions and provide truly personalized experiences that improve over time.

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/msampathkumar/google-adk-sam/blob/main/Notebook2.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/msampathkumar/google-adk-sam/blob/main/Notebook2.ipynb">
      <img width="32px" src="https://storage.googleapis.com/github-repo/generative-ai/logos/GitHub_Invertocat_Dark.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

| Author(s) |
| --------- |
| [Sampath M](https://github.com/msampathm) |

## 1. Setup and Configuration

This section covers the initial setup required to run the notebook, including installing libraries and configuring the environment.

#### 1.1. Install Dependencies

Install necessary Python packages: google-adk

In [None]:
!pip install --upgrade --quiet google-adk==1.16

#### 1.2. Environment Configuration

- Set up Gemini API Key if using Google AI Studio
- Set up Google Cloud Project ID and Location if using Vertex AI and handles authentication for Google Colab environments
- Import required libraries
- Define the agent_name, app_name, model and user_id

##### **Vertex AI Users**
If you are using **Vertex AI**, set the values of **PROJECT_ID** and **LOCATION** below and authenticate

In [None]:
import os

PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

if not PROJECT_ID:
    PROJECT_ID = "[your-project-name]"  # @param {type:"string"}


LOCATION = "global" # @param {type:"string"}
GOOGLE_GENAI_USE_VERTEXAI = "1" # Use Vertex AI API

os.environ["GOOGLE_CLOUD_PROJECT"] = PROJECT_ID
os.environ["GOOGLE_CLOUD_LOCATION"] = LOCATION
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = GOOGLE_GENAI_USE_VERTEXAI # Use Vertex AI API

if PROJECT_ID and LOCATION and GOOGLE_GENAI_USE_VERTEXAI:
    print('‚úÖ Environmental variables are set!\n')
else:
    print('‚ùå')

In [None]:
# User Authentication - only required for Google Colab Notebooks
import sys

if "google.colab" in sys.modules:

    # user auth
    from google.colab import auth
    auth.authenticate_user()

    # colab secrect keys
    from google.colab import userdata
    os.environ["GOOGLE_CLOUD_PROJECT"] = userdata.get('GOOGLE_CLOUD_PROJECT')

## 2. Understanding ADK Memory Services

### 2.1. The Architecture of Memory

In the previous notebook, we learned that Sessions are containers for conversations. Now, let's understand how Memory fits into the bigger picture:

```
üì± Application
  ‚îî‚îÄ‚îÄ üë§ Users
       ‚îî‚îÄ‚îÄ üí¨ Sessions (Conversations)
            ‚îú‚îÄ‚îÄ üìù Events (User messages & Agent responses)
            ‚îî‚îÄ‚îÄ üß† State (Working memory for current conversation)

       ‚îî‚îÄ‚îÄ üóÑÔ∏è Memory (Long-term knowledge across sessions)
            ‚îú‚îÄ‚îÄ üìö Extracted insights from past sessions
            ‚îú‚îÄ‚îÄ üîç Searchable knowledge base
            ‚îî‚îÄ‚îÄ üéØ User preferences and patterns
```

ADK memory services implement the [`BaseMemoryService`](https://github.com/google/adk-python/blob/main/src/google/adk/memory/base_memory_service.py) interface, which provides methods for:
- **Storing memories**: Converting session events into searchable knowledge
- **Searching memories**: Finding relevant information based on queries
- **Managing lifecycle**: Handling memory persistence and retrieval

### Memory Architecture Visualization

```
mermaid
flowchart TB
    subgraph App["üè¢ Application"]
        subgraph User["üë§ User (user-123)"]
            subgraph Sessions["üí¨ Sessions"]
                S1["Session 1<br/>üìù Events + üß† State"]
                S2["Session 2<br/>üìù Events + üß† State"]
                S3["Session N<br/>üìù Events + üß† State"]
            end
            
            subgraph Memory["üóÑÔ∏è Long-term Memory"]
                M1["üìö Extracted Insights"]
                M2["üîç Searchable Knowledge"]
                M3["üéØ User Preferences"]
            end
        end
    end
    
    S1 -.->|Transfer| Memory
    S2 -.->|Transfer| Memory
    S3 -.->|Transfer| Memory
    
    Memory -->|Query| Agent["ü§ñ Agent"]
    
    style Sessions fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style Memory fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style Agent fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
```


<img src="https://github.com/msampathkumar/google-adk-sam/blob/main/adk-memory-image-intro.png?raw=1" alt="drawing" width="600"/>

This diagram illustrates the relationship between Sessions (temporary, per-conversation) and Memory (persistent, across conversations). Sessions can be transferred to Memory, and agents can query Memory for relevant information.

### Imports & Helper functions

The helper function (`run_session`)'s job is to prepare a session and run user queries using the runner.

In [None]:
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import BaseSessionService
from google.adk.sessions import InMemorySessionService
from google.adk.memory import InMemoryMemoryService
from google.adk.tools import load_memory
from google.genai import types



async def run_session(runner_instance: Runner, user_queries: list[str] | str = None, session_name: str = "default"):
    """
    Helper function that manages a complete conversation session, handling session
    creation/retrieval, query processing, and response streaming. It supports
    both single queries and multiple queries in sequence.

    Args:
        runner_instance (Runner): The ADK Runner instance that manages the
            conversation flow between user and agent.
        user_queries (list[str] | str | None): Either a single query string,
            a list of query strings to process sequentially, or None if no
            queries are provided.
        session_name (str): A unique identifier for the session. Defaults to
            "default". Used to resume previous conversations or start new ones.

    Returns:
        None: This function prints the conversation to stdout rather than
            returning values.

    Example:
        >>> await run_session(runner, "What is the capital of France?", "geography-session")
        >>> await run_session(runner, ["Hello!", "What's my name?"], "user-intro-session")

    Note:
        - If a session with the given name already exists, it will be resumed.
    """
    # Display the session identifier for tracking
    print(f"\n ### Session: {session_name}")

    # Attempt to create a new session or retrieve an existing one
    try:
        session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_name)
    except:
        session = await session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_name)

    # Process queries if provided
    if user_queries:
        # Convert single query to list for uniform processing
        if type(user_queries) == str:
            user_queries = [user_queries]

        # Process each query in the list sequentially
        for query in user_queries:
            # Display the user's query
            print(f"\nUser > {query}")

            # Convert the query string to the ADK Content format
            query = types.Content(role="user", parts=[types.Part(text=query)])

            # Stream the agent's response asynchronously
            async for event in runner_instance.run_async(user_id=USER_ID, session_id=session.id, new_message=query):
                # Check if the event contains valid content
                if event.content and event.content.parts:
                    # Filter out empty or "None" responses before printing
                    if event.content.parts[0].text != "None" and event.content.parts[0].text:
                        # Display the model's response with the model name prefix
                        print(f"{MODEL_NAME} > ", event.content.parts[0].text)
                        print("----")
    else:
        print("No queries!")

## 3. Building Our First Memory-Enabled Agent

### 3.1. Starting Simple: Agent Without Memory

Let's first create a basic agent setup. Notice that we're initializing both a `SessionService` (for conversation history) and a `MemoryService` (for long-term knowledge), but our agent doesn't yet know how to use the memory:

In [None]:
APP_NAME = "MemoryExampleApp"
USER_ID = "user-123"
MODEL_NAME = "gemini-2.5-flash-lite"


session_service = InMemorySessionService()
memory_service = InMemoryMemoryService()

user_agent = LlmAgent(
    model=MODEL_NAME,
    instruction="Answer the user's questions in simple words.",
    name=APP_NAME,
)

runner = Runner(
    agent=user_agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_service,
)

### 3.2. Testing Session Memory (Within Conversation)

First, let's verify that our agent can remember information within a single session, just like we learned in Part 1. This establishes our baseline - the agent remembers because both queries are in the same conversation:

In [None]:
user_queries = [
    "My favorite color is BlueGreen. Can you write a Haiku",
    "What is my favorite color",
]

await run_session(runner, user_queries, "test-run-01")

## Example response: ##
# User > My favorite color is BlueGreen. Can you write a Haiku
# Model > A color so rare,
# Blue meets green, a lovely blend,
# Nature's soft embrace.
#
# User > What is my favorite color
# Model > Your favorite color is BlueGreen.

## 4. From Sessions to Memory: The Transfer Process


### 4.1. Understanding the Memory Creation Workflow

Now comes the crucial part - converting conversation history into searchable long-term memory. This is like taking notes from a meeting and filing them in a knowledge base for future reference.

The workflow looks like this:
1. **Conversation happens** ‚Üí Events stored in Session
2. **Extract valuable information** ‚Üí Identify what's worth remembering
3. **Store in Memory** ‚Üí Make it searchable across sessions
4. **Future conversations** ‚Üí Agent can access this knowledge

Let's walk through this process step by step.

### Session to Memory Transfer Flow

```
mermaid
flowchart LR
    subgraph Session["üí¨ Active Session"]
        E1["User: My favorite color<br/>is BlueGreen"]
        E2["Agent: BlueGreen is<br/>a nice shade..."]
        E3["User: What is my<br/>favorite color?"]
        E4["Agent: Your favorite<br/>color is BlueGreen"]
    end
    
    subgraph Process["üîÑ Transfer Process"]
        P1["1Ô∏è‚É£ Retrieve Session Events"]
        P2["2Ô∏è‚É£ Extract Valuable Info"]
        P3["3Ô∏è‚É£ Store in Memory Service"]
    end
    
    subgraph Memory["üóÑÔ∏è Long-term Memory"]
        M1["Searchable:<br/>'favorite color' ‚Üí 'BlueGreen'"]
        M2["User Preferences:<br/>Color: BlueGreen"]
    end
    
    E1 --> P1
    E2 --> P1
    E3 --> P1
    E4 --> P1
    
    P1 --> P2
    P2 --> P3
    P3 --> M1
    P3 --> M2
    
    style Session fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style Process fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style Memory fill:#fff3e0,stroke:#f57c00,stroke-width:2px
```


<img src="https://github.com/msampathkumar/google-adk-sam/blob/main/adk-memory-session-to-memory-transfer-process.png?raw=1" alt="drawing" width="600"/>


This flowchart visualizes how conversation events from a session are processed and transferred into searchable long-term memory.

### 4.2. Step 1: Retrieving Session Events

First, let's fetch the conversation we just had. Remember, at this point, the information only exists in the session - not in long-term memory:

In [None]:
session = await session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id="test-run-01"
)

for each in session.events:
    print(f'{each.content.role} > {each.content.parts[0].text}')

### 4.3. Step 2: Checking Memory Status

Let's verify that our memory service is currently empty. The `_session_events` is an internal structure that shows what's stored in memory:

In [None]:
# Check if memory service has any stored memories yet
memory_service._session_events

# Example response:
# {}  # Empty dictionary - no memories stored yet

### 4.4. Step 3: Transferring Session to Memory

Now for the key moment - we'll transfer our conversation history from the temporary session storage to permanent memory. This is where the magic happens:

In [None]:
# Transfer the entire conversation from session to long-term memory
# This makes the conversation searchable across future sessions
await memory_service.add_session_to_memory(session)

### 4.5. Step 4: Verifying Memory Storage

Let's verify that our conversation has been successfully stored in memory. Notice how the memory is organized by app, user, and session:

In [None]:
# Inspect what's now stored in the memory service
# This shows the hierarchical structure: App -> User -> Session -> Events

for app_user, user_sessions in memory_service._session_events.items():
    for user_session, session_events in user_sessions.items():
        print('----------------------')
        print(f'App: {app_user}, User: {user_session}')
        for each in session_events:
            print(f'Role: {each.content.role} > {each.content.parts[0].text}')

# Example response:
# ----------------------
# App: MemoryExampleApp/user-123, User: test-run-01
# Role: user > My favorite color is BlueGreen. Can you write a Haiku
# Role: model > Sure, I can write a haiku about your favorite color, BlueGreen!
# ....

### 4.6. Step 5: Testing Memory Search

Now that we have memories stored, let's test the search functionality. This is what allows agents to find relevant information from past conversations:

In [None]:
# Search for memories related to "favorite color"
# The search looks through all stored conversations for this user

await memory_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="favorite color",
)
# Example response:
# SearchMemoryResponse(memories=[MemoryEntry(content=Content(
#   parts=[
#     Part(
#       text='My favorite color is BlueGreen. Can you write a Haiku'
#     ),
#   ],
#   role='user'
#  ....

Let's test searching for something that wasn't discussed - this should return empty results:

In [None]:
# Search for memories about "trip plan" - which we never discussed
# This demonstrates that search only returns relevant memories

await memory_service.search_memory(
    app_name=APP_NAME,
    user_id=USER_ID,
    query="trip plan",
)

# Example response
# SearchMemoryResponse(memories=[])  # Empty - no memories about trip plans

**Key Insight:** The `search_memory` function only returns memories that actually exist and match the query. This helps keep agent responses relevant and grounded in actual past conversations.

### 4.7. Step 6: Dynamic Memory Loading

Here is a an example of how to dyanmically load session into memories.

## 5. Empowering Agents with Memory Tools

There are two main architectural patterns for retrieving or loading memories into an agent's context within the Agent Development Kit (ADK): **Proactive Retrieval (Static Loading)** and **Reactive Retrieval (Memory-as-a-Tool)**.

1. **Proactive Retrieval** ensures context is always available by automatically loading memories at the beginning of every conversation turn. This uses the `PreloadMemoryTool` built into ADK. Although this pattern guarantees the context is present, it can introduce **unnecessary latency for turns that do not require memory**.

2. **Reactive Retrieval**, often referred to as `Memory-as-a-Tool`, grants the agent the autonomy to decide when memory access is necessary. This is implemented using the LoadMemoryTool, which the agent invokes when its reasoning determines that past context is needed to answer a query. This approach is generally more efficient as the latency and cost of retrieval are incurred only when required.

How to build custom implementations for Proactive and Reactive retrievals:
‚Ä¢ Proactive Retrieval can be implemented via a `custom callback` to manually retrieve memories and append them to the system instructions.
‚Ä¢ Reactive Retrieval can be implemented via a `custom tool` where the developer defines what type of information might be available, enabling a more informed decision by the LLM on when to query.

Here is the simplified representation:

| Pattern | Description | Implementation Options |
|---|---|---|
| Proactive Retrieval (Static Loading) | Memory is loaded before the agent's main processing loop begins. It's **always on** and available. | 1. PreloadMemoryTool |
|  |  | 2. Custom callback (on_before_agent_call) |
| Reactive Retrieval (Memory-as-a-Tool) | Memory is treated as a separate tool that the **agent can choose** to call during its execution, when needed. | 1. LoadMemoryTool |
|  |  | 2. Custom Tool Implementation |

### 5.1. The Problem: Agent Can't Access Memory Yet

So far, we've stored memories, but our agent doesn't know how to use them. It's like having a filing cabinet full of valuable information but no way to access it. Let's fix that by giving our agent the `load_memory` tool.

### 5.2. Upgrading the Agent with Memory Access

The `load_memory` tool allows agents to search through stored memories during conversations. This is the key to making agents truly personalized:

### Memory Access Flow Across Sessions



```mermaid
flowchart TB
    subgraph OldSession["üí¨ Previous Session (test-run-01)"]
        OE["User: My favorite color is BlueGreen<br/>Agent: Your favorite color is BlueGreen"]
    end
    
    subgraph MemoryStore["üóÑÔ∏è Long-term Memory"]
        MS["Stored: favorite color = BlueGreen"]
    end
    
    subgraph NewSession["üÜï New Session (test-run-02)"]
        NQ["User: What is my favorite color?"]
    end
    
    subgraph Agent["ü§ñ Agent with Memory Tools"]
        A1["1Ô∏è‚É£ Detect need for memory"]
        A2["2Ô∏è‚É£ Use load_memory tool"]
        A3["3Ô∏è‚É£ Search: 'favorite color'"]
        A4["4Ô∏è‚É£ Retrieve: BlueGreen"]
        A5["5Ô∏è‚É£ Respond with answer"]
    end
    
    OE -.->|Previously transferred| MemoryStore
    NQ --> A1
    A1 --> A2
    A2 --> A3
    A3 --> MemoryStore
    MemoryStore -->|Returns result| A4
    A4 --> A5
    A5 -->|"Your favorite color is BlueGreen"| NewSession
    
    style OldSession fill:#e0e0e0,stroke:#757575,stroke-width:2px
    style MemoryStore fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style NewSession fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style Agent fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
```

This diagram shows how an agent in a completely new session can access memories from previous conversations using the memory tools, enabling true cross-session personalization.

In [None]:
from google.adk.tools import load_memory # Tool to query memory

user_agent = LlmAgent(
    model=MODEL_NAME,
    instruction="Answer the user's questions in simple words.",
    name=APP_NAME,
    tools=[load_memory] # Equip Agent with Tools to call memory
)

runner = Runner(
    agent=user_agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_service,
)

### 5.3. Testing Memory Access Across Sessions

Now for the moment of truth - let's start a **completely new session** and see if our agent can remember information from the previous conversation:

In [None]:
# Start a NEW session - this is crucial!
# The agent has no conversation history from test-run-01
# But it DOES have access to memories via the load_memory tool

user_queries = [
    "What is my favorite color",
]

await run_session(runner, user_queries, "test-run-02") # Note: New session

# Expected behavior:
# The agent will use the load_memory tool to search for information
# about favorite color, find the memory from test-run-01,
# and correctly answer "BlueGreen"

### 5.4. Proactive Memory Loading (Optional)

In [None]:
from google.adk.tools.preload_memory_tool import PreloadMemoryTool # Tool to query memory

user_agent = LlmAgent(
    model=MODEL_NAME,
    instruction="Answer the user's questions in simple words.",
    name=APP_NAME,
    tools=[PreloadMemoryTool()] # Equip Agent with Tools to call memory
)

runner = Runner(
    agent=user_agent,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_service,
)

In [None]:
user_queries = [
    "What is my favorite color",
]

await run_session(runner, user_queries, "test-run-04") # Note: New session

As you may have noticed, we are able to recall information using a proactive approach. Now let's query the model with information which does not require memory loading.

In [None]:
user_queries = [
    "What is the capital of India?",
]

await run_session(runner, user_queries, "test-run-04") # Note: New session

Expected logging messages from the above executions: `Warning: there are non-text parts in the response:..`

The `PreloadMemoryTool` loads information independent of the query's need. To load memories from active chat conversations check https://google.github.io/adk-docs/sessions/memory/#using-memory-in-your-agent

## 6. What You've Built

üéâ **Congratulations!** You've successfully transformed a stateless LLM into an intelligent agent with persistent memory that spans across conversations.

### Your Journey Recap:

1. **Understood the Challenge**: Recognized how Sessions provide only temporary memory within a single conversation
2. **Implemented Memory Storage**: Used `InMemoryMemoryService` to create a searchable knowledge base
3. **Transferred Knowledge**: Learned how to extract valuable information from sessions and store it as memories
4. **Enabled Memory Access**: Equipped your agent with the `load_memory` tool to access past conversations
5. **Achieved Persistence**: Created an agent that remembers user preferences across different sessions

### Key Takeaways:

- **Sessions vs Memory**: Sessions handle conversation flow; Memory provides long-term knowledge
- **Memory is Searchable**: Unlike sessions, memories can be queried semantically
- **Tools Enable Access**: The `load_memory` tool bridges the gap between stored memories and agent capabilities
- **User-Specific**: Memories are segregated by user, ensuring privacy and personalization

### What's Next?

In production environments, you'll want to:
- Use **Vertex AI Memory Bank** for scalable, persistent memory storage
- Implement memory curation strategies to extract the most valuable insights
- Add memory expiration policies for data governance
- Consider using artifacts for storing structured data and files

Your agents can now build relationships with users over time, learning preferences and providing increasingly personalized experiences. This is the foundation of truly intelligent conversational AI!

#### Read more
* [Google ADK Memory](https://google.github.io/adk-docs/sessions/memory/)
* [Google ADK - Vertex AI Memory Bank](https://github.com/GoogleCloudPlatform/generative-ai/blob/62efa4db92dd6aeff735e8f0f29bffa7c016eba4/gemini/agent-engine/memory/get_started_with_memory_bank_adk.ipynb)
* [Google ADK - Artifacts](https://google.github.io/adk-docs/artifacts/#what-are-artifacts)