In [None]:
!pip install --upgrade --quiet google-adk==1.16

# Building Intelligent Agents: Gemini, Google ADK, and Memory Management - Part 1

In this notebook, you'll learn how to transform stateless LLMs into intelligent, stateful agents that can maintain conversation context and working memory. We'll explore Sessions, Events, Session State, and the concept of Context Engineering.

**What you'll learn:**
- Why LLMs are inherently *stateless* and how to overcome this limitation
- Building *stateful* conversational agents with Sessions and Events
- Understand the limitations of Session and Memory Context
- Managing working memory with Session State
- Best practices for managing long conversations

**Time:** 20-25 minutes

In the next notebook, we'll extend these concepts to cover long-term Memory that persists across sessions.

## 1. Setup and Configuration

This section covers the initial setup required to run the notebook, including installing libraries and configuring the environment.

#### 1.1. Install Dependencies

Install necessary Python packages: google-adk


#### 1.2. Environment Configuration

- Set up Gemini API Key if using Google AI Studio
- Set up Google Cloud Project ID and Location if using Vertex AI and handle authentication for Google Colab environments
- Import required libraries
- Define the agent_name, app_name, model and user_id

##### **Vertex AI Users**
If you are using **Vertex AI**, set the values of **PROJECT_ID** and **LOCATION** below and authenticate


In [None]:
import os

PROJECT_ID = "[your-project-name]"  # @param {type:"string"}
LOCATION = "global" # @param {type:"string"}

if not PROJECT_ID:
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

GOOGLE_GENAI_USE_VERTEXAI = "1" # Use Vertex AI API

os.environ["GOOGLE_CLOUD_PROJECT"] = PROJECT_ID
os.environ["GOOGLE_CLOUD_LOCATION"] = LOCATION
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = GOOGLE_GENAI_USE_VERTEXAI # Use Vertex AI API

In [None]:
# User Authentication - only required for Google Colab Notebooks
import sys

if "google.colab" in sys.modules:
    from google.colab import auth
    auth.authenticate_user()

##### **Google AI Studio Users**
If you are using **Google AI Studio**, store the API Key in the secret manager and access it below


In [None]:
# import os
# GEMINI_API_KEY = "<add-your-api-key-here>"  # @param {type:"string"}

# if not GEMINI_API_KEY:
#     GEMINI_API_KEY = str(os.environ.get("GEMINI_API_KEY"), "")

# os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY

## 2. Understanding the Challenge: Stateless LLMs

At their core, Large Language Models are **inherently stateless**. Think of them as having amnesia after every interaction - their awareness is confined to the information you provide in a single API call. This means an agent without proper context management will react to the current prompt without considering any previous history.

**Why does this matter?** Imagine trying to have a meaningful conversation with someone who forgets everything you've said after each sentence. That's the challenge we face with raw LLMs!

Let's see this limitation in action. We'll build a simple chat with [Google GenAI SDK](https://github.com/googleapis/python-genai) and demonstrate how it forgets our name between calls.



In [None]:
# Google Gen AI SDK Imports
from google import genai
from google.genai import types

In the example below, we provide username as `Sam` to let Gemini register this information


In [None]:
# Initialize the Gemini client - notice we're using the raw SDK without any session management
client = genai.Client(http_options=types.HttpOptions(api_version="v1"))
model_name = "gemini-2.5-flash-lite"

# First query: We introduce ourselves as Sam
query = "Hi, I am Sam. Tell me why the sky is blue?"
response = client.models.generate_content(model=model_name, contents=query)

# Display the interaction
print(f'User> {query}')
print(f'Model> {response.text[:100]}..')

# Example response:
#   User> Hi, I am Sam. Tell me why the sky is blue?
#   Model> Hi Sam! That's a great question, and the reason the sky is blue has to do with how sunlight interact..

    User> Hi, I am Sam. Tell me why the sky is blue?
    Model> Hi Sam! That's a fantastic question, and it's all thanks to a phenomenon called **Rayleigh scatterin..

Notice how Gemini responded with "Hi Sam!" in the previous interaction? Let's see if it remembers our name in a new API call:


In [None]:
# Second query: Ask if the model remembers our name
# This is a completely new API call with no context from the previous interaction
query = "Hi, what is my name?"
response = client.models.generate_content(model=model_name, contents=query)

# Display the interaction
print(f'User> {query}')
print(f'Model> {response.text[:100]}...')

# Example response:
#   User> Hi, what is my name?
#   Model> I do not know your name. I am a large language model, trained by Google.

# 🔴 The model has completely forgotten that we introduced ourselves as Sam!

    User> Hi, what is my name?
    Model> I do not have access to your personal information, including your name. I am a large language model,...

As you can see, the agent has *no record* of our first message. To build intelligent agents that can remember, learn, and personalize interactions, we must construct the context for every turn of a conversation. This practice is known as **Context Engineering**.

### Section 2.1: Context Engineering

In the earlier section, we understood that LLM can remember information that is available in the given context windows. In the section we will learn about how to effectively manage context windows using Google ADK.

If you are familiar with [Prompt Engineering](https://en.wikipedia.org/wiki/Prompt_engineering), then you see **Context Engineering** an evolution from it. Let's understand the difference:

**Prompt Engineering (Traditional approach):**
- Focuses on crafting optimal, often static, system instructions
- Works using fixed templates
- Limited adaptation to a limited context or scenarios

**Context Engineering (Modern approach):**
- Focuses on understanding user preferences, often dynamic and data(context) driven decisions.
- Addresses the entire payload, dynamically constructing a state-aware prompt
- Wide range of adaptation due to ability to improve over time.

In Google ADK, to manage context we have 2 key components: **Sessions** (conversation history) and **State** (working memory).

### Section 2.2: Understanding Sessions, Events & Runner

Now let's explore how Google ADK implements Context Engineering through Sessions.

#### Key Concepts:

**📦 Session**

In Google ADK, A session is a foundational element of Context Engineering. A session is a container for conversations. It encapsulates the conversation history in a chronological manner and also records all tool interactions and responses for a single, continuous conversation.

A session is tied to a user and for a specific agent. For instance a session history for 1 user is not shared with other users. Similarly, a session history for an Agent is not shared with other Agent. These segregation helps to keep information separate and private(limited) thus enabling the Agent's performance over time. 

**📝 Events**:

While A session is a container for conversations, Events are the building blocks of a conversation.

- **User Input**: A message from the user (text, audio, image, etc.)
- **Agent Response**: The agent's reply to the user
- **Tool Call**: The agent's decision to use an external tool or API
- **Tool Output**: The data returned from a tool call, which the agent uses to continue its reasoning

**🎯 ADK Components:**

An Agentic Application can have multiple users and each user may have multiple sessions with the Application. To manage these sessions and events, Google ADK offers a **SessionManager** and **Runner**.

1. **`SessionService`**: The storage layer
   - Manages creation, storage, and retrieval of session data
   - Different implementations for different needs (memory, database, cloud)

2. **`Runner`**: The orchestration layer
   - Manages the flow of information between user and agent
   - Automatically maintains conversation history
   - Handles the Context Engineering behind the scenes

Think of it like this:
- **Session** = A notebook 📓
- **Events** = Individual entries in the notebook 📝
- **SessionService** = The filing cabinet storing notebooks 🗄️
- **Runner** = The assistant managing the conversation 🤖

### Section 2.3: Implementing a Session for Conversational History

Let's rebuild our agent, but this time, we'll use a `Runner` and a `SessionService` to make it stateful. Watch how the same conversation works when we properly manage context!


In [None]:
import warnings
import logging
from typing import Any, Iterator, Optional, List, Dict
import httpx

from google.adk.agents import Agent, LlmAgent
from google.adk.sessions import InMemorySessionService
from google.adk.sessions import DatabaseSessionService
from google.adk.runners import Runner
from google.genai import types

LLMs usually work by converting user input information into token and respond in tokens. These tokens are later converted into text, video, images or audio depending on LLM's capability. While all these steps are usually managed by LLM service provider, the LLMs response take time.

```mermaid
block-beta
    A space B space C space D space E space F
    
    A["😊 User Input"] --> B("Input2Tokens")
    B --> C{"LLM Processing"}
    C --> D("Response Tokens")
    D --> E{"Tokens2Output"}
    E --> F["🌸 Model Response."]
```

To be efficient, Google ADK uses async calls for LLMs and others. Let's create a helper function that will make it easy to run conversations in this notebook:


In [None]:
async def run_session(runner_instance: Runner, user_queries: list[str] | str = None, session_name: str = "default"):
    """
    Helper function that manages a complete conversation session, handling session
    creation/retrieval, query processing, and response streaming. It supports
    both single queries and multiple queries in sequence.

    Args:
        runner_instance (Runner): The ADK Runner instance that manages the
            conversation flow between user and agent.
        user_queries (list[str] | str | None): Either a single query string,
            a list of query strings to process sequentially, or None if no
            queries are provided.
        session_name (str): A unique identifier for the session. Defaults to
            "default". Used to resume previous conversations or start new ones.

    Returns:
        None: This function prints the conversation to stdout rather than
            returning values.

    Example:
        >>> await run_session(runner, "What is the capital of France?", "geography-session")
        >>> await run_session(runner, ["Hello!", "What's my name?"], "user-intro-session")

    Note:
        - If a session with the given name already exists, it will be resumed.
    """
    # Display the session identifier for tracking
    print(f"\n ### Session: {session_name}")

    # Attempt to create a new session or retrieve an existing one
    try:
        session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_name)
    except:
        session = await session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_name)

    # Process queries if provided
    if user_queries:
        # Convert single query to list for uniform processing
        if type(user_queries) == str:
            user_queries = [user_queries]

        # Process each query in the list sequentially
        for query in user_queries:
            # Display the user's query
            print(f"\nUser > {query}")

            # Convert the query string to the ADK Content format
            query = types.Content(role="user", parts=[types.Part(text=query)])

            # Stream the agent's response asynchronously
            async for event in runner_instance.run_async(user_id=USER_ID, session_id=session.id, new_message=query):
                # Check if the event contains valid content
                if event.content and event.content.parts:
                    # Filter out empty or "None" responses before printing
                    if event.content.parts[0].text != "None" and event.content.parts[0].text:
                        # Display the model's response with the model name prefix
                        print(f"{MODEL_NAME} > ", event.content.parts[0].text)
    else:
        print("No queries!")

### Implementing Our First Stateful Agent

Now let's put this into practice. ADK offers different types of sessions suitable for different needs. To begin, we'll start with `InMemorySessionService` for simplicity:


In [None]:
APP_NAME = "default"        # Application
USER_ID = "default"         # User
SESSION = "default"         # Session

MODEL_NAME = "gemini-2.5-flash-lite"


# Step 1: Create the LLM Agent
# This defines WHAT our agent is - its model and purpose
root_agent = Agent(
    model="gemini-2.5-flash-lite",  # Using the efficient Gemini model
    name="text_chat_bot",           # Internal name for logging/debugging
    description="A text chatbot",   # Description of the agent's purpose
)

# Step 2: Set up Session Management
# InMemorySessionService stores conversations in RAM (temporary)
session_service = InMemorySessionService()

# Step 3: Create the Runner
# The Runner orchestrates everything - it connects the agent with session management
runner = Runner(
    agent=root_agent,
    app_name=APP_NAME,
    session_service=session_service
)

print("✅ Stateful agent initialized!")
print(f"   - Application: {APP_NAME}")
print(f"   - User: {USER_ID}")
print(f"   - Using: {session_service.__class__.__name__}")

    ✅ Stateful agent initialized!
       - Application: default
       - User: default
       - Using: InMemorySessionService

### Testing Our Stateful Agent

Now let's see the magic of sessions in action! We'll run the same conversation that failed with the stateless approach:


In [None]:
# Run a conversation with two queries in the same session
# Notice: Both queries are part of the SAME session, so context is maintained
await run_session(runner, [
    "Hi, I am Sam! What is the capital of United States?",
    "Hello! What is my name?"  # This time, the agent should remember!
], "test-session-01")

# Expected output:
# ### Session: default-test-session-01
#
# User > Hi, I am Sam! What is the capital of the United States?
# gemini-2.5-flash-lite > Hi Sam! The capital of the United States is Washington, D.C.
#
# User > Hello! What is my name?
# gemini-2.5-flash-lite > Your name is Sam! You told me at the beginning of our conversation.

     ### Session: test-session-01
    
    User > Hi, I am Sam! What is the capital of United States?
    gemini-2.5-flash-lite >  Hi Sam! The capital of the United States is Washington, D.C.
    
    User > Hello! What is my name?
    gemini-2.5-flash-lite >  Your name is Sam.


🎉 **Success!** The agent remembered your name because both queries were part of the same session. The Runner automatically maintained the conversation history.

But there's a catch: `InMemorySessionService` is temporary. Once the application stops, all conversation history is lost. Let's confirm this by continuing the same session:


In [None]:
# Continue the same session - the agent should still remember everything
await run_session(runner, [
    "What did I ask you about earlier?",
    "And remind me, what's my name?"
], "test-session-01")  # Note, we are using same session name

# The agent remembers because we're still in the same session!
# But if you restart the kernel, all this history will be gone...

     ### Session: test-session-01
    
    User > What did I ask you about earlier?
    gemini-2.5-flash-lite >  I do not have the ability to recall past conversations. Therefore, I cannot tell you what you asked me about earlier.
    
    User > And remind me, what's my name?
    gemini-2.5-flash-lite >  I do not have the ability to recall past conversations. Therefore, I cannot tell you your name.


## 3. Persistent Sessions with DatabaseSessionService

While `InMemorySessionService` is great for prototyping, real-world applications need conversations to survive restarts, crashes, and deployments. Let's level up to persistent storage!

### 3.1. Choosing the Right SessionService

ADK provides different SessionService implementations for different needs:

| Service | Use Case | Persistence | Best For |
|---------|----------|-------------|----------|
| **InMemorySessionService** | Development & Testing | ❌ Lost on restart | Quick prototypes |
| **DatabaseSessionService** | Self-managed apps | ✅ Survives restarts | Small to medium apps |
| **Agent Engine Sessions** | Production on GCP | ✅ Fully managed | Enterprise scale |

### 3.2. Implementing Persistent Sessions

Let's upgrade to `DatabaseSessionService` using SQLite. This gives us persistence without needing a separate database server:


In [None]:
# Clean up any existing database to start fresh
import os
if os.path.exists("my_agent_data.db"):
    os.remove("my_agent_data.db")
print("🗑️  Cleaned up old database files")

    🗑️  Cleaned up old database files



In [None]:
from google.adk.sessions import DatabaseSessionService

# Step 1: Create the same agent (notice we use LlmAgent this time)
root_agent = LlmAgent(
    model="gemini-2.5-flash-lite",
    name="text_chat_bot",
    description="A text chatbot with persistent memory",
)

# Step 2: Switch to DatabaseSessionService
# SQLite database will be created automatically
db_url = "sqlite:///my_agent_data.db"  # Local SQLite file
session_service = DatabaseSessionService(db_url=db_url)

# Step 3: Create a new runner with persistent storage
runner = Runner(
    agent=root_agent,
    app_name=APP_NAME,
    session_service=session_service
)

print("✅ Upgraded to persistent sessions!")
print(f"   - Database: my_agent_data.db")
print(f"   - Sessions will survive restarts!")

    ✅ Upgraded to persistent sessions!
       - Database: my_agent_data.db
       - Sessions will survive restarts!

### Test Run 1: Verifying Persistence

In this first test run, we'll start a new conversation with the session ID `test-db-session-01`. We will first introduce our name as 'Sam' and then ask a question. In the second turn, we will ask the agent for our name.

Since we are using `DatabaseSessionService`, the agent should remember the name.

After the conversation, we'll inspect the `my_agent_data.db` SQLite database directly to see how the conversation `events` (the user queries and model responses) are stored.


In [None]:
await run_session(runner, [
    "Hi, I am Sam! what is the Capital of United States?",
    "Hello! what is my name?"
], "test-db-session-01")

     ### Session: test-db-session-01
    
    User > Hi, I am Sam! what is the Capital of United States?
    gemini-2.5-flash-lite >  Hello Sam! The capital of the United States is Washington, D.C.
    
    User > Hello! what is my name?
    gemini-2.5-flash-lite >  Your name is Sam.


### Test Run 2: Resuming a Conversation

Now, let's run the session again with the **same session ID** (`test-db-session-01`). This simulates resuming a previous conversation.

We will ask a new question and then ask for our name again. Because the session is loaded from the database, the agent should still remember that our name is 'Sam' from the first test run. This demonstrates the power of persistent sessions.


In [None]:
await run_session(runner, [
    "What is the Capital of India?",
    "Hello! what is my name?"
], "test-db-session-01")

     ### Session: test-db-session-01
    
    User > What is the Capital of India?
    gemini-2.5-flash-lite >  The capital of India is New Delhi.
    
    User > Hello! what is my name?
    gemini-2.5-flash-lite >  Your name is Sam.


### Observation: Sqlite Sesion Events

As we using a sqlite DB to store informations, let us have a quick peek to see how information is store.


In [None]:
import sqlite3

def check_data_in_db():
    with sqlite3.connect("my_agent_data.db") as connection:
        cursor = connection.cursor()
        result = cursor.execute("select app_name, session_id, author, content from events")
        print([_[0] for _ in result.description])
        for each in result.fetchall():
            print(each)


check_data_in_db()

    ['app_name', 'session_id', 'author', 'content']
    ('default', 'test-db-session-01', 'user', '{"parts": [{"text": "Hi, I am Sam! what is the Capital of United States?"}], "role": "user"}')
    ('default', 'test-db-session-01', 'text_chat_bot', '{"parts": [{"text": "Hello Sam! The capital of the United States is Washington, D.C."}], "role": "model"}')
    ('default', 'test-db-session-01', 'user', '{"parts": [{"text": "Hello! what is my name?"}], "role": "user"}')
    ('default', 'test-db-session-01', 'text_chat_bot', '{"parts": [{"text": "Your name is Sam."}], "role": "model"}')
    ('default', 'test-db-session-01', 'user', '{"parts": [{"text": "What is the Capital of India?"}], "role": "user"}')
    ('default', 'test-db-session-01', 'text_chat_bot', '{"parts": [{"text": "The capital of India is New Delhi."}], "role": "model"}')
    ('default', 'test-db-session-01', 'user', '{"parts": [{"text": "Hello! what is my name?"}], "role": "user"}')
    ('default', 'test-db-session-01', 'text_chat_bot', '{"parts": [{"text": "Your name is Sam."}], "role": "model"}')

### Session to Session Segregations

As mentioned earlier, a session is private conversation between an Agent and a User. i.e. Two session do not share information. Let run our `run_session` with a different session name `test-db-session-02` to confirm this.


In [None]:
await run_session(runner, [
    "Hello! what is my name?"
], "test-db-session-02")   # Note, we are using new session name

     ### Session: test-db-session-02
    
    User > Hello! what is my name?
    gemini-2.5-flash-lite >  I do not have access to your personal information, including your name. I am a text-based AI and do not have the ability to know who you are.


While isolation of information is good, it can become counter productive when working with Multiple (Sub) Agents or when working with same User. At the same time, sharing the entire conversation history is ineffective. In Google ADK, we use Session.State for short term (conversational data) and Memory for long term (past user conversational data) to address this.

## 4. Working with Session State

So far, we've focused on conversation history (Events). But sessions can also maintain **structured working memory** called Session State. This is like the agent's notepad or scratchpad during a conversation.

### 4.1. Understanding Session State

Within each Session (our conversation thread), the state attribute acts like the agent's dedicated scratchpad for that specific interaction. While `session.events` holds the full history, `session.state` is where the agent stores and updates **dynamic details** needed during the conversation.

### 4.2. Key Characteristics of State

- (Dictionary) Structure: Conceptually, `session.state` is a collection (dictionary or Map) holding key-value pairs.
- Mutability: It is automatically managed and updated during conversation history.
- Persistence: Its persistancy is dependent of Session.

### 4.3. Session State (`output_key`)

For a Multi-Agent application, ensuring model responses are stored in session state is crucial. In Google ADK, `output_key` parameter is used for this and it is automatically managed in the session state (session.state).


In [None]:
from google.adk.agents import LlmAgent


root_agent = LlmAgent(
    model="gemini-2.5-flash-lite",
    name="text_chat_bot",
    description="A text chatbot",
    output_key="text_chat_bot_output_key"  # Note: Adding `output_key` 
)

### 4.4. Best Practices for State Design Recap

* Minimalism: Store only essential, dynamic data.
* Serialization: Use basic, serializable types.
* Descriptive Keys & Prefixes: Use clear names and appropriate optional prefixes (user:, app:, temp:).

## 5. Production Considerations

When moving an agent to a production environment, its session management system must evolve from a simple log to a robust, enterprise-grade service. The is not the complete list but key considerations fall into three critical areas.

1. **Security and Privacy:** Protecting sensitive information in sessions is **non-negotiable**. Use ACLs (Access Control List) when necessary.

2. **Data Integrity and Lifecycle Management:**  Sessions need clear rules for storage and maintenance. Add data retention policies to manage past conversation history.

3. **Performance and Scalability:** An Agentic application needs to be fast and reliable to provide a good user performance.


## 6. What You've Built

🎉  Congratulations! You've mastered the fundamentals of building stateful AI agents:

- ✅ **Context Engineering** - You understand how to dynamically assemble context for LLMs
- ✅ **Sessions & Events** - You can maintain conversation history across multiple turns
- ✅ **Persistent Storage** - You know how to make conversations survive restarts
- ✅ **Session State** - You can track structured data during conversations
- ✅ **Production Considerations** - You're ready to handle real-world challenges

### To Recap - Your Journey So Far

1. Started with stateless LLMs that forget everything
2. Learned about Context Engineering vs Prompt Engineering
3. Implemented stateful agents with Sessions
4. Upgraded to persistent storage with databases
5. Introduction to Sesion.State.
6. A few production considerations

## Learn more

To learn more about Sessions and State in depth:

* https://google.github.io/adk-docs/sessions/session/
* https://google.github.io/adk-docs/sessions/state/
* https://medium.com/google-cloud/2-minute-adk-manage-context-efficiently-with-artifacts-6fcc6683d274
* https://medium.com/google-cloud/2-minute-adk-context-compaction-in-a-snap-470da15c30f4