# 8 - Memory Management
##  GADK - Long-Term Knowledge with MemoryService
- Long-Term Knowledge managed by the MemoryService is like a searchable archive or library the agent can look through, potentially containing information from many past chats or other sourcess
- ADK gives you a few options for how to build this long-term knowledge store. You can use InMemoryMemoryService which is great for quick tests but loses data on restart. For production needs, you'll likely use VertexAiRagMemoryService which leverages Google Cloud's powerful RAG service for scalable, persistent, and smart semantic search 


In [None]:
# Example: Using InMemoryMemoryService
# This is suitable for local development and testing where data persistence
# across application restarts is not required. Memory content is lost when the app stops.
from google.adk.memory import InMemoryMemoryService

memory_service = InMemoryMemoryService()

In [None]:
# Example: Using VertexAiRagMemoryService
# This is suitable for scalable production on Google Cloud Platform, leveraging
# Vertex AI RAG (Retrieval Augmented Generation) for persistent, searchable memory.
# Requires: pip install google-adk[vertexai], GCP setup/authentication, and a Vertex AI RAG Corpus.
from google.adk.memory import VertexAiRagMemoryService

# The resource name of your Vertex AI RAG Corpus
RAG_CORPUS_RESOURCE_NAME = "projects/your-gcp-project-id/locations/us-central1/ragCorpora/your-corpus-id"  # Replace with your Corpus resource name

# Optional configuration for retrieval behavior
SIMILARITY_TOP_K = 5  # Number of top results to retrieve
VECTOR_DISTANCE_THRESHOLD = 0.7  # Threshold for vector similarity

memory_service = VertexAiRagMemoryService(
    rag_corpus=RAG_CORPUS_RESOURCE_NAME,
    similarity_top_k=SIMILARITY_TOP_K,
    vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD,
)
# When using this service, methods like add_session_to_memory and search_memory
# will interact with the specified Vertex AI RAG Corpus.

Here's how it typically flows: 
- You have a chat interaction via a Session. 
- At some point, you add that session's relevant content to the long-term memory using memory_service.add_session_to_memory(session).
- Later, in a different chat, you might ask a question needing past context. 
- An agent with a memory-retrieval tool (like the built-in load_memory tool) uses it, providing a search query. 
- The tool calls memory_service.search_memory(...), which searches the store and returns relevant snippets. 
- The agent then gets these results back from the tool and uses them to formulate its final answer to you.

In [None]:
# ADK Conceptual Example: Adding and Searching Memory (InMemory)
# This example demonstrates the core ADK memory pattern:
# 1. Capturing information within one session.
# 2. Adding the content of that session to a Memory Service (simulated in-memory).
# 3. Querying the Memory Service from a *separate* session using a tool to retrieve the previously captured information.

import asyncio  # Required to run asynchronous code (like runner.run and memory_service methods)

# Assuming necessary ADK components are available for import.
# In a real project, you would install the ADK library and import these directly.
from google.adk.agents import LlmAgent  # Base class for agents powered by LLMs
from google.adk.sessions import (
    InMemorySessionService,
    Session,
)  # Service and object for managing conversation sessions
from google.adk.memory import (
    InMemoryMemoryService,
)  # In-memory implementation of the Memory Service
from google.adk.runners import (
    Runner,
)  # Orchestrates agent execution and session/memory interaction
from google.adk.tools import (
    load_memory,
)  # Built-in tool for querying the Memory Service
from google.genai.types import (
    Content,
    Part,
)  # Classes to structure conversational content

# --- Constants ---
# Define identifiers for the application, user, and the conceptual LLM model.
APP_NAME = "memory_example_app"  # Unique name for the application
USER_ID = "mem_user"  # Unique identifier for the user
MODEL = "gemini-2.0-flash"  # Conceptual name of the LLM model being used

# --- Agent Definitions ---
# Define the agents that will participate in the scenario.

# Agent 1: InfoCaptureAgent
# This agent is designed to simply acknowledge user input.
# Its primary purpose in this example is to generate conversation content
# that we can later add to the memory service.
info_capture_agent = LlmAgent(
    model=MODEL,  # Assign the conceptual model
    name="InfoCaptureAgent",  # Assign a name to the agent
    instruction="Acknowledge the user's statement.",  # Provide instructions for the agent's behavior
    # output_key="captured_info" # Optional: Could also save the agent's response to session state
)

# Agent 2: MemoryRecallAgent
# This agent is designed to answer user questions and is equipped with a tool
# to access long-term memory.
memory_recall_agent = LlmAgent(
    model=MODEL,  # Assign the conceptual model
    name="MemoryRecallAgent",  # Assign a name to the agent
    instruction="Answer the user's question. Use the 'load_memory' tool "
    "if the answer might be in past conversations.",  # Instruct the agent to use the tool
    tools=[load_memory],  # <-- Provide the agent with the built-in load_memory tool.
    # This allows the agent (via the LLM's function calling ability)
    # to decide to use this tool to search memory.
)

# --- Services and Runner ---
# Initialize the core ADK services and the Runner.

# Initialize the in-memory session service.
# This service manages the creation, retrieval, and updating of conversation sessions.
session_service = InMemorySessionService()  # Using InMemory for simplicity in this demo

# Initialize the in-memory memory service.
# This service manages the storage and retrieval of long-term knowledge.
# The load_memory tool will interact with this service.
memory_service = InMemoryMemoryService()  # Using InMemory for simplicity in this demo

# Initialize the Runner.
# The Runner acts as the orchestrator, taking user input, managing the session,
# running the appropriate agent, handling tool calls, and updating the session/memory.
runner = Runner(
    # Initially, set the runner to use the info_capture_agent for the first turn.
    agent=info_capture_agent,
    app_name=APP_NAME,  # Provide the application name
    session_service=session_service,  # Provide the session service
    memory_service=memory_service,  # <-- Provide the memory service to the Runner.
    # The Runner needs this to execute tools that interact with memory.
)

# --- Scenario ---
# This section simulates a two-turn conversation scenario to demonstrate memory usage.

# Turn 1: Capture some information in a session.
print("--- Turn 1: Capturing Information ---")
session1_id = "session_info"  # Define a unique ID for the first session

# Create the first session using the session service.
session1 = session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=session1_id
)

# Define the user's input for the first turn.
user_input1 = Content(
    parts=[Part(text="My favorite project is Project Alpha.")], role="user"
)

# Run the first agent (info_capture_agent) with the user input in session1.
# The runner handles the flow: appends user input, runs the agent, appends agent response.
print("Running InfoCaptureAgent...")


# runner.run is an asynchronous generator, so we need to iterate over it using 'async for'.
async def run_turn1():
    final_response_text = "(No final response)"
    # Iterate through the events yielded by the runner during this turn.
    async for event in runner.run(
        user_id=USER_ID, session_id=session1_id, new_message=user_input1
    ):
        # In a real application, you would process and display these events to the user.
        # Here, we just capture the final response text for printing.
        if event.is_final_response() and event.content and event.content.parts:
            final_response_text = event.content.parts[0].text
    return final_response_text


# Execute the asynchronous function for turn 1.
turn1_response = asyncio.run(run_turn1())
print(f"Agent 1 Response: {turn1_response}")

# Get the completed session object after the first turn.
# This session object now contains the history of the first turn.
completed_session1 = session_service.get_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=session1_id
)

# Add the content of this session to the Memory Service.
# This is the crucial step for long-term memory. The content of session1
# is now stored in the memory_service and can be searched later.
print("\n--- Adding Session 1 to Memory ---")


# add_session_to_memory is an asynchronous method of the Memory Service.
async def add_session():
    await memory_service.add_session_to_memory(completed_session1)


# Execute the asynchronous function to add the session to memory.
asyncio.run(add_session())
print("Session added to memory.")


# Turn 2: In a *new* (or same) session, ask a question requiring memory.
print("\n--- Turn 2: Recalling Information ---")
# Create a second session.
# Using a *new* session ID here ("session_recall") clearly demonstrates that
# the MemoryRecallAgent is retrieving information from a *past* session (session1),
# not just the current session's history.
session2_id = (
    "session_recall"  # Can be the same ID as session1_id if continuing conversation
)
session2 = session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=session2_id
)

# Switch the runner to use the memory_recall_agent for this turn.
# This agent has the 'load_memory' tool.
runner.agent = memory_recall_agent

# Define the user's input for the second turn - a question that requires recalling info from session1.
user_input2 = Content(parts=[Part(text="What is my favorite project?")], role="user")

# Run the memory_recall_agent with the user input in session2.
print("Running MemoryRecallAgent...")


# Iterate through events yielded by the runner for turn 2.
async def run_turn2():
    final_response_text_2 = "(No final response)"
    async for event in runner.run(
        user_id=USER_ID, session_id=session2_id, new_message=user_input2
    ):
        # Print information about the events to see the sequence of actions,
        # including the tool call and tool response.
        event_type = "Unknown"
        if event.content and event.content.parts and event.content.parts[0].text:
            event_type = "Text"
        elif event.get_function_calls():
            event_type = "FuncCall"  # Indicates the agent decided to call a tool
        elif event.get_function_responses():
            event_type = "FuncResp"  # Indicates the result from a tool call
        print(f"  Event: {event.author} - Type: {event_type}")
        if event.content:
            print(
                f"    Content: {event.content.parts[0].text if event.content.parts else 'Empty'}"
            )
        if event.get_function_calls():
            print(
                f"    Tool Call: {event.get_function_calls()}"
            )  # Show details of the tool call (e.g., tool name, arguments)
        if event.get_function_responses():
            print(
                f"    Tool Response: {event.get_function_responses()}"
            )  # Show the output received from the tool

        # Check for the agent's final response.
        if event.is_final_response() and event.content and event.content.parts:
            final_response_text_2 = event.content.parts[0].text
            print(f"Agent 2 Final Response: {final_response_text_2}")
            break  # Stop after the final response in this simulation

    return final_response_text_2


# Run the asynchronous function for turn 2.
turn2_response = asyncio.run(run_turn2())

# Expected Event Sequence for Turn 2 (as processed by the Runner):
# 1. User sends "What is my favorite project?" (User event yielded)
# 2. Agent (LLM) processes input and current context (session2 events, which is just the user's question).
# 3. Based on its instruction and the user's question, the Agent (LLM) decides to call the `load_memory` tool. (Tool call event yielded by Runner)
# 4. The Runner intercepts the tool call event and executes the `load_memory` tool.
# 5. The `load_memory` tool calls `memory_service.search_memory` with a query derived from the user's input (e.g., "favorite project").
# 6. The `InMemoryMemoryService` searches its stored content (which includes the content from session1).
# 7. The `InMemoryMemoryService` finds the relevant text ("My favorite project is Project Alpha.") from session1 and returns it to the tool.
# 8. The tool packages this retrieved text into a FunctionResponse event. (Tool response event yielded by Runner)
# 9. The Agent (LLM) receives the function response (the retrieved text) as part of the session history, processes this new context.
# 10. Based on the retrieved information, the Agent generates the final answer (e.g., "Your favorite project is Project Alpha."). (Agent final response event yielded by Runner)


# Clean up the sessions (optional, but good practice in longer-running apps)
# session_service.delete_session(app_name, USER_ID, session1_id)
# session_service.delete_session(app_name, USER_ID, session2_id)