# Agentic Platform: Memory Gateway

This lab introduces the concept of a Memory Gateway. Memory Gateways provide a standard API on top of the underlying memory implementation. The gateway makes it simple to change out an underlying memory store, like moving from PostgreSQL to a NoSQL database, without changing the agents that use the memory. As long as the underlying memory store supports the same contract in the gateway, clients are unaffected.

There are many open-source frameworks that provide memory features for agents, including [Mem0](https://mem0.ai/). However, there is not a standard gateway implementation.

To get started, let's build a simple memory gateway that uses PostgreSQL as the underlying data store. In the platform we use Aurora RDS PostgreSQL, but here we will use a PostgreSQL database running in a Docker container.

## Memory Gateway components

Our memory gateway exposes four methods.

`get-session-context` retrieves session-specific data.
`upsert-session-context` adds new messages to a session history
`get-memories` retrieves memories for a session or user
`create-memory` creates a new memory for a user or session

To put these methods into context, the memory gateway deals with two forms of memory. Session memory is a short-term log of messages in a single conversation or session. The more general purpose memory stores a piece of content, when the memory was created, and optionally the embedding representation of that content.

In usage, the short-term session memory will let you retrieve the flow of a single session or conversation. The general purpose memory will let you retrieve contextual information based on session, user, or agent ID, or using a semantic embedding search.

## Using local PostgreSQL

To experiment with PostgreSQL, we've provided a docker-compose file in the main project. You can spin up the docker compose to get a local instance of PostgreSQL. In the agent stack PostgreSQL is in a private subnet making it harder to test. You can port foward through the jump box if you'd like to hit it directly, but for this lab we'll be hitting our local docker instance.

You should create a .env file in this directory and have these values once docker compose is stood up. 

```bash
export ENVIRONMENT=local
export PG_DATABASE=devdb
export PG_USER=dev
export PG_READ_ONLY_USER=dev
export PG_PASSWORD='dev'
export PG_READ_ONLY_PASSWORD='dev'
export PG_CONNECTION_URL=localhost
```

In [None]:
from dotenv import load_dotenv

load_dotenv()

## Create and retrieve basic memory entries

Let's get started by creating and retrieving a few sample memory entries. Here are the definitions of a session and general-purpose memory entry.

In [None]:
from agentic_platform.core.models.memory_models import SessionContext, Memory

In [None]:
SessionContext??

In [None]:
Memory??

Now we'll use the agentic platform API to store a couple of sample entries.

In [None]:
from agentic_platform.core.models.memory_models import CreateMemoryRequest, UpsertSessionContextRequest
from agentic_platform.core.models.llm_models import Message
from agentic_platform.service.memory_gateway.api.create_memory_controller import CreateMemoryController
from agentic_platform.service.memory_gateway.api.upsert_session_controller import UpsertSessionContextController

In [None]:
session_entry = SessionContext()
session_entry.add_message(Message(role='user', text='sample message'))

In [None]:
from agentic_platform.core.models.memory_models import CreateMemoryResponse, UpsertSessionContextResponse
upsert_request : UpsertSessionContextRequest = UpsertSessionContextRequest(
    session_context=session_entry
)

response: UpsertSessionContextResponse = UpsertSessionContextController.upsert_session_context(upsert_request)
print(response.model_dump_json(indent=2))

In [None]:
import uuid
session_entry.add_message(Message(role='assistant', text='Could you elaborate?'))
session_entry.add_message(Message(role='user', text='Meant to ask you to tell me a joke'))
session_entry.add_message(Message(role='assistant', text='Knock knock!'))
session_entry.add_message(Message(role='user', text='My favorite sport is football, tell me a joke about that'))
session_entry.add_message(Message(role='assistant', text='Why do football players get paid so much? Overtime!'))
memory_entry = CreateMemoryRequest(session_id = session_entry.session_id,
                                   agent_id=str(uuid.uuid4()),
                                   user_id='test_user',
                                   session_context=session_entry)

In [None]:
response: CreateMemoryResponse = CreateMemoryController.create_memory(memory_entry)
print(response.model_dump_json(indent=2))

Now, let's retrieve our session and memory.

In [None]:
from agentic_platform.core.models.memory_models import GetMemoriesRequest, GetMemoriesResponse, GetSessionContextRequest, GetSessionContextResponse

get_memory_request : GetMemoriesRequest = GetMemoriesRequest(
    session_id = session_entry.session_id
)

In [None]:
get_session_request: GetSessionContextRequest = GetSessionContextRequest(
    session_id = session_entry.session_id
)

In [None]:
from agentic_platform.service.memory_gateway.api.get_memory_controller import GetMemoriesController
from agentic_platform.service.memory_gateway.api.get_session_controller import GetSessionContextController

response: GetSessionContextResponse = GetSessionContextController.get_session_context(get_session_request)
print(response.model_dump_json(indent=2))

In [None]:
response: GetMemoriesResponse = GetMemoriesController.get_memories(get_memory_request)
print(response.model_dump_json(indent=2))

We can also retrieve by embedding similarity.

In [None]:
get_memory_request : GetMemoriesRequest = GetMemoriesRequest(
    session_id = session_entry.session_id,
    embedding=response.memories[0].embedding
)

In [None]:
response: GetMemoriesResponse = GetMemoriesController.get_memories(get_memory_request)
print(response.model_dump_json(indent=2))

## Implement a basic gateway

Let's examine the code behind our memory gateway.

In [None]:
%pycat ../../../src/agentic_platform/service/memory_gateway/server.py

In order to avoid needing authentication, we'll reproduce this code here.

In [None]:
# Continue with regular imports.
from fastapi import FastAPI
from agentic_platform.core.models.memory_models import (
    GetSessionContextRequest,
    GetSessionContextResponse,
    UpsertSessionContextRequest,
    UpsertSessionContextResponse,
    GetMemoriesRequest,
    GetMemoriesResponse,
    CreateMemoryRequest,
    CreateMemoryResponse
)
from agentic_platform.service.memory_gateway.api.get_session_controller import GetSessionContextController
from agentic_platform.service.memory_gateway.api.upsert_session_controller import UpsertSessionContextController
from agentic_platform.service.memory_gateway.api.get_memory_controller import GetMemoriesController
from agentic_platform.service.memory_gateway.api.create_memory_controller import CreateMemoryController

app = FastAPI()

@app.post("/get-session-context")
async def get_session_context(request: GetSessionContextRequest) -> GetSessionContextResponse:
    """Get the session context for a given session id."""
    return GetSessionContextController.get_session_context(request)

@app.post("/upsert-session-context")
async def upsert_session_context(request: UpsertSessionContextRequest) -> UpsertSessionContextResponse:
    """Upsert the session context for a given session id."""
    return UpsertSessionContextController.upsert_session_context(request)

@app.post("/get-memories")
async def get_memories(request: GetMemoriesRequest) -> GetMemoriesResponse:
    """Get the memories for a given session id."""
    return GetMemoriesController.get_memories(request)

@app.post("/create-memory")
async def create_memory(request: CreateMemoryRequest) -> CreateMemoryResponse:
    """Create a memory for a given session id."""
    return CreateMemoryController.create_memory(request)

@app.get("/health")
async def health():
    """
    Health check endpoint for Kubernetes probes.
    """
    return {"status": "healthy"}



In [None]:
from fastapi.testclient import TestClient
test_client = TestClient(app)

Now we can use the local API to retrieve the same session and memories.

In [None]:
# Test the health endpoint
response = test_client.get("/health")
print(f"Status code: {response.status_code}")
print(f"Response: {response.json()}")

In [None]:
request = GetSessionContextRequest(session_id=session_entry.session_id)
response = test_client.post("/get-session-context", json=request.dict()) 
print(f"Status code: {response.status_code}")
print(f"Response: {response.json()}")

In [None]:
response = test_client.post("/get-memories", json=get_memory_request.dict()) 
print(f"Status code: {response.status_code}")
print(f"Response: {response.json()}")

## Use the memory gateway in the platform 

As with the LLM gateway, we can also call the memory gateway deployed in the platform. We'll start again by looking up the DNS for the load balancer, and setting up an authentication token.

In [None]:
import boto3
from typing import List, Dict

# Initialize the client
elbv2 = boto3.client('elbv2')

# List all load balancers
load_balancers: List[Dict] = elbv2.describe_load_balancers()['LoadBalancers']

# Get the load balancer name. It should be prefixed by k8s-platform
dns_name: str = [lb['DNSName'] for lb in load_balancers if 'k8s-platform' in lb['LoadBalancerName']][0]
dns_name

In [None]:
# Get our Secret for Auth
import json
import boto3
from typing import Dict
# The name should be prefixed by whatever you named your stack prefix followed by -m2m-credentials
secret_name: str = 'agent-base-rd-m2m-credentials'
secret = boto3.client('secretsmanager').get_secret_value(SecretId=secret_name)
secret_value: str = secret['SecretString']

# Parse the secret value
secret_value_dict: Dict = json.loads(secret_value)

In [None]:
import requests

def get_token():
    client_id = secret_value_dict.get('client_id')
    client_secret = secret_value_dict.get('client_secret')
    token_url = secret_value_dict.get('token_url')
    scopes = secret_value_dict.get('scopes')

    data={
        'grant_type': 'client_credentials',
        'client_id': client_id,
        'client_secret': client_secret,
        'scope': scopes
    }

    response = requests.post(
        token_url,
        headers={'Content-Type': 'application/x-www-form-urlencoded'},
        data=data
    )

    token_data = response.json()
    # Extract the access token
    token = token_data['access_token']
    return token

def construct_auth_header(token: str) -> str:
    return f'Bearer {token}'

m2m_token = get_token()
auth_header = construct_auth_header(m2m_token)

Let's test the health endpoint.

In [None]:
response = requests.get(
        f'http://{dns_name}/memory-gateway/health',
        timeout=5
    )
print(response.json())

Now we'll add helper methods to call the gateway through the load balancer. 

In [None]:
def call_gateway_get_session(request: GetSessionContextRequest) -> GetSessionContextResponse:
    # Call the gateway
    response = requests.post(
        f'http://{dns_name}/memory-gateway/get-session-context',
        headers={'Authorization': auth_header},
        json=request.model_dump(),
        timeout=5
    )

    # Convert the response to our own type
    print(json.dumps(response.json()))
    return GetSessionContextResponse(**response.json())

def call_gateway_get_memories(request: GetMemoriesRequest) -> GetMemoriesResponse:
    # Call the gateway
    response = requests.post(
        f'http://{dns_name}/memory-gateway/get-memories',
        headers={'Authorization': auth_header},
        json=request.model_dump(),
        timeout=5
    )

    # Convert the response to our own type
    print(json.dumps(response.json()))
    return GetMemoriesResponse(**response.json())

def call_gateway_create_memory(request: CreateMemoryRequest) -> CreateMemoryResponse:
    # Call the gateway
    response = requests.post(
        f'http://{dns_name}/memory-gateway/create-memory',
        headers={'Authorization': auth_header},
        json=request.model_dump(),
        timeout=5
    )

    # Convert the response to our own type
    print(json.dumps(response.json()))
    return CreateMemoryResponse(**response.json())

def call_gateway_upsert_session(request: UpsertSessionContextRequest) -> UpsertSessionContextResponse:
    # Call the gateway
    response = requests.post(
        f'http://{dns_name}/memory-gateway/upsert-session-context',
        headers={'Authorization': auth_header},
        json=request.model_dump(),
        timeout=5
    )

    # Convert the response to our own type
    print(json.dumps(response.json()))
    return UpsertSessionContextResponse(**response.json())

We'll test storing session context.

In [None]:
session_entry = SessionContext()
session_entry.add_message(Message(role='user', text='sample message'))
upsert_request : UpsertSessionContextRequest = UpsertSessionContextRequest(
    session_context=session_entry
)

In [None]:
response: UpsertSessionContextResponse = call_gateway_upsert_session(upsert_request)

In [None]:
print(response.model_dump_json(indent=2))