![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# Working Memory

## Introduction

This notebook demonstrates how to implement working memory, which is session-scoped data that persists across multiple turns of a conversation. Working memory stores conversation messages and task-related context, giving LLMs the knowledge they need to maintain coherent, context-aware conversations.

### Key Concepts

- **Working Memory**: Persistent storage for current conversation messages and task-specific context
- **Long-term Memory**: Cross-session knowledge (user preferences, important facts learned over time)
- **Session Scope**: Working memory is tied to a specific conversation session
- **Message History**: The sequence of user and assistant messages that form the conversation

### The Problem We're Solving

LLMs are stateless - they don't inherently remember previous messages in a conversation. Working memory solves this by:
- Storing conversation messages so the LLM can reference earlier parts of the conversation
- Maintaining task-specific context (like current goals, preferences mentioned in this session)
- Persisting this information across multiple turns of the conversation
- Providing a foundation for extracting important information to long-term storage

Because working memory stores messages, we can extract long-term data from it. When using the Agent Memory Server, extraction happens automatically in the background based on a configured strategy that controls what kind of information gets extracted.

In [5]:
# Install the Redis Context Course package
import subprocess
import sys
import os

# Install the package in development mode
package_path = "../../reference-agent"
result = subprocess.run([sys.executable, "-m", "pip", "install", "-q", "-e", package_path], 
                      capture_output=True, text=True)
if result.returncode == 0:
    print("✅ Package installed successfully")
else:
    print(f"❌ Package installation failed: {result.stderr}")
    raise RuntimeError(f"Failed to install package: {result.stderr}")

✅ Package installed successfully


In [6]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Verify required environment variables are set
if not os.getenv("OPENAI_API_KEY"):
    raise ValueError(
        "OPENAI_API_KEY not found. Please create a .env file with your OpenAI API key. "
        "See SETUP.md for instructions."
    )

print("✅ Environment variables loaded")
print(f"   REDIS_URL: {os.getenv('REDIS_URL', 'redis://localhost:6379')}")
print(f"   AGENT_MEMORY_URL: {os.getenv('AGENT_MEMORY_URL', 'http://localhost:8000')}")
print(f"   OPENAI_API_KEY: {'✓ Set' if os.getenv('OPENAI_API_KEY') else '✗ Not set'}")

## 1. Working Memory Structure

Working memory contains the essential context for the current conversation:

- **Messages**: The conversation history (user and assistant messages)
- **Session ID**: Identifies this specific conversation
- **User ID**: Identifies the user across sessions
- **Task Data**: Optional task-specific context (current goals, temporary state)

This structure gives the LLM everything it needs to understand the current conversation context.

Let's import the memory client to work with working memory:

In [7]:
from redis_context_course import MemoryClient

print("✅ Memory server client imported successfully")

✅ Memory server client imported successfully


## 2. Storing and Retrieving Conversation Context

Let's see how working memory stores and retrieves conversation context:

In [8]:
import os
from agent_memory_client import MemoryClientConfig

# Initialize memory client for working memory
student_id = "demo_student_working_memory"
session_id = "session_001"
config = MemoryClientConfig(
    base_url=os.getenv("AGENT_MEMORY_URL", "http://localhost:8000"),
    default_namespace="redis_university"
)
memory_client = MemoryClient(config=config)

print("✅ Memory client initialized successfully")
print(f"📊 User ID: {student_id}")
print(f"📊 Session ID: {session_id}")
print("\nWorking memory will store conversation messages for this session.")

✅ Memory client initialized successfully
📊 User ID: demo_student_working_memory
📊 Session ID: session_001

Working memory will store conversation messages for this session.


In [9]:
# Simulate a conversation using working memory

print("💬 Simulating Conversation with Working Memory")
print("=" * 50)

# Create messages for the conversation
messages = [
    {"role": "user", "content": "I prefer online courses because I work part-time"},
    {"role": "assistant", "content": "I understand you prefer online courses due to your work schedule."},
    {"role": "user", "content": "My goal is to specialize in machine learning"},
    {"role": "assistant", "content": "Machine learning is an excellent specialization!"},
    {"role": "user", "content": "What courses do you recommend?"},
]

# Save to working memory
from agent_memory_client.models import WorkingMemory, MemoryMessage

# Convert messages to MemoryMessage format
memory_messages = [MemoryMessage(**msg) for msg in messages]

# Create WorkingMemory object
working_memory = WorkingMemory(
    session_id=session_id,
    user_id=student_id,
    messages=memory_messages,
    memories=[],
    data={}
)

await memory_client.put_working_memory(
    session_id=session_id,
    memory=working_memory,
    user_id=student_id,
    model_name="gpt-4o"
)

print("✅ Conversation saved to working memory")
print(f"📊 Messages: {len(messages)}")
print("\nThese messages are now available as context for the LLM.")
print("The LLM can reference earlier parts of the conversation.")

# Retrieve working memory
_, working_memory = await memory_client.get_or_create_working_memory(
    session_id=session_id,
    model_name="gpt-4o",
    user_id=student_id,
)

if working_memory:
    print(f"\n📋 Retrieved {len(working_memory.messages)} messages from working memory")
    print("This is the conversation context that would be provided to the LLM.")

💬 Simulating Conversation with Working Memory
15:01:47 httpx INFO   HTTP Request: PUT http://localhost:8000/v1/working-memory/session_001?user_id=demo_student_working_memory&model_name=gpt-4o "HTTP/1.1 500 Internal Server Error"


MemoryServerError: HTTP 500: dial tcp [::1]:8000: connect: connection refused


## 3. Automatic Extraction to Long-Term Memory

Because working memory stores messages, we can extract important long-term information from it. When using the Agent Memory Server, this extraction happens automatically in the background.

The extraction strategy controls what kind of information gets extracted:
- User preferences (e.g., "I prefer online courses")
- Goals (e.g., "I want to specialize in machine learning")
- Important facts (e.g., "I work part-time")
- Key decisions or outcomes from the conversation

This extracted information becomes long-term memory that persists across sessions.

Let's check what information was automatically extracted from our working memory:

In [None]:
# Check what was extracted to long-term memory
import asyncio
from agent_memory_client import MemoryAPIClient as MemoryClient, MemoryClientConfig

# Ensure memory_client is defined (in case cells are run out of order)
if 'memory_client' not in globals():
    # Initialize memory client with proper config
    import os
    config = MemoryClientConfig(
        base_url=os.getenv("AGENT_MEMORY_URL", "http://localhost:8000"),
        default_namespace="redis_university"
    )
    memory_client = MemoryClient(config=config)

await asyncio.sleep(2)  # Give the extraction process time to complete

# Search for extracted memories
extracted_memories = await memory_client.search_long_term_memory(
    text="preferences goals",
    limit=10
)

print("🧠 Extracted to Long-term Memory")
print("=" * 50)

if extracted_memories.memories:
    for i, memory in enumerate(extracted_memories.memories, 1):
        print(f"{i}. {memory.text}")
        print(f"   Type: {memory.memory_type} | Topics: {', '.join(memory.topics)}")
        print()
else:
    print("No memories extracted yet (extraction may take a moment)")
    print("\nThe Agent Memory Server automatically extracts:")
    print("- User preferences (e.g., 'prefers online courses')")
    print("- Goals (e.g., 'wants to specialize in machine learning')")
    print("- Important facts (e.g., 'works part-time')")
    print("\nThis happens in the background based on the configured extraction strategy.")

## 4. Summary

In this notebook, you learned:

- ✅ **The Core Problem**: LLMs are stateless and need working memory to maintain conversation context
- ✅ **Working Memory Solution**: Stores messages and task-specific context for the current session
- ✅ **Message Storage**: Conversation history gives the LLM knowledge of what was said earlier
- ✅ **Automatic Extraction**: Important information is extracted to long-term memory in the background
- ✅ **Extraction Strategy**: Controls what kind of information gets extracted from working memory

**Key API Methods:**
```python
# Save working memory (stores messages for this session)
await memory_client.put_working_memory(session_id, memory, user_id, model_name)

# Retrieve working memory (gets conversation context)
_, working_memory = await memory_client.get_or_create_working_memory(
    session_id, model_name, user_id
)

# Search long-term memories (extracted from working memory)
memories = await memory_client.search_long_term_memory(text, limit)
```

**The Key Insight:**
Working memory solves the fundamental problem of giving LLMs knowledge of the current conversation. Because it stores messages, we can also extract long-term data from it. The extraction strategy controls what gets extracted, and this happens automatically in the background when using the Agent Memory Server.

## Next Steps

See the next notebooks to learn about:
- Long-term memory and how it persists across sessions
- Memory tools that give LLMs explicit control over what gets remembered
- Integrating working and long-term memory in your applications