![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# üîó Combining Memory with Retrieved Context

**‚è±Ô∏è Estimated Time:** 60-75 minutes

## üéØ Learning Objectives

By the end of this notebook, you will:

1. **Build** a memory-enhanced RAG system that combines all four context types
2. **Demonstrate** the benefits of memory for natural conversations
3. **Convert** a simple RAG system into a LangGraph agent
4. **Prepare** for Section 4 (adding tools and advanced agent capabilities)

---

## üîó Bridge from Previous Notebooks

### **What You've Learned:**

**Section 1:** Four Context Types
- System Context (static instructions)
- User Context (profile, preferences)
- Conversation Context (enabled by working memory)
- Retrieved Context (RAG results)

**Section 2:** RAG Fundamentals
- Semantic search with vector embeddings
- Context assembly
- LLM generation

**Section 3 (Notebook 1):** Memory Fundamentals
- Working memory for conversation continuity
- Long-term memory for persistent knowledge
- Memory types (semantic, episodic, message)
- Memory lifecycle and persistence

### **What We'll Build:**

**Part 1:** Memory-Enhanced RAG
- Integrate working memory + long-term memory + RAG
- Show clear before/after comparisons
- Demonstrate benefits of memory systems

**Part 2:** LangGraph Agent (Separate Notebook)
- Convert memory-enhanced RAG to LangGraph agent
- Add state management and control flow
- Prepare for Section 4 (tools and advanced capabilities)

---

## üìä The Complete Picture

### **Memory-Enhanced RAG Flow:**

```
User Query
    ‚Üì
1. Load Working Memory (conversation history)
2. Search Long-term Memory (user preferences, facts)
3. RAG Search (relevant courses)
4. Assemble Context (System + User + Conversation + Retrieved)
5. Generate Response
6. Save Working Memory (updated conversation)
```

### **All Four Context Types Working Together:**

| Context Type | Source | Purpose |
|-------------|--------|---------|
| **System** | Static prompt | Role, instructions, guidelines |
| **User** | Profile + Long-term Memory | Personalization, preferences |
| **Conversation** | Working Memory | Reference resolution, continuity |
| **Retrieved** | RAG Search | Relevant courses, information |

**üí° Key Insight:** Memory transforms stateless RAG into stateful, personalized conversations.

---

## üì¶ Setup and Environment

Let's set up our environment with the necessary dependencies and connections. We'll build on Section 2's RAG foundation and add memory capabilities.

### ‚ö†Ô∏è Prerequisites

**Before running this notebook, make sure you have:**

1. **Docker Desktop running** - Required for Redis and Agent Memory Server

2. **Environment variables** - Create a `.env` file in the project root directory:
   ```bash
   # Copy the example file (if it exists)
   cd ../..
   # Or create .env manually with:
   # OPENAI_API_KEY=your_actual_openai_api_key_here
   # REDIS_URL=redis://localhost:6379
   # AGENT_MEMORY_URL=http://localhost:8088
   ```

3. **Start services** - Make sure Redis and Agent Memory Server are running:
   ```bash
   # Start Redis and Agent Memory Server using docker-compose
   cd ../..
   docker-compose up -d
   ```

**Note:** Using docker-compose will:
- ‚úÖ Start Redis on port 6379
- ‚úÖ Start Agent Memory Server on port 8088
- ‚úÖ Configure networking between services
- ‚úÖ Persist data in Docker volumes

If the Memory Server is not available, the notebook will skip memory-related demos but will still run.


---


### Automated Setup Check

Let's run the setup script to ensure all services are running properly.


In [1]:
# Check if services are running
import subprocess
import sys
from pathlib import Path

print("Checking if required services are running...\n")

# Check if Redis is running
try:
    result = subprocess.run(
        ["docker", "ps", "--filter", "name=redis", "--format", "{{.Names}}"],
        capture_output=True,
        text=True,
        timeout=5
    )
    if "redis" in result.stdout:
        print("‚úÖ Redis is running")
    else:
        print("‚ö†Ô∏è  Redis is not running. Start it with: docker-compose up -d")
except Exception as e:
    print(f"‚ö†Ô∏è  Could not check Redis status: {e}")

# Check if Agent Memory Server is running
try:
    result = subprocess.run(
        ["docker", "ps", "--filter", "name=agent-memory", "--format", "{{.Names}}"],
        capture_output=True,
        text=True,
        timeout=5
    )
    if "agent-memory" in result.stdout or "memory" in result.stdout:
        print("‚úÖ Agent Memory Server is running")
    else:
        print("‚ö†Ô∏è  Agent Memory Server is not running. Start it with: docker-compose up -d")
except Exception as e:
    print(f"‚ö†Ô∏è  Could not check Agent Memory Server status: {e}")

print("\nIf services are not running, start them with:")
print("  cd ../..")
print("  docker-compose up -d")

Checking if required services are running...

‚úÖ Redis is running
‚úÖ Agent Memory Server is running

If services are not running, start them with:
  cd ../..
  docker-compose up -d


---


### Install Dependencies

If you haven't already installed the reference-agent package, uncomment and run the following:


In [2]:
# Uncomment to install the project package
# %pip install -q -e ../..

# Uncomment to install agent-memory-client
# %pip install -q agent-memory-client

### Load Environment Variables

We'll load environment variables from the `.env` file in the `reference-agent` directory.

**Required variables:**
- `OPENAI_API_KEY` - Your OpenAI API key
- `REDIS_URL` - Redis connection URL (default: redis://localhost:6379)
- `AGENT_MEMORY_URL` - Agent Memory Server URL (default: http://localhost:8088)

If you haven't created the `.env` file yet, copy `.env.example` and add your OpenAI API key.


In [3]:
import os
from pathlib import Path

from dotenv import load_dotenv

# Load environment variables from project root directory
env_path = Path("../../.env")
load_dotenv(dotenv_path=env_path)

# Verify required environment variables
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
AGENT_MEMORY_URL = os.getenv("AGENT_MEMORY_URL", "http://localhost:8088")

if not OPENAI_API_KEY:
    print(
        f"""‚ùå OPENAI_API_KEY not found!

    Please create a .env file at: {env_path.absolute()}

    With the following content:
    OPENAI_API_KEY=your_openai_api_key
    REDIS_URL=redis://localhost:6379
    AGENT_MEMORY_URL=http://localhost:8088
    """
    )
else:
    print("‚úÖ Environment variables loaded")
    print(f"   REDIS_URL: {REDIS_URL}")
    print(f"   AGENT_MEMORY_URL: {AGENT_MEMORY_URL}")

‚úÖ Environment variables loaded
   REDIS_URL: redis://localhost:6379
   AGENT_MEMORY_URL: http://localhost:8088


### Import Core Libraries

We'll import standard Python libraries and async support for our memory operations.


In [4]:
import asyncio
import sys
from datetime import datetime
from typing import Any, Dict, List, Optional

print("‚úÖ Core libraries imported")

‚úÖ Core libraries imported


### Import Section 2 Components

We're building on Section 2's RAG foundation, so we'll reuse the same components:
- `redis_config` - Redis connection and configuration
- `CourseManager` - Course search and management
- `StudentProfile` and other models - Data structures


In [5]:
from redis_context_course.course_manager import CourseManager
from redis_context_course.models import (
    Course,
    CourseFormat,
    DifficultyLevel,
    Semester,
    StudentProfile,
)

# Import Section 2 components
from redis_context_course.redis_config import redis_config

print("‚úÖ Section 2 components imported")
print(f"   CourseManager: Available")
print(f"   Redis Config: Available")
print(f"   Models: Course, StudentProfile, etc.")

‚úÖ Section 2 components imported
   CourseManager: Available
   Redis Config: Available
   Models: Course, StudentProfile, etc.


### Import LangChain Components

We'll use LangChain for LLM interaction and message handling.


In [6]:
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI

print("‚úÖ LangChain components imported")
print(f"   ChatOpenAI: Available")
print(f"   Message types: HumanMessage, SystemMessage, AIMessage")

‚úÖ LangChain components imported
   ChatOpenAI: Available
   Message types: HumanMessage, SystemMessage, AIMessage


### Import Agent Memory Server Client

The Agent Memory Server provides production-ready memory management. If it's not available, we'll note that and continue with limited functionality.


In [7]:
# Import Agent Memory Server client
try:
    from agent_memory_client import MemoryAPIClient, MemoryClientConfig
    from agent_memory_client.models import (
        ClientMemoryRecord,
        MemoryMessage,
        WorkingMemory,
    )

    MEMORY_SERVER_AVAILABLE = True
    print("‚úÖ Agent Memory Server client available")
    print("   MemoryAPIClient: Ready")
    print("   Memory models: WorkingMemory, MemoryMessage, ClientMemoryRecord")
except ImportError:
    MEMORY_SERVER_AVAILABLE = False
    print("‚ö†Ô∏è  Agent Memory Server not available")
    print("   Install with: pip install agent-memory-client")
    print("   Start server: See reference-agent/README.md")
    print("   Note: Some demos will be skipped")

‚úÖ Agent Memory Server client available
   MemoryAPIClient: Ready
   Memory models: WorkingMemory, MemoryMessage, ClientMemoryRecord


### Environment Summary

Let's verify everything is set up correctly.


In [8]:
print("=" * 80)
print("üîß ENVIRONMENT SETUP SUMMARY")
print("=" * 80)
print(f"\n‚úÖ Core Libraries: Imported")
print(f"‚úÖ Section 2 Components: Imported")
print(f"‚úÖ LangChain: Imported")
print(
    f"{'‚úÖ' if MEMORY_SERVER_AVAILABLE else '‚ö†Ô∏è '} Agent Memory Server: {'Available' if MEMORY_SERVER_AVAILABLE else 'Not Available'}"
)
print(f"\nüìã Configuration:")
print(f"   OPENAI_API_KEY: {'‚úì Set' if OPENAI_API_KEY else '‚úó Not set'}")
print(f"   REDIS_URL: {REDIS_URL}")
print(f"   AGENT_MEMORY_URL: {AGENT_MEMORY_URL}")
print("=" * 80)

üîß ENVIRONMENT SETUP SUMMARY

‚úÖ Core Libraries: Imported
‚úÖ Section 2 Components: Imported
‚úÖ LangChain: Imported
‚úÖ Agent Memory Server: Available

üìã Configuration:
   OPENAI_API_KEY: ‚úì Set
   REDIS_URL: redis://localhost:6379
   AGENT_MEMORY_URL: http://localhost:8088


---

## üîß Initialize Components

Now let's initialize the components we'll use throughout this notebook.


### Initialize Course Manager

The `CourseManager` handles course search and retrieval, just like in Section 2.


In [9]:
# Initialize Course Manager
course_manager = CourseManager()

print("‚úÖ Course Manager initialized")
print("   Ready to search and retrieve courses")

14:50:06 redisvl.index.index INFO   Index already exists, not overwriting.


‚úÖ Course Manager initialized
   Ready to search and retrieve courses


### Initialize LLM

We'll use GPT-4o with temperature=0.0 for consistent, deterministic responses.


In [10]:
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0.0)

print("‚úÖ LLM initialized")
print("   Model: gpt-4o")
print("   Temperature: 0.0 (deterministic)")

‚úÖ LLM initialized
   Model: gpt-4o
   Temperature: 0.0 (deterministic)


### Initialize Memory Client

If the Agent Memory Server is available, we'll initialize the memory client. This client handles both working memory (conversation history) and long-term memory (persistent facts).


In [11]:
# Initialize Memory Client
if MEMORY_SERVER_AVAILABLE:
    config = MemoryClientConfig(
        base_url=AGENT_MEMORY_URL, default_namespace="redis_university"
    )
    memory_client = MemoryAPIClient(config=config)
    print("‚úÖ Memory Client initialized")
    print(f"   Base URL: {config.base_url}")
    print(f"   Namespace: {config.default_namespace}")
    print("   Ready for working memory and long-term memory operations")
else:
    memory_client = None
    print("‚ö†Ô∏è  Memory Server not available")
    print("   Running with limited functionality")
    print("   Some demos will be skipped")

‚úÖ Memory Client initialized
   Base URL: http://localhost:8088
   Namespace: redis_university
   Ready for working memory and long-term memory operations


### Create Sample Student Profile

We'll create a sample student profile to use throughout our demos. This follows the same pattern from Section 2.


In [12]:
# Create sample student profile
sarah = StudentProfile(
    name="Sarah Chen",
    email="sarah.chen@university.edu",
    major="Computer Science",
    year=2,
    interests=["machine learning", "data science", "algorithms"],
    completed_courses=["Introduction to Programming", "Data Structures"],
    current_courses=["Linear Algebra"],
    preferred_format=CourseFormat.ONLINE,
    preferred_difficulty=DifficultyLevel.INTERMEDIATE,
)

print("‚úÖ Student profile created")
print(f"   Name: {sarah.name}")
print(f"   Major: {sarah.major}")
print(f"   Year: {sarah.year}")
print(f"   Interests: {', '.join(sarah.interests)}")
print(f"   Completed: {', '.join(sarah.completed_courses)}")
print(f"   Preferred Format: {sarah.preferred_format.value}")

‚úÖ Student profile created
   Name: Sarah Chen
   Major: Computer Science
   Year: 2
   Interests: machine learning, data science, algorithms
   Completed: Introduction to Programming, Data Structures
   Preferred Format: online


### üí° Key Insight

We're reusing:
- ‚úÖ **Same `CourseManager`** from Section 2
- ‚úÖ **Same `StudentProfile`** model
- ‚úÖ **Same Redis configuration**

We're adding:
- ‚ú® **Memory Client** for conversation history
- ‚ú® **Working Memory** for session context
- ‚ú® **Long-term Memory** for persistent knowledge

---

## üìö Part 1: Memory-Enhanced RAG

### **Goal:** Build a simple, inline memory-enhanced RAG system that demonstrates the benefits of memory.

### **Approach:**
- Start with Section 2's stateless RAG
- Add working memory for conversation continuity
- Add long-term memory for personalization
- Show clear before/after comparisons

---

## üö´ Before: Stateless RAG (Section 2 Approach)

Let's first recall how Section 2's stateless RAG worked, and see its limitations.


### Query 1: Initial query (works fine)


In [13]:
print("=" * 80)
print("üö´ STATELESS RAG DEMO")
print("=" * 80)

stateless_query_1 = "I'm interested in machine learning courses"
print(f"\nüë§ User: {stateless_query_1}\n\n")

# Search courses
stateless_courses_1 = await course_manager.search_courses(stateless_query_1, limit=3)

# Assemble context (System + User + Retrieved only - NO conversation history)
stateless_system_prompt = """You are a Redis University course advisor.

CRITICAL RULES:
- ONLY discuss and recommend courses from the "Relevant Courses" list provided below
- Do NOT mention, suggest, or make up any courses that are not in the provided list
- If the available courses don't perfectly match the request, recommend the best options from what IS available"""

stateless_user_context = f"""Student: {sarah.name}
Major: {sarah.major}
Interests: {', '.join(sarah.interests)}
Completed: {', '.join(sarah.completed_courses)}
"""

stateless_retrieved_context = "Relevant Courses:\n"
for i, course in enumerate(stateless_courses_1, 1):
    stateless_retrieved_context += f"\n{i}. {course.title}"
    stateless_retrieved_context += f"\n   Description: {course.description}"
    stateless_retrieved_context += f"\n   Difficulty: {course.difficulty_level.value}"

# Generate response
stateless_messages_1 = [
    SystemMessage(content=stateless_system_prompt),
    HumanMessage(
        content=f"{stateless_user_context}\n\n{stateless_retrieved_context}\n\nQuery: {stateless_query_1}"
    ),
]

stateless_response_1 = llm.invoke(stateless_messages_1).content
print(f"\nü§ñ Agent: {stateless_response_1}")

# ‚ùå No conversation history stored
# ‚ùå Next query won't remember this interaction

üö´ STATELESS RAG DEMO

üë§ User: I'm interested in machine learning courses




14:50:08 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


14:50:10 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



ü§ñ Agent: Based on your interest in machine learning and your background in computer science, I recommend the "Machine Learning" course from the list. This course will introduce you to machine learning algorithms and applications, including supervised and unsupervised learning and neural networks. It is an advanced course, so it will build on your existing knowledge from your completed courses in programming and data structures. Additionally, you might consider taking "Linear Algebra" as it is essential for understanding many concepts in data science and machine learning.


### Query 2: Follow-up with pronoun reference (fails)

Now let's try a follow-up that requires conversation history.


In [14]:
stateless_query_2 = "What are the prerequisites for the first one?"
print(f"üë§ User: {stateless_query_2}")
print(f"   Note: 'the first one' refers to the first course from Query 1\n\n")

# Search courses (will search for "prerequisites first one" - not helpful)
stateless_courses_2 = await course_manager.search_courses(stateless_query_2, limit=3)

# Assemble context (NO conversation history from Query 1)
stateless_retrieved_context_2 = "Relevant Courses:\n"
for i, course in enumerate(stateless_courses_2, 1):
    stateless_retrieved_context_2 += f"\n{i}. {course.title}"
    stateless_retrieved_context_2 += f"\n   Description: {course.description}"
    stateless_retrieved_context_2 += f"\n   Difficulty: {course.difficulty_level.value}"

# Generate response
stateless_messages_2 = [
    SystemMessage(content=stateless_system_prompt),
    HumanMessage(
        content=f"{stateless_user_context}\n\n{stateless_retrieved_context_2}\n\nQuery: {stateless_query_2}"
    ),
]

stateless_response_2 = llm.invoke(stateless_messages_2).content
print(f"\nü§ñ Agent: {stateless_response_2}")
print("\n‚ùå Agent can't resolve 'the first one' - no conversation history!")

üë§ User: What are the prerequisites for the first one?
   Note: 'the first one' refers to the first course from Query 1




14:50:10 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


14:50:13 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



ü§ñ Agent: The course list provided only includes "Calculus I" courses, and they all have the same description and difficulty level. Typically, prerequisites for a Calculus I course might include a solid understanding of pre-calculus topics such as algebra and trigonometry. However, since the list does not specify prerequisites, I recommend checking with the course provider or institution for specific requirements. If Sarah is interested in machine learning, data science, and algorithms, a strong foundation in calculus can be beneficial, so taking Calculus I could be a good step forward.

‚ùå Agent can't resolve 'the first one' - no conversation history!




### üéØ What Just Happened?

**Query 1:** "I'm interested in machine learning courses"
- ‚úÖ Works fine - searches and returns ML courses

**Query 2:** "What are the prerequisites for **the first one**?"
- ‚ùå **Fails** - Agent doesn't know what "the first one" refers to
- ‚ùå No conversation history stored
- ‚ùå Each query is completely independent

**The Problem:** Natural conversation requires context from previous turns.

---

## ‚úÖ After: Memory-Enhanced RAG

Now let's add memory to enable natural conversations.

### **Step 1: Load Working Memory**

Working memory stores conversation history for the current session.


In [15]:
# Set up session and student identifiers
session_id = "demo_session_001"
student_id = sarah.email.split("@")[0]

# Load working memory
if MEMORY_SERVER_AVAILABLE:
    _, working_memory = await memory_client.get_or_create_working_memory(
        session_id=session_id, user_id=student_id, model_name="gpt-4o"
    )

    print(f"‚úÖ Loaded working memory for session: {session_id}")
    print(f"   Messages: {len(working_memory.messages)}")
else:
    print("‚ö†Ô∏è  Memory Server not available")

14:50:13 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/demo_session_001?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


‚úÖ Loaded working memory for session: demo_session_001
   Messages: 8


### üéØ What We Just Did

**Loaded Working Memory:**
- Created or retrieved conversation history for this session
- Session ID: `demo_session_001` (unique per conversation)
- User ID: `sarah_chen` (from student email)

**Why This Matters:**
- Working memory persists across turns in the same session
- Enables reference resolution ("it", "that course", "the first one")
- Conversation context is maintained

---

### **Step 2: Search Long-term Memory**

Long-term memory stores persistent facts and preferences across sessions.


In [16]:
# Search long-term memory
longterm_query = "What does the student prefer?"

if MEMORY_SERVER_AVAILABLE:
    from agent_memory_client.filters import UserId

    longterm_results = await memory_client.search_long_term_memory(
        text=longterm_query, user_id=UserId(eq=student_id), limit=5
    )

    longterm_memories = (
        [m.text for m in longterm_results.memories] if longterm_results.memories else []
    )

    print(f"üîç Query: '{longterm_query}'")
    print(f"üìö Found {len(longterm_memories)} relevant memories:")
    for i, memory in enumerate(longterm_memories, 1):
        print(f"   {i}. {memory}")
else:
    longterm_memories = []
    print("‚ö†Ô∏è  Memory Server not available")

14:50:13 httpx INFO   HTTP Request: POST http://localhost:8088/v1/long-term-memory/search?optimize_query=false "HTTP/1.1 200 OK"


üîç Query: 'What does the student prefer?'
üìö Found 5 relevant memories:
   1. User is interested in machine learning courses.
   2. User asked about the availability of machine learning courses online, indicating an interest in flexible learning options.
   3. User has interests in data science and algorithms
   4. User is interested in machine learning courses.
   5. Data Structures and Algorithms (CS301) could be a good fit for User


### üéØ What We Just Did

**Searched Long-term Memory:**
- Used semantic search to find relevant facts
- Query: "What does the student prefer?"
- Results: Memories about preferences, goals, academic info

**Why This Matters:**
- Long-term memory enables personalization
- Facts persist across sessions (days, weeks, months)
- Semantic search finds relevant memories without exact keyword matching

---

### **Step 3: Assemble All Four Context Types**

Now let's combine everything: System + User + Conversation + Retrieved.


#### 3.1: System Context (static)


In [17]:
# 1. System Context (static)
context_system_prompt = """You are a Redis University course advisor.

Your role:
- Help students find and enroll in courses from our catalog
- Provide personalized recommendations based on available courses
- Answer questions about courses, prerequisites, schedules

CRITICAL RULES - READ CAREFULLY:
- You can ONLY recommend courses that appear in the "Relevant Courses" list below
- Do NOT suggest courses that are not in the "Relevant Courses" list
- Do NOT say things like "you might want to consider X course" if X is not in the list
- Do NOT mention courses from other platforms or external resources
- If the available courses don't perfectly match the request, recommend the best options from what IS in the list
- Use conversation history to resolve references ("it", "that course", "the first one")
- Use long-term memories to personalize your recommendations
- Be helpful, supportive, and encouraging while staying within the available courses"""

print("‚úÖ System Context created")
print(f"   Length: {len(context_system_prompt)} chars")

‚úÖ System Context created
   Length: 927 chars


#### 3.2: User Context (profile + long-term memories)


In [18]:
# 2. User Context (profile + long-term memories)
context_user_context = f"""Student Profile:
- Name: {sarah.name}
- Major: {sarah.major}
- Year: {sarah.year}
- Interests: {', '.join(sarah.interests)}
- Completed: {', '.join(sarah.completed_courses)}
- Current: {', '.join(sarah.current_courses)}
- Preferred Format: {sarah.preferred_format.value}
- Preferred Difficulty: {sarah.preferred_difficulty.value}"""

# Search long-term memory for this query
context_query = "machine learning courses"

if MEMORY_SERVER_AVAILABLE:
    from agent_memory_client.filters import UserId

    context_longterm_results = await memory_client.search_long_term_memory(
        text=context_query, user_id=UserId(eq=student_id), limit=5
    )
    context_longterm_memories = (
        [m.text for m in context_longterm_results.memories]
        if context_longterm_results.memories
        else []
    )

    if context_longterm_memories:
        context_user_context += f"\n\nLong-term Memories:\n" + "\n".join(
            [f"- {m}" for m in context_longterm_memories]
        )

print("‚úÖ User Context created")
print(f"   Length: {len(context_user_context)} chars")

14:50:13 httpx INFO   HTTP Request: POST http://localhost:8088/v1/long-term-memory/search?optimize_query=false "HTTP/1.1 200 OK"


‚úÖ User Context created
   Length: 682 chars


#### 3.3: Conversation Context (working memory)


In [19]:
# 3. Conversation Context (working memory)
if MEMORY_SERVER_AVAILABLE:
    _, context_working_memory = await memory_client.get_or_create_working_memory(
        session_id=session_id, user_id=student_id, model_name="gpt-4o"
    )

    context_conversation_messages = []
    for msg in context_working_memory.messages:
        if msg.role == "user":
            context_conversation_messages.append(HumanMessage(content=msg.content))
        elif msg.role == "assistant":
            context_conversation_messages.append(AIMessage(content=msg.content))

    print("‚úÖ Conversation Context loaded")
    print(f"   Messages: {len(context_conversation_messages)}")
else:
    context_conversation_messages = []

14:50:13 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/demo_session_001?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


‚úÖ Conversation Context loaded
   Messages: 8


#### 3.4: Retrieved Context (RAG)


In [20]:
# 4. Retrieved Context (RAG)
context_courses = await course_manager.search_courses(context_query, limit=3)

context_retrieved_context = "Relevant Courses:\n"
for i, course in enumerate(context_courses, 1):
    context_retrieved_context += f"\n{i}. {course.title}"
    context_retrieved_context += f"\n   Description: {course.description}"
    context_retrieved_context += f"\n   Difficulty: {course.difficulty_level.value}"
    context_retrieved_context += f"\n   Format: {course.format.value}"
    if course.prerequisites:
        prereq_names = [p.course_title for p in course.prerequisites]
        context_retrieved_context += f"\n   Prerequisites: {', '.join(prereq_names)}"

print("‚úÖ Retrieved Context created")
print(f"   Length: {len(context_retrieved_context)} chars")

14:50:13 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚úÖ Retrieved Context created
   Length: 596 chars


#### Summary: All Four Context Types


In [21]:
print("=" * 80)
print("üìä ASSEMBLED CONTEXT")
print("=" * 80)
print(f"\n1Ô∏è‚É£ System Context: {len(context_system_prompt)} chars")
print(f"2Ô∏è‚É£ User Context: {len(context_user_context)} chars")
print(f"3Ô∏è‚É£ Conversation Context: {len(context_conversation_messages)} messages")
print(f"4Ô∏è‚É£ Retrieved Context: {len(context_retrieved_context)} chars")

üìä ASSEMBLED CONTEXT

1Ô∏è‚É£ System Context: 927 chars
2Ô∏è‚É£ User Context: 682 chars
3Ô∏è‚É£ Conversation Context: 8 messages
4Ô∏è‚É£ Retrieved Context: 596 chars


### üéØ What We Just Did

**Assembled All Four Context Types:**

1. **System Context** - Role, instructions, guidelines (static)
2. **User Context** - Profile + long-term memories (dynamic, user-specific)
3. **Conversation Context** - Working memory messages (dynamic, session-specific)
4. **Retrieved Context** - RAG search results (dynamic, query-specific)

**Why This Matters:**
- All four context types from Section 1 are now working together
- System knows WHO the user is (User Context)
- System knows WHAT was discussed (Conversation Context)
- System knows WHAT's relevant (Retrieved Context)
- System knows HOW to behave (System Context)

---

### **Step 4: Generate Response and Save Memory**

Now let's put it all together: generate a response and save the conversation.


#### 4.1: Set up the query


In [22]:
test_query = "I'm interested in machine learning courses"
print(f"üë§ User: {test_query}")

üë§ User: I'm interested in machine learning courses


#### 4.2: Assemble all context types

We'll reuse the context assembly logic from Step 3.


In [23]:
if MEMORY_SERVER_AVAILABLE:
    # Load working memory
    _, test_working_memory = await memory_client.get_or_create_working_memory(
        session_id=session_id, user_id=student_id, model_name="gpt-4o"
    )

    # Build conversation messages
    test_conversation_messages = []
    for msg in test_working_memory.messages:
        if msg.role == "user":
            test_conversation_messages.append(HumanMessage(content=msg.content))
        elif msg.role == "assistant":
            test_conversation_messages.append(AIMessage(content=msg.content))

    # Search for courses
    test_courses = await course_manager.search_courses(test_query, limit=3)

    # Build retrieved context
    test_retrieved_context = "Relevant Courses:\n"
    for i, course in enumerate(test_courses, 1):
        test_retrieved_context += f"\n{i}. {course.title}"
        test_retrieved_context += f"\n   Description: {course.description}"
        test_retrieved_context += f"\n   Difficulty: {course.difficulty_level.value}"
        if course.prerequisites:
            prereq_names = [p.course_title for p in course.prerequisites]
            test_retrieved_context += f"\n   Prerequisites: {', '.join(prereq_names)}"

    print("‚úÖ Context assembled")

14:50:14 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/demo_session_001?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚úÖ Context assembled


#### 4.3: Build messages and generate response


In [24]:
if MEMORY_SERVER_AVAILABLE:
    # Build complete message list
    test_messages = [SystemMessage(content=context_system_prompt)]
    test_messages.extend(test_conversation_messages)  # Add conversation history
    test_messages.append(
        HumanMessage(
            content=f"{context_user_context}\n\n{test_retrieved_context}\n\nQuery: {test_query}"
        )
    )

    # Generate response using LLM
    test_response = llm.invoke(test_messages).content

    print(f"\nü§ñ Agent: {test_response}")

14:50:16 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



ü§ñ Agent: Hi Sarah! I see you're interested in machine learning courses. Currently, we have an advanced Machine Learning course available. It covers topics such as supervised and unsupervised learning, as well as neural networks. Given your background in computer science and your current study of Linear Algebra, you might find this course challenging but rewarding.

While it is at an advanced level, your strong foundation in programming and data structures, along with your ongoing study of Linear Algebra, could help you tackle the material. If you're ready to take on the challenge, this course could be a great way to deepen your understanding of machine learning.

Let me know if you have any questions or need more information about the course!


#### 4.4: Save to working memory


In [25]:
if MEMORY_SERVER_AVAILABLE:
    # Add messages to working memory
    test_working_memory.messages.extend(
        [
            MemoryMessage(role="user", content=test_query),
            MemoryMessage(role="assistant", content=test_response),
        ]
    )

    # Save to Memory Server
    await memory_client.put_working_memory(
        session_id=session_id,
        memory=test_working_memory,
        user_id=student_id,
        model_name="gpt-4o",
    )

    print(f"\n‚úÖ Conversation saved to working memory")
    print(f"   Total messages: {len(test_working_memory.messages)}")

14:50:17 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/demo_session_001?user_id=sarah.chen&model_name=gpt-4o "HTTP/1.1 200 OK"



‚úÖ Conversation saved to working memory
   Total messages: 10


#### Helper function for the demo

For the complete demo below, we'll use a helper function that combines all these steps.


In [26]:
# Helper function for demo (combines all steps above)


async def generate_and_save(
    user_query: str, student_profile: StudentProfile, session_id: str, top_k: int = 3
) -> str:
    """Generate response and save to working memory"""

    if not MEMORY_SERVER_AVAILABLE:
        return "‚ö†Ô∏è Memory Server not available"

    from agent_memory_client.filters import UserId

    student_id = student_profile.email.split("@")[0]

    # Load working memory
    _, working_memory = await memory_client.get_or_create_working_memory(
        session_id=session_id, user_id=student_id, model_name="gpt-4o"
    )

    # Build conversation messages
    conversation_messages = []
    for msg in working_memory.messages:
        if msg.role == "user":
            conversation_messages.append(HumanMessage(content=msg.content))
        elif msg.role == "assistant":
            conversation_messages.append(AIMessage(content=msg.content))

    # Search courses
    courses = await course_manager.search_courses(user_query, limit=top_k)

    # Build retrieved context
    retrieved_context = "Relevant Courses:\n"
    for i, course in enumerate(courses, 1):
        retrieved_context += f"\n{i}. {course.title}"
        retrieved_context += f"\n   Description: {course.description}"
        retrieved_context += f"\n   Difficulty: {course.difficulty_level.value}"
        if course.prerequisites:
            prereq_names = [p.course_title for p in course.prerequisites]
            retrieved_context += f"\n   Prerequisites: {', '.join(prereq_names)}"

    # Build messages
    messages = [SystemMessage(content=context_system_prompt)]
    messages.extend(conversation_messages)
    messages.append(
        HumanMessage(
            content=f"{context_user_context}\n\n{retrieved_context}\n\nQuery: {user_query}"
        )
    )

    # Generate response
    response = llm.invoke(messages).content

    # Save to working memory
    working_memory.messages.extend(
        [
            MemoryMessage(role="user", content=user_query),
            MemoryMessage(role="assistant", content=response),
        ]
    )
    await memory_client.put_working_memory(
        session_id=session_id,
        memory=working_memory,
        user_id=student_id,
        model_name="gpt-4o",
    )

    return response


print("‚úÖ Helper function created for demo")

‚úÖ Helper function created for demo


### üéØ What We Just Did

**Generated Response:**
- Assembled all four context types
- Built message list with conversation history
- Generated response using LLM
- **Saved updated conversation to working memory**

**Why This Matters:**
- Next query will have access to this conversation
- Reference resolution will work ("it", "that course")
- Conversation continuity is maintained

---

## üß™ Complete Demo: Memory-Enhanced RAG

Now let's test the complete system with a multi-turn conversation.

We'll break this down into three turns:
1. Initial query about machine learning courses
2. Follow-up asking about prerequisites (with pronoun reference)
3. Another follow-up checking if student meets prerequisites


### Turn 1: Initial Query

Let's start with a query about machine learning courses.


In [27]:
# Set up demo session
demo_session_id = "complete_demo_session"

print("=" * 80)
print("üß™ MEMORY-ENHANCED RAG DEMO")
print("=" * 80)
print(f"\nüë§ Student: {sarah.name}")
print(f"üìß Session: {demo_session_id}")

print("\n" + "=" * 80)
print("üìç TURN 1: Initial Query")
print("=" * 80)

demo_query_1 = "I'm interested in machine learning courses"
print(f"\nüë§ User: {demo_query_1}")

üß™ MEMORY-ENHANCED RAG DEMO

üë§ Student: Sarah Chen
üìß Session: complete_demo_session

üìç TURN 1: Initial Query

üë§ User: I'm interested in machine learning courses


#### Generate response and save to memory


In [28]:
demo_response_1 = await generate_and_save(demo_query_1, sarah, demo_session_id)

print(f"\nü§ñ Agent: {demo_response_1}")
print(f"\n‚úÖ Conversation saved to working memory")

14:50:17 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:17 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


14:50:18 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


14:50:18 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&model_name=gpt-4o "HTTP/1.1 200 OK"



ü§ñ Agent: Hi Sarah! It's great to see your continued interest in machine learning. While we currently don't have an intermediate-level machine learning course available, I recommend considering the "Machine Learning" course we offer. It covers a range of topics including supervised and unsupervised learning, as well as neural networks. However, please note that this course is at an advanced difficulty level.

Since you're currently taking Linear Algebra, which is essential for understanding many machine learning concepts, you might find that it helps prepare you for the advanced course. If you feel ready to take on the challenge, this could be a great opportunity to deepen your knowledge in machine learning. Let me know if you have any questions or need further assistance!

‚úÖ Conversation saved to working memory


### Turn 2: Follow-up with Pronoun Reference

Now let's ask about "the first one" - a reference that requires conversation history.


In [29]:
print("\n" + "=" * 80)
print("üìç TURN 2: Follow-up with Pronoun Reference")
print("=" * 80)

demo_query_2 = "What are the prerequisites for the first one?"
print(f"\nüë§ User: {demo_query_2}")
print(f"   Note: 'the first one' refers to the first course mentioned in Turn 1")


üìç TURN 2: Follow-up with Pronoun Reference

üë§ User: What are the prerequisites for the first one?
   Note: 'the first one' refers to the first course mentioned in Turn 1


#### Load conversation history and generate response

The system will load Turn 1 from working memory to resolve "the first one".


In [30]:
demo_response_2 = await generate_and_save(demo_query_2, sarah, demo_session_id)

print(f"\nü§ñ Agent: {demo_response_2}")
print("\n‚úÖ Agent resolved 'the first one' using conversation history!")

14:50:18 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:18 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


14:50:20 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


14:50:20 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&model_name=gpt-4o "HTTP/1.1 200 OK"



ü§ñ Agent: The first "Calculus I" course listed does not have any specified prerequisites, making it accessible for students who have a foundational understanding of mathematics. Given your background in Computer Science and your current study of Linear Algebra, you should be well-prepared to enroll in this intermediate-level course. It will provide you with essential mathematical skills that are beneficial for your interests in machine learning and data science. If you have any more questions or need further assistance, feel free to ask!

‚úÖ Agent resolved 'the first one' using conversation history!


### Turn 3: Another Follow-up

Let's ask if the student meets the prerequisites mentioned in Turn 2.


In [31]:
print("\n" + "=" * 80)
print("üìç TURN 3: Another Follow-up")
print("=" * 80)

demo_query_3 = "Do I meet those prerequisites?"
print(f"\nüë§ User: {demo_query_3}")
print(f"   Note: 'those prerequisites' refers to prerequisites from Turn 2")


üìç TURN 3: Another Follow-up

üë§ User: Do I meet those prerequisites?
   Note: 'those prerequisites' refers to prerequisites from Turn 2


#### Load full conversation history and check student profile

The system will:
1. Load Turns 1-2 from working memory
2. Resolve "those prerequisites"
3. Check student's completed courses from profile


In [32]:
demo_response_3 = await generate_and_save(demo_query_3, sarah, demo_session_id)

print(f"\nü§ñ Agent: {demo_response_3}")
print("\n‚úÖ Agent resolved 'those prerequisites' and checked student's transcript!")

print("\n" + "=" * 80)
print("‚úÖ DEMO COMPLETE: Memory-enhanced RAG enables natural conversations!")
print("=" * 80)

14:50:20 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:20 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


14:50:23 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


14:50:23 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/complete_demo_session?user_id=sarah.chen&model_name=gpt-4o "HTTP/1.1 200 OK"



ü§ñ Agent: Based on the information provided, the first "Calculus I" course lists prerequisites as Prerequisite Course 3 and Prerequisite Course 6. Unfortunately, without more details on what those specific courses entail, it's difficult to determine if you meet them.

However, the second and third "Calculus I" courses do not have any specified prerequisites, making them accessible for students with a foundational understanding of mathematics. Given your background in Computer Science and your current study of Linear Algebra, you should be well-prepared to enroll in one of these intermediate-level Calculus I courses. This will provide you with essential mathematical skills that are beneficial for your interests in machine learning and data science.

If you have any more questions or need further assistance, feel free to ask!

‚úÖ Agent resolved 'those prerequisites' and checked student's transcript!

‚úÖ DEMO COMPLETE: Memory-enhanced RAG enables natural conversations!


### üéØ What Just Happened?

**Turn 1:** "I'm interested in machine learning courses"
- System searches courses
- Finds ML-related courses
- Responds with recommendations
- **Saves conversation to working memory**

**Turn 2:** "What are the prerequisites for **the first one**?"
- System loads working memory (Turn 1)
- Resolves "the first one" ‚Üí first course mentioned in Turn 1
- Responds with prerequisites
- **Saves updated conversation**

**Turn 3:** "Do I meet **those prerequisites**?"
- System loads working memory (Turns 1-2)
- Resolves "those prerequisites" ‚Üí prerequisites from Turn 2
- Checks student's completed courses (from profile)
- Responds with personalized answer
- **Saves updated conversation**

**üí° Key Insight:** Memory + RAG = **Natural, stateful, personalized conversations**

---

## üìä Before vs. After Comparison

Let's visualize the difference between stateless and memory-enhanced RAG.

### **Stateless RAG (Section 2):**

```
Query 1: "I'm interested in ML courses"
  ‚Üí ‚úÖ Works (searches and returns courses)

Query 2: "What are the prerequisites for the first one?"
  ‚Üí ‚ùå Fails (no conversation history)
  ‚Üí Agent: "Which course are you referring to?"
```

**Problems:**
- ‚ùå No conversation continuity
- ‚ùå Can't resolve references
- ‚ùå Each query is independent
- ‚ùå Poor user experience

### **Memory-Enhanced RAG (This Notebook):**

```
Query 1: "I'm interested in ML courses"
  ‚Üí ‚úÖ Works (searches and returns courses)
  ‚Üí Saves to working memory

Query 2: "What are the prerequisites for the first one?"
  ‚Üí ‚úÖ Works (loads conversation history)
  ‚Üí Resolves "the first one" ‚Üí first course from Query 1
  ‚Üí Responds with prerequisites
  ‚Üí Saves updated conversation

Query 3: "Do I meet those prerequisites?"
  ‚Üí ‚úÖ Works (loads conversation history)
  ‚Üí Resolves "those prerequisites" ‚Üí prerequisites from Query 2
  ‚Üí Checks student transcript
  ‚Üí Responds with personalized answer
```

**Benefits:**
- ‚úÖ Conversation continuity
- ‚úÖ Reference resolution
- ‚úÖ Personalization
- ‚úÖ Natural user experience

---

## üéì Key Takeaways

### **1. Memory Transforms RAG**

**Without Memory (Section 2):**
- Stateless queries
- No conversation continuity
- Limited to 3 context types (System, User, Retrieved)

**With Memory (This Notebook):**
- Stateful conversations
- Reference resolution
- All 4 context types (System, User, Conversation, Retrieved)

### **2. Two Types of Memory Work Together**

**Working Memory:**
- Session-scoped conversation history
- Enables reference resolution
- Persists within the session (like ChatGPT conversations)

**Long-term Memory:**
- User-scoped persistent facts
- Enables personalization
- Persists indefinitely

### **3. Simple, Inline Approach**

**What We Built:**
- Small, focused functions
- Inline code (no large classes)
- Progressive learning
- Clear demonstrations

**Why This Matters:**
- Easy to understand
- Easy to modify
- Easy to extend
- Foundation for LangGraph agents (Part 2)

### **4. All Four Context Types**

**System Context:** Role, instructions, guidelines
**User Context:** Profile + long-term memories
**Conversation Context:** Working memory
**Retrieved Context:** RAG results

**Together:** Natural, stateful, personalized conversations

**üí° Research Insight (From Section 1):** Context Rot research demonstrates that context structure and organization affect LLM attention. Memory systems that selectively retrieve and organize context outperform systems that dump all available information. This validates our approach: quality over quantity, semantic similarity, and selective retrieval. ([Context Rot paper](https://research.trychroma.com/context-rot))

---

## üöÄ What's Next?

### **Part 2: Converting to LangGraph Agent (Separate Notebook)**

In the next notebook (`03_langgraph_agent_conversion.ipynb`), we'll:

1. **Convert** memory-enhanced RAG to LangGraph agent
2. **Add** state management and control flow
3. **Prepare** for Section 4 (tools and advanced capabilities)
4. **Build** a foundation for production-ready agents

**Why LangGraph?**
- Better state management
- More control over agent flow
- Easier to add tools (Section 4)
- Production-ready architecture

### **Section 4: Tools and Advanced Agents**

After completing Part 2, you'll be ready for Section 4.

**üí° What's Next:**

In Section 4, you'll build an agent that can actively decide when to use memory tools, rather than having memory operations hardcoded in your application flow.

---

## üèãÔ∏è Practice Exercises

### **Exercise 1: Add Personalization**

Modify the system to use long-term memories for personalization:

1. Store student preferences in long-term memory
2. Search long-term memory in `assemble_context()`
3. Use memories to personalize recommendations

**Hint:** Use `memory_client.create_long_term_memory()` and `memory_client.search_long_term_memory()`

### **Exercise 2: Add Error Handling**

Add error handling for memory operations:

1. Handle case when Memory Server is unavailable
2. Fallback to stateless RAG
3. Log warnings appropriately

**Hint:** Check `MEMORY_SERVER_AVAILABLE` flag

### **Exercise 3: Add Conversation Summary**

Add a function to summarize the conversation:

1. Load working memory
2. Extract key points from conversation
3. Display summary to user

**Hint:** Use LLM to generate summary from conversation history



### **Exercise 4: Compare Memory Extraction Strategies** üÜï

In Notebook 1, we learned about memory extraction strategies. Now let's see them in action!

**Goal:** Compare how discrete vs summary strategies extract different types of memories from the same conversation.

**Scenario:** A student has a long advising session discussing their academic goals, course preferences, and career aspirations.


#### **Understanding the Difference**

**Discrete Strategy (Default):**
- Extracts individual facts: "User's major is CS", "User interested in ML", "User wants to graduate Spring 2026"
- Each fact is independently searchable
- Good for: Most conversations, factual Q&A

**Summary Strategy:**
- Creates conversation summary: "User discussed academic planning, expressing interest in ML courses for Spring 2026 graduation..."
- Preserves conversational context
- Good for: Long sessions, meeting notes, comprehensive context

**Let's see the difference with real code!**


#### **Demo: Discrete Strategy (Current Default)**


In [33]:
if MEMORY_SERVER_AVAILABLE:
    import uuid

    from agent_memory_client.models import MemoryStrategyConfig
    from agent_memory_client.filters import UserId

    # Create a test session with discrete strategy (default)
    discrete_session_id = f"demo_discrete_{uuid.uuid4().hex[:8]}"
    discrete_student_id = f"student_discrete_{uuid.uuid4().hex[:8]}"

    print("üéØ Testing DISCRETE Strategy (Default)")
    print("=" * 80)
    print(f"Session ID: {discrete_session_id}")
    print(f"Student ID: {discrete_student_id}\n")

    # Simulate a long advising conversation
    advising_conversation = [
        {
            "role": "user",
            "content": "Hi! I'm a Computer Science major planning to graduate in Spring 2026. I'm really interested in machine learning and AI.",
        },
        {
            "role": "assistant",
            "content": "Great to meet you! I can help you plan your ML/AI coursework. What's your current experience level with machine learning?",
        },
        {
            "role": "user",
            "content": "I've taken intro to Python and data structures. I prefer online courses because I work part-time. I'm hoping to get an internship at a tech startup next summer.",
        },
        {
            "role": "assistant",
            "content": "Perfect! Based on your goals, I'd recommend starting with RU301 (Querying, Indexing, and Full-Text Search) and RU330 (Trading Engine). Both are available online.",
        },
        {
            "role": "user",
            "content": "That sounds good. I'm also interested in vector databases since they're used in AI applications. Do you have courses on that?",
        },
        {
            "role": "assistant",
            "content": "Absolutely! RU401 (Running Redis at Scale) covers vector search capabilities. It's a great fit for your AI interests.",
        },
    ]

    # Store conversation in working memory (discrete strategy is default)
    messages = [
        MemoryMessage(role=msg["role"], content=msg["content"])
        for msg in advising_conversation
    ]

    # Get or create working memory
    _, discrete_working_memory = await memory_client.get_or_create_working_memory(
        session_id=discrete_session_id, user_id=discrete_student_id, model_name="gpt-4o"
    )

    # Add messages to working memory
    discrete_working_memory.messages.extend(messages)

    # Save to Memory Server
    await memory_client.put_working_memory(
        session_id=discrete_session_id,
        memory=discrete_working_memory,
        user_id=discrete_student_id,
        model_name="gpt-4o",
    )

    print("‚úÖ Conversation stored with DISCRETE strategy")
    print(f"   Messages: {len(messages)}")
    print("\n‚è≥ Waiting for automatic memory extraction...")

    # Wait a moment for background extraction
    import asyncio

    await asyncio.sleep(2)

    # Search for extracted memories
    discrete_results = await memory_client.search_long_term_memory(
        text="student preferences and goals",
        user_id=UserId(eq=discrete_student_id),
        limit=10,
    )

    discrete_memories = discrete_results.memories if discrete_results.memories else []

    print(f"\nüìä DISCRETE Strategy Results:")
    print(f"   Extracted {len(discrete_memories)} individual memories\n")

    if discrete_memories:
        for i, mem in enumerate(discrete_memories[:5], 1):
            print(f"   {i}. {mem.text[:100]}...")
    else:
        print("   ‚è≥ No memories extracted yet (background processing may take time)")
        print("   Note: In production, extraction happens asynchronously")
else:
    print("‚ö†Ô∏è  Memory Server not available - skipping demo")

üéØ Testing DISCRETE Strategy (Default)
Session ID: demo_discrete_584f315e
Student ID: student_discrete_afd4621f

14:50:23 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/demo_discrete_584f315e?user_id=student_discrete_afd4621f&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 404 Not Found"


14:50:23 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/demo_discrete_584f315e?user_id=student_discrete_afd4621f&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:23 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/demo_discrete_584f315e?user_id=student_discrete_afd4621f&model_name=gpt-4o "HTTP/1.1 200 OK"


‚úÖ Conversation stored with DISCRETE strategy
   Messages: 6

‚è≥ Waiting for automatic memory extraction...


14:50:25 httpx INFO   HTTP Request: POST http://localhost:8088/v1/long-term-memory/search?optimize_query=false "HTTP/1.1 200 OK"



üìä DISCRETE Strategy Results:
   Extracted 0 individual memories

   ‚è≥ No memories extracted yet (background processing may take time)
   Note: In production, extraction happens asynchronously


#### **Demo: Summary Strategy**

Now let's see how the SUMMARY strategy handles the same conversation differently.


In [34]:
if MEMORY_SERVER_AVAILABLE:
    # Create a test session with SUMMARY strategy
    summary_session_id = f"demo_summary_{uuid.uuid4().hex[:8]}"
    summary_student_id = f"student_summary_{uuid.uuid4().hex[:8]}"

    print("\nüéØ Testing SUMMARY Strategy")
    print("=" * 80)
    print(f"Session ID: {summary_session_id}")
    print(f"Student ID: {summary_student_id}\n")

    # Configure summary strategy
    summary_strategy = MemoryStrategyConfig(
        strategy="summary", config={"max_summary_length": 500}
    )

    # Store the SAME conversation with summary strategy
    messages = [
        MemoryMessage(role=msg["role"], content=msg["content"])
        for msg in advising_conversation
    ]

    # Get or create working memory
    _, summary_working_memory = await memory_client.get_or_create_working_memory(
        session_id=summary_session_id, user_id=summary_student_id, model_name="gpt-4o"
    )

    # Set the long-term memory strategy
    summary_working_memory.long_term_memory_strategy = summary_strategy  # ‚Üê Key difference!

    # Add messages to working memory
    summary_working_memory.messages.extend(messages)

    # Save to Memory Server
    await memory_client.put_working_memory(
        session_id=summary_session_id,
        memory=summary_working_memory,
        user_id=summary_student_id,
        model_name="gpt-4o",
    )

    print("‚úÖ Conversation stored with SUMMARY strategy")
    print(f"   Messages: {len(messages)}")
    print(f"   Strategy: summary (max_summary_length=500)")
    print("\n‚è≥ Waiting for automatic memory extraction...")

    # Wait for background extraction
    await asyncio.sleep(2)

    # Search for extracted memories
    summary_results = await memory_client.search_long_term_memory(
        text="student preferences and goals",
        user_id=UserId(eq=summary_student_id),
        limit=10,
    )

    summary_memories = summary_results.memories if summary_results.memories else []

    print(f"\nüìä SUMMARY Strategy Results:")
    print(f"   Extracted {len(summary_memories)} conversation summaries\n")

    if summary_memories:
        for i, mem in enumerate(summary_memories[:3], 1):
            print(f"   {i}. {mem.text}\n")
    else:
        print("   ‚è≥ No summaries extracted yet (background processing may take time)")
        print("   Note: In production, extraction happens asynchronously")
else:
    print("‚ö†Ô∏è  Memory Server not available - skipping demo")


üéØ Testing SUMMARY Strategy
Session ID: demo_summary_2aa10f68
Student ID: student_summary_9eac115c

14:50:25 httpx INFO   HTTP Request: GET http://localhost:8088/v1/working-memory/demo_summary_2aa10f68?user_id=student_summary_9eac115c&namespace=redis_university&model_name=gpt-4o "HTTP/1.1 404 Not Found"


14:50:25 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/demo_summary_2aa10f68?user_id=student_summary_9eac115c&model_name=gpt-4o "HTTP/1.1 200 OK"


14:50:25 httpx INFO   HTTP Request: PUT http://localhost:8088/v1/working-memory/demo_summary_2aa10f68?user_id=student_summary_9eac115c&model_name=gpt-4o "HTTP/1.1 200 OK"


‚úÖ Conversation stored with SUMMARY strategy
   Messages: 6
   Strategy: summary (max_summary_length=500)

‚è≥ Waiting for automatic memory extraction...


14:50:28 httpx INFO   HTTP Request: POST http://localhost:8088/v1/long-term-memory/search?optimize_query=false "HTTP/1.1 200 OK"



üìä SUMMARY Strategy Results:
   Extracted 0 conversation summaries

   ‚è≥ No summaries extracted yet (background processing may take time)
   Note: In production, extraction happens asynchronously


#### **Comparison: When to Use Each Strategy**

**Use DISCRETE Strategy (Default) when:**
- ‚úÖ You want individual, searchable facts
- ‚úÖ Facts should be independently retrievable
- ‚úÖ Building knowledge graphs or fact databases
- ‚úÖ Most general-purpose agent interactions

**Example:** Course advisor agent (our use case)
- "User's major is Computer Science"
- "User interested in machine learning"
- "User prefers online courses"
- "User wants to graduate Spring 2026"

**Use SUMMARY Strategy when:**
- ‚úÖ Long conversations need to be preserved as context
- ‚úÖ Meeting notes or session summaries
- ‚úÖ Comprehensive context matters more than individual facts
- ‚úÖ Reducing storage while preserving meaning

**Example:** Academic advising session summary
- "Student discussed academic planning for Spring 2026 graduation, expressing strong interest in ML/AI courses. Prefers online format due to part-time work. Seeking tech startup internship. Recommended RU301, RU330, and RU401 based on AI career goals."

**Use PREFERENCES Strategy when:**
- ‚úÖ Building user profiles
- ‚úÖ Personalization is primary goal
- ‚úÖ User onboarding flows

**Example:** User profile building
- "User prefers email over SMS notifications"
- "User works best in morning hours"
- "User prefers dark mode interfaces"


#### **Key Takeaway**

**For this course, we use Discrete Strategy (default)** because:
1. Course advising benefits from searchable individual facts
2. Students ask specific questions ("What are my prerequisites?")
3. Facts are independently useful ("User completed RU101")
4. Balances detail with storage efficiency

**In production**, you might use:
- **Discrete** for most interactions
- **Summary** for long consultation sessions
- **Preferences** during onboarding
- **Custom** for domain-specific needs (legal, medical, technical)


#### **Configuration Reference**

**Discrete Strategy (Default - No Config Needed):**
```python
# This is the default - no configuration required
_, working_memory = await memory_client.get_or_create_working_memory(
    session_id=session_id, user_id=user_id, model_name="gpt-4o"
)
working_memory.messages.extend(messages)
await memory_client.put_working_memory(
    session_id=session_id,
    memory=working_memory,
    user_id=user_id,
    model_name="gpt-4o"
)
```

**Summary Strategy:**
```python
from agent_memory_client.models import MemoryStrategyConfig

summary_strategy = MemoryStrategyConfig(
    strategy="summary",
    config={"max_summary_length": 500}
)

_, working_memory = await memory_client.get_or_create_working_memory(
    session_id=session_id, user_id=user_id, model_name="gpt-4o"
)
working_memory.long_term_memory_strategy = summary_strategy
working_memory.messages.extend(messages)
await memory_client.put_working_memory(
    session_id=session_id,
    memory=working_memory,
    user_id=user_id,
    model_name="gpt-4o"
)
```

**Preferences Strategy:**
```python
preferences_strategy = MemoryStrategyConfig(
    strategy="preferences",
    config={}
)

_, working_memory = await memory_client.get_or_create_working_memory(
    session_id=session_id, user_id=user_id, model_name="gpt-4o"
)
working_memory.long_term_memory_strategy = preferences_strategy
working_memory.messages.extend(messages)
await memory_client.put_working_memory(
    session_id=session_id,
    memory=working_memory,
    user_id=user_id,
    model_name="gpt-4o"
)
```


#### **üìö Learn More**

For complete documentation and advanced configuration:
- [Memory Extraction Strategies Documentation](https://redis.github.io/agent-memory-server/memory-extraction-strategies/)
- [Working Memory Configuration](https://redis.github.io/agent-memory-server/working-memory/)
- [Long-term Memory Best Practices](https://redis.github.io/agent-memory-server/long-term-memory/)

**Next:** In Section 4, we'll see how agents use these strategies in production workflows.



---

## üìù Summary

### **What You Learned:**

1. ‚úÖ **Built** memory-enhanced RAG system
2. ‚úÖ **Integrated** all four context types
3. ‚úÖ **Demonstrated** benefits of memory
4. ‚úÖ **Prepared** for LangGraph conversion

### **Key Concepts:**

- **Working Memory** - Session-scoped conversation history
- **Long-term Memory** - User-scoped persistent facts
- **Context Assembly** - Combining all four context types
- **Reference Resolution** - Resolving pronouns and references
- **Stateful Conversations** - Natural, continuous dialogue

### **Next Steps:**

1. Complete practice exercises
2. Experiment with different queries
3. Move to Part 2 (LangGraph agent conversion)
4. Prepare for Section 4 (tools and advanced agents)

**üéâ Congratulations!** You've built a complete memory-enhanced RAG system!

---

## üîó Resources

- **Section 1:** Four Context Types
- **Section 2:** RAG Fundamentals
- **Section 3 (Notebook 1):** Memory Fundamentals
- **Section 3 (Notebook 3):** LangGraph Agent Conversion (Next)
- **Section 4:** Tools and Advanced Agents

**Agent Memory Server:**
- GitHub: `reference-agent/`
- Documentation: See README.md
- API Client: `agent-memory-client`

**LangChain:**
- Documentation: https://python.langchain.com/
- LangGraph: https://langchain-ai.github.io/langgraph/

---

![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

**Redis University - Context Engineering Course**

---

## üìö Additional Resources

- [Agent Memory Server Documentation](https://github.com/redis/agent-memory-server) - Production-ready memory management
- [Agent Memory Client](https://pypi.org/project/agent-memory-client/) - Python client for Agent Memory Server
- [RedisVL Documentation](https://redisvl.com/) - Redis Vector Library
- [Retrieval-Augmented Generation Paper](https://arxiv.org/abs/2005.11401) - Original RAG research
- [LangChain RAG Tutorial](https://python.langchain.com/docs/use_cases/question_answering/) - Building RAG systems
- [LangGraph Tutorials](https://langchain-ai.github.io/langgraph/tutorials/) - Building agents with LangGraph
- [Agent Architectures](https://python.langchain.com/docs/modules/agents/) - Different agent patterns
- [ReAct: Synergizing Reasoning and Acting](https://arxiv.org/abs/2210.03629) - Reasoning + acting in LLMs
- [Anthropic's Guide to Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) - Agent design patterns


