![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# Enhancing Your RAG Agent with Memory Architecture

## Building on Your Context-Engineered RAG Agent

In Section 2, you built a sophisticated RAG agent with excellent context engineering. Now we'll enhance it with **advanced memory architecture** that provides:

- **🧠 Persistent Memory** - Remember conversations across sessions
- **📚 Long-term Learning** - Build knowledge about each student over time
- **🔄 Memory Consolidation** - Summarize and organize conversation history
- **⚡ Efficient Retrieval** - Quick access to relevant past interactions

### What You'll Build

Transform your `SimpleRAGAgent` into a `MemoryEnhancedAgent` that:
- Remembers student preferences and learning patterns
- Maintains conversation continuity across sessions
- Consolidates memory to prevent context bloat
- Uses Redis for scalable memory persistence

### Learning Objectives

By the end of this notebook, you will:
1. **Understand** the grounding problem and how memory solves context engineering challenges
2. **Enhance** your RAG agent with sophisticated memory architecture
3. **Implement** Redis-based memory persistence for scalability
4. **Build** memory consolidation and summarization systems
5. **Create** cross-session conversation continuity
6. **Optimize** memory-aware context engineering for better responses

## Memory Architecture for RAG Systems

### The Memory Challenge in RAG Agents

Your current RAG agent has basic conversation history, but faces limitations:

**Current Limitations:**
- ❌ **Session-bound** - Forgets everything when restarted
- ❌ **Linear growth** - Context gets longer with each exchange
- ❌ **No consolidation** - Important insights get buried in history
- ❌ **No learning** - Doesn't build knowledge about student preferences

**Memory-Enhanced Benefits:**
- ✅ **Persistent memory** - Remembers across sessions and restarts
- ✅ **Intelligent consolidation** - Summarizes and organizes key insights
- ✅ **Student modeling** - Builds comprehensive understanding of each student
- ✅ **Efficient retrieval** - Finds relevant past context quickly

### Dual Memory Architecture

We'll implement a **dual memory system** inspired by human cognition:

```
WORKING MEMORY (Short-term)
├── Current conversation context
├── Recent exchanges (last 5-10)
├── Active task context
└── Immediate student state

LONG-TERM MEMORY (Persistent)
├── Student profile and preferences
├── Learning patterns and progress
├── Consolidated conversation summaries
└── Historical interaction insights
```

In [None]:
# Setup: Import the reference agent and enhance it with memory
import os
import sys
from typing import List, Dict, Any, Optional
from datetime import datetime
import asyncio
from dotenv import load_dotenv

# Load environment
load_dotenv()
sys.path.append('../../reference-agent')

# Import the reference agent components (already built for us!)
from redis_context_course.models import (
    Course, StudentProfile, DifficultyLevel, 
    CourseFormat, Semester, CourseRecommendation
)
from redis_context_course.course_manager import CourseManager
from redis_context_course.agent import ClassAgent  # The reference agent with memory!
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

# Import memory client (already built!)
try:
    from agent_memory_client import MemoryAPIClient, MemoryClientConfig
    MEMORY_AVAILABLE = True
    print("✅ Agent Memory Server client available")
except ImportError:
    MEMORY_AVAILABLE = False
    print("⚠️  Agent Memory Server not available - will use simplified memory")

import tiktoken

# Initialize components
tokenizer = tiktoken.encoding_for_model("gpt-3.5-turbo")
def count_tokens(text: str) -> int:
    return len(tokenizer.encode(text))

print("🧠 Memory-Enhanced RAG Agent Setup Complete!")
print("📚 Reference agent components imported")
print("🔧 Ready to enhance your agent with sophisticated memory")

## Building the Memory-Enhanced RAG Agent

Let's enhance your `SimpleRAGAgent` from Section 2 with sophisticated memory architecture. We'll build on the same foundation but add persistent memory capabilities.

In [None]:
# Let's first understand what we're building on from Section 2
class SimpleRAGAgent:
    """Your RAG agent from Section 2 - foundation for memory enhancement"""
    
    def __init__(self, course_manager: CourseManager):
        self.course_manager = course_manager
        self.llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0.7)
        self.conversation_history = {}  # In-memory only - lost when restarted!
    
    async def search_courses(self, query: str, limit: int = 3) -> List[Course]:
        """Search for relevant courses using the course manager"""
        results = await self.course_manager.search_courses(query, limit=limit)
        return results
    
    def create_context(self, student: StudentProfile, query: str, courses: List[Course]) -> str:
        """Create context for the LLM - your excellent context engineering from Section 2"""
        
        # Student context
        student_context = f"""STUDENT PROFILE:
Name: {student.name}
Academic Status: {student.major}, Year {student.year}
Completed Courses: {', '.join(student.completed_courses) if student.completed_courses else 'None'}
Learning Interests: {', '.join(student.interests)}
Preferred Format: {student.preferred_format.value if student.preferred_format else 'Any'}"""
        
        # Courses context
        courses_context = "RELEVANT COURSES:\n"
        for i, course in enumerate(courses, 1):
            courses_context += f"{i}. {course.course_code}: {course.title}\n"
        
        # Basic conversation history (limited and session-bound)
        history_context = ""
        if student.email in self.conversation_history:
            history = self.conversation_history[student.email]
            if history:
                history_context = "\nRECENT CONVERSATION:\n"
                for msg in history[-2:]:  # Only last 2 messages
                    history_context += f"User: {msg['user']}\nAssistant: {msg['assistant']}\n"
        
        return student_context + "\n\n" + courses_context + history_context
    
    async def chat(self, student: StudentProfile, query: str) -> str:
        """Chat with the student using RAG"""
        relevant_courses = await self.search_courses(query, limit=3)
        context = self.create_context(student, query, relevant_courses)
        
        system_message = SystemMessage(content="""You are a helpful academic advisor for Redis University. 
Use the provided context to give personalized course recommendations.
Be specific and explain why courses are suitable for the student.""")
        
        human_message = HumanMessage(content=f"Context: {context}\n\nStudent Question: {query}")
        response = self.llm.invoke([system_message, human_message])
        
        # Store in basic memory (session-bound)
        if student.email not in self.conversation_history:
            self.conversation_history[student.email] = []
        
        self.conversation_history[student.email].append({
            "user": query,
            "assistant": response.content
        })
        
        return response.content

print("📝 SimpleRAGAgent defined (Section 2 foundation)")

## The Reference Agent: Memory-Enhanced RAG

Great news! The `redis_context_course` reference agent already has sophisticated memory architecture built-in. Let's explore what it provides and how it solves the grounding problem.

### Built-in Memory Architecture

The reference agent includes:

1. **🧠 Working Memory** - Session-scoped conversation context
2. **📚 Long-term Memory** - Cross-session knowledge and preferences
3. **🔄 Automatic Memory Extraction** - Intelligent fact extraction from conversations
4. **🔍 Semantic Memory Search** - Vector-based memory retrieval
5. **🛠️ Memory Tools** - LLM can control its own memory

Let's see how this solves the context engineering challenges we identified!

In [None]:
# Let's explore the reference agent's memory capabilities
async def demonstrate_reference_agent_memory():
    """Demonstrate the built-in memory capabilities of the reference agent"""
    
    if not MEMORY_AVAILABLE:
        print("⚠️  Agent Memory Server not available")
        print("📝 This demo shows what the reference agent can do with full memory setup")
        print("\n🔧 To run with full memory:")
        print("   1. Install Agent Memory Server: pip install agent-memory-server")
        print("   2. Start the server: agent-memory-server")
        print("   3. Set AGENT_MEMORY_URL environment variable")
        return
    
    print("🧠 Reference Agent Memory Capabilities:")
    print()
    
    # Create a student ID for memory
    student_id = "sarah_chen_demo"
    
    try:
        # Initialize the reference agent with memory
        agent = ClassAgent(student_id=student_id)
        print(f"✅ ClassAgent initialized with memory for student: {student_id}")
        
        # The agent automatically handles:
        print("\n🔧 Built-in Memory Features:")
        print("   • Working Memory: Session-scoped conversation context")
        print("   • Long-term Memory: Cross-session knowledge persistence")
        print("   • Automatic Extraction: Important facts saved automatically")
        print("   • Semantic Search: Vector-based memory retrieval")
        print("   • Memory Tools: LLM can search and store memories")
        
        return agent
        
    except Exception as e:
        print(f"⚠️  Could not initialize reference agent: {e}")
        print("📝 This is expected if Agent Memory Server is not running")
        return None

# Demonstrate the reference agent
reference_agent = await demonstrate_reference_agent_memory()

## Building Your Own Memory-Enhanced Agent

While the reference agent has sophisticated memory, let's build a simplified version you can understand and extend. This will teach you the core concepts of memory-enhanced context engineering.

In [None]:
# Simple memory-enhanced agent that you can understand and build
class MemoryEnhancedRAGAgent(SimpleRAGAgent):
    """Enhanced RAG agent with simple but effective memory"""
    
    def __init__(self, course_manager: CourseManager):
        super().__init__(course_manager)
        # Simple memory storage (in production, use Redis or database)
        self.conversation_memory = {}  # Stores full conversation history
        self.student_preferences = {}  # Stores learned preferences
        self.conversation_topics = {}  # Tracks current conversation topics
    
    def store_conversation_topic(self, student_email: str, topic: str):
        """Remember what we're currently discussing"""
        self.conversation_topics[student_email] = topic
    
    def get_conversation_topic(self, student_email: str) -> str:
        """Get current conversation topic for reference resolution"""
        return self.conversation_topics.get(student_email, "")
    
    def store_preference(self, student_email: str, preference_type: str, preference_value: str):
        """Store student preferences for personalization"""
        if student_email not in self.student_preferences:
            self.student_preferences[student_email] = {}
        self.student_preferences[student_email][preference_type] = preference_value
    
    def get_preferences(self, student_email: str) -> Dict[str, str]:
        """Get stored student preferences"""
        return self.student_preferences.get(student_email, {})
    
    def resolve_references(self, query: str, student_email: str) -> str:
        """Resolve pronouns and references in the query"""
        current_topic = self.get_conversation_topic(student_email)
        preferences = self.get_preferences(student_email)
        
        # Simple reference resolution
        resolved_query = query
        
        # Resolve pronouns
        if current_topic and any(pronoun in query.lower() for pronoun in ['it', 'that', 'this']):
            resolved_query = f"{query} (referring to {current_topic})"
        
        # Resolve preference references
        if 'my preferred format' in query.lower() and 'format' in preferences:
            resolved_query = resolved_query.replace('my preferred format', preferences['format'])
        
        return resolved_query
    
    def create_memory_enhanced_context(self, student: StudentProfile, query: str, courses: List[Course]) -> str:
        """Enhanced context engineering with memory insights"""
        
        # Get memory insights
        preferences = self.get_preferences(student.email)
        current_topic = self.get_conversation_topic(student.email)
        
        # Enhanced student context with memory
        student_context = f"""STUDENT PROFILE:
Name: {student.name}
Academic Status: {student.major}, Year {student.year}
Completed Courses: {', '.join(student.completed_courses) if student.completed_courses else 'None'}
Learning Interests: {', '.join(student.interests)}
Preferred Format: {student.preferred_format.value if student.preferred_format else 'Any'}"""
        
        # Add memory insights
        if preferences:
            student_context += f"\nLearned Preferences: {preferences}"
        
        if current_topic:
            student_context += f"\nCurrent Discussion Topic: {current_topic}"
        
        # Courses context
        courses_context = "RELEVANT COURSES:\n"
        for i, course in enumerate(courses, 1):
            courses_context += f"{i}. {course.course_code}: {course.title}\n"
        
        # Enhanced conversation history (more than SimpleRAGAgent)
        history_context = ""
        if student.email in self.conversation_history:
            history = self.conversation_history[student.email]
            if history:
                history_context = "\nRECENT CONVERSATION:\n"
                for msg in history[-4:]:  # Last 4 messages (vs 2 in SimpleRAGAgent)
                    history_context += f"User: {msg['user']}\nAssistant: {msg['assistant']}\n"
        
        return student_context + "\n\n" + courses_context + history_context
    
    async def chat_with_memory(self, student: StudentProfile, query: str) -> str:
        """Enhanced chat with memory and reference resolution"""
        
        # Step 1: Resolve references in the query
        resolved_query = self.resolve_references(query, student.email)
        
        # Step 2: Search for courses using resolved query
        relevant_courses = await self.search_courses(resolved_query, limit=3)
        
        # Step 3: Create memory-enhanced context
        context = self.create_memory_enhanced_context(student, resolved_query, relevant_courses)
        
        # Step 4: Get LLM response
        system_message = SystemMessage(content="""You are a helpful academic advisor for Redis University. 
Use the provided context about the student and relevant courses to give personalized advice.
Pay attention to the student's learned preferences and current discussion topic.
Be specific about course recommendations and explain why they're suitable for the student.""")
        
        human_message = HumanMessage(content=f"Context: {context}\n\nStudent Question: {resolved_query}")
        response = self.llm.invoke([system_message, human_message])
        
        # Step 5: Store conversation and extract insights
        self._store_conversation_and_insights(student, query, response.content)
        
        return response.content
    
    def _store_conversation_and_insights(self, student: StudentProfile, query: str, response: str):
        """Store conversation and extract simple insights"""
        
        # Store conversation (same as SimpleRAGAgent)
        if student.email not in self.conversation_history:
            self.conversation_history[student.email] = []
        
        self.conversation_history[student.email].append({
            "user": query,
            "assistant": response
        })
        
        # Extract conversation topic for reference resolution
        query_lower = query.lower()
        response_lower = response.lower()
        
        # Extract course mentions as current topic
        import re
        course_mentions = re.findall(r'ru\d+|cs\d+|ds\d+', query_lower + ' ' + response_lower)
        if course_mentions:
            self.store_conversation_topic(student.email, course_mentions[0].upper())
        
        # Extract preferences
        if 'prefer' in query_lower:
            if 'online' in query_lower:
                self.store_preference(student.email, 'format', 'online')
            elif 'hands-on' in query_lower or 'practical' in query_lower:
                self.store_preference(student.email, 'learning_style', 'hands-on')

print("🧠 MemoryEnhancedRAGAgent created!")
print("New capabilities:")
print("• Reference resolution (it, that, this)")
print("• Preference learning and storage")
print("• Conversation topic tracking")
print("• Enhanced conversation history")

## Testing Your Memory-Enhanced RAG Agent

Let's test the memory-enhanced agent and see how it improves over multiple conversations. We'll demonstrate:

1. **Cross-session memory** - Agent remembers across restarts
2. **Learning patterns** - Agent builds understanding of student preferences
3. **Memory consolidation** - Agent summarizes and organizes insights
4. **Enhanced context** - Better responses using memory insights

In [None]:
# Initialize the memory-enhanced RAG agent
import asyncio

async def test_memory_enhanced_agent():
    # Initialize components
    course_manager = CourseManager()
    memory_agent = MemoryEnhancedRAGAgent(course_manager, redis_client)
    
    # Create a test student
    sarah = StudentProfile(
        name='Sarah Chen',
        email='sarah.chen@university.edu',
        major='Computer Science',
        year=3,
        completed_courses=['RU101'],
        current_courses=[],
        interests=['machine learning', 'data science', 'python', 'AI'],
        preferred_format=CourseFormat.ONLINE,
        preferred_difficulty=DifficultyLevel.INTERMEDIATE,
        max_credits_per_semester=15
    )
    
    # Simulate a conversation sequence
    conversation_sequence = [
        "Hi! I'm interested in learning machine learning. What courses do you recommend?",
        "I prefer hands-on learning with practical projects. Do these courses have labs?",
        "What are the prerequisites for the advanced ML course?",
        "I'm also interested in data science. How does that relate to ML?",
        "Can you remind me what we discussed about machine learning courses?"
    ]
    
    # Test conversation with memory
    for i, query in enumerate(conversation_sequence, 1):
        print(f"\n--- Conversation Turn {i} ---")
        print(f"👤 Student: {query}")
        
        response = await memory_agent.chat_with_memory(sarah, query)
        print(f"🤖 Agent: {response[:150]}..." if len(response) > 150 else f"🤖 Agent: {response}")
        
        # Show memory insights after each exchange
        memory = memory_agent._get_student_memory(sarah.email)
        insights = memory.get_insights()
        if insights:
            print(f"💭 Memory Insights: {len(insights)} insights stored")
    
    return memory_agent, sarah

# Run the test
memory_agent, sarah = await test_memory_enhanced_agent()

## Memory Analysis: Before vs After

Let's analyze how memory enhancement improves our RAG agent's performance.

In [None]:
# Analyze memory capabilities
async def analyze_memory_benefits():
    # Get student memory
    memory = memory_agent._get_student_memory(sarah.email)
    
    # Show conversation history
    recent_conversations = memory.get_recent_conversation(10)
    print(f"📚 Stored Conversations: {len(recent_conversations)} exchanges")
    
    # Show insights
    insights = memory.get_insights()
    print(f"💡 Learning Insights: {len(insights)} insights extracted")
    
    for insight_type, insight in insights.items():
        print(f"   • {insight_type}: {insight['data']}")
    
    # Show memory consolidation
    consolidated = memory.get_memory_summary()
    print(f"\n🧠 Consolidated Memory:")
    print(f"   {consolidated}")
    
    # Compare context sizes
    print(f"\n📊 Context Engineering Comparison:")
    
    # Simple RAG context
    simple_agent = SimpleRAGAgent(memory_agent.course_manager)
    courses = await simple_agent.search_courses('machine learning', limit=3)
    simple_context = simple_agent.create_context(sarah, 'What ML courses do you recommend?', courses)
    
    # Memory-enhanced context
    enhanced_context = memory_agent.create_memory_enhanced_context(sarah, 'What ML courses do you recommend?', courses)
    
    print(f"   Simple RAG Context: {count_tokens(simple_context)} tokens")
    print(f"   Memory-Enhanced Context: {count_tokens(enhanced_context)} tokens")
    print(f"   Memory Overhead: {count_tokens(enhanced_context) - count_tokens(simple_context)} tokens")

# Run the analysis
await analyze_memory_benefits()

## Key Benefits of Memory Enhancement

### ✨ Context Quality Improvements

- **✅ Cross-session continuity** - Remembers past conversations
- **✅ Learning pattern recognition** - Understands student preferences
- **✅ Personalized insights** - Builds comprehensive student model
- **✅ Memory consolidation** - Summarizes key learning journey insights

### 🚀 Performance Benefits

- **Persistent memory** across sessions and restarts
- **Intelligent consolidation** prevents context bloat
- **Efficient retrieval** of relevant past interactions
- **Scalable architecture** using Redis for memory persistence

### 🎯 Next Steps

In **Section 4**, we'll enhance this memory-enabled agent with:
- **Multi-tool capabilities** for specialized academic advisor functions
- **Semantic tool selection** for intelligent routing
- **Memory-aware tool coordination** for complex queries

Your memory-enhanced RAG agent is now ready for the next level of sophistication!