# Agent Memory & Conversation Management

## Persistent Conversations and Knowledge Integration

**Module Duration:** 15 minutes | **Focus:** Memory patterns and knowledge retrieval

---

### Learning Objectives

Master RAG and memory patterns for production conversational AI:

- **RAG Architecture:** Retrieval-Augmented Generation with vector search
- **FAISS Integration:** Production-grade similarity search and indexing
- **Conversation Persistence:** SQLite-based session and turn management
- **Context Management:** Token-aware window management and compression
- **Memory-Enhanced Agents:** Combining knowledge retrieval with conversation history

**What You'll Build:**
- Vector similarity search system with FAISS
- Persistent conversation storage with SQLite
- Memory-enhanced agent with knowledge retrieval
- Production-ready memory management patterns

This covers memory patterns used in enterprise conversational AI systems.

In [None]:
# Vector Embeddings & Similarity Foundation
import numpy as np
from sentence_transformers import SentenceTransformer
import asyncio
from datetime import datetime
from typing import List, Tuple, Dict, Any
from dataclasses import dataclass
import logging

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

print("🧠 AGENT MEMORY & CONVERSATION MANAGEMENT")
print("=" * 45)
print(f"Session: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("Focus: Vector embeddings and similarity search")
print()

# Initialize embedding model
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

def create_embedding(text: str) -> np.ndarray:
    """Create normalized embedding for text"""
    embedding = embedding_model.encode([text])[0]
    # Normalize for cosine similarity
    return embedding / np.linalg.norm(embedding)

def calculate_similarity(embedding1: np.ndarray, embedding2: np.ndarray) -> float:
    """Calculate cosine similarity between embeddings"""
    return float(np.dot(embedding1, embedding2))

# Test embedding functionality
print("🔍 Testing Vector Embeddings:")
test_texts = [
    "I love programming in Python",
    "Python is a great programming language", 
    "The weather is sunny today",
    "Machine learning uses algorithms to learn patterns"
]

embeddings = []
for text in test_texts:
    embedding = create_embedding(text)
    embeddings.append((text, embedding))
    print(f"   ✅ Created embedding for: '{text}' (dimension: {len(embedding)})")

# Test similarity
print("\n📊 Similarity Analysis:")
for i in range(len(embeddings)):
    for j in range(i+1, len(embeddings)):
        text1, emb1 = embeddings[i]
        text2, emb2 = embeddings[j]
        similarity = calculate_similarity(emb1, emb2)
        print(f"   '{text1[:30]}...' vs '{text2[:30]}...': {similarity:.3f}")

print("\n✅ Vector embedding system working:")
print("   Sentence transformer model loaded")
print("   Embedding dimension: {len(embeddings[0][1])}")
print("   Cosine similarity calculation ready")

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: mps
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2


🧠 AGENT MEMORY & CONVERSATION MANAGEMENT
Session: 2025-06-16 12:19:43
Focus: Vector embeddings and similarity search

🔍 Testing Vector Embeddings:


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

   ✅ Created embedding for: 'I love programming in Python' (dimension: 384)


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

   ✅ Created embedding for: 'Python is a great programming language' (dimension: 384)


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

   ✅ Created embedding for: 'The weather is sunny today' (dimension: 384)


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

   ✅ Created embedding for: 'Machine learning uses algorithms to learn patterns' (dimension: 384)

📊 Similarity Analysis:
   'I love programming in Python...' vs 'Python is a great programming ...': 0.868
   'I love programming in Python...' vs 'The weather is sunny today...': 0.029
   'I love programming in Python...' vs 'Machine learning uses algorith...': 0.227
   'Python is a great programming ...' vs 'The weather is sunny today...': 0.036
   'Python is a great programming ...' vs 'Machine learning uses algorith...': 0.167
   'The weather is sunny today...' vs 'Machine learning uses algorith...': 0.019

✅ Vector embedding system working:
   Sentence transformer model loaded
   Embedding dimension: 384
   Cosine similarity calculation ready


### FAISS Integration for Fast Similarity Search

FAISS (Facebook AI Similarity Search) enables efficient similarity search across large collections of vectors:

**Key Benefits:**
- **Speed:** Optimized for fast nearest neighbor search
- **Scalability:** Handles millions of vectors efficiently
- **Flexibility:** Multiple index types for different use cases
- **Memory Efficiency:** Optimized storage and retrieval

In [None]:
# FAISS Integration for Vector Search
import faiss
import uuid
import json



@dataclass
class MemoryEntry:
    """Structured memory entry with metadata"""
    entry_id: str
    content: str
    embedding: np.ndarray
    metadata: Dict[str, Any]
    created_at: str

class VectorMemorySystem:
    """Production vector memory with FAISS"""
    
    def __init__(self, dimension: int = 384):
        self.dimension = dimension
        self.index = faiss.IndexFlatIP(dimension)  # Inner product for cosine similarity
        self.entries = []
        self.entry_map = {}  # Maps FAISS index to entry_id
        
    def add_memory(self, content: str, metadata: Dict[str, Any] = None) -> str:
        """Add content to vector memory"""
        if metadata is None:
            metadata = {}
            
        entry_id = str(uuid.uuid4())
        embedding = create_embedding(content)
        
        # Create memory entry
        entry = MemoryEntry(
            entry_id=entry_id,
            content=content,
            embedding=embedding,
            metadata=metadata,
            created_at=datetime.now().isoformat()
        )
        
        # Add to FAISS index
        faiss_index = len(self.entries)
        self.index.add(embedding.reshape(1, -1))
        self.entries.append(entry)
        self.entry_map[faiss_index] = entry_id
        
        logger.info(f"Added memory: {entry_id[:8]} - '{content[:50]}...'")
        return entry_id
    
    def search_memory(self, query: str, top_k: int = 5, threshold: float = 0.3) -> List[Tuple[MemoryEntry, float]]:
        """Search memory using vector similarity"""
        if len(self.entries) == 0:
            return []
        
        query_embedding = create_embedding(query)
        scores, indices = self.index.search(query_embedding.reshape(1, -1), min(top_k, len(self.entries)))
        
        results = []
        for score, idx in zip(scores[0], indices[0]):
            if score >= threshold:
                entry = self.entries[idx]
                results.append((entry, float(score)))
        
        logger.info(f"Memory search for '{query[:30]}...' found {len(results)} matches")
        return results
    
    def get_stats(self) -> Dict[str, Any]:
        """Get memory statistics"""
        return {
            "total_entries": len(self.entries),
            "index_size": self.index.ntotal,
            "dimension": self.dimension
        }

# Initialize vector memory
vector_memory = VectorMemorySystem()

# Test with knowledge entries
print("\n🔧 Testing FAISS Vector Memory:")
knowledge_base = [
    ("Python is a programming language known for simplicity and readability", {"topic": "programming"}),
    ("Machine learning enables computers to learn from data automatically", {"topic": "ai"}),
    ("FAISS provides efficient similarity search for large vector collections", {"topic": "search"}),
    ("Vector databases store high-dimensional embeddings for semantic search", {"topic": "databases"}),
    ("Natural language processing helps computers understand human language", {"topic": "nlp"})
]

for content, metadata in knowledge_base:
    entry_id = vector_memory.add_memory(content, metadata)

# Test search functionality
print("\n🔍 Testing Vector Search:")
test_queries = [
    "programming languages",
    "artificial intelligence and learning", 
    "search algorithms"
]

for query in test_queries:
    print(f"\n   Query: '{query}'")
    results = vector_memory.search_memory(query, top_k=2)
    for entry, score in results:
        print(f"   ✅ Match (score: {score:.3f}): {entry.content}")

print("\n✅ FAISS vector memory system ready:")
stats = vector_memory.get_stats()
print("   Total entries: {stats['total_entries']}")
print("   Vector dimension: {stats['dimension']}")
print("   FAISS index size: {stats['index_size']}")


🔧 Testing FAISS Vector Memory:


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Added memory: 24d502ea - 'Python is a programming language known for simplic...'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Added memory: e1490352 - 'Machine learning enables computers to learn from d...'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Added memory: 717c6814 - 'FAISS provides efficient similarity search for lar...'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Added memory: 21a32a20 - 'Vector databases store high-dimensional embeddings...'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Added memory: f6e2d78b - 'Natural language processing helps computers unders...'



🔍 Testing Vector Search:

   Query: 'programming languages'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'programming languages...' found 2 matches


   ✅ Match (score: 0.579): Python is a programming language known for simplicity and readability
   ✅ Match (score: 0.401): Natural language processing helps computers understand human language

   Query: 'artificial intelligence and learning'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'artificial intelligence and le...' found 2 matches


   ✅ Match (score: 0.522): Machine learning enables computers to learn from data automatically
   ✅ Match (score: 0.359): Natural language processing helps computers understand human language

   Query: 'search algorithms'


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'search algorithms...' found 2 matches


   ✅ Match (score: 0.457): FAISS provides efficient similarity search for large vector collections
   ✅ Match (score: 0.401): Vector databases store high-dimensional embeddings for semantic search

✅ FAISS vector memory system ready:
   Total entries: 5
   Vector dimension: 384
   FAISS index size: 5


### Conversation Persistence with SQLite

For production memory systems, we need persistent storage that survives agent restarts:

**Storage Requirements:**
- **Session Management:** Track individual conversation sessions
- **Turn Storage:** Store each user-agent interaction
- **Metadata Tracking:** Context and knowledge retrieval logging
- **Query Performance:** Fast retrieval of conversation history

In [None]:
# Conversation Persistence Manager
import sqlite3
from pathlib import Path

@dataclass
class ConversationTurn:
    """Individual conversation turn with metadata"""
    turn_id: str
    session_id: str
    timestamp: str
    user_message: str
    agent_response: str
    knowledge_used: List[str]

class ConversationManager:
    """Manages conversation persistence with SQLite"""
    
    def __init__(self, db_path: str = "agent_memory.db"):
        self.db_path = db_path
        self.init_database()
        
    def init_database(self):
        """Initialize SQLite database"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        # Sessions table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS sessions (
                session_id TEXT PRIMARY KEY,
                user_id TEXT NOT NULL,
                started_at TEXT NOT NULL,
                last_active TEXT NOT NULL,
                total_turns INTEGER DEFAULT 0
            )
        ''')
        
        # Conversation turns table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS turns (
                turn_id TEXT PRIMARY KEY,
                session_id TEXT NOT NULL,
                timestamp TEXT NOT NULL,
                user_message TEXT NOT NULL,
                agent_response TEXT NOT NULL,
                knowledge_used TEXT,
                FOREIGN KEY (session_id) REFERENCES sessions (session_id)
            )
        ''')
        
        conn.commit()
        conn.close()
        logger.info(f"Database initialized: {self.db_path}")
    
    def create_session(self, user_id: str) -> str:
        """Create new conversation session"""
        session_id = str(uuid.uuid4())
        timestamp = datetime.now().isoformat()
        
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        cursor.execute('''
            INSERT INTO sessions (session_id, user_id, started_at, last_active, total_turns)
            VALUES (?, ?, ?, ?, ?)
        ''', (session_id, user_id, timestamp, timestamp, 0))
        
        conn.commit()
        conn.close()
        
        logger.info(f"Created session: {session_id[:8]} for user {user_id}")
        return session_id
    
    def save_turn(self, session_id: str, user_message: str, agent_response: str, 
                  knowledge_used: List[str] = None) -> str:
        """Save conversation turn"""
        turn_id = str(uuid.uuid4())
        timestamp = datetime.now().isoformat()
        
        if knowledge_used is None:
            knowledge_used = []
        
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        # Save turn
        cursor.execute('''
            INSERT INTO turns (turn_id, session_id, timestamp, user_message, agent_response, knowledge_used)
            VALUES (?, ?, ?, ?, ?, ?)
        ''', (turn_id, session_id, timestamp, user_message, agent_response, json.dumps(knowledge_used)))
        
        # Update session
        cursor.execute('''
            UPDATE sessions SET last_active = ?, total_turns = total_turns + 1
            WHERE session_id = ?
        ''', (timestamp, session_id))
        
        conn.commit()
        conn.close()
        
        logger.info(f"Saved turn: {turn_id[:8]} for session {session_id[:8]}")
        return turn_id
    
    def get_conversation_history(self, session_id: str, last_n: int = 5) -> List[ConversationTurn]:
        """Get recent conversation history"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        cursor.execute('''
            SELECT turn_id, session_id, timestamp, user_message, agent_response, knowledge_used
            FROM turns WHERE session_id = ?
            ORDER BY timestamp DESC LIMIT ?
        ''', (session_id, last_n))
        
        turns = []
        for row in cursor.fetchall():
            turn = ConversationTurn(
                turn_id=row[0],
                session_id=row[1], 
                timestamp=row[2],
                user_message=row[3],
                agent_response=row[4],
                knowledge_used=json.loads(row[5]) if row[5] else []
            )
            turns.append(turn)
        
        conn.close()
        turns.reverse()  # Chronological order
        
        logger.info(f"Retrieved {len(turns)} turns for session {session_id[:8]}")
        return turns

# Initialize conversation manager
conversation_manager = ConversationManager()

# Test conversation persistence
print("\n💾 Testing Conversation Persistence:")
test_session = conversation_manager.create_session("demo_user")

# Save some test conversations
test_conversations = [
    ("Hello, I'm learning about AI", "Hi! I'd be happy to help you learn about AI. What specifically interests you?"),
    ("What is machine learning?", "Machine learning is a subset of AI that enables computers to learn from data without explicit programming."),
    ("How does it relate to neural networks?", "Neural networks are one approach to machine learning, inspired by how biological brains process information.")
]

for user_msg, agent_resp in test_conversations:
    turn_id = conversation_manager.save_turn(test_session, user_msg, agent_resp, ["ai_knowledge"])

# Test retrieval
print("\n📚 Testing History Retrieval:")
history = conversation_manager.get_conversation_history(test_session, last_n=3)
for i, turn in enumerate(history, 1):
    print(f"   Turn {i}:")
    print(f"     User: {turn.user_message}")
    print(f"     Agent: {turn.agent_response}")

print("\n✅ Conversation persistence ready:")
print("   SQLite database: {conversation_manager.db_path}")
print("   Session tracking with turn history")
print("   Knowledge usage logging")

INFO:__main__:Database initialized: agent_memory.db
INFO:__main__:Created session: f677dcc9 for user demo_user
INFO:__main__:Saved turn: 7dde1afc for session f677dcc9
INFO:__main__:Saved turn: 252f8ca2 for session f677dcc9
INFO:__main__:Saved turn: 3091fef5 for session f677dcc9
INFO:__main__:Retrieved 3 turns for session f677dcc9



💾 Testing Conversation Persistence:

📚 Testing History Retrieval:
   Turn 1:
     User: Hello, I'm learning about AI
     Agent: Hi! I'd be happy to help you learn about AI. What specifically interests you?
   Turn 2:
     User: What is machine learning?
     Agent: Machine learning is a subset of AI that enables computers to learn from data without explicit programming.
   Turn 3:
     User: How does it relate to neural networks?
     Agent: Neural networks are one approach to machine learning, inspired by how biological brains process information.

✅ Conversation persistence ready:
   SQLite database: agent_memory.db
   Session tracking with turn history
   Knowledge usage logging


### Memory-Enhanced Agent Integration

Now we combine vector memory and conversation persistence with ADK agents:

**Integration Features:**
- **Context Building:** Combine conversation history with relevant knowledge
- **Smart Retrieval:** Search knowledge based on current conversation
- **Response Enhancement:** Use retrieved context to improve responses
- **Memory Tracking:** Log what knowledge and context influenced each response

In [12]:
# Memory-Enhanced Agent with ADK (Fixed Session Management)
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.models.lite_llm import LiteLlm
from google.genai import types

class MemoryAgent:
    """Agent with vector memory and conversation persistence"""
    
    def __init__(self):
        self.vector_memory = vector_memory
        self.conversation_manager = conversation_manager
        self.current_session = None
        self.adk_session_id = None  # Separate ADK session ID
        
    async def setup(self, user_id: str = "demo_user"):
        """Initialize memory-enhanced agent"""
        # Create conversation session
        self.current_session = self.conversation_manager.create_session(user_id)
        
        # Create separate ADK session ID
        self.adk_session_id = f"adk_{self.current_session}"
        
        # Setup ADK agent
        model = LiteLlm(model="ollama_chat/llama3.2:latest")
        
        self.agent = Agent(
            name="MemoryAgent",
            model=model,
            instruction="""You are an intelligent agent with persistent memory capabilities.

You have access to:
- A knowledge base with relevant information
- Previous conversation history with this user
- Context from past interactions

When responding:
1. Use relevant knowledge from your memory when helpful
2. Reference previous conversations naturally
3. Build on past context to provide personalized responses
4. Be conversational and remember what you've discussed

Always be helpful and maintain conversation continuity."""
        )
        
        self.session_service = InMemorySessionService()
        self.runner = Runner(
            agent=self.agent,
            app_name="memory_agent", 
            session_service=self.session_service
        )
        
        # Create ADK session with the correct ID
        await self.session_service.create_session(
            app_name="memory_agent",
            user_id=user_id,
            session_id=self.adk_session_id
        )
        
        logger.info(f"Memory agent ready - Conversation: {self.current_session[:8]}, ADK: {self.adk_session_id[:8]}")
    
    async def chat(self, user_message: str) -> str:
        """Chat with memory-enhanced responses"""
        # Search relevant knowledge
        knowledge_results = self.vector_memory.search_memory(user_message, top_k=3, threshold=0.2)
        relevant_knowledge = [entry.content for entry, score in knowledge_results]
        
        # Get conversation history
        history = self.conversation_manager.get_conversation_history(self.current_session, last_n=3)
        
        # Build context
        context_parts = []
        
        if relevant_knowledge:
            context_parts.append("Relevant Knowledge:")
            for knowledge in relevant_knowledge:
                context_parts.append(f"- {knowledge}")
        
        if history:
            context_parts.append("\nRecent Conversation:")
            for turn in history:
                context_parts.append(f"User: {turn.user_message}")
                context_parts.append(f"Assistant: {turn.agent_response}")
        
        context_parts.append(f"\nCurrent User Message: {user_message}")
        
        enhanced_message = "\n".join(context_parts)
        
        # Send to agent using the correct ADK session ID
        message = types.Content(role="user", parts=[types.Part(text=enhanced_message)])
        
        response = ""
        async for event in self.runner.run_async(
            user_id="memory_user",
            session_id=self.adk_session_id,  # Use ADK session ID here
            new_message=message
        ):
            if event.is_final_response():
                response = event.content.parts[0].text
                break
        
        # Save conversation using conversation session ID
        knowledge_used = [f"{entry.content[:50]}..." for entry, _ in knowledge_results]
        self.conversation_manager.save_turn(self.current_session, user_message, response, knowledge_used)
        
        return response

# Initialize memory agent
memory_agent = MemoryAgent()
await memory_agent.setup("demo_user")

print("✅ Memory-enhanced agent ready:")
print("   Vector knowledge retrieval integrated")
print("   Conversation history tracking")
print("   Context-aware response generation")
print("   Session management fixed")

INFO:__main__:Created session: ebe381a2 for user demo_user
INFO:__main__:Memory agent ready - Conversation: ebe381a2, ADK: adk_ebe3


✅ Memory-enhanced agent ready:
   Vector knowledge retrieval integrated
   Conversation history tracking
   Context-aware response generation
   Session management fixed


### Memory System Demonstration

Let's test the complete memory system with realistic conversations:

**Test Scenarios:**
- **Knowledge Integration:** Agent uses vector search to find relevant information
- **Conversation Continuity:** Agent remembers previous discussion points
- **Context Building:** Combines knowledge and history for better responses
- **Memory Tracking:** Logs what information influenced each response

In [None]:
# Complete Memory System Demonstration

# Update the existing chat method directly
async def chat(self, user_message: str) -> str:
    """Chat with memory-enhanced responses"""
    # Search relevant knowledge
    knowledge_results = self.vector_memory.search_memory(user_message, top_k=3, threshold=0.2)
    relevant_knowledge = [entry.content for entry, score in knowledge_results]
    
    # Get conversation history
    history = self.conversation_manager.get_conversation_history(self.current_session, last_n=3)
    
    # Build context
    context_parts = []
    
    if relevant_knowledge:
        context_parts.append("Relevant Knowledge:")
        for knowledge in relevant_knowledge:
            context_parts.append(f"- {knowledge}")
    
    if history:
        context_parts.append("\nRecent Conversation:")
        for turn in history:
            context_parts.append(f"User: {turn.user_message}")
            context_parts.append(f"Assistant: {turn.agent_response}")
    
    context_parts.append(f"\nCurrent User Message: {user_message}")
    
    enhanced_message = "\n".join(context_parts)
    
    # Send to agent using the correct user_id
    message = types.Content(role="user", parts=[types.Part(text=enhanced_message)])
    
    response = ""
    async for event in self.runner.run_async(
        user_id="demo_user",  # Fixed: use same user_id as setup
        session_id=self.adk_session_id,
        new_message=message
    ):
        if event.is_final_response():
            response = event.content.parts[0].text
            break
    
    # Save conversation
    knowledge_used = [f"{entry.content[:50]}..." for entry, _ in knowledge_results]
    self.conversation_manager.save_turn(self.current_session, user_message, response, knowledge_used)
    
    return response

# Update the method properly
MemoryAgent.chat = chat

async def demonstrate_memory_system():
    """Test memory-enhanced conversations"""
    
    print("🧪 MEMORY SYSTEM DEMONSTRATION")
    print("=" * 35)
    
    conversations = [
        "Hi, I'm interested in learning about Python programming",
        "What makes Python good for machine learning?", 
        "Can you explain how vector search works?",
        "Earlier you mentioned Python - what did you say about its advantages?",
        "How do vector databases help with the machine learning concepts we discussed?"
    ]
    
    print("\n💬 Memory-Enhanced Conversations:")
    
    for i, user_message in enumerate(conversations, 1):
        print(f"\n--- Turn {i} ---")
        print(f"👤 User: {user_message}")
        
        response = await memory_agent.chat(user_message)
        print(f"🤖 Agent: {response}")
        
        await asyncio.sleep(0.5)

# Run demonstration
await demonstrate_memory_system()

print("\n📊 MEMORY SYSTEM ANALYSIS")
print("=" * 30)

# Memory statistics
memory_stats = vector_memory.get_stats()
print("\n🧠 Vector Memory:")
print(f"   Knowledge Entries: {memory_stats['total_entries']}")
print(f"   Vector Dimension: {memory_stats['dimension']}")
print(f"   FAISS Index Size: {memory_stats['index_size']}")

# Test knowledge retrieval
print("\n🔍 Knowledge Retrieval Test:")
test_query = "programming and artificial intelligence"
results = vector_memory.search_memory(test_query, top_k=3)
print(f"   Query: '{test_query}'")
for entry, score in results:
    print(f"   ✅ {score:.3f}: {entry.content[:60]}...")

# Conversation history
history = conversation_manager.get_conversation_history(memory_agent.current_session)
print("\n💬 Conversation History:")
print(f"   Total turns in session: {len(history)}")
print(f"   Session ID: {memory_agent.current_session[:8]}")

print("\n✅ MEMORY SYSTEM DEMONSTRATION COMPLETE:")
print("   ✅ Vector Knowledge Base: Semantic search with FAISS")
print("   ✅ Persistent Conversations: SQLite storage with history")
print("   ✅ Context Integration: Knowledge + history in responses")
print("   ✅ Memory Tracking: What knowledge influenced each response")
print("   ✅ Production Ready: Scalable architecture with enterprise patterns")

🧪 MEMORY SYSTEM DEMONSTRATION

💬 Memory-Enhanced Conversations:

--- Turn 1 ---
👤 User: Hi, I'm interested in learning about Python programming


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'Hi, I'm interested in learning...' found 2 matches
INFO:__main__:Retrieved 0 turns for session ebe381a2
[92m12:39:38 - LiteLLM:INFO[0m: utils.py:3119 - 
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:LiteLLM:
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
[92m12:40:07 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:LiteLLM:selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:__main__:Saved turn: b03306f6 for session ebe381a2
  PydanticSerializationUnexpectedValue(Expected 9 fields but got 5: Expected `Message` - serialized value may not be as expected [input_value=Message(cont

🤖 Agent: Hello! Welcome to our conversation about Python programming. It's great that you're interested in learning more about it.

Python is indeed an excellent choice for beginners and experienced programmers alike due to its simplicity, readability, and versatility. One of the key features of Python is its syntax, which is designed to be easy to understand and write. This makes it a fantastic language for rapid prototyping and development.

I'd like to know more about your goals with learning Python. Are you looking to get started with programming in general, or do you have a specific project or area of interest (e.g., data science, web development) that you'd like to explore?

Also, since we're just getting started, I'll mention that natural language processing can play a significant role in automating tasks, such as text analysis and machine learning. But for now, let's focus on the basics of Python programming.

What do you think? Would you like to start with some basic tutorials

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'What makes Python good for mac...' found 3 matches
INFO:__main__:Retrieved 1 turns for session ebe381a2
[92m12:40:09 - LiteLLM:INFO[0m: utils.py:3119 - 
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:LiteLLM:
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
[92m12:40:42 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:LiteLLM:selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:__main__:Saved turn: 28853cc4 for session ebe381a2
  PydanticSerializationUnexpectedValue(Expected 9 fields but got 5: Expected `Message` - serialized value may not be as expected [input_value=Message(cont

🤖 Agent: A natural follow-up question! You're interested in exploring how Python can be used for machine learning, which is an exciting field that's all about enabling computers to learn from data automatically.

Python is an excellent choice for machine learning because of its simplicity and readability, making it easy for developers to write and implement machine learning algorithms. Additionally, Python has a vast array of libraries and frameworks that make machine learning more accessible, such as scikit-learn, TensorFlow, and Keras.

One of the key reasons Python is well-suited for machine learning is its ability to handle large datasets efficiently. With libraries like Pandas and NumPy, you can easily manipulate and analyze data in Python, which makes it a great choice for tasks like data preprocessing, feature engineering, and model training.

Another advantage of using Python for machine learning is its flexibility. You can use it to develop both supervised and unsupervised lea

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'Can you explain how vector sea...' found 3 matches
INFO:__main__:Retrieved 2 turns for session ebe381a2
[92m12:40:43 - LiteLLM:INFO[0m: utils.py:3119 - 
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:LiteLLM:
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
[92m12:41:28 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:LiteLLM:selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:__main__:Saved turn: ad033332 for session ebe381a2
  PydanticSerializationUnexpectedValue(Expected 9 fields but got 5: Expected `Message` - serialized value may not be as expected [input_value=Message(cont

🤖 Agent: Vector search! That's a fantastic topic, especially for those interested in data science and natural language processing. Vector search is a technique used to find similar items within a large dataset by representing each item as a dense vector.

In the context of natural language processing (NLP), vector search is often applied to text documents or entities in a database. These vectors are usually high-dimensional, meaning they have many features that describe the document or entity, such as word frequencies, sentiment scores, and semantic roles.

The goal of vector search is to find similar items in the dataset by computing the cosine similarity between all pairs of vectors. This allows us to identify documents or entities that share similar characteristics, such as topics, themes, or even emotional tone.

One popular library for efficient vector search is FAISS (Facebook AI Similarity Search). FAISS provides a fast and scalable way to compute similarities between vectors, m

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'Earlier you mentioned Python -...' found 2 matches
INFO:__main__:Retrieved 3 turns for session ebe381a2
[92m12:41:30 - LiteLLM:INFO[0m: utils.py:3119 - 
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:LiteLLM:
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
[92m12:42:25 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:LiteLLM:selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:__main__:Saved turn: aa6bde0a for session ebe381a2
  PydanticSerializationUnexpectedValue(Expected 9 fields but got 5: Expected `Message` - serialized value may not be as expected [input_value=Message(cont

🤖 Agent: You're referring back to our earlier conversation about the advantages of using Python for machine learning and other tasks. I mentioned that Python is an excellent choice due to its simplicity, readability, and versatility.

One of the key features of Python is its syntax, which is designed to be easy to understand and write. This makes it a fantastic language for rapid prototyping and development. Additionally, Python has a vast array of libraries and frameworks that make many tasks more accessible, such as scikit-learn, TensorFlow, and Keras.

I also mentioned that Python's simplicity and readability allow developers to focus on writing code rather than wrestling with complex syntax or semantics. This makes it an ideal choice for beginners and experienced programmers alike, regardless of their background or experience level.

Moreover, Python's versatility is another significant advantage. It can be used for a wide range of applications, from web development and data scienc

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'How do vector databases help w...' found 3 matches
INFO:__main__:Retrieved 3 turns for session ebe381a2
[92m12:42:27 - LiteLLM:INFO[0m: utils.py:3119 - 
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:LiteLLM:
LiteLLM completion() model= llama3.2:latest; provider = ollama_chat
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
[92m12:43:44 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:LiteLLM:selected model name for cost calculation: ollama_chat/llama3.2:latest
INFO:httpx:HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"
INFO:__main__:Saved turn: 2eda09c1 for session ebe381a2
  PydanticSerializationUnexpectedValue(Expected 9 fields but got 5: Expected `Message` - serialized value may not be as expected [input_value=Message(cont

🤖 Agent: Vector databases play a crucial role in supporting machine learning concepts, particularly those related to natural language processing (NLP) and semantic search. By storing high-dimensional embeddings for text documents or entities in a database, vector databases enable efficient similarity searches that are essential for many machine learning applications.

In the context of NLP, vector databases can be used to store and query dense vectors representing words, phrases, or entire texts. This allows developers to perform tasks like language modeling, sentiment analysis, and topic modeling more efficiently. Vector databases also support techniques like word embeddings, which map words to vectors that capture their semantic meaning.

For example, in a search engine application, vector databases can be used to store the dense vectors of web pages. When a user submits a query, the database can quickly retrieve the most relevant documents by computing the similarity between the que

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Memory search for 'programming and artificial int...' found 3 matches
INFO:__main__:Retrieved 5 turns for session ebe381a2


   Query: 'programming and artificial intelligence'
   ✅ 0.468: Python is a programming language known for simplicity and re...
   ✅ 0.431: Machine learning enables computers to learn from data automa...
   ✅ 0.425: Natural language processing helps computers understand human...

💬 Conversation History:
   Total turns in session: 5
   Session ID: ebe381a2

✅ MEMORY SYSTEM DEMONSTRATION COMPLETE:
   ✅ Vector Knowledge Base: Semantic search with FAISS
   ✅ Persistent Conversations: SQLite storage with history
   ✅ Context Integration: Knowledge + history in responses
   ✅ Memory Tracking: What knowledge influenced each response
   ✅ Production Ready: Scalable architecture with enterprise patterns


---

## 🎉 Agent Memory & Conversation Management Mastery Complete!

**You've mastered sophisticated memory patterns for conversational AI systems.**

### 🏆 **What You've Accomplished:**

**✅ Vector Memory System:**
- **FAISS Integration:** Production-grade similarity search with optimized indexing
- **Semantic Embeddings:** Sentence transformers for meaningful vector representations
- **Knowledge Storage:** Structured memory with metadata and fast retrieval
- **Similarity Search:** Context-aware knowledge retrieval with configurable thresholds

**✅ Conversation Persistence:**
- **SQLite Backend:** Persistent storage surviving agent restarts
- **Session Management:** Multi-user conversation isolation and tracking
- **Turn Storage:** Complete conversation history with metadata
- **Query Performance:** Efficient retrieval of conversation context

**✅ Memory-Enhanced Agents:**
- **Context Integration:** Seamless combination of knowledge and conversation history
- **Smart Retrieval:** Query-based knowledge search for relevant information
- **Response Enhancement:** Context-aware generation using retrieved memory
- **Memory Tracking:** Comprehensive logging of knowledge usage in responses

### 🚀 **Production Applications:**

These patterns power enterprise conversational AI systems:
- **Customer Service:** Agents remember customer history and preferences
- **Knowledge Management:** RAG systems with persistent conversation context
- **Virtual Assistants:** Personalized responses based on interaction history
- **Educational AI:** Adaptive learning with student progress tracking

---

**🎖️ Achievement Unlocked: Memory Management Expert**

*You've implemented production-ready memory patterns that enable truly intelligent, context-aware conversational agents.*