# Building RAG as Agent with DSPy

This notebook demonstrates how to build intelligent agents that combine Retrieval-Augmented Generation (RAG) with advanced reasoning capabilities using DSPy.

## What You'll Learn:
- Building agent-based RAG systems with DSPy
- Combining retrieval with multi-step reasoning
- Implementing tool-calling and external service integration
- Creating adaptive and self-improving RAG agents
- Managing memory and context across interactions

Based on the DSPy tutorial: [Building RAG as Agent](https://dspy.ai/tutorials/agents/)

## Setup and Imports

In [None]:
import os
import sys
sys.path.append('../../')

import dspy
from utils import setup_default_lm, print_step, print_result, print_error
from utils.datasets import get_sample_rag_documents
from dotenv import load_dotenv
import json
from typing import List, Dict, Any, Optional
from dataclasses import dataclass
import numpy as np
import random
from datetime import datetime

# Load environment variables
load_dotenv('../../.env')

## Language Model Configuration

In [None]:
print_step("Setting up Language Model", "Configuring DSPy for RAG Agent development")

try:
    lm = setup_default_lm(provider="openai", model="gpt-4o-mini", max_tokens=2000)
    dspy.configure(lm=lm)
    print_result("Language model configured successfully!", "Status")
except Exception as e:
    print_error(f"Failed to configure language model: {e}")

## Data Structures for RAG Agent

In [None]:
@dataclass
class AgentMemory:
    """Memory structure for the RAG agent."""
    conversation_history: List[Dict[str, str]]
    retrieved_context: List[str]
    reasoning_traces: List[str]
    action_history: List[Dict[str, Any]]
    learned_facts: List[str]

@dataclass
class AgentAction:
    """Represents an action the agent can take."""
    action_type: str  # 'retrieve', 'reason', 'synthesize', 'clarify'
    parameters: Dict[str, Any]
    confidence: float
    rationale: str

@dataclass
class RetrievalResult:
    """Enhanced retrieval result with metadata."""
    content: str
    relevance_score: float
    source: str
    timestamp: str
    confidence: float

print_result("Agent data structures defined successfully!")

## Document Retrieval System

In [None]:
class IntelligentRetriever:
    """Advanced retrieval system with semantic understanding."""
    
    def __init__(self, documents: List[str]):
        self.documents = documents
        self.document_metadata = {}
        self._initialize_metadata()
    
    def _initialize_metadata(self):
        """Initialize document metadata for better retrieval."""
        for i, doc in enumerate(self.documents):
            self.document_metadata[i] = {
                'length': len(doc),
                'keywords': self._extract_keywords(doc),
                'topic': self._infer_topic(doc),
                'complexity': self._estimate_complexity(doc)
            }
    
    def _extract_keywords(self, text: str) -> List[str]:
        """Simple keyword extraction."""
        # In a real implementation, you'd use proper NLP libraries
        words = text.lower().split()
        return [word for word in words if len(word) > 4][:10]
    
    def _infer_topic(self, text: str) -> str:
        """Infer document topic."""
        text_lower = text.lower()
        if 'machine learning' in text_lower or 'ai' in text_lower:
            return 'technology'
        elif 'science' in text_lower or 'research' in text_lower:
            return 'science'
        elif 'business' in text_lower or 'market' in text_lower:
            return 'business'
        else:
            return 'general'
    
    def _estimate_complexity(self, text: str) -> float:
        """Estimate text complexity."""
        avg_word_length = np.mean([len(word) for word in text.split()])
        return min(avg_word_length / 10.0, 1.0)
    
    def semantic_search(self, query: str, top_k: int = 3) -> List[RetrievalResult]:
        """Perform semantic search with enhanced scoring."""
        results = []
        query_lower = query.lower()
        
        for i, doc in enumerate(self.documents):
            # Simple similarity scoring (in practice, use embeddings)
            doc_lower = doc.lower()
            
            # Keyword overlap
            query_words = set(query_lower.split())
            doc_words = set(doc_lower.split())
            overlap = len(query_words.intersection(doc_words))
            
            # Basic relevance score
            relevance = overlap / len(query_words) if query_words else 0
            
            # Boost score based on metadata
            metadata = self.document_metadata[i]
            if any(keyword in doc_lower for keyword in query_words):
                relevance *= 1.2
            
            if relevance > 0:
                results.append(RetrievalResult(
                    content=doc,
                    relevance_score=relevance,
                    source=f"doc_{i}",
                    timestamp=datetime.now().isoformat(),
                    confidence=min(relevance * 0.8, 1.0)
                ))
        
        # Sort by relevance and return top_k
        results.sort(key=lambda x: x.relevance_score, reverse=True)
        return results[:top_k]
    
    def adaptive_retrieve(self, query: str, context: List[str], memory: AgentMemory) -> List[RetrievalResult]:
        """Adaptive retrieval that considers context and memory."""
        # Expand query based on conversation history
        expanded_query = query
        if memory.conversation_history:
            recent_context = " ".join([msg['content'] for msg in memory.conversation_history[-3:]])
            expanded_query = f"{query} {recent_context}"
        
        # Get initial results
        results = self.semantic_search(expanded_query, top_k=5)
        
        # Filter out already retrieved content
        seen_content = set(memory.retrieved_context)
        filtered_results = [r for r in results if r.content not in seen_content]
        
        return filtered_results[:3]

# Initialize retriever with sample documents
documents = get_sample_rag_documents()
retriever = IntelligentRetriever(documents)

print_result(f"Initialized retriever with {len(documents)} documents")

## RAG Agent Signatures

In [None]:
class QueryAnalysis(dspy.Signature):
    """Analyze user query to determine optimal agent strategy."""
    
    query = dspy.InputField(desc="User's question or request")
    conversation_context = dspy.InputField(desc="Previous conversation context")
    
    query_type = dspy.OutputField(desc="Type of query: factual, analytical, creative, or procedural")
    complexity_level = dspy.OutputField(desc="Complexity level: simple, moderate, or complex")
    required_actions = dspy.OutputField(desc="List of actions needed to answer the query")
    retrieval_strategy = dspy.OutputField(desc="Best retrieval strategy for this query")

class ActionPlanning(dspy.Signature):
    """Plan the sequence of actions to answer a query."""
    
    query = dspy.InputField(desc="User's question")
    query_analysis = dspy.InputField(desc="Analysis of the query")
    available_context = dspy.InputField(desc="Currently available context and information")
    
    action_plan = dspy.OutputField(desc="Step-by-step plan to answer the query")
    priority_actions = dspy.OutputField(desc="Most important actions to take first")
    fallback_strategy = dspy.OutputField(desc="Alternative approach if primary plan fails")

class ContextualRetrieval(dspy.Signature):
    """Perform contextual document retrieval."""
    
    query = dspy.InputField(desc="Search query")
    context = dspy.InputField(desc="Current conversation context")
    retrieval_strategy = dspy.InputField(desc="Retrieval strategy to use")
    
    search_queries = dspy.OutputField(desc="Optimized search queries for retrieval")
    relevance_criteria = dspy.OutputField(desc="Criteria for evaluating document relevance")
    expected_answer_type = dspy.OutputField(desc="Type of information expected in the answer")

class InformationSynthesis(dspy.Signature):
    """Synthesize information from multiple sources into a coherent response."""
    
    query = dspy.InputField(desc="Original user query")
    retrieved_docs = dspy.InputField(desc="Retrieved documents and information")
    conversation_memory = dspy.InputField(desc="Relevant conversation history")
    
    synthesized_answer = dspy.OutputField(desc="Comprehensive answer synthesized from sources")
    confidence_level = dspy.OutputField(desc="Confidence level in the answer")
    information_gaps = dspy.OutputField(desc="Identified gaps in available information")
    follow_up_suggestions = dspy.OutputField(desc="Suggested follow-up questions or actions")

class MemoryManagement(dspy.Signature):
    """Manage agent memory and learning."""
    
    current_interaction = dspy.InputField(desc="Current query and response")
    existing_memory = dspy.InputField(desc="Current agent memory state")
    interaction_outcome = dspy.InputField(desc="Success/failure of the interaction")
    
    memory_updates = dspy.OutputField(desc="Updates to make to agent memory")
    learned_patterns = dspy.OutputField(desc="New patterns or facts learned")
    memory_consolidation = dspy.OutputField(desc="How to consolidate and organize memory")

print_result("RAG Agent signatures defined successfully!")

## Advanced RAG Agent Implementation

In [None]:
class AdvancedRAGAgent(dspy.Module):
    """Advanced RAG agent with multi-step reasoning and memory."""
    
    def __init__(self, retriever: IntelligentRetriever):
        super().__init__()
        self.retriever = retriever
        
        # Initialize DSPy modules
        self.query_analyzer = dspy.ChainOfThought(QueryAnalysis)
        self.action_planner = dspy.ChainOfThought(ActionPlanning)
        self.contextual_retriever = dspy.ChainOfThought(ContextualRetrieval)
        self.information_synthesizer = dspy.ChainOfThought(InformationSynthesis)
        self.memory_manager = dspy.ChainOfThought(MemoryManagement)
        
        # Agent memory
        self.memory = AgentMemory(
            conversation_history=[],
            retrieved_context=[],
            reasoning_traces=[],
            action_history=[],
            learned_facts=[]
        )
        
        # Performance tracking
        self.interaction_count = 0
        self.success_rate = 0.0
    
    def forward(self, query: str) -> dspy.Prediction:
        """Main agent reasoning loop."""
        
        self.interaction_count += 1
        
        # Step 1: Analyze the query
        conversation_context = self._get_conversation_context()
        query_analysis = self.query_analyzer(
            query=query,
            conversation_context=conversation_context
        )
        
        # Step 2: Plan actions
        available_context = self._get_available_context()
        action_plan = self.action_planner(
            query=query,
            query_analysis=f"Type: {query_analysis.query_type}, Complexity: {query_analysis.complexity_level}",
            available_context=available_context
        )
        
        # Step 3: Execute retrieval strategy
        retrieval_result = self._execute_retrieval(query, query_analysis, action_plan)
        
        # Step 4: Synthesize information
        synthesis = self.information_synthesizer(
            query=query,
            retrieved_docs=retrieval_result['formatted_docs'],
            conversation_memory=conversation_context
        )
        
        # Step 5: Update memory
        self._update_memory(query, synthesis, query_analysis)
        
        # Step 6: Evaluate and learn
        self._evaluate_interaction(query, synthesis)
        
        return dspy.Prediction(
            answer=synthesis.synthesized_answer,
            confidence=synthesis.confidence_level,
            query_type=query_analysis.query_type,
            action_plan=action_plan.action_plan,
            retrieved_sources=retrieval_result['sources'],
            information_gaps=synthesis.information_gaps,
            follow_up_suggestions=synthesis.follow_up_suggestions,
            reasoning_trace=self._get_reasoning_trace(query_analysis, action_plan, synthesis)
        )
    
    def _get_conversation_context(self) -> str:
        """Get recent conversation context."""
        if not self.memory.conversation_history:
            return "No previous conversation context."
        
        recent_history = self.memory.conversation_history[-3:]
        context = "\n".join([
            f"{msg['role']}: {msg['content']}"
            for msg in recent_history
        ])
        return context
    
    def _get_available_context(self) -> str:
        """Get currently available context information."""
        context_parts = []
        
        if self.memory.retrieved_context:
            context_parts.append(f"Retrieved context: {len(self.memory.retrieved_context)} documents")
        
        if self.memory.learned_facts:
            context_parts.append(f"Learned facts: {len(self.memory.learned_facts)} items")
        
        if self.memory.reasoning_traces:
            context_parts.append(f"Previous reasoning: {len(self.memory.reasoning_traces)} traces")
        
        return "; ".join(context_parts) if context_parts else "No available context."
    
    def _execute_retrieval(self, query: str, query_analysis, action_plan) -> Dict[str, Any]:
        """Execute the retrieval strategy."""
        
        # Get retrieval parameters
        retrieval_params = self.contextual_retriever(
            query=query,
            context=self._get_conversation_context(),
            retrieval_strategy=query_analysis.retrieval_strategy
        )
        
        # Perform adaptive retrieval
        retrieved_docs = self.retriever.adaptive_retrieve(query, [], self.memory)
        
        # Format documents for synthesis
        formatted_docs = "\n\n".join([
            f"Source {i+1} (relevance: {doc.relevance_score:.2f}):\n{doc.content}"
            for i, doc in enumerate(retrieved_docs)
        ])
        
        # Update memory with retrieved context
        for doc in retrieved_docs:
            if doc.content not in self.memory.retrieved_context:
                self.memory.retrieved_context.append(doc.content)
        
        return {
            'formatted_docs': formatted_docs,
            'sources': [doc.source for doc in retrieved_docs],
            'retrieval_params': retrieval_params
        }
    
    def _update_memory(self, query: str, synthesis, query_analysis):
        """Update agent memory with new interaction."""
        
        # Add to conversation history
        self.memory.conversation_history.append({
            'role': 'user',
            'content': query,
            'timestamp': datetime.now().isoformat()
        })
        
        self.memory.conversation_history.append({
            'role': 'assistant',
            'content': synthesis.synthesized_answer,
            'timestamp': datetime.now().isoformat()
        })
        
        # Add reasoning trace
        reasoning_trace = f"Query: {query} | Type: {query_analysis.query_type} | Answer: {synthesis.synthesized_answer[:100]}..."
        self.memory.reasoning_traces.append(reasoning_trace)
        
        # Use memory manager for advanced updates
        memory_update = self.memory_manager(
            current_interaction=f"Q: {query}\nA: {synthesis.synthesized_answer}",
            existing_memory=str(len(self.memory.learned_facts)),
            interaction_outcome="success" if float(synthesis.confidence_level.split('%')[0] if '%' in synthesis.confidence_level else '0') > 70 else "partial"
        )
        
        # Apply memory updates
        if hasattr(memory_update, 'learned_patterns') and memory_update.learned_patterns:
            self.memory.learned_facts.append(memory_update.learned_patterns)
    
    def _evaluate_interaction(self, query: str, synthesis):
        """Evaluate and learn from the interaction."""
        
        # Simple confidence-based evaluation
        try:
            confidence_value = float(synthesis.confidence_level.split('%')[0] if '%' in synthesis.confidence_level else synthesis.confidence_level)
            success = confidence_value > 70
        except:
            success = len(synthesis.synthesized_answer) > 50  # Fallback heuristic
        
        # Update success rate
        current_success = 1.0 if success else 0.0
        self.success_rate = (self.success_rate * (self.interaction_count - 1) + current_success) / self.interaction_count
        
        # Log action
        self.memory.action_history.append({
            'query': query,
            'success': success,
            'confidence': synthesis.confidence_level,
            'timestamp': datetime.now().isoformat()
        })
    
    def _get_reasoning_trace(self, query_analysis, action_plan, synthesis) -> str:
        """Generate a reasoning trace for transparency."""
        
        trace_parts = [
            f"1. Query Analysis: {query_analysis.query_type} query with {query_analysis.complexity_level} complexity",
            f"2. Action Planning: {action_plan.action_plan}",
            f"3. Information Synthesis: {synthesis.confidence_level} confidence",
            f"4. Identified Gaps: {synthesis.information_gaps}"
        ]
        
        return "\n".join(trace_parts)
    
    def get_agent_status(self) -> Dict[str, Any]:
        """Get current agent status and performance metrics."""
        
        return {
            'interaction_count': self.interaction_count,
            'success_rate': self.success_rate,
            'memory_size': {
                'conversation_history': len(self.memory.conversation_history),
                'retrieved_context': len(self.memory.retrieved_context),
                'reasoning_traces': len(self.memory.reasoning_traces),
                'learned_facts': len(self.memory.learned_facts)
            },
            'last_interactions': self.memory.action_history[-5:] if self.memory.action_history else []
        }

# Initialize the RAG agent
rag_agent = AdvancedRAGAgent(retriever)

print_result("Advanced RAG Agent initialized successfully!")

## Testing the RAG Agent

In [None]:
print_step("Testing RAG Agent", "Demonstrating agent capabilities with various queries")

# Test queries of different types
test_queries = [
    "What is machine learning and how does it work?",
    "Can you explain the business applications of AI?",
    "How do neural networks process information?",
    "What are the ethical considerations in AI development?",
    "Follow up: Can you give me specific examples of ethical AI frameworks?"
]

for i, query in enumerate(test_queries, 1):
    print(f"\n{'='*60}")
    print(f"Query {i}: {query}")
    print('='*60)
    
    try:
        result = rag_agent(query)
        
        print_result(f"Answer: {result.answer}", "Response")
        print_result(f"Confidence: {result.confidence}", "Confidence")
        print_result(f"Query Type: {result.query_type}", "Analysis")
        print_result(f"Sources: {', '.join(result.retrieved_sources)}", "Sources")
        
        if result.information_gaps:
            print_result(f"Information Gaps: {result.information_gaps}", "Gaps")
        
        if result.follow_up_suggestions:
            print_result(f"Follow-up Suggestions: {result.follow_up_suggestions}", "Suggestions")
        
        print("\nReasoning Trace:")
        print(result.reasoning_trace)
        
    except Exception as e:
        print_error(f"Error processing query {i}: {e}")

print("\n" + "="*80)
print("AGENT PERFORMANCE SUMMARY")
print("="*80)

# Get agent status
status = rag_agent.get_agent_status()

print_result(f"Total Interactions: {status['interaction_count']}", "Statistics")
print_result(f"Success Rate: {status['success_rate']:.2%}", "Performance")
print_result(f"Memory Usage: {status['memory_size']}", "Memory")

if status['last_interactions']:
    print("\nRecent Interactions:")
    for interaction in status['last_interactions']:
        print(f"  - {interaction['query'][:50]}... | Success: {interaction['success']} | Confidence: {interaction['confidence']}")

## Agent Self-Improvement and Learning

In [None]:
class SelfImprovingRAGAgent(AdvancedRAGAgent):
    """RAG agent with self-improvement capabilities."""
    
    def __init__(self, retriever: IntelligentRetriever):
        super().__init__(retriever)
        
        # Self-improvement tracking
        self.performance_history = []
        self.strategy_effectiveness = defaultdict(list)
        
        # Adaptive parameters
        self.confidence_threshold = 0.7
        self.retrieval_top_k = 3
        self.memory_consolidation_interval = 10
    
    def forward(self, query: str) -> dspy.Prediction:
        """Enhanced forward with self-improvement."""
        
        # Call parent implementation
        result = super().forward(query)
        
        # Track performance for self-improvement
        self._track_performance(query, result)
        
        # Periodically consolidate memory and adapt
        if self.interaction_count % self.memory_consolidation_interval == 0:
            self._consolidate_memory()
            self._adapt_parameters()
        
        return result
    
    def _track_performance(self, query: str, result):
        """Track performance metrics for learning."""
        
        # Extract confidence score
        try:
            confidence_score = float(result.confidence.split('%')[0] if '%' in result.confidence else result.confidence)
        except:
            confidence_score = 0.5  # Default
        
        # Performance record
        performance_record = {
            'query_type': result.query_type,
            'confidence': confidence_score,
            'answer_length': len(result.answer),
            'sources_used': len(result.retrieved_sources),
            'timestamp': datetime.now().isoformat()
        }
        
        self.performance_history.append(performance_record)
        
        # Track strategy effectiveness
        self.strategy_effectiveness[result.query_type].append(confidence_score)
    
    def _consolidate_memory(self):
        """Consolidate and optimize memory usage."""
        
        print_step("Memory Consolidation", "Optimizing agent memory")
        
        # Consolidate conversation history (keep last 20 interactions)
        if len(self.memory.conversation_history) > 40:  # 20 Q&A pairs
            self.memory.conversation_history = self.memory.conversation_history[-40:]
        
        # Consolidate retrieved context (remove duplicates and low-value content)
        unique_context = list(set(self.memory.retrieved_context))
        self.memory.retrieved_context = unique_context[-50:]  # Keep last 50 unique documents
        
        # Extract key learnings from reasoning traces
        if len(self.memory.reasoning_traces) > 20:
            # Keep successful patterns and recent traces
            self.memory.reasoning_traces = self.memory.reasoning_traces[-20:]
        
        print_result("Memory consolidation completed")
    
    def _adapt_parameters(self):
        """Adapt agent parameters based on performance."""
        
        if len(self.performance_history) < 5:
            return
        
        print_step("Parameter Adaptation", "Optimizing agent parameters")
        
        # Analyze recent performance
        recent_performance = self.performance_history[-10:]
        avg_confidence = np.mean([p['confidence'] for p in recent_performance])
        avg_sources = np.mean([p['sources_used'] for p in recent_performance])
        
        # Adapt confidence threshold
        if avg_confidence < 0.6:
            self.confidence_threshold = max(0.5, self.confidence_threshold - 0.05)
            print_result(f"Lowered confidence threshold to {self.confidence_threshold}")
        elif avg_confidence > 0.8:
            self.confidence_threshold = min(0.9, self.confidence_threshold + 0.05)
            print_result(f"Raised confidence threshold to {self.confidence_threshold}")
        
        # Adapt retrieval parameters
        if avg_sources < 2:
            self.retrieval_top_k = min(5, self.retrieval_top_k + 1)
            print_result(f"Increased retrieval top_k to {self.retrieval_top_k}")
        elif avg_sources > 4:
            self.retrieval_top_k = max(2, self.retrieval_top_k - 1)
            print_result(f"Decreased retrieval top_k to {self.retrieval_top_k}")
    
    def get_learning_insights(self) -> Dict[str, Any]:
        """Get insights about agent learning and performance."""
        
        insights = {
            'performance_trends': self._analyze_performance_trends(),
            'strategy_effectiveness': self._analyze_strategy_effectiveness(),
            'adaptation_history': self._get_adaptation_history(),
            'current_parameters': {
                'confidence_threshold': self.confidence_threshold,
                'retrieval_top_k': self.retrieval_top_k,
                'memory_consolidation_interval': self.memory_consolidation_interval
            }
        }
        
        return insights
    
    def _analyze_performance_trends(self) -> Dict[str, Any]:
        """Analyze performance trends over time."""
        
        if len(self.performance_history) < 5:
            return {'status': 'insufficient_data'}
        
        # Calculate trends
        confidences = [p['confidence'] for p in self.performance_history]
        recent_avg = np.mean(confidences[-5:])
        overall_avg = np.mean(confidences)
        
        return {
            'overall_average_confidence': overall_avg,
            'recent_average_confidence': recent_avg,
            'improvement_trend': 'improving' if recent_avg > overall_avg else 'declining',
            'total_interactions': len(self.performance_history)
        }
    
    def _analyze_strategy_effectiveness(self) -> Dict[str, float]:
        """Analyze effectiveness of different strategies."""
        
        effectiveness = {}
        for strategy, scores in self.strategy_effectiveness.items():
            if scores:
                effectiveness[strategy] = np.mean(scores)
        
        return effectiveness
    
    def _get_adaptation_history(self) -> List[str]:
        """Get history of parameter adaptations."""
        
        # This would be more sophisticated in a real implementation
        return [
            f"Confidence threshold: {self.confidence_threshold}",
            f"Retrieval top_k: {self.retrieval_top_k}",
            f"Memory consolidation interval: {self.memory_consolidation_interval}"
        ]

# Initialize self-improving agent
self_improving_agent = SelfImprovingRAGAgent(retriever)

print_result("Self-improving RAG Agent initialized successfully!")

## Testing Self-Improvement Capabilities

In [None]:
print_step("Testing Self-Improvement", "Running queries to demonstrate learning")

# Extended test with varied queries to trigger learning
learning_queries = [
    "What are the fundamentals of machine learning?",
    "How do businesses implement AI solutions?",
    "Can you explain deep learning architectures?",
    "What are the challenges in AI ethics?",
    "How does natural language processing work?",
    "What is the future of artificial intelligence?",
    "Can you compare different ML algorithms?",
    "How do recommendation systems function?",
    "What role does data play in AI success?",
    "How can AI be applied in healthcare?",
    "What are the limitations of current AI?",
    "How do we ensure AI fairness and transparency?"
]

# Process queries and track learning
for i, query in enumerate(learning_queries, 1):
    print(f"\nQuery {i}: {query}")
    
    try:
        result = self_improving_agent(query)
        print(f"Response confidence: {result.confidence}")
        
        # Show consolidation messages when they occur
        if i % 10 == 0:
            print("[Agent performed memory consolidation and parameter adaptation]")
        
    except Exception as e:
        print_error(f"Error in query {i}: {e}")

print("\n" + "="*80)
print("LEARNING INSIGHTS")
print("="*80)

# Get learning insights
insights = self_improving_agent.get_learning_insights()

print_step("Performance Analysis")
trends = insights['performance_trends']
if 'overall_average_confidence' in trends:
    print_result(f"Overall Average Confidence: {trends['overall_average_confidence']:.2%}")
    print_result(f"Recent Average Confidence: {trends['recent_average_confidence']:.2%}")
    print_result(f"Trend: {trends['improvement_trend']}")

print_step("Strategy Effectiveness")
for strategy, effectiveness in insights['strategy_effectiveness'].items():
    print_result(f"{strategy}: {effectiveness:.2%}")

print_step("Current Parameters")
for param, value in insights['current_parameters'].items():
    print_result(f"{param}: {value}")

# Compare with initial agent
print("\n" + "="*80)
print("AGENT COMPARISON")
print("="*80)

initial_status = rag_agent.get_agent_status()
improved_status = self_improving_agent.get_agent_status()

print_step("Performance Comparison")
print_result(f"Initial Agent Success Rate: {initial_status['success_rate']:.2%}")
print_result(f"Self-Improving Agent Success Rate: {improved_status['success_rate']:.2%}")
print_result(f"Improvement: {(improved_status['success_rate'] - initial_status['success_rate']):.2%}")

## Advanced Agent Features Demo

In [None]:
print_step("Advanced Features Demo", "Demonstrating specialized agent capabilities")

# 1. Multi-turn conversation with context
print("\n1. Multi-turn Conversation:")
print("-" * 40)

conversation_queries = [
    "What is machine learning?",
    "Can you give me specific examples?",
    "How does this compare to traditional programming?",
    "What are the main challenges in implementing ML?"
]

for i, query in enumerate(conversation_queries, 1):
    print(f"\nTurn {i}: {query}")
    result = self_improving_agent(query)
    print(f"Agent: {result.answer[:150]}...")
    if result.follow_up_suggestions:
        print(f"Suggestions: {result.follow_up_suggestions}")

# 2. Complex analytical query
print("\n\n2. Complex Analytical Query:")
print("-" * 40)

complex_query = "Analyze the relationship between artificial intelligence, machine learning, and deep learning. How do they differ and how do they work together in modern applications?"
result = self_improving_agent(complex_query)

print_result(f"Query Type: {result.query_type}")
print_result(f"Answer: {result.answer}")
print_result(f"Confidence: {result.confidence}")
print_result(f"Sources Used: {len(result.retrieved_sources)}")

# 3. Information gap identification
print("\n\n3. Information Gap Identification:")
print("-" * 40)

gap_query = "What are the latest developments in quantum machine learning and their implications for cryptography?"
result = self_improving_agent(gap_query)

print_result(f"Answer: {result.answer}")
if result.information_gaps:
    print_result(f"Identified Gaps: {result.information_gaps}")
if result.follow_up_suggestions:
    print_result(f"Suggestions: {result.follow_up_suggestions}")

# 4. Agent introspection
print("\n\n4. Agent Self-Analysis:")
print("-" * 40)

final_status = self_improving_agent.get_agent_status()
final_insights = self_improving_agent.get_learning_insights()

print_step("Final Agent Statistics")
print_result(f"Total Interactions: {final_status['interaction_count']}")
print_result(f"Final Success Rate: {final_status['success_rate']:.2%}")
print_result(f"Memory Components: {final_status['memory_size']}")

if 'performance_trends' in final_insights and 'improvement_trend' in final_insights['performance_trends']:
    print_result(f"Learning Trend: {final_insights['performance_trends']['improvement_trend']}")

print("\n🎉 RAG Agent demonstration completed successfully!")

## Conclusion

This notebook demonstrated how to build sophisticated RAG agents using DSPy:

### Key Features Implemented:

1. **Intelligent Retrieval**: Context-aware document retrieval with relevance scoring
2. **Multi-step Reasoning**: Query analysis, action planning, and information synthesis
3. **Memory Management**: Conversation history, retrieved context, and learned facts
4. **Self-Improvement**: Adaptive parameters and continuous learning
5. **Agent Transparency**: Detailed reasoning traces and confidence scoring

### DSPy Features Utilized:

- **Signatures**: Structured task definitions for each agent component
- **Chain of Thought**: Step-by-step reasoning for complex tasks
- **Modular Design**: Composable agent architecture
- **Memory Integration**: Persistent state across interactions

### Advanced Capabilities:

- **Context-Aware Retrieval**: Considers conversation history and previous interactions
- **Adaptive Learning**: Adjusts parameters based on performance feedback
- **Multi-turn Conversations**: Maintains context across dialog turns
- **Gap Identification**: Recognizes limitations in available information
- **Performance Tracking**: Monitors and improves agent effectiveness

### Real-world Applications:

- **Customer Support**: Intelligent help desk agents with memory
- **Research Assistance**: Academic and technical research support
- **Knowledge Management**: Organizational knowledge bases and Q&A
- **Educational Tutoring**: Adaptive learning and personalized instruction
- **Business Intelligence**: Data-driven insights and recommendations

This RAG agent architecture demonstrates how DSPy enables building sophisticated, adaptive AI systems that can learn and improve over time while maintaining transparency and reliability.