# Agentic RAG Basics

**Build Your First Knowledge-Powered AI Agent**

---

Welcome to the world of **Retrieval-Augmented Generation (RAG)**! This notebook teaches you how to build AI agents that can access and reason over custom knowledge bases. By the end of this 10-minute tutorial, you'll have a working RAG system that enhances your agents with external knowledge.

### 🎯 What You'll Learn

In this tutorial, you will:
- Understand what RAG is and why it's powerful
- Build a local vector store using FAISS
- Create embeddings with sentence transformers
- Implement RAG tools for Strands agents
- Build a knowledge-enabled agent
- Test your agent with real queries

### 🤔 What is RAG?

**Retrieval-Augmented Generation** combines:
- **Retrieval**: Finding relevant information from a knowledge base
- **Generation**: Using that information to generate accurate responses

Think of it as giving your agent a searchable library of information!

## 📦 Step 1: Installing Required Packages

### Overview
We'll install the necessary packages for RAG functionality. These include vector storage, embedding models, and the Strands framework.

### 📚 What We're Installing
- **sentence-transformers**: Creates embeddings from text
- **faiss-cpu**: Facebook's vector similarity search library
- **numpy**: For numerical operations
- **strands-agents**: Our agent framework

In [None]:
# Install required packages
%pip install sentence-transformers faiss-cpu numpy strands-agents -q

# For document generation (optional)
%pip install reportlab python-docx -q

print("✅ All packages installed successfully!")
print("   Ready to build your RAG system! 🚀")

## 🔑 Step 2: Setting Up AWS Bedrock

### Hybrid Approach
We'll use:
- **Local storage** for our knowledge base (no cloud costs!)
- **AWS Bedrock** for the powerful Claude LLM

### 💡 Why This Approach?
- Your documents stay private on your machine
- Only pay for LLM usage, not storage
- Fast local search with powerful cloud reasoning

In [None]:
import boto3
from strands import Agent
from strands.models import BedrockModel
import os

# Configure AWS session
# Note: Make sure you have AWS credentials configured
session = boto3.Session(
    profile_name='default'  # Use your AWS profile name
)

# Create a Bedrock model instance
try:
    bedrock_model = BedrockModel(
        model_id="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
        boto_session=session
    )
    print("✅ AWS Bedrock configured successfully!")
    print("   Model: Claude 3.7 Sonnet")
except Exception as e:
    print(f"❌ Error configuring Bedrock: {e}")
    print("   Please check your AWS credentials and Bedrock access")

## 📚 Step 3: Creating Sample Documents

### Generate Knowledge Base Content
Let's create some documents about the Strands framework for our RAG system to use. This ensures everyone has the same content to work with.

In [None]:
# Import our document generator
import sys
sys.path.append('../src')

try:
    from rag_document_generator import create_basic_documents
    
    # Create the documents
    print("📝 Creating sample documents...")
    created_files = create_basic_documents()
    
    print(f"\n✅ Created {len(created_files)} documents:")
    for file in created_files:
        print(f"   - {file}")
        
except ImportError:
    print("⚠️  Could not import document generator. Creating documents manually...")
    
    # Manual fallback
    import os
    os.makedirs('../rag_docs', exist_ok=True)
    
    # Create a simple document
    with open('../rag_docs/strands_intro.txt', 'w') as f:
        f.write("""Strands is a powerful framework for building AI agents.
It supports multiple models including AWS Bedrock, OpenAI, and local models.
Agents can use tools to extend their capabilities.
The framework is designed to be simple yet flexible.""")
    
    print("✅ Created fallback document")

## 🧠 Step 4: Building the Local Vector Store

### The RAG Engine
Now we'll create the core RAG functionality:
1. **Embeddings**: Convert text to numerical vectors
2. **Vector Store**: Store and search these vectors
3. **Retrieval**: Find relevant documents for queries

### 🔍 How It Works
- Text → Embeddings → Vector Store
- Query → Embedding → Similarity Search → Results

In [None]:
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss
from typing import List, Dict, Any
import pickle

class LocalRAGStore:
    """A simple local RAG store using FAISS and sentence transformers."""
    
    def __init__(self, model_name: str = 'all-MiniLM-L6-v2'):
        """Initialize the RAG store with an embedding model."""
        print(f"🚀 Loading embedding model: {model_name}...")
        self.embedder = SentenceTransformer(model_name)
        self.embedding_dim = 384  # Dimension for all-MiniLM-L6-v2
        
        # Storage
        self.documents = []
        self.metadata = []
        self.index = faiss.IndexFlatL2(self.embedding_dim)
        
        print("✅ RAG store initialized!")
    
    def add_documents(self, texts: List[str], metadata: List[Dict] = None):
        """Add documents to the store."""
        if not texts:
            return
        
        # Store documents
        self.documents.extend(texts)
        
        # Store metadata
        if metadata:
            self.metadata.extend(metadata)
        else:
            self.metadata.extend([{} for _ in texts])
        
        # Create embeddings
        print(f"📊 Creating embeddings for {len(texts)} documents...")
        embeddings = self.embedder.encode(texts, show_progress_bar=True)
        
        # Add to FAISS index
        self.index.add(embeddings.astype('float32'))
        
        print(f"✅ Added {len(texts)} documents to the knowledge base")
        print(f"   Total documents: {len(self.documents)}")
    
    def search(self, query: str, k: int = 3) -> List[Dict[str, Any]]:
        """Search for relevant documents."""
        # Embed the query
        query_embedding = self.embedder.encode([query]).astype('float32')
        
        # Search in FAISS
        distances, indices = self.index.search(query_embedding, k)
        
        # Format results
        results = []
        for i, (idx, distance) in enumerate(zip(indices[0], distances[0])):
            if idx < len(self.documents):
                results.append({
                    'content': self.documents[idx],
                    'metadata': self.metadata[idx],
                    'score': float(distance),
                    'rank': i + 1
                })
        
        return results

# Create our RAG store
print("🏗️ Creating local RAG store...")
rag_store = LocalRAGStore()
print("\n💡 The embedding model will download on first use (about 90MB)")

## 📖 Step 5: Loading Documents into the Knowledge Base

### Populating Our RAG System
Now we'll load the documents we created earlier into our vector store. This creates a searchable knowledge base.

In [None]:
import os
from pathlib import Path

# Load documents from the rag_docs directory
rag_docs_path = Path('../rag_docs')
documents_to_load = []
metadata_list = []

print("📁 Loading documents from rag_docs directory...")

# Read all text files
for file_path in rag_docs_path.glob('*.txt'):
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
        documents_to_load.append(content)
        metadata_list.append({
            'source': file_path.name,
            'type': 'text'
        })
        print(f"   📄 Loaded: {file_path.name}")

# Read all markdown files
for file_path in rag_docs_path.glob('*.md'):
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
        documents_to_load.append(content)
        metadata_list.append({
            'source': file_path.name,
            'type': 'markdown'
        })
        print(f"   📄 Loaded: {file_path.name}")

# Add documents to RAG store
if documents_to_load:
    print(f"\n🔄 Adding {len(documents_to_load)} documents to the knowledge base...")
    rag_store.add_documents(documents_to_load, metadata_list)
else:
    print("\n⚠️  No documents found. Creating sample content...")
    
    # Add some sample content
    sample_docs = [
        "Strands is a framework for building AI agents. It supports tools and multiple models.",
        "RAG stands for Retrieval-Augmented Generation. It helps agents access external knowledge.",
        "AWS Bedrock provides access to Claude and other foundation models."
    ]
    
    sample_metadata = [
        {"source": "sample_1", "type": "intro"},
        {"source": "sample_2", "type": "rag"},
        {"source": "sample_3", "type": "bedrock"}
    ]
    
    rag_store.add_documents(sample_docs, sample_metadata)

## 🔧 Step 6: Creating RAG Tools

### Tools = Agent Superpowers
We'll create tools that allow our agent to:
1. Search the knowledge base
2. Add new information

### 🎯 Key Concepts
- Tools must have type hints
- Clear docstrings help the agent understand when to use each tool
- Tools should handle errors gracefully

In [None]:
from strands import tool

@tool
def search_knowledge_base(query: str, num_results: int = 3) -> str:
    """Search the knowledge base for relevant information.
    
    Args:
        query: The search query
        num_results: Number of results to return (default: 3)
    
    Returns:
        Formatted search results with sources
    """
    # Perform the search
    results = rag_store.search(query, k=num_results)
    
    if not results:
        return "No relevant information found in the knowledge base."
    
    # Format the results
    formatted_results = []
    for result in results:
        source = result['metadata'].get('source', 'Unknown')
        content = result['content'][:200] + "..." if len(result['content']) > 200 else result['content']
        
        formatted_results.append(
            f"[Source: {source}]\n"
            f"Content: {content}\n"
            f"Relevance Score: {result['score']:.2f}"
        )
    
    return "\n\n".join(formatted_results)

@tool
def add_to_knowledge_base(content: str, source: str = "user_added") -> str:
    """Add new information to the knowledge base.
    
    Args:
        content: The information to add
        source: The source of the information (default: "user_added")
    
    Returns:
        Confirmation message
    """
    # Add the document
    rag_store.add_documents(
        [content],
        [{'source': source, 'type': 'user_added'}]
    )
    
    return f"Successfully added new information to the knowledge base from source: {source}"

print("🔧 RAG tools created successfully!")
print("   - search_knowledge_base: Find relevant information")
print("   - add_to_knowledge_base: Add new knowledge")

## 🤖 Step 7: Creating Your RAG-Enabled Agent

### Bringing It All Together
Now we'll create an agent that can:
- Access the knowledge base
- Answer questions using retrieved information
- Learn new information when provided

### 📋 Agent Instructions
The system prompt is crucial - it tells the agent to always check the knowledge base first!

In [None]:
# Create a RAG-enabled agent
rag_agent = Agent(
    model=bedrock_model,
    system_prompt="""You are a helpful AI assistant with access to a knowledge base about the Strands framework and AI agents.
    
    IMPORTANT INSTRUCTIONS:
    1. Always search the knowledge base FIRST before answering questions
    2. If you find relevant information, use it in your response and cite the source
    3. If no relevant information is found, acknowledge this and provide your best answer
    4. When users provide new information, add it to the knowledge base
    5. Be helpful, accurate, and concise in your responses
    
    Remember: Your knowledge base is your primary source of truth!""",
    tools=[search_knowledge_base, add_to_knowledge_base]
)

print("🎉 RAG-enabled agent created successfully!")
print("   Model: Claude 3.7 Sonnet (via Bedrock)")
print("   Knowledge Base: Local FAISS vector store")
print("   Tools: search_knowledge_base, add_to_knowledge_base")

## 💬 Step 8: Testing Your RAG Agent

### Real-World Queries
Let's test our agent with various questions to see how it uses the knowledge base. Watch how it searches for information before responding!

In [None]:
# Test 1: Ask about Strands
print("🔍 Test 1: Asking about Strands")
print("="*50)
response = rag_agent("What is the Strands framework?")
print(f"🤖 Agent: {response}")
print("\n" + "="*50 + "\n")

In [None]:
# Test 2: Ask about Python best practices
print("🔍 Test 2: Asking about Python best practices")
print("="*50)
response = rag_agent("What are some Python best practices for AI development?")
print(f"🤖 Agent: {response}")
print("\n" + "="*50 + "\n")

In [None]:
# Test 3: Add new information
print("📝 Test 3: Adding new information to the knowledge base")
print("="*50)
response = rag_agent(
    "I want to add this information: RAG systems can significantly improve agent accuracy "
    "by providing access to up-to-date and domain-specific information. They're especially "
    "useful for technical documentation and customer support."
)
print(f"🤖 Agent: {response}")
print("\n" + "="*50 + "\n")

In [None]:
# Test 4: Query the newly added information
print("🔍 Test 4: Querying the newly added information")
print("="*50)
response = rag_agent("What are RAG systems especially useful for?")
print(f"🤖 Agent: {response}")
print("\n" + "="*50 + "\n")

## 🔍 Step 9: Understanding RAG Performance

### Behind the Scenes
Let's explore how our RAG system works and examine its performance characteristics.

In [None]:
# Analyze the knowledge base
print("📊 Knowledge Base Statistics:")
print(f"   Total documents: {len(rag_store.documents)}")
print(f"   Embedding dimension: {rag_store.embedding_dim}")
print(f"   Index size: {rag_store.index.ntotal}")

# Test search performance
import time

test_queries = [
    "What is Strands?",
    "How do I use tools?",
    "AWS Bedrock setup",
    "Python best practices"
]

print("\n⏱️  Search Performance Test:")
for query in test_queries:
    start_time = time.time()
    results = rag_store.search(query, k=3)
    search_time = time.time() - start_time
    
    print(f"   Query: '{query}'")
    print(f"   Time: {search_time*1000:.2f}ms")
    print(f"   Top result: {results[0]['metadata']['source'] if results else 'No results'}")
    print()

## 🎓 Step 10: Advanced Tips and Best Practices

### Making the Most of RAG
Here are key tips for building production-ready RAG systems.

In [None]:
print("🎓 RAG BEST PRACTICES")
print("=" * 60)

best_practices = {
    "📚 Document Preparation": [
        "Chunk large documents into smaller pieces",
        "Include metadata (source, date, category)",
        "Keep chunks self-contained and meaningful",
        "Remove duplicate information"
    ],
    "🔍 Search Optimization": [
        "Use appropriate embedding models for your domain",
        "Experiment with different k values for retrieval",
        "Consider hybrid search (vector + keyword)",
        "Implement re-ranking for better results"
    ],
    "🤖 Agent Design": [
        "Always instruct agents to search first",
        "Have agents cite their sources",
        "Handle 'no results' cases gracefully",
        "Allow agents to request more information"
    ],
    "⚡ Performance": [
        "Use smaller models for simple domains",
        "Cache frequent queries",
        "Batch document additions",
        "Consider GPU acceleration for embeddings"
    ]
}

for category, tips in best_practices.items():
    print(f"\n{category}")
    for tip in tips:
        print(f"   • {tip}")

# Saving and loading the knowledge base
print("\n\n💾 SAVING YOUR KNOWLEDGE BASE")
print("=" * 60)
print("""
# Save the knowledge base
import pickle

def save_rag_store(rag_store, filepath='rag_store.pkl'):
    with open(filepath, 'wb') as f:
        pickle.dump({
            'documents': rag_store.documents,
            'metadata': rag_store.metadata,
            'index': faiss.serialize_index(rag_store.index)
        }, f)

# Load the knowledge base
def load_rag_store(filepath='rag_store.pkl'):
    with open(filepath, 'rb') as f:
        data = pickle.load(f)
    # Reconstruct the store...
""")

## 🎉 Congratulations!

### 🏆 What You've Accomplished

In just 10 minutes, you've:
- ✅ Built a complete RAG system from scratch
- ✅ Created a local vector store with FAISS
- ✅ Implemented embedding-based search
- ✅ Built RAG tools for Strands agents
- ✅ Created a knowledge-enabled AI agent
- ✅ Tested real-world queries

### 🚀 What's Next?\n\nNow that you've mastered the basics of RAG, you're ready to explore:\n1. **Advanced Document Processing** - Handle PDFs, Word docs, and more\n2. **Persistent Storage** - Use ChromaDB for long-term knowledge\n3. **Production Deployment** - Scale your RAG system\n4. **Multi-Agent RAG** - Share knowledge between agents\n\n### 💡 Key Takeaways\n\n1. **RAG = Retrieval + Generation**: Combine search with AI generation\n2. **Local Storage**: Keep your data private and costs low\n3. **Powerful Cloud LLMs**: Use Bedrock for advanced reasoning\n4. **Tools Enable RAG**: search_knowledge_base and add_to_knowledge_base\n\n### 📚 Resources\n\n- [Strands Documentation](https://strandsagents.com/0.1.x/)\n- [FAISS Documentation](https://github.com/facebookresearch/faiss)\n- [Sentence Transformers](https://www.sbert.net/)\n\n### 🌟 Challenge Yourself\n\nTry enhancing your RAG system by:\n- Adding more documents to the knowledge base\n- Experimenting with different embedding models\n- Implementing document chunking for large files\n- Creating specialized agents for different domains\n\nHappy building with RAG! 🚀🤖✨