# 🚀 Getting Started with RAG (Retrieval-Augmented Generation)

This notebook will guide you through building your first RAG (Retrieval-Augmented Generation) system using the GenerativeAI Starter Kit.

## What is RAG?

RAG combines the power of:
- **Retrieval**: Finding relevant information from a knowledge base
- **Generation**: Using an LLM to generate responses based on retrieved context

This approach allows AI systems to access up-to-date information and provide more accurate, source-backed responses.

## What You'll Learn

1. Set up a vector database for document storage
2. Add documents to your knowledge base
3. Perform semantic search and retrieval
4. Generate AI responses with source citations
5. Build a complete question-answering system

## 📋 Prerequisites

Before running this notebook, make sure you have:

1. **Installed the GenerativeAI Starter Kit**
   ```bash
   python scripts/setup/install.py
   ```

2. **Set up your API keys in `.env` file**
   ```
   OPENAI_API_KEY=your_api_key_here
   ```

3. **Activated your virtual environment**
   ```bash
   source venv/bin/activate  # Linux/Mac
   # or
   venv\Scripts\activate     # Windows
   ```

In [None]:
# Import required libraries
import sys
import os
from pathlib import Path

# Add src to Python path
sys.path.append(str(Path('..') / 'src'))

# Load environment variables
from dotenv import load_dotenv
load_dotenv('../.env')

# Import our RAG components
from rag.pipeline import RAGPipeline
from rag.vector_store import VectorStoreManager
from rag.retriever import DocumentRetriever
from rag.generator import ResponseGenerator, OpenAILLM

print("✅ All imports successful!")
print(f"OpenAI API Key loaded: {'Yes' if os.getenv('OPENAI_API_KEY') else 'No'}")

## 🏗️ Step 1: Initialize the RAG Pipeline

We'll create a RAG pipeline that uses:
- **ChromaDB** for vector storage (runs locally, no additional setup needed)
- **OpenAI** for text generation
- **Sentence Transformers** for document embeddings

In [None]:
# Initialize the RAG pipeline
pipeline = RAGPipeline(
    vector_store_type="chroma",           # Local vector database
    llm_provider="openai",               # OpenAI for generation
    collection_name="getting_started",   # Name for our document collection
    top_k=3,                            # Retrieve top 3 most relevant documents
    similarity_threshold=0.5            # Minimum similarity score
)

print("🎉 RAG Pipeline initialized successfully!")
print(f"Vector Store: {pipeline.vector_store.store_type}")
print(f"LLM Model: {pipeline.llm.model}")

## 📚 Step 2: Add Documents to Knowledge Base

Let's add some sample documents about artificial intelligence topics. In a real application, these could be:
- Company documentation
- Research papers
- Product manuals
- FAQ documents
- Any text-based knowledge

In [None]:
# Sample documents about AI and machine learning
documents = [
    """
    Artificial Intelligence (AI) Overview:
    
    Artificial Intelligence is the simulation of human intelligence in machines that are programmed 
    to think and learn like humans. AI systems can perform tasks that typically require human 
    intelligence, such as visual perception, speech recognition, decision-making, and language 
    translation. The field includes machine learning, deep learning, natural language processing, 
    computer vision, and robotics.
    """,
    
    """
    Machine Learning Fundamentals:
    
    Machine Learning (ML) is a subset of AI that enables computers to learn and improve from 
    experience without being explicitly programmed. ML algorithms build mathematical models 
    based on training data to make predictions or decisions. There are three main types:
    
    1. Supervised Learning: Uses labeled training data
    2. Unsupervised Learning: Finds patterns in unlabeled data
    3. Reinforcement Learning: Learns through interaction with environment
    """,
    
    """
    Deep Learning and Neural Networks:
    
    Deep Learning is a specialized subset of machine learning that uses artificial neural networks 
    with multiple layers to model complex patterns in data. These networks are inspired by the 
    human brain and consist of interconnected nodes (neurons) organized in layers. Deep learning 
    has achieved breakthrough results in image recognition, natural language processing, speech 
    recognition, and game playing.
    """,
    
    """
    Natural Language Processing (NLP):
    
    NLP is a branch of AI that focuses on enabling computers to understand, interpret, and 
    generate human language. Key NLP tasks include:
    
    - Text classification and sentiment analysis
    - Machine translation between languages
    - Question answering and chatbots
    - Text summarization and generation
    - Named entity recognition
    
    Modern NLP relies heavily on transformer models like BERT, GPT, and T5.
    """,
    
    """
    RAG (Retrieval-Augmented Generation):
    
    RAG is a technique that combines information retrieval with text generation to create more 
    accurate and up-to-date AI responses. The process works in two steps:
    
    1. Retrieval: Search for relevant documents from a knowledge base using semantic similarity
    2. Generation: Use retrieved context to generate informed responses with an LLM
    
    RAG enables AI systems to access external knowledge beyond their training data, making 
    responses more factual and reducing hallucinations.
    """
]

# Metadata for each document (optional but useful for tracking sources)
metadata = [
    {"source": "AI_Overview.pdf", "topic": "artificial_intelligence", "author": "AI Research Team"},
    {"source": "ML_Guide.pdf", "topic": "machine_learning", "author": "Data Science Team"},
    {"source": "DL_Handbook.pdf", "topic": "deep_learning", "author": "Neural Networks Lab"},
    {"source": "NLP_Tutorial.pdf", "topic": "natural_language_processing", "author": "Language AI Team"},
    {"source": "RAG_Paper.pdf", "topic": "retrieval_augmented_generation", "author": "AI Research Institute"}
]

print(f"📄 Preparing to add {len(documents)} documents to the knowledge base...")

In [None]:
# Add documents to the pipeline
result = pipeline.add_documents(documents, metadata)

print("✅ Documents added successfully!")
print(f"📊 Documents added: {result['documents_added']}")
print(f"🧩 Chunks created: {result['chunks_created']}")
print(f"⏱️  Processing time: {result['processing_time']:.2f} seconds")

## 🔍 Step 3: Test Document Retrieval

Before generating responses, let's test the retrieval system to see if it can find relevant documents.

In [None]:
# Test retrieval with a sample query
test_query = "What is machine learning?"

print(f"🔍 Testing retrieval for query: '{test_query}'")
print("=" * 50)

# Get retrieved documents
retrieved_docs = pipeline.retriever.retrieve(test_query, k=3)

print(f"📋 Found {len(retrieved_docs)} relevant documents:")
print()

for i, doc in enumerate(retrieved_docs, 1):
    relevance_score = doc['relevance_score']
    source = doc['metadata'].get('source', 'Unknown')
    snippet = doc['document'][:200] + "..." if len(doc['document']) > 200 else doc['document']
    
    print(f"📄 Document {i}:")
    print(f"   📊 Relevance Score: {relevance_score:.3f}")
    print(f"   📁 Source: {source}")
    print(f"   📝 Snippet: {snippet.strip()}")
    print()


## 🤖 Step 4: Generate AI Responses

Now let's use the complete RAG pipeline to generate informed responses with source citations.

In [None]:
# Test queries for our AI system
test_queries = [
    "What is machine learning and how does it work?",
    "What's the difference between AI and deep learning?",
    "How does RAG improve AI responses?",
    "What are the main applications of NLP?"
]

print("🤖 Testing RAG Pipeline with Multiple Queries")
print("=" * 60)

for i, query in enumerate(test_queries, 1):
    print(f"\n❓ Query {i}: {query}")
    print("-" * 50)
    
    # Generate response with sources
    response = pipeline.query(
        query,
        include_sources=True,    # Include source citations
        temperature=0.7,        # Creativity level (0.0 = focused, 1.0 = creative)
        max_tokens=300          # Maximum response length
    )
    
    print(f"🎯 **Answer:**")
    print(response['answer'])
    print()
    
    # Show sources
    if response.get('sources'):
        print(f"📚 **Sources:**")
        for source in response['sources']:
            print(f"   {source['id']} {source['name']} (Relevance: {source['relevance_score']:.3f})")
    
    print(f"⏱️  **Response Time:** {response['total_time']:.2f}s")
    print("=" * 60)

## 💬 Step 5: Interactive Conversation Mode

Let's test the conversation mode where the AI remembers previous exchanges.

In [None]:
# Reset conversation history
pipeline.reset_conversation()

print("💬 Conversation Mode Example")
print("=" * 40)

# Simulate a conversation
conversation = [
    "What is deep learning?",
    "How is it different from traditional machine learning?",
    "Can you give me some practical applications?"
]

for turn, query in enumerate(conversation, 1):
    print(f"\n👤 **Turn {turn} - User:** {query}")
    
    response = pipeline.query(
        query,
        conversation_mode=True,  # Enable conversation context
        include_sources=False,  # Skip sources for cleaner conversation
        temperature=0.7,
        max_tokens=200
    )
    
    print(f"🤖 **AI Assistant:** {response['answer']}")

print(f"\n📊 **Conversation Summary:**")
print(f"Total turns: {len(pipeline.conversation_history)}")
print(f"Average response time: {sum(h.get('response_time', 0) for h in pipeline.conversation_history) / len(pipeline.conversation_history) if pipeline.conversation_history else 0:.2f}s")

## 📊 Step 6: Pipeline Analytics

Let's examine the performance and usage statistics of our RAG system.

In [None]:
# Get analytics
analytics = pipeline.get_analytics()

print("📊 RAG Pipeline Analytics")
print("=" * 30)

for key, value in analytics.items():
    if isinstance(value, float):
        print(f"{key.replace('_', ' ').title()}: {value:.3f}")
    else:
        print(f"{key.replace('_', ' ').title()}: {value}")

print("\n🎯 Performance Insights:")
if analytics.get('average_total_time', 0) < 2.0:
    print("✅ Excellent response times!")
elif analytics.get('average_total_time', 0) < 5.0:
    print("✅ Good response times")
else:
    print("⚠️  Consider optimizing for faster responses")

if analytics.get('average_response_length', 0) > 100:
    print("✅ Generating comprehensive responses")
else:
    print("ℹ️  Responses are concise")

## 🔧 Step 7: Advanced Features

Let's explore some advanced features of the RAG system.

In [None]:
# 1. Query Explanation - understand how the system processes queries
query = "How does neural network training work?"
explanation = pipeline.explain_query(query)

print(f"🔍 Query Analysis for: '{query}'")
print("=" * 50)
print(f"Documents found: {explanation['retrieval_analysis']['total_results']}")
print(f"Average relevance: {explanation['retrieval_analysis']['summary']['average_score']:.3f}")
print(f"Best match score: {explanation['retrieval_analysis']['summary']['highest_score']:.3f}")

if explanation.get('suggestions'):
    print("\n💡 Suggestions for better results:")
    for suggestion in explanation['suggestions']:
        print(f"   • {suggestion}")

In [None]:
# 2. Metadata Filtering - search within specific document types
print("\n🏷️  Metadata Filtering Example")
print("=" * 40)

# Search only in documents from "AI Research Team"
filtered_results = pipeline.retriever.retrieve_with_metadata_filter(
    query="What is artificial intelligence?",
    metadata_filter={"author": "AI Research Team"},
    k=2
)

print(f"Found {len(filtered_results)} documents from 'AI Research Team':")
for doc in filtered_results:
    print(f"   📄 {doc['metadata']['source']} (Score: {doc['relevance_score']:.3f})")

In [None]:
# 3. Batch Processing - process multiple queries efficiently
print("\n⚡ Batch Processing Example")
print("=" * 30)

batch_queries = [
    "What is supervised learning?",
    "What is unsupervised learning?",
    "What is reinforcement learning?"
]

batch_results = pipeline.batch_query(batch_queries, include_sources=False)

for i, result in enumerate(batch_results):
    print(f"\n❓ {result['question']}")
    print(f"🤖 {result['answer'][:150]}...")
    print(f"⏱️  {result['total_time']:.2f}s")

## 💾 Step 8: Save and Load Pipeline State

You can save the pipeline state for later use.

In [None]:
# Save pipeline state
state_file = "../data/pipeline_state.json"
pipeline.save_state(state_file)

print(f"💾 Pipeline state saved to: {state_file}")
print(f"📊 Saved {len(pipeline.query_history)} queries and {len(pipeline.conversation_history)} conversation turns")

## 🎉 Congratulations!

You've successfully built and tested a complete RAG system! Here's what you accomplished:

✅ **Created a vector database** for document storage  
✅ **Added documents** with metadata to your knowledge base  
✅ **Performed semantic search** to find relevant information  
✅ **Generated AI responses** with source citations  
✅ **Tested conversation mode** with context memory  
✅ **Analyzed performance** with built-in analytics  
✅ **Explored advanced features** like filtering and batch processing  

## 🚀 Next Steps

Now that you understand the basics, you can:

1. **Add your own documents** - Replace sample data with your domain-specific content
2. **Try different vector databases** - Experiment with FAISS, Pinecone, or Weaviate
3. **Customize the LLM** - Use different models or providers
4. **Explore multimodal RAG** - Add images and audio to your knowledge base
5. **Build a web interface** - Create a user-friendly chat interface
6. **Deploy to production** - Scale your system for real-world use

## 📚 Additional Resources

- **[Advanced RAG Techniques](./02_Advanced_RAG_Techniques.ipynb)** - Learn about query expansion, reranking, and more
- **[Multimodal AI](./03_Multimodal_AI.ipynb)** - Process images, audio, and text together
- **[Fine-tuning Models](./04_Model_Fine_Tuning.ipynb)** - Customize models for your specific use case
- **[Production Deployment](./05_Production_Deployment.ipynb)** - Deploy your RAG system at scale

Happy building! 🔥