# Gravix Layer Cookbook: Simple RAG System
Welcome to the GravixLayer Cookbook! This notebook provides a hands-on, step-by-step guide to building a Retrieval-Augmented Generation (RAG) system using GravixLayer's APIs and vector database.

**What you'll learn:**
- How to set up GravixLayer for semantic search and LLM-powered generation
- How to ingest documents and create a knowledge base
- How to retrieve relevant context and generate answers using RAG
- How to extend, filter, and evaluate your RAG pipeline

**Who is this for?**
- Developers, data scientists, and AI enthusiasts looking to build practical RAG systems
- Anyone exploring GravixLayer's SDK and APIs for real-world applications

## Architecture Overview
1. **Document Ingestion**: Convert text documents to vectors using embedding models
2. **Vector Storage**: Store embeddings in GravixLayer vector database
3. **Retrieval**: Search for relevant documents using semantic similarity
4. **Generation**: Use retrieved context with LLM to generate responses

## Setup and Installation

In [None]:
# Install required packages
!pip install gravixlayer requests python-dotenv

In [1]:
import os
import json
import requests
from typing import List, Dict, Any
from gravixlayer import GravixLayer
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Set up API key (make sure to export GRAVIXLAYER_API_KEY in your environment)
API_KEY = os.getenv('GRAVIXLAYER_API_KEY')
if not API_KEY:
    raise ValueError("Please set GRAVIXLAYER_API_KEY environment variable")

# Initialize GravixLayer client
client = GravixLayer()
print("✅ GravixLayer client initialized successfully")

✅ GravixLayer client initialized successfully


## RAG System Configuration

In [2]:
# Configuration
CONFIG = {
    'embedding_model': 'baai/bge-large-en-v1.5',  # 1024 dimensions
    'llm_model': 'meta-llama/llama-3.1-8b-instruct',
    'vector_dimension': 1024,
    'similarity_metric': 'cosine',
    'index_name': 'rag-knowledge-base',
    'top_k_results': 3,
    'base_url': 'https://api.gravixlayer.com/v1/vectors'
}

print(f"📋 Configuration:")
for key, value in CONFIG.items():
    print(f"  {key}: {value}")

📋 Configuration:
  embedding_model: baai/bge-large-en-v1.5
  llm_model: meta-llama/llama-3.1-8b-instruct
  vector_dimension: 1024
  similarity_metric: cosine
  index_name: rag-knowledge-base
  top_k_results: 3
  base_url: https://api.gravixlayer.com/v1/vectors


## Vector Database Setup

In [None]:
# VectorDatabase class for managing GravixLayer vector DB operations
class VectorDatabase:
    def __init__(self, api_key: str, base_url: str):
        """Initialize the VectorDatabase client with API key and base URL."""
        self.api_key = api_key
        self.base_url = base_url
        self.headers = {
            'Authorization': f'Bearer {api_key}',
            'Content-Type': 'application/json'
        }
        self.index_id = None
    
    def create_index(self, name: str, dimension: int, metric: str) -> str:
        """Create a new vector index with the given name, dimension, and metric.
        Returns the index ID if successful, else None."""
        url = f"{self.base_url}/indexes"
        payload = {
            "name": name,
            "dimension": dimension,
            "metric": metric,
            "vector_type": "dense"
        }
        
        response = requests.post(url, headers=self.headers, json=payload)
        
        if response.status_code == 201:
            result = response.json()
            self.index_id = result['id']
            print(f"✅ Index created successfully: {result['id']}")
            return result['id']
        else:
            print(f"❌ Error creating index: {response.text}")
            return None
    
    def list_indexes(self) -> List[Dict]:
        """List all vector indexes available in the database.
        Returns a list of index metadata dictionaries."""
        url = f"{self.base_url}/indexes"
        response = requests.get(url, headers=self.headers)
        
        if response.status_code == 200:
            return response.json()['indexes']
        return []
    
    def find_or_create_index(self, name: str, dimension: int, metric: str) -> str:
        """Find an existing index by name, or create a new one if not found.
        Returns the index ID."""
        indexes = self.list_indexes()
        
        # Check if index already exists
        for index in indexes:
            if index['name'] == name:
                self.index_id = index['id']
                print(f"📁 Using existing index: {index['id']}")
                return index['id']
        
        # Create new index if not found
        return self.create_index(name, dimension, metric)
    
    def upsert_text_vectors(self, texts_with_metadata: List[Dict]) -> bool:
        """Upsert (insert/update) text vectors into the database with automatic embedding.
        Returns True if successful, else False."""
        if not self.index_id:
            print("❌ No index selected")
            return False
        
        url = f"{self.base_url}/{self.index_id}/text/upsert"
        payload = {"vectors": texts_with_metadata}
        
        response = requests.post(url, headers=self.headers, json=payload)
        
        if response.status_code == 200:
            result = response.json()
            print(f"✅ Upserted {result['upserted_count']} vectors successfully")
            return True
        else:
            print(f"❌ Error upserting vectors: {response.text}")
            return False
    
    def search_text(self, query: str, model: str, top_k: int = 3, filter_dict: Dict = None) -> List[Dict]:
        """Search for similar vectors using a text query and optional filters.
        Returns a list of matching vector hits."""
        if not self.index_id:
            print("❌ No index selected")
            return []
        
        url = f"{self.base_url}/{self.index_id}/search/text"
        payload = {
            "query": query,
            "model": model,
            "top_k": top_k,
            "include_metadata": True,
            "include_values": False
        }
        
        if filter_dict:
            payload["filter"] = filter_dict
        
        response = requests.post(url, headers=self.headers, json=payload)
        
        if response.status_code == 200:
            result = response.json()
            print(f"🔍 Found {len(result['hits'])} relevant documents (took {result['query_time_ms']}ms)")
            return result['hits']
        else:
            print(f"❌ Error searching: {response.text}")
            return []

# Initialize vector database
vector_db = VectorDatabase(API_KEY, CONFIG['base_url'])
print("🔧 Vector database client initialized")

🔧 Vector database client initialized


## Create Vector Index

In [5]:
# Create or find vector index
index_id = vector_db.find_or_create_index(
    name=CONFIG['index_name'],
    dimension=CONFIG['vector_dimension'],
    metric=CONFIG['similarity_metric']
)

if index_id:
    print(f"📊 Vector index ready: {index_id}")
else:
    print("❌ Failed to create/find vector index")

✅ Index created successfully: 28702d39-b691-4c91-bbe3-6e8ab10857e1
📊 Vector index ready: 28702d39-b691-4c91-bbe3-6e8ab10857e1


## Sample Knowledge Base

In [6]:
# Sample documents for the knowledge base
knowledge_base = [
    {
        "id": "doc_ai_overview",
        "text": "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. AI systems can learn, reason, perceive, and make decisions. Machine learning, deep learning, and neural networks are key components of modern AI.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "AI",
            "topic": "overview",
            "difficulty": "beginner"
        }
    },
    {
        "id": "doc_machine_learning",
        "text": "Machine Learning (ML) is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions. Common types include supervised learning, unsupervised learning, and reinforcement learning.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "AI",
            "topic": "machine_learning",
            "difficulty": "intermediate"
        }
    },
    {
        "id": "doc_deep_learning",
        "text": "Deep Learning is a specialized subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to model and understand complex patterns in data. It's particularly effective for tasks like image recognition, natural language processing, and speech recognition. Popular frameworks include TensorFlow and PyTorch.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "AI",
            "topic": "deep_learning",
            "difficulty": "advanced"
        }
    },
    {
        "id": "doc_nlp",
        "text": "Natural Language Processing (NLP) is a field of AI that focuses on enabling computers to understand, interpret, and generate human language. NLP combines computational linguistics with machine learning and deep learning. Applications include chatbots, translation, sentiment analysis, and text summarization.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "AI",
            "topic": "nlp",
            "difficulty": "intermediate"
        }
    },
    {
        "id": "doc_rag",
        "text": "Retrieval-Augmented Generation (RAG) is an AI technique that combines information retrieval with text generation. RAG systems first retrieve relevant documents from a knowledge base, then use this context to generate more accurate and informed responses. This approach helps reduce hallucinations and provides more factual answers.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "AI",
            "topic": "rag",
            "difficulty": "advanced"
        }
    },
    {
        "id": "doc_python",
        "text": "Python is a high-level, interpreted programming language known for its simplicity and readability. It's widely used in AI and machine learning due to its extensive libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and PyTorch. Python's syntax makes it ideal for rapid prototyping and development.",
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": "Programming",
            "topic": "python",
            "difficulty": "beginner"
        }
    }
]

print(f"📚 Prepared {len(knowledge_base)} documents for ingestion")
for doc in knowledge_base:
    print(f"  - {doc['id']}: {doc['metadata']['topic']}")

📚 Prepared 6 documents for ingestion
  - doc_ai_overview: overview
  - doc_machine_learning: machine_learning
  - doc_deep_learning: deep_learning
  - doc_nlp: nlp
  - doc_rag: rag
  - doc_python: python


## Ingest Documents into Vector Database

In [7]:
# Ingest documents into vector database
print("📥 Ingesting documents into vector database...")
success = vector_db.upsert_text_vectors(knowledge_base)

if success:
    print("✅ All documents successfully ingested into vector database")
else:
    print("❌ Failed to ingest documents")

📥 Ingesting documents into vector database...
✅ Upserted 6 vectors successfully
✅ All documents successfully ingested into vector database


## RAG System Implementation

In [None]:
# SimpleRAG class for RAG pipeline using GravixLayer
class SimpleRAG:
    def __init__(self, vector_db: VectorDatabase, llm_client: GravixLayer, config: Dict):
        """Initialize the SimpleRAG system with vector DB, LLM client, and config."""
        self.vector_db = vector_db
        self.llm_client = llm_client
        self.config = config
    
    def retrieve_context(self, query: str, top_k: int = None) -> str:
        """Retrieve relevant context from the vector database for a given query.
        Returns a formatted string of context sources."""
        if top_k is None:
            top_k = self.config['top_k_results']
        
        # Search for relevant documents
        hits = self.vector_db.search_text(
            query=query,
            model=self.config['embedding_model'],
            top_k=top_k
        )
        
        # Extract and format context
        context_parts = []
        for i, hit in enumerate(hits, 1):
            metadata = hit.get('metadata', {})
            topic = metadata.get('topic', 'unknown')
            category = metadata.get('category', 'unknown')
            
            # Find the original text from our knowledge base
            doc_text = None
            for doc in knowledge_base:
                if doc['id'] == hit['id']:
                    doc_text = doc['text']
                    break
            
            if doc_text:
                context_parts.append(
                    f"Source {i} (Score: {hit['score']:.3f}, Topic: {topic}, Category: {category}):\n{doc_text}"
                )
        
        return "\n\n".join(context_parts)
    
    def generate_response(self, query: str, context: str) -> str:
        """Generate a response using the LLM, based on the retrieved context and user query.
        Returns the generated answer as a string."""
        prompt = f"""You are a helpful AI assistant. Use the provided context to answer the user's question accurately and comprehensively.

Context:
{context}

User Question: {query}

Instructions:
1. Base your answer primarily on the provided context
2. If the context doesn't contain enough information, say so clearly
3. Be concise but thorough
4. Cite which sources you're using when relevant

Answer:"""
        
        try:
            response = self.llm_client.chat.completions.create(
                model=self.config['llm_model'],
                messages=[
                    {"role": "user", "content": prompt}
                ],
                max_tokens=500,
                temperature=0.7
            )
            
            return response.choices[0].message.content
        
        except Exception as e:
            return f"Error generating response: {str(e)}"
    
    def query(self, question: str, show_context: bool = True) -> Dict[str, Any]:
        """Main method to run a RAG query: retrieves context and generates a response.
        Returns a dictionary with the query, context, and response."""
        print(f"🤔 Query: {question}")
        print("─" * 80)
        
        # Step 1: Retrieve relevant context
        print("🔍 Retrieving relevant context...")
        context = self.retrieve_context(question)
        
        if show_context:
            print("\n📄 Retrieved Context:")
            print(context)
            print("\n" + "─" * 80)
        
        # Step 2: Generate response
        print("🤖 Generating response...")
        response = self.generate_response(question, context)
        
        print("\n💡 Response:")
        print(response)
        print("\n" + "═" * 80)
        
        return {
            "query": question,
            "context": context,
            "response": response
        }

# Initialize RAG system
rag_system = SimpleRAG(vector_db, client, CONFIG)
print("🚀 RAG system initialized and ready!")

🚀 RAG system initialized and ready!


## Test the RAG System

In [9]:
# Test query 1: General AI question
result1 = rag_system.query("What is artificial intelligence and how does it work?")

🤔 Query: What is artificial intelligence and how does it work?
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 54ms)

📄 Retrieved Context:
Source 1 (Score: 0.826, Topic: overview, Category: AI):
Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. AI systems can learn, reason, perceive, and make decisions. Machine learning, deep learning, and neural networks are key components of modern AI.

Source 2 (Score: 0.660, Topic: machine_learning, Category: AI):
Machine Learning (ML) is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions. Common types include supervised learning, unsupervised learning, and reinforcement lea

In [10]:
# Test query 2: Specific topic
result2 = rag_system.query("Explain the difference between machine learning and deep learning")

🤔 Query: Explain the difference between machine learning and deep learning
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 61ms)

📄 Retrieved Context:
Source 1 (Score: 0.793, Topic: deep_learning, Category: AI):
Deep Learning is a specialized subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to model and understand complex patterns in data. It's particularly effective for tasks like image recognition, natural language processing, and speech recognition. Popular frameworks include TensorFlow and PyTorch.

Source 2 (Score: 0.725, Topic: machine_learning, Category: AI):
Machine Learning (ML) is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions. Common types include supervised learning,

In [11]:
# Test query 3: About RAG itself
result3 = rag_system.query("What is RAG and how does it help with AI applications?")

🤔 Query: What is RAG and how does it help with AI applications?
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 63ms)

📄 Retrieved Context:
Source 1 (Score: 0.797, Topic: rag, Category: AI):
Retrieval-Augmented Generation (RAG) is an AI technique that combines information retrieval with text generation. RAG systems first retrieve relevant documents from a knowledge base, then use this context to generate more accurate and informed responses. This approach helps reduce hallucinations and provides more factual answers.

Source 2 (Score: 0.644, Topic: overview, Category: AI):
Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. AI systems can learn, reason, perceive, and make decisions. Machine learning, deep learning, and neural networks are key components of modern AI.


In [12]:
# Test query 4: Programming related
result4 = rag_system.query("Why is Python popular for machine learning?")

🤔 Query: Why is Python popular for machine learning?
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 53ms)

📄 Retrieved Context:
Source 1 (Score: 0.812, Topic: python, Category: Programming):
Python is a high-level, interpreted programming language known for its simplicity and readability. It's widely used in AI and machine learning due to its extensive libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and PyTorch. Python's syntax makes it ideal for rapid prototyping and development.

Source 2 (Score: 0.715, Topic: machine_learning, Category: AI):
Machine Learning (ML) is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions. Common types include supervised learning, unsupervised learning, and reinforcement learning.

Source 3 (S

## Advanced RAG Features

In [None]:
# Function to search with metadata filtering in the vector DB
def search_with_filter(query: str, category: str = None, difficulty: str = None):
    """Search for documents using a query and optional metadata filters (category, difficulty).
    Prints results and returns the list of hits."""
    filter_dict = {}
    if category:
        filter_dict['category'] = category
    if difficulty:
        filter_dict['difficulty'] = difficulty
    
    print(f"🔍 Searching: '{query}'")
    if filter_dict:
        print(f"📋 Filters: {filter_dict}")
    
    hits = vector_db.search_text(
        query=query,
        model=CONFIG['embedding_model'],
        top_k=3,
        filter_dict=filter_dict if filter_dict else None
    )
    
    print("\n📊 Results:")
    for hit in hits:
        metadata = hit.get('metadata', {})
        print(f"  - {hit['id']}: Score {hit['score']:.3f} | {metadata}")
    
    return hits

# Test filtering by category
print("Testing filtered search - AI category only:")
ai_results = search_with_filter("learning algorithms", category="AI")

print("\n" + "─" * 60)
print("Testing filtered search - Beginner difficulty only:")
beginner_results = search_with_filter("programming", difficulty="beginner")

Testing filtered search - AI category only:
🔍 Searching: 'learning algorithms'
📋 Filters: {'category': 'AI'}
🔍 Found 3 relevant documents (took 47ms)

📊 Results:
  - doc_machine_learning: Score 0.713 | {'category': 'AI', 'difficulty': 'intermediate', 'topic': 'machine_learning'}
  - doc_deep_learning: Score 0.628 | {'category': 'AI', 'difficulty': 'advanced', 'topic': 'deep_learning'}
  - doc_ai_overview: Score 0.580 | {'category': 'AI', 'difficulty': 'beginner', 'topic': 'overview'}

────────────────────────────────────────────────────────────
Testing filtered search - Beginner difficulty only:
🔍 Searching: 'programming'
📋 Filters: {'difficulty': 'beginner'}
🔍 Found 2 relevant documents (took 46ms)

📊 Results:
  - doc_python: Score 0.644 | {'category': 'Programming', 'difficulty': 'beginner', 'topic': 'python'}
  - doc_ai_overview: Score 0.599 | {'category': 'AI', 'difficulty': 'beginner', 'topic': 'overview'}


## Interactive RAG Chat

In [None]:
# Interactive chat interface for the RAG system
def interactive_rag_chat():
    """Start an interactive chat loop for asking questions to the RAG system.
    Type 'quit', 'exit', or 'bye' to end the chat."""
    print("🤖 Welcome to the RAG Chat System!")
    print("Ask questions about AI, machine learning, or programming.")
    print("Type 'quit' to exit.\n")
    
    while True:
        try:
            user_input = input("🙋 Your question: ").strip()
            
            if user_input.lower() in ['quit', 'exit', 'bye']:
                print("👋 Goodbye!")
                break
            
            if not user_input:
                print("Please enter a question.")
                continue
            
            # Process the query and print the response
            result = rag_system.query(user_input, show_context=False)
            
        except KeyboardInterrupt:
            print("\n👋 Chat interrupted. Goodbye!")
            break
        except Exception as e:
            print(f"❌ Error: {e}")

# Uncomment the line below to start interactive chat
# interactive_rag_chat()

## System Statistics and Management

In [None]:
# Function to display system statistics and configuration
def show_system_stats():
    """Display statistics about the RAG system, indexes, and configuration."""
    print("📊 RAG System Statistics")
    print("═" * 40)
    
    # List all indexes
    indexes = vector_db.list_indexes()
    print(f"📁 Total Indexes: {len(indexes)}")
    
    for index in indexes:
        print(f"\n🔍 Index: {index['name']}")
        print(f"  - ID: {index['id']}")
        print(f"  - Dimension: {index['dimension']}")
        print(f"  - Metric: {index['metric']}")
        print(f"  - Status: {index['status']}")
        print(f"  - Created: {index['created_at']}")
    
    print(f"\n📚 Knowledge Base: {len(knowledge_base)} documents")
    print(f"🤖 LLM Model: {CONFIG['llm_model']}")
    print(f"🔤 Embedding Model: {CONFIG['embedding_model']}")
    print(f"📏 Vector Dimension: {CONFIG['vector_dimension']}")
    print(f"📐 Similarity Metric: {CONFIG['similarity_metric']}")

show_system_stats()

📊 RAG System Statistics
════════════════════════════════════════
📁 Total Indexes: 2

🔍 Index: rag-knowledge-base
  - ID: 28702d39-b691-4c91-bbe3-6e8ab10857e1
  - Dimension: 1024
  - Metric: cosine
  - Status: ready
  - Created: 2025-09-19T08:36:47.90609Z

🔍 Index: product-embeddings
  - ID: bf93acdc-dc02-4514-8a6f-292e85fc3450
  - Dimension: 1024
  - Metric: cosine
  - Status: ready
  - Created: 2025-09-18T13:02:58.152099Z

📚 Knowledge Base: 6 documents
🤖 LLM Model: meta-llama/llama-3.1-8b-instruct
🔤 Embedding Model: baai/bge-large-en-v1.5
📏 Vector Dimension: 1024
📐 Similarity Metric: cosine


## Adding New Documents

In [None]:
# Function to add a new document to the knowledge base and vector DB
def add_document(doc_id: str, text: str, category: str, topic: str, difficulty: str = "intermediate"):
    """Add a new document to the knowledge base and upsert it into the vector database.
    Returns True if successful, else False."""
    new_doc = {
        "id": doc_id,
        "text": text,
        "model": CONFIG['embedding_model'],
        "metadata": {
            "category": category,
            "topic": topic,
            "difficulty": difficulty
        }
    }
    
    # Add to vector database
    success = vector_db.upsert_text_vectors([new_doc])
    
    if success:
        # Add to local knowledge base for reference
        knowledge_base.append(new_doc)
        print(f"✅ Document '{doc_id}' added successfully")
        return True
    else:
        print(f"❌ Failed to add document '{doc_id}'")
        return False

# Example: Add a new document about transformers
add_document(
    doc_id="doc_transformers",
    text="Transformers are a type of neural network architecture that has revolutionized natural language processing. Introduced in the 'Attention is All You Need' paper, transformers use self-attention mechanisms to process sequences of data. They form the basis of large language models like GPT, BERT, and T5.",
    category="AI",
    topic="transformers",
    difficulty="advanced"
    )

# Test the new document
print("\nTesting with new document:")
result_new = rag_system.query("What are transformers in machine learning?")

✅ Upserted 1 vectors successfully
✅ Document 'doc_transformers' added successfully

Testing with new document:
🤔 Query: What are transformers in machine learning?
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 51ms)

📄 Retrieved Context:
Source 1 (Score: 0.828, Topic: transformers, Category: AI):
Transformers are a type of neural network architecture that has revolutionized natural language processing. Introduced in the 'Attention is All You Need' paper, transformers use self-attention mechanisms to process sequences of data. They form the basis of large language models like GPT, BERT, and T5.

Source 2 (Score: 0.709, Topic: machine_learning, Category: AI):
Machine Learning (ML) is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions

## Performance Evaluation

In [None]:
# Function to evaluate RAG system performance on test queries
import time

def evaluate_rag_performance(test_queries: List[str]):
    """Evaluate the RAG system's performance on a list of test queries.
    Prints response time and length for each query, and returns summary statistics."""
    print("🔬 Evaluating RAG System Performance")
    print("═" * 50)
    
    total_time = 0
    results = []
    
    for i, query in enumerate(test_queries, 1):
        print(f"\nTest {i}/{len(test_queries)}: {query}")
        
        start_time = time.time()
        result = rag_system.query(query, show_context=False)
        end_time = time.time()
        
        query_time = end_time - start_time
        total_time += query_time
        
        results.append({
            'query': query,
            'time': query_time,
            'response_length': len(result['response'])
        })
        
        print(f"⏱️  Response time: {query_time:.2f}s")
        print(f"📝 Response length: {len(result['response'])} characters")
    
    # Summary statistics
    avg_time = total_time / len(test_queries)
    avg_length = sum(r['response_length'] for r in results) / len(results)
    
    print("\n📊 Performance Summary:")
    print(f"  - Total queries: {len(test_queries)}")
    print(f"  - Total time: {total_time:.2f}s")
    print(f"  - Average time per query: {avg_time:.2f}s")
    print(f"  - Average response length: {avg_length:.0f} characters")
    
    return results

# Test queries for evaluation
test_queries = [
    "What is the difference between AI and machine learning?",
    "How do neural networks work?",
    "What are the applications of natural language processing?",
    "Why is Python good for data science?",
    "Explain transformers in deep learning"
    ]

# Run performance evaluation
performance_results = evaluate_rag_performance(test_queries)

🔬 Evaluating RAG System Performance
══════════════════════════════════════════════════

Test 1/5: What is the difference between AI and machine learning?
🤔 Query: What is the difference between AI and machine learning?
────────────────────────────────────────────────────────────────────────────────
🔍 Retrieving relevant context...
🔍 Found 3 relevant documents (took 53ms)
🤖 Generating response...

💡 Response:
Based on the provided context, I'd be happy to explain the difference between AI and machine learning.

AI (Artificial Intelligence) is a broader field that aims to create intelligent machines capable of performing tasks that typically require human intelligence [Source 2]. It encompasses various subfields, including machine learning. In other words, all machine learning falls under the umbrella of AI, but not all AI involves machine learning.

Machine Learning (ML), on the other hand, is a subset of AI that enables computers to learn and improve from experience without being expli

## Cleanup and Resource Management

In [None]:
# Function to clean up vector database resources (indexes and vectors)
def cleanup_resources(confirm: bool = False):
    """Clean up all vector database resources (indexes and vectors).
    Set confirm=True to actually delete resources. Prints what would be deleted otherwise."""
    if not confirm:
        print("⚠️  This will delete all indexes and vectors. Set confirm=True to proceed.")
        return
    
    print("🧹 Cleaning up resources...")
    
    # List and optionally delete indexes
    indexes = vector_db.list_indexes()
    for index in indexes:
        print(f"🗑️  Would delete index: {index['name']} ({index['id']})")
        # Uncomment to actually delete:
        # vector_db.delete_index(index['id'])
    
    print("✅ Cleanup complete")

# Show what would be cleaned up (don't actually delete)
cleanup_resources(confirm=False)

⚠️  This will delete all indexes and vectors. Set confirm=True to proceed.


## Conclusion

**Congratulations!** You have successfully built a complete RAG system using GravixLayer!

### What you've accomplished:

1. **Set up the environment** with GravixLayer APIs
2. **Created a vector database** for semantic search
3. **Ingested documents** with automatic text embedding
4. **Implemented retrieval** using semantic similarity search
5. **Built generation** using the LLM with retrieved context
6. **Created a complete RAG pipeline** that combines both components
7. **Added advanced features** like filtering and performance evaluation

### Key Features of this RAG System:

- **Semantic Search**: Uses embedding models to find contextually relevant documents
- **Flexible Filtering**: Supports metadata-based filtering for precise retrieval
- **Scalable Architecture**: Built on GravixLayer's vector database infrastructure
- **Easy Extension**: Simple methods to add new documents to the knowledge base
- **Performance Monitoring**: Built-in evaluation and timing capabilities

### Next Steps:

1. **Expand the knowledge base** with your own documents
2. **Experiment with different embedding models** and dimensions
3. **Tune the retrieval parameters** (top_k, similarity thresholds)
4. **Add more sophisticated filtering** based on your use case
5. **Implement conversation memory** for multi-turn interactions
6. **Add document preprocessing** for better text chunking

### Resources:

- [Gravix Layer Documentation](https://docs.gravixlayer.com)
- [Vector Database Best Practices](https://docs.gravixlayer.com/docs/vector)

Happy building! 