# Basic RAG with ChromaDB and Gemini 3 Pro

> **Created by [Build Fast with AI](https://www.buildfastwithai.com)**

This notebook demonstrates how to build a Retrieval Augmented Generation (RAG) system using ChromaDB and Gemini 3 Pro.

## What you'll learn:
- What is RAG and why it's important
- Setting up ChromaDB for vector storage
- Creating embeddings with Google's embedding models
- Building a simple document Q&A system
- Best practices for RAG implementations

In [None]:
!pip install -q google-generativeai chromadb

In [None]:
import google.generativeai as genai
import chromadb
from chromadb.utils import embedding_functions
import os
from IPython.display import Markdown, display

In [None]:
# Configure API key
try:
    from google.colab import userdata
    GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
except:
    GOOGLE_API_KEY = os.environ.get('GOOGLE_API_KEY', 'your-api-key-here')

genai.configure(api_key=GOOGLE_API_KEY)

## 1. Understanding RAG

**Retrieval Augmented Generation (RAG)** combines:
- **Retrieval**: Finding relevant information from a knowledge base
- **Generation**: Using an LLM to generate answers based on retrieved context

Benefits:
- Reduces hallucinations
- Provides source attribution
- Enables domain-specific knowledge
- No need to fine-tune the model

## 2. Sample Documents

Let's create a knowledge base about AI and machine learning.

In [None]:
documents = [
    {
        "id": "doc1",
        "text": "Machine Learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. It focuses on developing computer programs that can access data and use it to learn for themselves.",
        "metadata": {"category": "basics", "topic": "machine learning"}
    },
    {
        "id": "doc2",
        "text": "Deep Learning is a subset of machine learning that uses neural networks with multiple layers. These deep neural networks are capable of learning and making intelligent decisions on their own. Deep learning has been particularly successful in image recognition, natural language processing, and speech recognition.",
        "metadata": {"category": "advanced", "topic": "deep learning"}
    },
    {
        "id": "doc3",
        "text": "Natural Language Processing (NLP) is a branch of AI that helps computers understand, interpret, and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, to bridge the gap between human communication and computer understanding.",
        "metadata": {"category": "basics", "topic": "nlp"}
    },
    {
        "id": "doc4",
        "text": "Transformers are a type of neural network architecture that have revolutionized NLP. Introduced in 2017, they use self-attention mechanisms to process sequential data. Models like BERT, GPT, and T5 are all based on the transformer architecture.",
        "metadata": {"category": "advanced", "topic": "transformers"}
    },
    {
        "id": "doc5",
        "text": "Supervised Learning is a type of machine learning where the model is trained on labeled data. The algorithm learns from the training dataset and makes predictions on unseen data. Common applications include classification and regression tasks.",
        "metadata": {"category": "basics", "topic": "supervised learning"}
    },
    {
        "id": "doc6",
        "text": "Unsupervised Learning involves training a model on unlabeled data. The system tries to learn patterns and structure from the data without explicit guidance. Clustering and dimensionality reduction are common unsupervised learning techniques.",
        "metadata": {"category": "basics", "topic": "unsupervised learning"}
    },
    {
        "id": "doc7",
        "text": "Reinforcement Learning is a type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize cumulative reward. It has been successfully applied to game playing, robotics, and autonomous systems.",
        "metadata": {"category": "advanced", "topic": "reinforcement learning"}
    },
    {
        "id": "doc8",
        "text": "Large Language Models (LLMs) are AI models trained on vast amounts of text data. They can generate human-like text, answer questions, write code, and perform various language tasks. Examples include GPT-4, PaLM, and Gemini.",
        "metadata": {"category": "advanced", "topic": "llm"}
    }
]

print(f"Created knowledge base with {len(documents)} documents")

## 3. Setting Up ChromaDB

ChromaDB is a vector database optimized for embeddings.

In [None]:
# Initialize ChromaDB client
chroma_client = chromadb.Client()

# Create a collection
collection = chroma_client.create_collection(
    name="ai_knowledge_base",
    metadata={"description": "AI and ML concepts"}
)

print(f"Collection created: {collection.name}")

## 4. Creating Embeddings with Google's Embedding Model

In [None]:
def get_embedding(text):
    """Get embedding for text using Google's embedding model."""
    result = genai.embed_content(
        model="models/embedding-001",
        content=text,
        task_type="retrieval_document"
    )
    return result['embedding']

# Test the embedding function
test_embedding = get_embedding("This is a test sentence.")
print(f"Embedding dimension: {len(test_embedding)}")
print(f"First 5 values: {test_embedding[:5]}")

## 5. Adding Documents to ChromaDB

In [None]:
# Prepare data for ChromaDB
ids = [doc["id"] for doc in documents]
texts = [doc["text"] for doc in documents]
metadatas = [doc["metadata"] for doc in documents]

# Generate embeddings for all documents
print("Generating embeddings...")
embeddings = [get_embedding(text) for text in texts]
print(f"Generated {len(embeddings)} embeddings")

# Add to collection
collection.add(
    ids=ids,
    embeddings=embeddings,
    documents=texts,
    metadatas=metadatas
)

print(f"\nAdded {collection.count()} documents to the collection")

## 6. Semantic Search

Now we can search for relevant documents using natural language queries.

In [None]:
def search_documents(query, n_results=3):
    """Search for relevant documents using semantic search."""
    # Get embedding for query
    query_embedding_result = genai.embed_content(
        model="models/embedding-001",
        content=query,
        task_type="retrieval_query"
    )
    query_embedding = query_embedding_result['embedding']
    
    # Search in ChromaDB
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=n_results
    )
    
    return results

# Test search
query = "What are neural networks?"
print(f"Query: {query}\n")

results = search_documents(query)

print("Top 3 relevant documents:\n")
for i, (doc, metadata, distance) in enumerate(zip(
    results['documents'][0],
    results['metadatas'][0],
    results['distances'][0]
), 1):
    print(f"{i}. [Score: {1-distance:.3f}] {metadata['topic']}")
    print(f"   {doc[:100]}...\n")

## 7. Building the RAG Pipeline

In [None]:
def rag_query(question, n_results=3):
    """
    Answer a question using RAG:
    1. Retrieve relevant documents
    2. Generate answer using Gemini with context
    """
    # Step 1: Retrieve relevant documents
    results = search_documents(question, n_results=n_results)
    
    # Step 2: Prepare context from retrieved documents
    context = "\n\n".join([
        f"Document {i+1}: {doc}"
        for i, doc in enumerate(results['documents'][0])
    ])
    
    # Step 3: Create prompt with context
    prompt = f"""
Based on the following context, answer the question. If the answer cannot be found in the context, say so.

Context:
{context}

Question: {question}

Answer:
"""
    
    # Step 4: Generate answer using Gemini
    model = genai.GenerativeModel('gemini-3-pro')
    response = model.generate_content(prompt)
    
    return {
        "answer": response.text,
        "sources": results['metadatas'][0],
        "source_texts": results['documents'][0]
    }

# Test RAG system
question = "What is the difference between supervised and unsupervised learning?"
print(f"Question: {question}\n")

result = rag_query(question)

print("Answer:")
display(Markdown(result['answer']))

print("\n\nSources used:")
for i, source in enumerate(result['sources'], 1):
    print(f"{i}. Topic: {source['topic']} (Category: {source['category']})")

## 8. Interactive RAG Q&A System

In [None]:
class RAGSystem:
    def __init__(self, collection):
        self.collection = collection
        self.model = genai.GenerativeModel('gemini-3-pro')
    
    def ask(self, question, n_results=3, show_sources=True):
        """Ask a question and get an answer with sources."""
        result = rag_query(question, n_results=n_results)
        
        print(f"\n{'='*80}")
        print(f"Question: {question}")
        print(f"{'='*80}\n")
        
        display(Markdown(result['answer']))
        
        if show_sources:
            print("\n\nSources:")
            for i, (source, text) in enumerate(zip(
                result['sources'],
                result['source_texts']
            ), 1):
                print(f"\n{i}. [{source['category'].upper()}] {source['topic']}")
                print(f"   {text[:150]}...")

# Create RAG system
rag_system = RAGSystem(collection)

# Test with multiple questions
questions = [
    "What are transformers in AI?",
    "Explain reinforcement learning",
    "What is NLP used for?"
]

for question in questions:
    rag_system.ask(question)

## 9. Advanced: Filtering and Metadata Search

In [None]:
def rag_query_with_filter(question, category_filter=None, n_results=3):
    """RAG query with metadata filtering."""
    # Get query embedding
    query_embedding_result = genai.embed_content(
        model="models/embedding-001",
        content=question,
        task_type="retrieval_query"
    )
    query_embedding = query_embedding_result['embedding']
    
    # Build where clause for filtering
    where_clause = {"category": category_filter} if category_filter else None
    
    # Search with filter
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=n_results,
        where=where_clause
    )
    
    # Generate answer
    context = "\n\n".join([
        f"Document {i+1}: {doc}"
        for i, doc in enumerate(results['documents'][0])
    ])
    
    prompt = f"""
Based on the following context, answer the question.

Context:
{context}

Question: {question}

Answer:
"""
    
    model = genai.GenerativeModel('gemini-3-pro')
    response = model.generate_content(prompt)
    
    return response.text, results['metadatas'][0]

# Test with filtering
print("Searching only in 'basics' category:\n")
answer, sources = rag_query_with_filter(
    "Explain machine learning",
    category_filter="basics"
)

display(Markdown(answer))
print("\nSources:", [s['topic'] for s in sources])

## 10. Best Practices for RAG

### Key Considerations:

1. **Chunk Size**: Split documents into optimal sizes (typically 200-500 tokens)
2. **Embedding Quality**: Use domain-specific embeddings when possible
3. **Retrieval Strategy**: Experiment with different retrieval methods
4. **Context Window**: Balance between context length and relevance
5. **Source Attribution**: Always provide sources for transparency
6. **Evaluation**: Regularly evaluate answer quality and relevance

In [None]:
# Example: Document chunking
def chunk_text(text, chunk_size=200, overlap=50):
    """Split text into overlapping chunks."""
    words = text.split()
    chunks = []
    
    for i in range(0, len(words), chunk_size - overlap):
        chunk = ' '.join(words[i:i + chunk_size])
        chunks.append(chunk)
        
        if i + chunk_size >= len(words):
            break
    
    return chunks

# Test chunking
long_text = " ".join(["This is a test sentence."] * 100)
chunks = chunk_text(long_text, chunk_size=50, overlap=10)
print(f"Created {len(chunks)} chunks from text with {len(long_text.split())} words")
print(f"First chunk: {chunks[0][:100]}...")

## Next Steps

- Explore advanced RAG with LangChain
- Build agentic RAG systems
- Create production-ready RAG applications

---

## Learn More

Build production-ready RAG systems with the **[Gen AI Crash Course](https://www.buildfastwithai.com/genai-course)** by Build Fast with AI!

**Created by [Build Fast with AI](https://www.buildfastwithai.com)**