# Enhanced RAG System with Embeddings and FAISS

This notebook demonstrates an improved RAG system using sentence embeddings and FAISS for efficient similarity search, along with Gemini for generation.

In [1]:
# Install required dependencies
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


## 2. Import Libraries and Setup

In [2]:
import re
import os
import numpy as np
import faiss
from sentence_transformers import SentenceTransformer
from google import genai

# Initialize embedding model
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

def chunk_text(text, chunk_size=600, overlap=100):
    """Split text into overlapping chunks using sliding window"""
    return [text[i:i+chunk_size] for i in range(0, len(text), chunk_size - overlap)]

def create_faiss_index(embeddings):
    """Create FAISS index for efficient similarity search"""
    dimension = embeddings.shape[1]
    index = faiss.IndexFlatL2(dimension)
    index.add(embeddings.astype(np.float32))
    return index

  from .autonotebook import tqdm as notebook_tqdm


## 3. Prepare Document and Generate Embeddings

In [21]:
# Sample document (same as previous)
document = """Once upon a time, in a whimsical land called Veggieville, there lived a curious rabbit named Bunny. Bunny was no ordinary rabbit—she had a knack for finding strange and magical objects. One sunny morning, while hopping through the Enchanted Forest, she stumbled upon a peculiar hat lying under a giant carrot-shaped tree. The hat was no ordinary hat—it was the Magic Hat of 2000 Lines, a legendary artifact said to grant its wearer the power to weave stories, spells, and songs with just a thought.
    Bunny picked up the hat and examined it closely. It was a tall, floppy hat with shimmering silver threads that seemed to dance in the sunlight. As she placed it on her head, a voice boomed from nowhere and everywhere at once.
    "Ah, a new wearer! Welcome, Bunny. I am the Hat, and I am bound to serve you. But beware—my magic is not infinite. I can only create 2000 lines of magic before my power fades. Use them wisely!"
    Bunny's ears perked up. "2000 lines? That's a lot! What can I do with them?"
    The Hat chuckled. "Anything you can imagine! Poems, riddles, spells, even entire stories. But remember, once the lines are used up, my magic is gone forever."
    Excited, Bunny decided to test the Hat's powers. She thought of her best friend, Carrot, a cheerful orange vegetable with a knack for getting into trouble. "Hat, can you tell me a story about Carrot?"
    The Hat glowed, and a story began to unfold:
    Once, in the heart of Veggieville, there lived a carrot named Carrot who loved to explore. One day, he stumbled upon a salty cave guarded by a grumpy old grain named Salt. Salt was the keeper of the Cave of Crystals, a place filled with shimmering treasures. But Salt was lonely and bitter, and he refused to let anyone enter.
    Carrot, being the curious soul he was, decided to befriend Salt. He brought him a basket of fresh vegetables and sang him a song so sweet that even Salt's hard exterior began to melt. Slowly, Salt opened up and shared his treasures with Carrot, and the two became the unlikeliest of friends.
    Bunny clapped her paws. "That was amazing! But... how many lines did that use?"
    The Hat sighed. "That was 15 lines. You have 1985 left."
    Bunny gasped. "Oh no! I need to be more careful. I don’t want to waste your magic."
    """

# Preprocess text
clean_text = re.sub('\s+', ' ', document).strip()

# Create chunks
chunks = chunk_text(clean_text)
print(f"Document chunks: {len(chunks)}")
print(f"First 3 chunks example: {chunks[0:1]}")

# Generate embeddings
chunk_embeddings = embedding_model.encode(chunks)
print(f"Embedding dimensions: {chunk_embeddings.shape}")
# print(f"FAISS index dimensions: {chunk_embeddings[4]}")

# Create FAISS index
index = create_faiss_index(chunk_embeddings)

Document chunks: 5
First 3 chunks example: ['Once upon a time, in a whimsical land called Veggieville, there lived a curious rabbit named Bunny. Bunny was no ordinary rabbit—she had a knack for finding strange and magical objects. One sunny morning, while hopping through the Enchanted Forest, she stumbled upon a peculiar hat lying under a giant carrot-shaped tree. The hat was no ordinary hat—it was the Magic Hat of 2000 Lines, a legendary artifact said to grant its wearer the power to weave stories, spells, and songs with just a thought. Bunny picked up the hat and examined it closely. It was a tall, floppy hat with shimmering silver thr']
Embedding dimensions: (5, 384)


## 4. Query Processing and Retrieval

In [22]:
def semantic_retrieval(query, index, chunks, top_k=3):
    """Retrieve relevant chunks using semantic similarity"""
    # Encode query
    query_embedding = embedding_model.encode([query])
    
    # Search FAISS index
    distances, indices = index.search(query_embedding.astype(np.float32), top_k)
    
    # Return sorted chunks by relevance
    return [chunks[i] for i in indices[0]]

## 5. Enhanced RAG Workflow

In [23]:
# Initialize Gemini client
from dotenv import load_dotenv

load_dotenv()
client = genai.Client(api_key=os.getenv('GEMINI_API_KEY'))

# Sample query
query = "What powers does the Magic Hat possess and what are its limitations?"

# Retrieve relevant context
context_chunks = semantic_retrieval(query, index, chunks)
context = "\n".join(context_chunks)

# Generate response
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=f"""Answer the question based on the following context:
    {context}
    
    Question: {query}
    Answer:"""
)

# Display results
print("Retrieved Context:")
for chunk in context_chunks:
    print(f"- {chunk[:100]}...")

print("\nGenerated Answer:")
print(response.text)

Retrieved Context:
- unny picked up the hat and examined it closely. It was a tall, floppy hat with shimmering silver thr...
- Once upon a time, in a whimsical land called Veggieville, there lived a curious rabbit named Bunny. ...
- Hat chuckled. "Anything you can imagine! Poems, riddles, spells, even entire stories. But remember, ...

Generated Answer:
The Magic Hat possesses the power to weave stories, spells, and songs with just a thought. Its limitations are that it can only create 2000 lines of magic before its power fades and is gone forever.



## Key Enhancements

1. **Semantic Embeddings**: Uses `all-MiniLM-L6-v2` model for dense vector representations
2. **FAISS Index**: Efficient similarity search for quick retrieval
3. **Contextual Understanding**: Better captures semantic relationships than keyword matching
4. **Scalability**: Can handle larger document collections efficiently

To further improve:
- Experiment with different embedding models
- Add metadata filtering
- Implement hybrid search (dense + sparse)
- Use more sophisticated chunking strategies