# Day 6 coding: FAISS + RAG pipeline 
We are essentially building a mini RAG retrieval example with:

client-specific documents

embeddings of those documents

FAISS as the vector DB

a query we embed & search

top‑k results retrieved



# Step 1: Install Libraries

Why?

sentence-transformers: to convert text to embeddings (dense vectors).

faiss-cpu: lightweight vector database library (Facebook AI Similarity Search).


In [4]:
!pip install sentence-transformers faiss-cpu




# Step 2: Import libraries

In [5]:
from sentence_transformers import SentenceTransformer

In [6]:
import faiss

In [7]:
import numpy as np 

# Why?

SentenceTransformer: pre-trained embedding models

faiss: indexing and searching vectors

numpy: array operations (FAISS works on np.ndarray)

 # Step 3: Define your documents

In [8]:
documents = [
    "The capital of France is Paris.",
    "The Eiffel Tower is in Paris.",
    "Berlin is the capital of Germany.",
    "The Brandenburg Gate is in Berlin.",
    "Tokyo is the capital of Japan."
]


# Why?

This is our client knowledge base — 5 simple factual sentences.

In a real RAG pipeline, these would come from PDF, web scraping, or a database.

# Step 4: Generate embeddings for the documents

In [9]:
model = SentenceTransformer("all-MiniLM-L6-v2")

In [11]:
doc_embeddings =  model.encode(documents)

# Why?

We load a small, fast model: all-MiniLM-L6-v2

.encode() converts each document string into a dense vector (e.g., 384 dimensions).

Shape: (5, 384)



# Step 5: Initialize in FAISS

In [15]:
dimension = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension) # L2 = Euclidean

# Why?

dimension must match the embedding size (384 here).

IndexFlatL2 → flat (brute-force) index using L2 distance (Euclidean).

# Step 6: Add embeddings to FAISS

In [16]:
index.add(np.array(doc_embeddings))

# Why?

This stores the vectors in the FAISS index, ready for similarity search.

You can also add metadata if you build a more advanced system.




# Step 7: Query Emdedding

In [19]:
query = "Where is eiffel Tower?"
query_embedding = model.encode([query])


  # Why?

We convert the query string into an embedding — in the same vector space as the documents.

# Step 8: Search

In [None]:
k = 2 # top-2
distances,indices = index.search(np.array(query_embedding),k)

# Why?

index.search() takes the query vector and finds the k most similar document vectors.

Returns:

distances: array of distances (lower = more similar)

indices: array of indices (document positions)



# Step 9: Print results

In [21]:
for idx in indices[0]:
    print(documents[idx])
    

The Eiffel Tower is in Paris.
The capital of France is Paris.


# Why?

Maps the indices back to the original document text and prints.

# Summary:

You embedded documents → stored them in FAISS

Embedded query → searched top-k in FAISS

Retrieved the most relevant documents

# Interview-ready takeaway:
In a RAG pipeline, we use FAISS or similar vector databases to store embeddings of client knowledge base and retrieve top-k relevant documents based on query embeddings, before passing them to the LLM for final response generation.