# Agentic RAG
#


### Sparse vs Dense Retrieval

<img src="sparse_vs_dense_retrieval.png" alt="Sparse vs Dense Retrieval" width="800">


## Dense Retrieval using DPR + FAISS
This notebook demonstrates how to encode passages and queries using DPR and perform semantic search using FAISS.

<img src="dpr_matching.gif" alt="DPR" width="800">

<img src="DPR.png" alt="DPR" width="800">

In [2]:
# Install required libraries (uncomment if needed)
!pip install sentence-transformers faiss-cpu

Collecting sentence-transformers
  Using cached sentence_transformers-5.1.0-py3-none-any.whl.metadata (16 kB)
Collecting faiss-cpu
  Using cached faiss_cpu-1.12.0-cp313-cp313-win_amd64.whl.metadata (5.2 kB)
Collecting transformers<5.0.0,>=4.41.0 (from sentence-transformers)
  Using cached transformers-4.56.1-py3-none-any.whl.metadata (42 kB)
Collecting torch>=1.11.0 (from sentence-transformers)
  Using cached torch-2.8.0-cp313-cp313-win_amd64.whl.metadata (30 kB)
Collecting scikit-learn (from sentence-transformers)
  Downloading scikit_learn-1.7.1-cp313-cp313-win_amd64.whl.metadata (11 kB)
Collecting huggingface-hub>=0.20.0 (from sentence-transformers)
  Using cached huggingface_hub-0.34.4-py3-none-any.whl.metadata (14 kB)
Collecting filelock (from transformers<5.0.0,>=4.41.0->sentence-transformers)
  Downloading filelock-3.19.1-py3-none-any.whl.metadata (2.1 kB)
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers<5.0.0,>=4.41.0->sentence-transformers)
  Using cached tokenizers-0


[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

## Load DPR Question Encoder
We load a pretrained Dense Passage Retrieval (DPR) model from Hugging Face.

In [5]:
model = SentenceTransformer("facebook-dpr-question_encoder-single-nq-base")

## Define Knowledge Base (Passages)
These are example sentences for the search. In real-world scenarios, this can be thousands of documents.

In [7]:
passages = [
    "The capital of France is Paris.",
    "The heart pumps blood throughout the body.",
    "Transformers are neural network architectures designed for sequence modeling."
]

## Encode Passages
Convert textual passages into vector embeddings for similarity search.

In [8]:
passage_embeddings = model.encode(passages, convert_to_numpy=True, normalize_embeddings=True)

In [18]:
len(passage_embeddings[0])

768

In [9]:
passage_embeddings.shape

(3, 768)

## Build FAISS Index
We create an index for fast similarity search using inner product (cosine similarity after normalization).

In [10]:
dim = passage_embeddings.shape[1]
index = faiss.IndexFlatIP(dim)
index.add(passage_embeddings)

In [11]:
dim

768

## Encode Query
Convert the user query into the same embedding space as the passages.

In [22]:
query = "capital of France?"
query_embedding = model.encode([query], convert_to_numpy=True, normalize_embeddings=True)

In [23]:
len(query_embedding[0])

768

In [24]:
query_embedding

array([[ 7.12551409e-03, -2.83812098e-02,  6.57091569e-03,
         1.19922059e-02,  1.92441717e-02,  2.37422273e-03,
         8.36352073e-03,  3.67608033e-02, -4.14550230e-02,
        -3.34225520e-02, -3.29218060e-02, -1.47995707e-02,
        -1.46289729e-02, -2.89078220e-03,  1.16086109e-02,
        -8.96721706e-03,  3.18435743e-03, -2.09358055e-02,
         2.79322341e-02,  5.99473342e-03, -1.00507978e-02,
         2.35777888e-02, -1.93492770e-02, -2.20410768e-02,
         2.13476382e-02, -2.90830880e-02,  7.45838275e-04,
         8.57227016e-03, -6.54399348e-03, -1.01682532e-03,
        -2.11749319e-02, -6.14606775e-03, -1.29680922e-02,
        -2.66288244e-03,  2.17304714e-02,  1.52921341e-02,
         9.15241800e-03, -9.37296078e-03, -2.28689779e-02,
        -1.24029201e-02, -1.05236601e-02, -3.55460420e-02,
         2.86429524e-02, -1.35557503e-02, -2.75748149e-02,
         7.50875624e-04,  1.48987584e-02,  5.22010326e-02,
         1.13688651e-02,  1.44195333e-02,  3.61152925e-0

## Search Top-k Similar Passages
Retrieve the most relevant passages for the query.

In [25]:
k = 2
D, I = index.search(query_embedding, k)

In [26]:
D, I

(array([[0.9066088, 0.6765148]], dtype=float32), array([[0, 1]]))

In [24]:
D[0][0]

np.float32(0.94370985)

## Display Results
Print the top-k passages along with their similarity scores.

In [27]:
print("Query:", query)
for rank, idx in enumerate(I[0]):
    print(f"Rank {rank+1}: {passages[idx]} (score={D[0][rank]:.4f})")

Query: capital of France?
Rank 1: The capital of France is Paris. (score=0.9066)
Rank 2: The heart pumps blood throughout the body. (score=0.6765)
