# Hybrid RAG Demo: FAISS + BM25


This notebook demonstrates a **Hybrid RAG** pipeline that uses:
- **BM25** for lexical matching (exact word matches)
- **FAISS** for semantic similarity using Sentence Transformers
- **Rank Fusion** to combine both methods for better retrieval quality


In [1]:

!pip install rank_bm25 faiss-cpu sentence-transformers langchain


Collecting rank_bm25
  Downloading rank_bm25-0.2.2-py3-none-any.whl.metadata (3.2 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from 

In [None]:

from rank_bm25 import BM25Okapi
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import random


In [3]:

corpus = [
    "I want to lose weight and eat better.",
    "Sometimes I eat fast food because it's convenient.",
    "I’ve tried dieting before but gave up quickly.",
    "I know I should exercise more but I feel too tired.",
    "My family habits make it hard to change.",
    "I want to be healthier for my kids.",
    "Obesity is affecting my self-confidence.",
    "Walking is one thing I enjoy doing.",
    "I feel judged when I talk about my weight.",
    "I want to feel good about my body."
]


In [4]:

tokenized_corpus = [doc.lower().split() for doc in corpus]
bm25 = BM25Okapi(tokenized_corpus)


In [5]:

model = SentenceTransformer("all-MiniLM-L6-v2")
doc_embeddings = model.encode(corpus, convert_to_numpy=True)

index = faiss.IndexFlatL2(doc_embeddings.shape[1])
index.add(doc_embeddings)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [6]:

def hybrid_retrieve(query, k=3, alpha=0.5):
    query_embedding = model.encode([query], convert_to_numpy=True)
    D, I = index.search(query_embedding, k)

    bm25_scores = bm25.get_scores(query.lower().split())
    bm25_ranks = np.argsort(bm25_scores)[::-1]

    bm25_top = bm25_ranks[:k]
    faiss_top = I[0]

    # Normalize and fuse ranks
    score_dict = {}
    for idx in bm25_top:
        score_dict[idx] = score_dict.get(idx, 0) + alpha * bm25_scores[idx]
    for i, idx in enumerate(faiss_top):
        score_dict[idx] = score_dict.get(idx, 0) + (1 - alpha) * (1 - D[0][i])

    ranked = sorted(score_dict.items(), key=lambda x: x[1], reverse=True)
    return [(corpus[idx], score) for idx, score in ranked[:k]]


In [7]:

query = "I want to stop eating junk food"
top_docs = hybrid_retrieve(query)
for i, (doc, score) in enumerate(top_docs, 1):
    print(f"{i}. {doc} (score={score:.4f})")


1. Sometimes I eat fast food because it's convenient. (score=1.1300)
2. I want to lose weight and eat better. (score=0.8675)
3. I want to feel good about my body. (score=0.7720)



This notebook shows how to:
- Combine **BM25** lexical scores with **FAISS** semantic similarity
- Run a rank fusion for hybrid retrieval
- Adapt to your RAG pipeline by replacing corpus with real data
