#FAISS

#What is FAISS?
FAISS (Facebook AI Similarity Search) is a fast library for nearest-neighbor search over dense vectors (embeddings). You use it to find “most similar items” quickly—core to RAG, semantic search, deduping, recommendations, etc.

#Why FAISS?

Speed & scale: Handles millions to billions of vectors.

Versatile: Many index types (exact/approximate), CPU/GPU.

Works with any embedding model: SBERT, OpenAI, Gemini (if/when embeddings available), etc.

#Key Concepts

>Vector/Embedding: Numeric representation of text/image/etc.

>Metric: How similarity is measured:

>>L2 (Euclidean distance)

>>Inner Product (dot product). With unit-normalized vectors, inner product ≈ cosine similarity.

>Index types (starter set):

>>IndexFlatL2 / IndexFlatIP: Exact search (simple & accurate; slower at big scale).

>>IVF* (Inverted File): Approximate; needs training; trades a little accuracy for much speed.

>>PQ/IVFPQ: Product Quantization for compression + speed (advanced).

#Usual Workflow
```
1.Create embeddings for your corpus (float32 vectors of same dimension).

2.Choose similarity (cosine/inner product or L2).

3.Build index (Flat for small; IVF/HNSW/PQ for big).

4.Add vectors (and store a mapping from id → metadata/document).

5.Search: embed the query → get top-k ids + scores → map back to documents.
```
#Simple Practical: Text Semantic Search with FAISS (Exact, Cosine)

We’ll:

>Use Sentence-Transformers to create text embeddings.

>Normalize vectors (so inner product = cosine).

>Build a Flat index with FAISS.

>Run a few searches.

💡 You don’t need any API keys for this practical.

1) Install deps

If you have a CUDA GPU and want GPU speed later: pip install faiss-gpu (optional).



In [1]:
!pip install faiss-cpu sentence-transformers numpy


Collecting faiss-cpu
  Downloading faiss_cpu-1.13.1-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (7.6 kB)
Downloading faiss_cpu-1.13.1-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (23.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m52.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.13.1


#2) Build a tiny corpus & embed

In [2]:
import numpy as np
from sentence_transformers import SentenceTransformer

# --- Sample corpus (you can replace with your own docs) ---
docs = [
    "FAISS is a library for efficient similarity search on dense vectors.",
    "Vector databases store embeddings to enable semantic search.",
    "Cosine similarity compares the angle between two vectors.",
    "FAISS supports CPU and GPU and multiple index types like Flat and IVF.",
    "Retrieval Augmented Generation (RAG) uses a retriever before a language model.",
    "Euclidean distance computes straight-line distance in vector space.",
    "Product Quantization compresses vectors for faster and cheaper search.",
    "Sentence-Transformers can create text embeddings easily."
]

# --- Load a small, fast embedding model ---
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# Create embeddings; normalize=True makes each vector unit length
embeddings = model.encode(docs, normalize_embeddings=True, convert_to_numpy=True)
embeddings = embeddings.astype("float32")  # faiss expects float32
embeddings.shape  # (num_docs, dim)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

(8, 384)

#3) Build a FAISS index (Inner Product for cosine)

In [3]:
import faiss

dim = embeddings.shape[1]
# Use Inner Product index; with unit vectors this is cosine similarity
index = faiss.IndexFlatIP(dim)

# Add vectors to the index
index.add(embeddings)

# Keep a mapping from FAISS ids -> document text (and any metadata you want)
id2doc = {i: doc for i, doc in enumerate(docs)}

print("Index total vectors:", index.ntotal)


Index total vectors: 8


#4) Search function

In [4]:
def search(query: str, k: int = 3):
    # Embed + normalize query
    q_emb = model.encode([query], normalize_embeddings=True, convert_to_numpy=True).astype("float32")
    # Search top-k
    # D = scores (inner product ~ cosine), I = indices
    D, I = index.search(q_emb, k)
    results = []
    for score, idx in zip(D[0], I[0]):
        results.append({
            "score": float(score),
            "doc_id": int(idx),
            "text": id2doc[int(idx)]
        })
    return results


#5) Try a few queries

In [5]:
for q in [
    "How do I do fast similarity search on embeddings?",
    "What is cosine similarity?",
    "Explain vector compression for search",
    "What is RAG?"
]:
    print(f"\nQuery: {q}")
    for r in search(q, k=3):
        print(f"  • score={r['score']:.3f} | {r['text']}")
#You’ll see the most semantically relevant lines surface with high scores.


Query: How do I do fast similarity search on embeddings?
  • score=0.699 | FAISS is a library for efficient similarity search on dense vectors.
  • score=0.592 | Vector databases store embeddings to enable semantic search.
  • score=0.453 | Product Quantization compresses vectors for faster and cheaper search.

Query: What is cosine similarity?
  • score=0.826 | Cosine similarity compares the angle between two vectors.
  • score=0.397 | FAISS is a library for efficient similarity search on dense vectors.
  • score=0.338 | Euclidean distance computes straight-line distance in vector space.

Query: Explain vector compression for search
  • score=0.677 | Product Quantization compresses vectors for faster and cheaper search.
  • score=0.540 | Vector databases store embeddings to enable semantic search.
  • score=0.502 | FAISS is a library for efficient similarity search on dense vectors.

Query: What is RAG?
  • score=0.399 | Retrieval Augmented Generation (RAG) uses a retriever before a 

#(Optional) Save & Load the index

In [6]:
# Save index to disk
faiss.write_index(index, "faiss_flat_ip.index")

# Later / elsewhere
index_loaded = faiss.read_index("faiss_flat_ip.index")


#Common Pitfalls & Tips
```
Cosine vs L2: For cosine, either (a) normalize vectors and use IndexFlatIP, or (b) use IndexFlatL2 without normalization (not cosine).

Types/dims: Ensure all vectors are float32 and same dimension.

Metadata: FAISS stores only vectors—you must keep your own id→doc mapping.

Reproducibility: Fix model/version; embeddings change if the model changes.

Scaling: Start with Flat; move to IVF/HNSW/PQ when data grows.
```