# 🔍 FAISS Vector Search Tutorial

A comprehensive guide to building high-performance local vector search using Facebook AI Similarity Search (FAISS) and sentence transformers. This tutorial demonstrates how to create efficient similarity searches without relying on cloud services.

## 📋 Prerequisites

Before starting, ensure you have:
- Python 3.7+ installed
- Required packages: `faiss-cpu` (or `faiss-gpu`), `sentence-transformers`, `numpy`

```bash
pip install faiss-cpu sentence-transformers numpy
```

> **💡 Note:** Use `faiss-gpu` instead of `faiss-cpu` if you have CUDA-compatible GPU for better performance.


## 🚀 Implementation

### 1. Import Required Libraries

Import FAISS for vector indexing, NumPy for array operations, and sentence-transformers for generating embeddings.

In [1]:
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer

> **📚 Library Overview:**
> - **FAISS**: Facebook's library for efficient similarity search
> - **NumPy**: Essential for numerical computations and array handling
> - **SentenceTransformers**: Converts text to meaningful vector embeddings

### 2. Sample Documents Collection

Define the same diverse set of documents we used in the Pinecone tutorial for consistency and comparison purposes.

In [2]:
# Using the same documents from Part 1
documents = [
    "The quick brown fox jumps over the lazy dog",
    "Artificial intelligence is transforming technology",
    "Python is a popular programming language",
    "Machine learning models require large datasets",
    "Vector databases enable fast similarity search",
    "Natural language processing analyzes text data",
    "Deep learning uses neural networks",
    "Data science combines statistics and programming",
    "Cloud computing provides scalable infrastructure",
    "Software development involves writing code"
]

> **🔄 Consistency**: Using identical documents allows for direct performance comparison between FAISS and Pinecone approaches.

### 3. Model Initialization and Embedding Generation

Load the sentence transformer model and convert all documents into high-dimensional vector representations.


In [3]:
# Load model and generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(documents)

> **🤖 Model Benefits:**
> - **Lightweight**: Fast inference with good quality
> - **Versatile**: Works well for various text similarity tasks
> - **Consistent**: Same model as Pinecone tutorial for fair comparison

### 4. NumPy Array Conversion and Preprocessing

Convert embeddings to NumPy format with proper data types as required by FAISS for optimal performance.

In [4]:
# Convert to numpy array (FAISS requirement)
embeddings_np = np.array(embeddings).astype('float32')
print(f"Embeddings shape: {embeddings_np.shape}")

Embeddings shape: (10, 384)


> **⚡ Performance Note:** FAISS requires `float32` format for optimal memory usage and computation speed.

### 5. FAISS Index Creation and Configuration

Create a FAISS index using Inner Product similarity, which is equivalent to cosine similarity when vectors are normalized

In [5]:
# Create FAISS index
dimension = embeddings_np.shape[1]
index = faiss.IndexFlatIP(dimension)  # Inner Product (cosine similarity)


> **🔧 Index Types:**
> - **IndexFlatIP**: Exact search using inner product
> - **IndexFlatL2**: Exact search using L2 (Euclidean) distance
> - **IndexIVFFlat**: Approximate search for large datasets

### 6. Vector Normalization for Cosine Similarity

Normalize vectors to unit length so that inner product becomes equivalent to cosine similarity.

In [6]:
# Normalize vectors for cosine similarity
faiss.normalize_L2(embeddings_np)

> **📐 Mathematical Note:** After L2 normalization, inner product = cosine similarity, enabling consistent results with Pinecone.

### 7. Add Vectors to Index

Insert all normalized embeddings into the FAISS index for fast retrieval.

In [7]:
# Add vectors to index
index.add(embeddings_np)
print(f"Added {index.ntotal} vectors to FAISS index")

Added 10 vectors to FAISS index


> **💾 Storage**: FAISS stores vectors in memory for ultra-fast access during search operations.

### 8. Query Processing and Search Execution

Convert the search query to an embedding and perform similarity search to find the most relevant documents.

In [8]:
# Search query
query = "What is AI and machine learning?"
query_embedding = model.encode([query])
query_np = np.array(query_embedding).astype('float32')
faiss.normalize_L2(query_np)

# Search
k = 3  # number of results
scores, indices = index.search(query_np, k)

print(f"\nFAISS Query: {query}")
print("\nTop 3 results:")
for i, (score, idx) in enumerate(zip(scores[0], indices[0])):
    print(f"{i+1}. Score: {score:.3f}")
    print(f"   Document: {documents[idx]}")
    print()


FAISS Query: What is AI and machine learning?

Top 3 results:
1. Score: 0.531
   Document: Artificial intelligence is transforming technology

2. Score: 0.438
   Document: Deep learning uses neural networks

3. Score: 0.365
   Document: Data science combines statistics and programming



> **🎯 Search Process:**
> 1. Encode query using same model
> 2. Normalize query vector
> 3. Find k nearest neighbors
> 4. Return similarity scores and document indices

---

## 🎯 Expected Output

When you run this code, you should see output similar to:

```
Embeddings shape: (10, 384)
✅ Added 10 vectors to FAISS index

🔍 FAISS Query: 'What is AI and machine learning?'

🏆 Top 3 most similar documents:
--------------------------------------------------
1. 📊 Similarity Score: 0.689
   📄 Document: Artificial intelligence is transforming technology

2. 📊 Similarity Score: 0.623
   📄 Document: Machine learning models require large datasets

3. 📊 Similarity Score: 0.521
   📄 Document: Deep learning uses neural networks
```

---

## ⚡ Performance Comparison: FAISS vs Pinecone

| Aspect | FAISS | Pinecone |
|--------|-------|----------|
| **Speed** | Ultra-fast (local) | Fast (network latency) |
| **Cost** | Free | Pay-per-use |
| **Scalability** | Limited by RAM | Virtually unlimited |
| **Setup** | Simple pip install | Account + API key |
| **Persistence** | Manual save/load | Automatic |
| **Deployment** | Local/self-hosted | Managed service |

---

## 🔧 Advanced FAISS Features

### 1. Saving and Loading Index

Persist your FAISS index to disk for reuse:


In [9]:
# Save index to disk
faiss.write_index(index, "documents.index")

# Load index from disk
loaded_index = faiss.read_index("documents.index")

### 2. Approximate Search for Large Datasets

For millions of vectors, use approximate search:

In [10]:
# Create approximate index for large datasets
nlist = 100  # number of clusters
quantizer = faiss.IndexFlatIP(dimension)
index_ivf = faiss.IndexIVFFlat(quantizer, dimension, nlist)

# Train the index (required for IVF)
index_ivf.train(embeddings_np)
index_ivf.add(embeddings_np)

# Set search parameters
index_ivf.nprobe = 10  # number of clusters to search


RuntimeError: Error in void __cdecl faiss::Clustering::train_encoded(idx_t, const uint8_t *, const Index *, Index &, const float *) at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\Clustering.cpp:279: Error: 'nx >= k' failed: Number of training points (10) should be at least as large as number of clusters (100)

### 3. GPU Acceleration

Leverage GPU for faster computations:

In [11]:
# Move index to GPU (requires faiss-gpu)
import faiss
res = faiss.StandardGpuResources()
gpu_index = faiss.index_cpu_to_gpu(res, 0, index)

AttributeError: module 'faiss' has no attribute 'StandardGpuResources'

### 4. Batch Search

Search multiple queries simultaneously:

In [12]:
# Multiple queries
queries = ["AI and ML", "Python programming", "Cloud services"]
query_embeddings = model.encode(queries)
query_np = np.array(query_embeddings).astype('float32')
faiss.normalize_L2(query_np)

# Batch search
scores, indices = index.search(query_np, k=3)

## 🛠️ Troubleshooting

| Issue | Solution |
|-------|----------|
| Import Error | Install correct FAISS version: `pip install faiss-cpu` |
| Wrong Dimensions | Ensure query embedding matches index dimensions |
| Poor Results | Check if vectors are properly normalized |
| Memory Issues | Use approximate indices for large datasets |
| Slow Performance | Consider GPU acceleration or index optimization |

---

## 📊 Index Types Comparison

| Index Type | Use Case | Speed | Memory | Accuracy |
|------------|----------|-------|---------|----------|
| `IndexFlatIP` | Small datasets (<1M) | Fast | High | 100% |
| `IndexFlatL2` | Small datasets | Fast | High | 100% |
| `IndexIVFFlat` | Medium datasets (1M-10M) | Medium | Medium | ~99% |
| `IndexIVFPQ` | Large datasets (>10M) | Fast | Low | ~95% |
| `IndexHNSW` | Real-time search | Very Fast | Medium | ~99% |


## 📚 Additional Resources

- [FAISS Documentation](https://github.com/facebookresearch/faiss)
- [FAISS Tutorial](https://github.com/facebookresearch/faiss/wiki/Getting-started)
- [Performance Benchmarks](https://github.com/facebookresearch/faiss/wiki/Benchmarks)
- [Index Selection Guide](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)

---

## 🎉 Next Steps

1. **Scale Testing**: Benchmark with larger document collections
2. **Index Optimization**: Experiment with different FAISS index types
3. **Production Deploy**: Implement as a REST API service
4. **Hybrid Search**: Combine with traditional keyword search
5. **Real-time Updates**: Handle dynamic document additions/deletions

---

## 💡 Pro Tips

- **Memory Management**: Use `IndexIVFPQ` for large datasets to reduce memory usage
- **Search Speed**: Tune `nprobe` parameter for optimal speed vs accuracy trade-off
- **Batch Processing**: Process multiple queries together for better throughput
- **Monitoring**: Track search latency and memory usage in production

Happy local vector searching! 🚀