# Vector Database Implementation: ChromaDB & FAISS

This notebook provides hands-on implementation of vector databases using **ChromaDB** and **FAISS**. We'll cover installation, creating collections, generating embeddings, inserting vectors, similarity search, filtering, and persistence.

## Table of Contents
1. [Introduction to Vector Databases](#introduction)
2. [ANN Algorithms Overview](#ann-algorithms)
3. [ChromaDB Implementation](#chromadb)
4. [FAISS Implementation](#faiss)
5. [Comparison & Best Practices](#comparison)

## 1. Introduction to Vector Databases <a id="introduction"></a>

Vector databases are specialized databases designed to store, index, and query high-dimensional vectors (embeddings). They are essential for:

- **Semantic search**: Finding similar documents based on meaning
- **Recommendation systems**: Finding similar items/users
- **RAG (Retrieval Augmented Generation)**: Providing context to LLMs
- **Image/Audio similarity**: Finding similar media files

### Key Concepts

| Concept | Description |
|---------|-------------|
| **Embedding** | A dense vector representation of data (text, image, etc.) |
| **Dimension** | The number of elements in an embedding vector (e.g., 384, 768, 1536) |
| **Distance Metric** | How similarity is measured (cosine, L2/Euclidean, inner product) |
| **ANN (Approximate Nearest Neighbor)** | Algorithms that trade some accuracy for speed |

## 2. ANN Algorithms Overview <a id="ann-algorithms"></a>

Exact nearest neighbor search has O(n) complexity, which is impractical for large datasets. ANN algorithms provide near-optimal results with much better performance.

### 2.1 HNSW (Hierarchical Navigable Small World)

HNSW builds a multi-layer graph structure where:
- **Upper layers**: Sparse connections for fast long-range navigation
- **Lower layers**: Dense connections for precise local search

```
Layer 2:  [A]---------------[D]
           |                 |
Layer 1:  [A]----[B]----[C]-[D]----[E]
           |      |      |   |      |
Layer 0:  [A]-[B]-[B]-[C]-[C]-[D]-[E]-[F]-[G]
```

**Key Parameters:**
- `M`: Number of connections per node (higher = more accurate, more memory)
- `ef_construction`: Size of dynamic candidate list during construction
- `ef_search`: Size of dynamic candidate list during search

**Characteristics:**
- ✅ Excellent query performance (O(log n))
- ✅ High recall (accuracy)
- ❌ Higher memory usage
- ❌ Slower index construction

### 2.2 IVF (Inverted File Index)

IVF partitions the vector space into clusters (Voronoi cells) using k-means:

```
          Centroid 1          Centroid 2          Centroid 3
              *                   *                   *
           /  |  \             /  |  \             /  |  \
          v1  v2  v3          v4  v5  v6          v7  v8  v9
```

**Key Parameters:**
- `nlist`: Number of clusters/cells
- `nprobe`: Number of clusters to search (higher = more accurate, slower)

**Characteristics:**
- ✅ Lower memory footprint
- ✅ Fast index construction
- ✅ Can be combined with Product Quantization (IVF-PQ)
- ❌ Lower recall than HNSW at same speed

### 2.3 Algorithm Comparison

| Algorithm | Build Time | Query Time | Memory | Recall | Best For |
|-----------|------------|------------|--------|--------|----------|
| HNSW | Slow | Fast | High | High | Real-time search, high accuracy |
| IVF | Fast | Medium | Low | Medium | Large datasets, memory constrained |
| IVF-PQ | Fast | Fast | Very Low | Lower | Billion-scale datasets |

## 3. ChromaDB Implementation <a id="chromadb"></a>

ChromaDB is an open-source embedding database designed for AI applications. It's simple to use and supports persistence out of the box.

In [None]:
# Installation
# !pip install chromadb sentence-transformers

In [None]:
import chromadb
from chromadb.utils import embedding_functions

# Create an in-memory client
client = chromadb.Client()

# Or create a persistent client (data survives restarts)
# client = chromadb.PersistentClient(path="./chroma_db")

print(f"ChromaDB version: {chromadb.__version__}")
print(f"Client type: {type(client).__name__}")

### 3.1 Creating Collections

In [None]:
# Create a collection with default embedding function
# ChromaDB uses all-MiniLM-L6-v2 by default (384 dimensions)
collection = client.create_collection(
    name="documents",
    metadata={"hnsw:space": "cosine"}  # Distance metric: cosine, l2, or ip
)

print(f"Collection created: {collection.name}")
print(f"Collection metadata: {collection.metadata}")

In [None]:
# Using a custom embedding function
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"  # 384 dimensions, fast and efficient
)

# Create collection with custom embedding function
custom_collection = client.get_or_create_collection(
    name="custom_embeddings",
    embedding_function=sentence_transformer_ef,
    metadata={
        "hnsw:space": "cosine",
        "hnsw:M": 16,  # Number of connections per node
        "hnsw:construction_ef": 100  # Construction-time search width
    }
)

print(f"Custom collection: {custom_collection.name}")

### 3.2 Inserting Documents with Embeddings

In [None]:
# Sample documents about programming languages
documents = [
    "Python is a high-level programming language known for its simplicity and readability.",
    "JavaScript is the language of the web, used for frontend and backend development.",
    "Rust provides memory safety without garbage collection, ideal for systems programming.",
    "Go is designed for simplicity and efficiency, great for cloud and network services.",
    "TypeScript adds static typing to JavaScript, improving code quality and maintainability.",
    "Java is a robust, platform-independent language widely used in enterprise applications.",
    "C++ offers high performance and is used in game development and system software.",
    "Kotlin is a modern language for Android development, fully interoperable with Java."
]

# Metadata for filtering
metadatas = [
    {"type": "interpreted", "paradigm": "multi", "year": 1991},
    {"type": "interpreted", "paradigm": "multi", "year": 1995},
    {"type": "compiled", "paradigm": "multi", "year": 2010},
    {"type": "compiled", "paradigm": "procedural", "year": 2009},
    {"type": "transpiled", "paradigm": "multi", "year": 2012},
    {"type": "compiled", "paradigm": "oop", "year": 1995},
    {"type": "compiled", "paradigm": "multi", "year": 1983},
    {"type": "compiled", "paradigm": "multi", "year": 2011}
]

# Unique IDs for each document
ids = [f"doc_{i}" for i in range(len(documents))]

# Add documents to collection (embeddings generated automatically)
collection.add(
    documents=documents,
    metadatas=metadatas,
    ids=ids
)

print(f"Added {collection.count()} documents to the collection")

In [None]:
# Adding documents with pre-computed embeddings
from sentence_transformers import SentenceTransformer

# Load embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Additional documents
new_docs = [
    "SQL is essential for database querying and data manipulation.",
    "HTML and CSS are the building blocks of web pages."
]

# Generate embeddings manually
embeddings = model.encode(new_docs).tolist()

# Add with pre-computed embeddings
collection.add(
    embeddings=embeddings,
    documents=new_docs,
    metadatas=[
        {"type": "query", "paradigm": "declarative", "year": 1974},
        {"type": "markup", "paradigm": "declarative", "year": 1993}
    ],
    ids=["doc_8", "doc_9"]
)

print(f"Total documents: {collection.count()}")
print(f"Embedding dimension: {len(embeddings[0])}")

### 3.3 Similarity Search

In [None]:
# Basic similarity search
results = collection.query(
    query_texts=["What programming language is good for web development?"],
    n_results=3
)

print("Query: What programming language is good for web development?\n")
print("Top 3 Results:")
for i, (doc, distance, metadata) in enumerate(zip(
    results['documents'][0], 
    results['distances'][0],
    results['metadatas'][0]
)):
    print(f"{i+1}. [Distance: {distance:.4f}] {doc}")
    print(f"   Metadata: {metadata}\n")

In [None]:
# Search with pre-computed query embedding
query_text = "systems programming with memory safety"
query_embedding = model.encode([query_text]).tolist()

results = collection.query(
    query_embeddings=query_embedding,
    n_results=3,
    include=["documents", "distances", "metadatas", "embeddings"]
)

print(f"Query: {query_text}\n")
for i, (doc, dist) in enumerate(zip(results['documents'][0], results['distances'][0])):
    print(f"{i+1}. [Distance: {dist:.4f}] {doc}")

### 3.4 Filtering with Metadata

In [None]:
# Filter by single condition
results = collection.query(
    query_texts=["modern programming language"],
    n_results=5,
    where={"type": "compiled"}  # Only compiled languages
)

print("Compiled languages matching 'modern programming language':")
for doc, meta in zip(results['documents'][0], results['metadatas'][0]):
    print(f"  - {doc[:60]}... ({meta})")

In [None]:
# Complex filtering with operators
results = collection.query(
    query_texts=["popular programming language"],
    n_results=5,
    where={
        "$and": [
            {"year": {"$gte": 2000}},  # Year >= 2000
            {"paradigm": {"$ne": "declarative"}}  # Not declarative
        ]
    }
)

print("Modern languages (year >= 2000, not declarative):")
for doc, meta in zip(results['documents'][0], results['metadatas'][0]):
    print(f"  - {meta['year']}: {doc[:50]}...")

In [None]:
# Filter by document content
results = collection.query(
    query_texts=["programming"],
    n_results=5,
    where_document={"$contains": "web"}  # Document must contain "web"
)

print("Documents containing 'web':")
for doc in results['documents'][0]:
    print(f"  - {doc}")

### 3.5 Updating and Deleting

In [None]:
# Update a document
collection.update(
    ids=["doc_0"],
    documents=["Python is a versatile, high-level language popular in AI, web development, and data science."],
    metadatas=[{"type": "interpreted", "paradigm": "multi", "year": 1991, "updated": True}]
)

# Verify update
result = collection.get(ids=["doc_0"])
print("Updated document:")
print(f"  Text: {result['documents'][0]}")
print(f"  Metadata: {result['metadatas'][0]}")

In [None]:
# Delete documents
print(f"Documents before delete: {collection.count()}")

# Delete by ID
collection.delete(ids=["doc_9"])

# Delete by filter
collection.delete(where={"type": "markup"})

print(f"Documents after delete: {collection.count()}")

### 3.6 Persistence

In [None]:
import os
import shutil

# Clean up any existing database
persist_path = "./chroma_persistent_db"
if os.path.exists(persist_path):
    shutil.rmtree(persist_path)

# Create persistent client
persistent_client = chromadb.PersistentClient(path=persist_path)

# Create and populate a collection
persistent_collection = persistent_client.create_collection("persistent_docs")
persistent_collection.add(
    documents=["This data will persist across sessions"],
    ids=["persistent_1"]
)

print(f"Data saved to: {persist_path}")
print(f"Directory contents: {os.listdir(persist_path)}")

In [None]:
# Simulate restarting - create new client with same path
del persistent_client  # Close the client

# Reconnect to persisted data
restored_client = chromadb.PersistentClient(path=persist_path)
restored_collection = restored_client.get_collection("persistent_docs")

# Verify data persisted
result = restored_collection.get(ids=["persistent_1"])
print(f"Restored document: {result['documents'][0]}")

# Cleanup
shutil.rmtree(persist_path)

## 4. FAISS Implementation <a id="faiss"></a>

FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It's highly optimized and supports GPU acceleration.

In [None]:
# Installation
# !pip install faiss-cpu  # CPU version
# !pip install faiss-gpu  # GPU version (requires CUDA)

In [None]:
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer

print(f"FAISS version: {faiss.__version__}")
print(f"Number of available threads: {faiss.omp_get_max_threads()}")

### 4.1 Creating Indices

FAISS offers multiple index types for different use cases:

In [None]:
# Sample data
documents = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with many layers.",
    "Natural language processing enables computers to understand text.",
    "Computer vision allows machines to interpret visual data.",
    "Reinforcement learning trains agents through rewards and penalties.",
    "Transfer learning reuses models trained on different tasks.",
    "Generative AI creates new content like text and images.",
    "Supervised learning uses labeled data for training."
]

# Generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(documents)

# Convert to float32 (FAISS requirement)
embeddings = np.array(embeddings).astype('float32')

print(f"Embeddings shape: {embeddings.shape}")
print(f"Dimension: {embeddings.shape[1]}")

In [None]:
# Flat Index (exact search, brute force)
dimension = embeddings.shape[1]

# L2 (Euclidean) distance
index_flat_l2 = faiss.IndexFlatL2(dimension)

# Inner Product (for cosine similarity, normalize vectors first)
index_flat_ip = faiss.IndexFlatIP(dimension)

print(f"Index type: {type(index_flat_l2).__name__}")
print(f"Is trained: {index_flat_l2.is_trained}")
print(f"Total vectors: {index_flat_l2.ntotal}")

### 4.2 Inserting Vectors

In [None]:
# Add vectors to the index
index_flat_l2.add(embeddings)

print(f"Vectors added: {index_flat_l2.ntotal}")

In [None]:
# For cosine similarity, normalize embeddings and use Inner Product
normalized_embeddings = embeddings.copy()
faiss.normalize_L2(normalized_embeddings)  # In-place normalization

index_flat_ip.add(normalized_embeddings)
print(f"Normalized vectors added: {index_flat_ip.ntotal}")

### 4.3 Similarity Search

In [None]:
# Basic search with L2 distance
query = "How do neural networks learn?"
query_embedding = model.encode([query]).astype('float32')

k = 3  # Number of nearest neighbors
distances, indices = index_flat_l2.search(query_embedding, k)

print(f"Query: {query}\n")
print("Top 3 results (L2 distance):")
for i, (idx, dist) in enumerate(zip(indices[0], distances[0])):
    print(f"{i+1}. [Index: {idx}, Distance: {dist:.4f}] {documents[idx]}")

In [None]:
# Cosine similarity search (using normalized vectors)
query_normalized = query_embedding.copy()
faiss.normalize_L2(query_normalized)

similarities, indices = index_flat_ip.search(query_normalized, k)

print(f"Query: {query}\n")
print("Top 3 results (Cosine similarity):")
for i, (idx, sim) in enumerate(zip(indices[0], similarities[0])):
    print(f"{i+1}. [Index: {idx}, Similarity: {sim:.4f}] {documents[idx]}")

### 4.4 ANN Indices (HNSW and IVF)

In [None]:
# HNSW Index
M = 16  # Number of connections per layer
index_hnsw = faiss.IndexHNSWFlat(dimension, M)

# Set construction parameters
index_hnsw.hnsw.efConstruction = 40  # Higher = more accurate, slower build
index_hnsw.hnsw.efSearch = 16  # Higher = more accurate, slower search

# Add vectors
index_hnsw.add(embeddings)

# Search
distances, indices = index_hnsw.search(query_embedding, k)

print("HNSW Index Results:")
for i, (idx, dist) in enumerate(zip(indices[0], distances[0])):
    print(f"{i+1}. [Distance: {dist:.4f}] {documents[idx]}")

In [None]:
# IVF Index (requires training)
nlist = 4  # Number of clusters (typically sqrt(n) to n/10)

# Create quantizer (for coarse search)
quantizer = faiss.IndexFlatL2(dimension)

# Create IVF index
index_ivf = faiss.IndexIVFFlat(quantizer, dimension, nlist)

# Train on data (learn cluster centroids)
print(f"Before training - is_trained: {index_ivf.is_trained}")
index_ivf.train(embeddings)
print(f"After training - is_trained: {index_ivf.is_trained}")

# Add vectors
index_ivf.add(embeddings)

# Set search parameter
index_ivf.nprobe = 2  # Number of clusters to search

# Search
distances, indices = index_ivf.search(query_embedding, k)

print("\nIVF Index Results:")
for i, (idx, dist) in enumerate(zip(indices[0], distances[0])):
    print(f"{i+1}. [Distance: {dist:.4f}] {documents[idx]}")

In [None]:
# IVF with Product Quantization (for memory efficiency)
m = 8  # Number of sub-vectors (dimension must be divisible by m)
nbits = 8  # Bits per sub-vector (typically 8)

# Adjust dimension to be divisible by m
# For 384-dim embeddings, m=8 works (384/8 = 48)

index_ivfpq = faiss.IndexIVFPQ(quantizer, dimension, nlist, m, nbits)

# Train
index_ivfpq.train(embeddings)
index_ivfpq.add(embeddings)
index_ivfpq.nprobe = 2

# Search
distances, indices = index_ivfpq.search(query_embedding, k)

print("IVF-PQ Index Results:")
for i, (idx, dist) in enumerate(zip(indices[0], distances[0])):
    print(f"{i+1}. [Distance: {dist:.4f}] {documents[idx]}")

### 4.5 Index Factory (Convenient Index Creation)

In [None]:
# FAISS Index Factory - create complex indices with a simple string
# Format: "preprocessing,coarse_quantizer,fine_quantizer"

# Examples:
index_strings = {
    "Flat": "Flat",  # Exact search
    "IVF100,Flat": "IVF100,Flat",  # IVF with 100 cells
    "IVF100,PQ8": "IVF100,PQ8",  # IVF + Product Quantization
    "HNSW32": "HNSW32",  # HNSW with M=32
    "HNSW32,Flat": "HNSW32,Flat",  # HNSW for coarse + Flat for fine
}

# Create using factory
index = faiss.index_factory(dimension, "IVF4,Flat")
index.train(embeddings)
index.add(embeddings)

print(f"Index created: {index}")
print(f"Total vectors: {index.ntotal}")

### 4.6 ID Mapping (Storing Custom IDs)

In [None]:
# FAISS returns indices 0 to n-1 by default
# Use IndexIDMap to store custom IDs

# Create base index
base_index = faiss.IndexFlatL2(dimension)

# Wrap with ID mapping
index_with_ids = faiss.IndexIDMap(base_index)

# Custom IDs (e.g., database primary keys)
custom_ids = np.array([1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008], dtype=np.int64)

# Add with custom IDs
index_with_ids.add_with_ids(embeddings, custom_ids)

# Search returns custom IDs
distances, indices = index_with_ids.search(query_embedding, k)

print("Results with custom IDs:")
for i, (idx, dist) in enumerate(zip(indices[0], distances[0])):
    doc_index = list(custom_ids).index(idx)  # Map back to document
    print(f"{i+1}. [ID: {idx}, Distance: {dist:.4f}] {documents[doc_index]}")

### 4.7 Persistence (Saving and Loading)

In [None]:
import os

# Save index to disk
index_path = "./faiss_index.bin"
faiss.write_index(index_flat_l2, index_path)

print(f"Index saved to: {index_path}")
print(f"File size: {os.path.getsize(index_path) / 1024:.2f} KB")

In [None]:
# Load index from disk
loaded_index = faiss.read_index(index_path)

print(f"Loaded index: {type(loaded_index).__name__}")
print(f"Total vectors: {loaded_index.ntotal}")

# Verify search works
distances, indices = loaded_index.search(query_embedding, k)
print(f"\nSearch result indices: {indices[0]}")

# Cleanup
os.remove(index_path)

### 4.8 Filtering in FAISS

FAISS doesn't have built-in metadata filtering like ChromaDB. Here are common approaches:

In [None]:
# Approach 1: Pre-filtering with IDSelector
# Create metadata store
metadata = {
    0: {"category": "ml", "year": 2020},
    1: {"category": "dl", "year": 2021},
    2: {"category": "nlp", "year": 2019},
    3: {"category": "cv", "year": 2022},
    4: {"category": "rl", "year": 2020},
    5: {"category": "ml", "year": 2018},
    6: {"category": "genai", "year": 2023},
    7: {"category": "ml", "year": 2015}
}

# Filter IDs based on metadata
def get_filtered_ids(metadata, filter_fn):
    """Return IDs that match the filter function."""
    return np.array([k for k, v in metadata.items() if filter_fn(v)], dtype=np.int64)

# Get IDs for documents after 2019
valid_ids = get_filtered_ids(metadata, lambda x: x['year'] >= 2020)
print(f"IDs with year >= 2020: {valid_ids}")

In [None]:
# Approach 2: Post-filtering (over-fetch and filter)
def search_with_filter(index, query_vec, k, metadata, filter_fn, fetch_k=None):
    """
    Search with post-filtering.
    
    Args:
        index: FAISS index
        query_vec: Query embedding
        k: Number of results to return
        metadata: Metadata dictionary
        filter_fn: Function to filter metadata
        fetch_k: Number of candidates to fetch (default: k * 4)
    """
    if fetch_k is None:
        fetch_k = min(k * 4, index.ntotal)
    
    # Fetch more candidates than needed
    distances, indices = index.search(query_vec, fetch_k)
    
    # Filter results
    results = []
    for idx, dist in zip(indices[0], distances[0]):
        if idx in metadata and filter_fn(metadata[idx]):
            results.append((idx, dist))
            if len(results) == k:
                break
    
    return results

# Search for ML category only
results = search_with_filter(
    index_flat_l2, 
    query_embedding, 
    k=2, 
    metadata=metadata,
    filter_fn=lambda x: x['category'] == 'ml'
)

print("Filtered results (category='ml'):")
for idx, dist in results:
    print(f"  [ID: {idx}, Distance: {dist:.4f}] {documents[idx]}")

In [None]:
# Approach 3: Separate indices per category (for large-scale filtering)
def create_category_indices(embeddings, metadata):
    """Create separate FAISS index for each category."""
    category_indices = {}
    category_id_maps = {}
    
    # Group by category
    for idx, meta in metadata.items():
        cat = meta['category']
        if cat not in category_indices:
            category_indices[cat] = []
            category_id_maps[cat] = []
        category_indices[cat].append(embeddings[idx])
        category_id_maps[cat].append(idx)
    
    # Create indices
    faiss_indices = {}
    for cat, vectors in category_indices.items():
        vectors = np.array(vectors).astype('float32')
        index = faiss.IndexFlatL2(dimension)
        index.add(vectors)
        faiss_indices[cat] = {
            'index': index,
            'id_map': category_id_maps[cat]
        }
    
    return faiss_indices

# Create category-specific indices
cat_indices = create_category_indices(embeddings, metadata)
print(f"Categories: {list(cat_indices.keys())}")

# Search only in 'ml' category
ml_index = cat_indices['ml']['index']
ml_id_map = cat_indices['ml']['id_map']

distances, indices = ml_index.search(query_embedding, k=2)

print("\nSearch in 'ml' category:")
for idx, dist in zip(indices[0], distances[0]):
    original_id = ml_id_map[idx]
    print(f"  [Original ID: {original_id}] {documents[original_id]}")

## 5. Comparison & Best Practices <a id="comparison"></a>

### Feature Comparison

| Feature | ChromaDB | FAISS |
|---------|----------|-------|
| **Ease of Use** | ⭐⭐⭐⭐⭐ Simple API | ⭐⭐⭐ Lower-level |
| **Built-in Embeddings** | ✅ Yes | ❌ No |
| **Metadata Filtering** | ✅ Native | ❌ Manual |
| **Persistence** | ✅ Built-in | ✅ Manual save/load |
| **GPU Support** | ❌ No | ✅ Yes |
| **Scalability** | Medium (millions) | High (billions) |
| **Memory Efficiency** | Medium | High (with PQ) |
| **Index Types** | HNSW only | Many options |
| **Best For** | RAG, prototyping | Production, large-scale |

### Best Practices

#### Embedding Generation
1. **Normalize embeddings** for cosine similarity
2. **Batch processing** for efficiency
3. **Use appropriate models** for your domain (e.g., `bge-large` for retrieval)

#### Index Selection
1. **< 10K vectors**: Use Flat index (exact search)
2. **10K - 1M vectors**: Use HNSW or IVF
3. **> 1M vectors**: Use IVF-PQ or HNSW with careful tuning

#### Performance Tuning
```python
# HNSW tuning
index_hnsw.hnsw.efSearch = 64  # Increase for better recall

# IVF tuning
index_ivf.nprobe = 10  # Increase for better recall (default: 1)
```

#### Memory Management
1. Use **float16** if precision allows
2. Use **Product Quantization** for large datasets
3. **Shard indices** across machines for very large datasets

In [None]:
# Performance comparison example
import time

def benchmark_search(index, query, n_queries=100, k=10):
    """Benchmark search performance."""
    start = time.time()
    for _ in range(n_queries):
        index.search(query, k)
    elapsed = time.time() - start
    return elapsed / n_queries * 1000  # ms per query

# Create different index types
indices = {
    'Flat (Exact)': faiss.IndexFlatL2(dimension),
    'HNSW (M=16)': faiss.IndexHNSWFlat(dimension, 16),
    'IVF (nlist=4)': faiss.index_factory(dimension, "IVF4,Flat")
}

# Train and add vectors
for name, index in indices.items():
    if hasattr(index, 'train') and not index.is_trained:
        index.train(embeddings)
    index.add(embeddings)

# Benchmark
print("Search Performance (ms per query):")
print("-" * 40)
for name, index in indices.items():
    ms = benchmark_search(index, query_embedding, n_queries=100)
    print(f"{name:20s}: {ms:.4f} ms")

### Summary

**Choose ChromaDB when:**
- Building RAG applications quickly
- Need built-in embedding generation
- Require metadata filtering
- Dataset is < 1M vectors

**Choose FAISS when:**
- Need maximum performance
- Working with very large datasets (billions)
- Need GPU acceleration
- Want fine-grained control over indexing

**Hybrid Approach:**
Many production systems use FAISS as the underlying index with a wrapper that provides ChromaDB-like features (metadata, persistence, etc.).