#### **What is a Vector Store in RAG?**
After converting documents or queries into numerical vectors (embeddings), we need to:
1. Store those vectors efficiently.
2. Search for the most similar vectors quickly during retrieval.

This is what a Vector Store (a.k.a. Vector Database) does.

#### **Vector Store = (Embeddings + Metadata + Index)**
A vector store contains:

| Component    | Description                                                                 |
| ------------ | --------------------------------------------------------------------------- |
| **Vectors**  | Dense arrays representing documents (from embedding model)                  |
| **Metadata** | Optional info like title, source, date, etc.                                |
| **Index**    | A data structure for fast similarity search (like FAISS, Annoy, HNSW, etc.) |

#### **How Vector Stores Work?**
1. Indexing
- After embedding, all document vectors are indexed using a fast approximate nearest neighbor (ANN) algorithm.
- Popular ANN algorithms: FAISS, HNSW, Annoy, ScaNN.

2. Similarity Search
- At query time, your question is embedded → vector.
- The store finds top-K most similar vectors to that query.
- Similarity is computed using cosine similarity, dot product, or Euclidean distance.

#### **Common Vector Stores in LangChain**

| Vector Store     | Type        | Speed       | Offline? | Notes                               |
| ---------------- | ----------- | ----------- | -------- | ----------------------------------- |
| **FAISS**        | Local index | ⚡ Fast      | ✅ Yes    | Most used for learning & local apps |
| **ChromaDB**     | Local + API | Medium      | ✅ Yes    | Lightweight DB with persistence     |
| **Weaviate**     | Server DB   | Fast        | ❌/✅      | Graph-like queries, strong metadata |
| **Pinecone**     | Cloud       | ⚡ Very fast | ❌        | Paid but powerful, scalable         |
| **Qdrant**       | Cloud/local | Fast        | ✅ Yes    | Great with filters and metadata     |
| **Milvus**       | Cloud/local | Fast        | ✅ Yes    | Used for large-scale prod search    |
| **Redis Vector** | Hybrid DB   | Fast        | ✅ Yes    | Combines key-value with vectors     |


In [1]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader

# Load and split
loader = TextLoader("quantum.txt")
docs = loader.load()
splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50)
splits = splitter.split_documents(docs)

# Embed
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Create vector store
vectorstore = FAISS.from_documents(splits, embeddings)

# Save to disk
vectorstore.save_local("my_faiss_index")

# Later: load and query
loaded_vs = FAISS.load_local(
    "my_faiss_index",
    embeddings,
    allow_dangerous_deserialization=True
)

docs = loaded_vs.similarity_search("Tell me about Greek letters", k=3)

for doc in docs:
    print(doc.page_content)


Created a chunk of size 468, which is longer than the specified 300
Created a chunk of size 387, which is longer than the specified 300
Created a chunk of size 406, which is longer than the specified 300
Created a chunk of size 331, which is longer than the specified 300
Created a chunk of size 348, which is longer than the specified 300
Created a chunk of size 307, which is longer than the specified 300
Created a chunk of size 411, which is longer than the specified 300
Created a chunk of size 303, which is longer than the specified 300
Created a chunk of size 397, which is longer than the specified 300
  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
  from .autonotebook import tqdm as notebook_tqdm


Wave-Particle Duality
One of the most profound revelations of quantum mechanics is that particles, such as electrons and photons, exhibit both wave-like and particle-like properties. This dual nature was highlighted in the famous double-slit experiment, where particles produce an interference pattern typical of waves when not observed, but behave like discrete particles when measured.
Quantum Tunneling
Quantum tunneling is the phenomenon where particles pass through energy barriers higher than their kinetic energy. This effect is critical in nuclear fusion in stars and the operation of tunnel diodes and scanning tunneling microscopes.
Quantum Mechanics: An Overview
Quantum mechanics is a fundamental branch of physics that describes nature at the smallest scales of energy levels of atoms and subatomic particles. Unlike classical mechanics, where objects have definite positions and velocities, quantum mechanics introduces a probabilistic framework. Particles are described by wave functio

#### **Nearest Neighbor Search**:
Given a query vector, nearest neighbor (NN) search aims to find the k vectors in a dataset that are most similar (nearest) to the query, according to a similarity measure.
**Similarity Measures**:
1. Cosine Similarity: Measures the cosine of the angle between vectors.
2. Euclidean Distance: Measures straight-line distance in vector space.
3. Dot Product / Inner Product: Common for normalized vectors (frequent in transformer embeddings).

#### **Approximate Nearest Neighbor (ANN) Search**:
ANN algorithms aim to approximate the nearest neighbors faster, especially in high-dimensional spaces (e.g., 768 or 1536 dimensions for BERT embeddings).

##### **LSH (Locality Sensitive Hashing)**:
Hash vectors into buckets such that similar vectors fall into the same bucket with high probability.
**How**:
- Use hash functions that preserve similarity.
- Only compare the query with vectors in the same bucket.
- Pros: Theoretically grounded, simple.
- Cons: Struggles with very high-dimensional vectors, precision trade-offs.

##### **Tree-Based Methods**:
- **KD-Tree**: Good for: Low dimensions (<30). Splits the space hierarchically along axis-aligned hyperplanes. Not efficient in high-dimensions due to the curse of dimensionality.
- **Ball Tree / Metric Tree**:Partitions the space using hyperspheres instead of hyperplanes. Better for some distance metrics.

##### **Graph-Based Methods**:
- **HNSW (Hierarchical Navigable Small World graphs)**: Currently the most popular ANN algorithm in vector DBs. Builds a multi-layer navigable small-world graph.Query: Starts from a random node and “navigates” to closer neighbors at each layer.Build time: High, but search is very fast and accurate. Used in: FAISS, Vespa, Weaviate, Qdrant. 

- **NSW (Navigable Small World)**: Earlier version of HNSW with a single-layer graph.

- **Product Quantization (PQ) and IVF (Inverted File Index) — Used in FAISS**: IVF: Clusters the dataset and searches only in a few relevant clusters.PQ: Compresses vectors using quantization, enabling efficient memory usage. Combo IVF+PQ: Used in FAISS to balance speed, accuracy, and memory.

- **ScaNN (Scalable Nearest Neighbors)**: Developed by Google. Uses asymmetric hashing, tree partitioning, and quantization. Highly optimized for TPUs and GPUs. Performs extremely well on large-scale datasets.


##### **Similarity Searches in Vector Databases**
When a query comes in:
- It's converted to an embedding vector (via an LLM or sentence encoder).
- That vector is passed to the vector DB.
- The vector DB uses ANN to search for similar vectors.
- The corresponding documents/chunks are retrieved and fed into the generation phase (e.g., with LLM).

**Popular Vector DBs**:
1. FAISS: Facebook’s library for vector search (CPU and GPU optimized).
2. Pinecone: Fully managed, cloud-native vector DB.
3. Weaviate: Feature-rich with built-in modules (text2vec, hybrid search).
4. Qdrant: Focuses on high performance and filtering.
5. Milvus: High-throughput vector database with scalability.
6. Chroma: Simpler, Pythonic vector store (used in many RAG demos).
