```{contents}
```
## Similarity Search 

### What Similarity Search Is

**Similarity search** is the process of finding items whose **vector embeddings are closest** to a query embedding in a vector space.

> Instead of matching exact words, similarity search matches **meaning**.

It is the **core retrieval mechanism** behind RAG, semantic search, and recommendation systems.

---

### Why Similarity Search Is Needed

Keyword search fails when:

* Synonyms are used
* Meaning is paraphrased
* Context matters more than exact words

Similarity search enables:

* Semantic retrieval
* Context-aware search
* Robust recall across phrasing variations

---

### Where Similarity Search Fits in RAG

```
User Query
   ↓
Query Embedding
   ↓
Similarity Search (Vector Store)
   ↓
Top-K Relevant Chunks
   ↓
LLM (Answer Generation)
```

Similarity search happens at **query time**.

---

### Core Idea (Geometric View)

* Each text → vector in high-dimensional space
* Similar meanings → vectors are close
* Search = find nearest vectors

```
distance(query_vector, doc_vector)
```

Smaller distance → higher similarity.

---

### Similarity Metrics (Critical)

### Cosine Similarity (Most Common)

Measures the **angle** between vectors.

* Scale-invariant
* Best for text embeddings

```text
similarity = cos(θ)
```

---

### Dot Product

* Faster
* Assumes normalized vectors
* Used by some embedding models

---

### Euclidean (L2) Distance

* Measures absolute distance
* More common in vision than text

---

### Similarity Search in LangChain

LangChain exposes similarity search through **vector stores**.

### Basic Demonstration



In [None]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

texts = [
    "LangChain is a framework for LLM applications",
    "RAG combines retrieval and generation",
    "Vector stores enable semantic search"
]

vectorstore = FAISS.from_texts(texts, embeddings)




---

### Performing Similarity Search



In [None]:
docs = vectorstore.similarity_search(
    "How does retrieval work?",
    k=2
)

for doc in docs:
    print(doc.page_content)


RAG combines retrieval and generation
Vector stores enable semantic search




Returns the **top-k most similar chunks**.

---

### What Happens Internally

1. Query text → embedding vector
2. Vector store computes distance to all stored vectors
3. ANN index narrows candidates
4. Top-k closest vectors are returned

---

### Similarity Search vs Retriever

| Aspect         | similarity_search | Retriever  |
| -------------- | ----------------- | ---------- |
| Level          | Low-level         | High-level |
| Output         | Documents         | Documents  |
| Filtering      | Manual            | Built-in   |
| Production use | Rare              | Preferred  |

**Best practice:** use retrievers in production.

---

### Similarity Search with Scores



In [None]:
docs_with_scores = vectorstore.similarity_search_with_score(
    "What is LangChain?",
    k=3
)

for doc, score in docs_with_scores:
    print(score, doc.page_content)


0.21958779 LangChain is a framework for LLM applications
0.5789856 Vector stores enable semantic search
0.59600055 RAG combines retrieval and generation




Lower score (or higher similarity) = better match.

Used for:

* Thresholding
* Debugging
* Reranking

---

### Similarity Search + Metadata Filtering



In [None]:
docs = vectorstore.similarity_search(
    "ticket escalation",
    k=5,
    filter={"source": "jira"}
)


Production-critical for:

* Multi-tenancy
* Access control
* Domain separation

---

### Similarity Search vs Keyword Search

| Feature          | Similarity Search | Keyword Search |
| ---------------- | ----------------- | -------------- |
| Semantic meaning | ✅                 | ❌              |
| Synonyms         | ✅                 | ❌              |
| Exact match      | ❌                 | ✅              |
| Explainability   | ❌                 | ✅              |

---

### Hybrid Search (Production Standard)

Most production systems use:

```
Keyword Search + Similarity Search → Merge → Rerank
```

Benefits:

* High recall
* High precision
* Robust to edge cases

---

### Similarity Search vs Reranking

| Component         | Purpose                  |
| ----------------- | ------------------------ |
| Similarity search | Fast candidate retrieval |
| Reranker          | Precise ordering         |

Similarity search is **coarse but fast**.

---

### Choosing `k` (Top-K)

Guidelines:

* Typical: `k = 3–10`
* Too small → misses context
* Too large → noise + latency

Production pattern:

* Retrieve top-k = 10
* Rerank to top-3
* Send to LLM

---

### Common Mistakes

#### No chunk overlap

❌ Similarity fails at boundaries

#### Large chunks

❌ Embeddings too coarse

#### High k values

❌ Slower + noisy context

#### Mixing embedding models

❌ Invalid similarity scores

---

### Best Practices

* Use consistent embedding model
* Split text properly before embedding
* Use metadata filters
* Tune k per use case
* Add reranking for precision

---

### Interview-Ready Summary

> “Similarity search finds semantically related content by comparing embedding vectors using distance metrics like cosine similarity. In LangChain, it is implemented via vector stores and is the core retrieval mechanism for RAG systems.”

---

### Rule of Thumb

* **Similarity search = meaning-based retrieval**
* **Vector store = engine**
* **Retriever = production interface**
* **Reranker = precision booster**
