```{contents}
```

## Embedding Model Drift


**Embedding Model Drift** happens when the embedding model changes over time, causing embeddings generated **at different times** to no longer be consistent with one another.

This breaks retrieval systems because:

```
query_embedding (new model)
≠
stored_document_embeddings (old model)
```

So semantic search and RAG stop working correctly—even though no code or data changed.

---

### Why Embedding Drift Happens

#### Drift occurs when:

1. **Model provider updates the embedding model**

   * OpenAI updates `text-embedding-ada-002`
   * HuggingFace updates SBERT
   * Jina / E5 modify projection layers

2. **Tokenizers change**

   * vocabulary updated
   * merging rules changed

3. **Quantization differences (FP32 → FP16 → INT8)**

   * embedding distribution shifts

4. **Hardware differences**

   * GPU vs CPU inference
   * mixed precision

5. **Different embedding version used accidentally**

   * Index created with model_v1
   * Queries encoded with model_v2

---

###  **Why Drift is a Critical Problem in Generative AI**

Because RAG depends on **consistent embedding spaces**.

If embeddings drift:

* relevant documents no longer match queries
* retrieval quality collapses
* hallucinations increase
* irrelevant documents appear in LLM responses
* semantic search becomes random

This leads to:

> “Why did retrieval suddenly stop working although nothing changed?”

This is a classic **silent failure**.

---

### Example of Embedding Drift (Intuition)

Suppose you indexed documents using:

```
text-embedding-3-small (Version A)
```

Months later, OpenAI updates the embedding model silently.

Now your query uses:

```
text-embedding-3-small (Version B)
```

But Version A and Version B produce vectors like:

| Text              | Old Embedding    | New Embedding       |
| ----------------- | ---------------- | ------------------- |
| "LLMs in finance" | [0.1, -0.2, 0.3] | [0.88, -0.11, 0.42] |

They are **not in the same space**, so cosine similarity becomes meaningless.

---

###  **Effects of Embedding Drift**

#### 1. **Retrieval Quality Drops**

* relevant docs no longer match
* RAG quality collapses
* more hallucinations

#### 2. **Queries Return Wrong Results**

User asks:

```
"What is GDPR?"
```

Vector DB returns:

```
unrelated chunk about JSON formatting
```

#### 3. **Chunking and ranking break**

Because embeddings cluster differently over time.

#### 4. **Hard to detect**

No errors in logs
No failures in vector DB
But system behaves incorrectly

---

### How to Detect Embedding Drift

#### **1. Sudden drop in RAG accuracy**

LLM starts hallucinating or retrieving irrelevant chunks.

#### **2. Embedding statistics change**

Mean and variance shift.

#### **3. Cosine similarity distributions shift**

Compare new embeddings vs old.

#### **4. Cluster drift**

Stored vectors cluster differently over time.

---

### How to Prevent Embedding Drift

#### ✔ **1. Embed Everything with the same model version**

Always store:

* embedding model name
* version
* embedding dimension
* tokenizer version

Example metadata:

```
"model_name": "text-embedding-3-large",
"model_sha": "f814ae1",
"dim": 3072
```

#### ✔ **2. Disable automatic updates**

Pin the version:

```
model="text-embedding-3-large@v0.1"
```

#### ✔ **3. Re-embed documents whenever model changes**

This is mandatory.

#### ✔ **4. Store embedding metadata per vector**

Vector DBs like Pinecone, Qdrant support metadata fields:

```
{ id: "...", vector: [...], meta: {"embedding_model": "v1"} }
```

#### ✔ **5. Use Embedding Registries**

Some companies maintain:

* embedding registry
* version tags
* migration logs

---

### How Companies Handle Drift

#### Enterprise RAG systems always include:

* **embedding versioning**
* **index versioning**
* **re-embedding pipelines**
* **A/B alignment tests**

---

### Simple Mental Model

Think of embeddings as coordinates on a map.

If the map changes:

* your stored coordinates point to wrong locations
* navigation breaks

That is embedding drift.

---

### Final Summary

| Concept              | Explanation                                                          |
| -------------------- | -------------------------------------------------------------------- |
| **Embedding Drift**  | Embeddings change over time → stored vectors no longer match queries |
| **Why it breaks AI** | RAG retrieval fails, hallucinations increase                         |
| **Causes**           | model updates, tokenizer changes, quantization, hardware differences |
| **Prevention**       | versioning, re-embedding, model pinning                              |
| **Impact**           | silent failure of semantic search/vector retrieval                   |

