Why is cosine similarity failing in modern embedding systems (RAG, LLMs, search)? #45793

ahsanshaokat · 2025-11-24T06:35:00Z

ahsanshaokat
Nov 24, 2025

Why is cosine similarity failing in modern embedding systems (RAG, LLMs, search)?

Cosine similarity has been the default similarity metric for almost 20 years.

And yes — it worked beautifully back when embeddings were:

• small (300 dimensions)
• from clean text
• from one domain
• short and consistent in length

But 2025 embeddings are totally different.
They are:

multi-domain
noisy
multi-scale (10 words → 400 words)
multi-modal
uneven in magnitude
generated by different models
Cosine similarity was never designed for this world.

❌ The Core Problem

Cosine similarity ignores magnitude completely.

It throws away information about:

whether a chunk is long or short
whether a vector is confident or noisy
whether a document is rich or empty
Cosine only cares about the direction of vectors.

This is why cosine behaves badly in Retrieval-Augmented Generation (RAG):

Example

Query: “What is the meaning of balance in the Quran?”

Chunk A (long, meaningful paragraph):
“The Quran emphasizes balance (Mizan) as a universal moral principle…”

Chunk B (short, noisy):
“Mizan = balance.”

Cosine says: “Both point the same way → same similarity!”

So the system often picks the noisy chunk and ignores the good one.
This leads to hallucination, unstable retrieval, and wrong context selection.

❌ Multi-Scale Collapse

Real embedding magnitudes vary widely:

short chunk → 3.1
long paragraph → 16.4
OCR text → 0.8
technical explanation → 22.7
Cosine erases this information.

The result:

short noisy text wins over long meaningful text
RAG quality drops
retrieval becomes unstable
cross-domain systems fail
This is the hidden crisis of similarity in modern AI.

✔ The Solution: The Mizan Balance Function

Instead of asking:

“Do these vectors point in the same direction?”

Mizan asks:

“Are these vectors balanced relative to their scale?”

It measures:

direction
proportional magnitude
relative confidence
balance
Mizan fixes cosine's biggest blind spot.

Short noisy vectors no longer outrank long informative vectors.

✔ Real Example

Let magnitudes be:

A = 10
B = 10
C = 2
Cosine:

cos(A, B) = 0.98
cos(A, C) = 0.97 → almost identical
Mizan:

M(A, B) ≈ 0.97
M(A, C) ≈ 0.61 → correctly penalized
This is exactly what RAG systems need.

✔ When to switch to Mizan

Use Mizan if your system contains:

✔ variable-length text
✔ OCR / noisy data
✔ multi-domain mixed corpora
✔ multi-model embeddings
✔ lengthy documents
✔ paragraph + sentence mixtures
✔ hallucination issues in RAG

Cosine is fine only for academic datasets and clean single-domain text.

✔ Final takeaway

Cosine was the right tool for 2015.
It is the wrong tool for 2025.

Mizan restores:

scale awareness
proportional balance
retrieval stability
semantic fairness
Cosine measures direction.
Mizan measures meaning.

This shift is essential for next-generation AI search and retrieval.

xiaofan-luan · 2025-11-24T08:39:59Z

xiaofan-luan
Nov 24, 2025
Maintainer

Don't think this really works becasue pretrained model already handled the difference.
but glad to see more results on evalaute in real world RAG systems

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why is cosine similarity failing in modern embedding systems (RAG, LLMs, search)? #45793

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why is cosine similarity failing in modern embedding systems (RAG, LLMs, search)? #45793

Uh oh!

ahsanshaokat Nov 24, 2025

Replies: 1 comment

Uh oh!

xiaofan-luan Nov 24, 2025 Maintainer

ahsanshaokat
Nov 24, 2025

xiaofan-luan
Nov 24, 2025
Maintainer