Skip to content

Add MNEMON_EMBED_DIMENSIONS for Matryoshka dimension truncation#5

Merged
Grivn merged 1 commit into
mnemon-dev:masterfrom
achinth-b:achinth/matryoshka-dimensions
May 3, 2026
Merged

Add MNEMON_EMBED_DIMENSIONS for Matryoshka dimension truncation#5
Grivn merged 1 commit into
mnemon-dev:masterfrom
achinth-b:achinth/matryoshka-dimensions

Conversation

@achinth-b
Copy link
Copy Markdown
Contributor

Closes #4

Support Matryoshka Representation Learning by passing the dimensions parameter to Ollama's /api/embed endpoint. When MNEMON_EMBED_DIMENSIONS is set (e.g., 256), Ollama truncates and re-normalizes the embedding vector, giving faster similarity search with minimal quality loss on MRL-trained models like nomic-embed-text.

Backward compatible: when unset, behavior is identical to before.

What

  • Added dims field to the Client struct (0 = use native dimensions)
  • NewClient() reads the MNEMON_EMBED_DIMENSIONS env var (parsed as int, ignores invalid/negative values)
  • Added Dimensions field to embedRequest with json:"dimensions,omitempty" so it's only included when set
  • Embed() conditionally populates req.Dimensions when dims > 0
  • Updated CHANGELOG.md under [Unreleased]
  • Updated docs/USAGE.md Configuration table with the new env var

Why

Here is the issue which describes the problem.

Matryoshka-trained models encode the most important semantic signal in the first N dimensions. Using 256 dims instead of 768 gives ~95% retrieval quality at 3× less storage and faster cosine similarity. This is especially impactful for large knowledge bases where similarity search dominates recall latency.

Verified locally

Tested against Ollama nomic-embed-text on macOS:

# Without dimensions → 768-dim vector
curl -s http://localhost:11434/api/embed \
  -d '{"model":"nomic-embed-text","input":"test query"}' 
# → Vector length: 768

# With dimensions: 256 → 256-dim vector
curl -s http://localhost:11434/api/embed \
  -d '{"model":"nomic-embed-text","input":"test query","dimensions":256}'
# → Vector length: 256

Warning

Re-indexing caveat
If users change dimensions on an existing store, old embeddings (768-dim) and new embeddings (256-dim) will have incompatible vector sizes. For this PR, the env var only affects newly generated embeddings. A follow-up could add a mnemon embed --reindex command.

Support Matryoshka Representation Learning by passing the dimensions
parameter to Ollama's /api/embed endpoint. When MNEMON_EMBED_DIMENSIONS
is set (e.g., 256), Ollama truncates and re-normalizes the embedding
vector, giving faster similarity search with minimal quality loss on
MRL-trained models like nomic-embed-text.
Backward compatible: when unset, behavior is identical to before.
@Grivn
Copy link
Copy Markdown
Member

Grivn commented May 3, 2026

LGTM. Thanks for the clean, focused change.

@Grivn Grivn merged commit 29e0cc2 into mnemon-dev:master May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Matryoshka dimension truncation for Ollama embeddings

2 participants