feat(embed): configurable embedding dim + ollama timeout via env by doctatortot · Pull Request #5 · ourmem/omem

doctatortot · 2026-05-17T09:58:42Z

Summary

Makes two values in the openai-compat embedder configurable instead of hardcoded:

OMEM_EMBED_DIM (default 1024) — drives OpenAICompatEmbedder::dims / EmbedService::dimensions()
OMEM_EMBED_TIMEOUT_SECS (default 10) — drives the reqwest::Client timeout for the embed call

Both default to the existing hardcoded values when unset or zero, so behavior for any existing deployment is unchanged.

Motivation

Two problems running omem-server with non-default models or on slower hardware:

Hardcoded dims: 1024 at OpenAICompatEmbedder::new() means any model whose native dim != 1024 produces vectors the Lance store can't store. Affects nomic-embed-text (768), bge-small-en-v1.5 (384), all-MiniLM-L6-v2 (384), and others — all of these otherwise fail at first write with an Arrow length-mismatch panic.
Hardcoded 10s reqwest timeout kills inference on CPU-only / older hardware. A single embed call against mxbai-embed-large (334M-param BERT) on a 2013 Xeon E5-2697 v2 routinely takes 11-15s under back-to-back load. omem-server returns 500 to the caller; ollama keeps computing and discards the eventual result. Lots of wasted CPU and no writes land.

Validated end-to-end

On a 2013 Xeon E5-2697 v2 (no AVX2/AVX-512), running ollama with a custom all-minilm-512 model (384-dim, num_ctx=512):

Embed call latency: 11-15s (mxbai-embed-large @ 1024) → 200-500ms (all-minilm @ 384)
Memory migration of 185 chunks: previously failed at 41-67% rate (timeouts), now completes with zero failures

The migration script that exercised this is just POST /v1/memories per chunk back-to-back with a 6s spacing.

Relationship to the schema-flexible dim PR

That PR (still open) makes LanceStore's vector column accept any dim from the embedder. This PR lets the openai-compat embedder actually report a non-1024 dim. Together they unlock smaller/faster embedding models on the openai-compat path. They can land in either order; behavior is back-compat in both directions.

Back-compat

OmemConfig::default() keeps the existing 1024-dim / 10s-timeout values.
Existing tests pass unchanged.
Zero values for either env var fall back to the defaults (guards against misconfig producing zero-length vectors or zero-timeout reqwest behavior).

Tests

Three new tests on OpenAICompatEmbedder::new:

embed_dim_from_config_overrides_default — confirms dims() returns the configured value
embed_dim_zero_falls_back_to_1024 — guards misconfig
embed_timeout_zero_falls_back_to_default — guards misconfig

Checklist

Existing openai_compat tests still pass
New behavior covered
Back-compat preserved (OmemConfig::default() unchanged, zero values fall back)
No new dependencies

The openai-compatible embedder hardcodes `dims: 1024` at construction and uses a hardcoded `Duration::from_secs(10)` reqwest timeout. Two problems this surfaces in practice: 1. Any embed model whose native dim != 1024 produces vectors the Lance store can't hold. `nomic-embed-text` (768), `bge-small` (384), `all-MiniLM-L6-v2` (384) all break currently. 2. On CPU-only / older hardware, a single embed call against a large model (e.g. mxbai-embed-large, 334M params) routinely takes 11-15s, blowing the 10s timeout. omem-server returns 500; ollama keeps computing and discards the result. Useless work, no successful writes. Add two `OmemConfig` fields populated from env vars: - `OMEM_EMBED_DIM` (default 1024) — passed into `OpenAICompatEmbedder::dims`, returned by `EmbedService::dimensions()` - `OMEM_EMBED_TIMEOUT_SECS` (default 10) — used to build the `reqwest::Client` timeout Both fall back to the existing defaults when unset or zero, so behavior for existing deployments is unchanged. Validated end-to-end on a 2013-vintage Xeon E5-2697 v2 host running ollama with `all-minilm-512` (384-dim, num_ctx=512). Embed calls dropped from 11-15s on mxbai-embed-large (1024) to 200-500ms, allowing a 185-chunk memory migration to complete with zero failures where the previous configuration failed at 41-67%. Three new openai_compat tests cover: - `embed_dim_from_config_overrides_default` — dims() returns the configured value - `embed_dim_zero_falls_back_to_1024` — guards against misconfig producing zero-length vectors - `embed_timeout_zero_falls_back_to_default` — guards against zero-timeout reqwest behavior This complements PR (schema-flexible vector dim) — that PR makes the Lance store accept any dim; this one lets the active embedder report it. Together they make non-1024 embedding models actually usable on the openai-compat path.

doctatortot · 2026-05-17T09:59:54Z

Companion to #4 (schema-flexible LanceStore vector dim). #4 makes the store accept any dim from the embedder; this PR lets the embedder produce one. Each is back-compat with existing defaults so they can land in either order.

yhyyz · 2026-05-18T07:27:35Z

Merged and deployed! Thanks @doctatortot 🙏

This pairs perfectly with #4 — together they unlock end-to-end support for smaller/faster embedding models on the openai-compatible path. The zero-value fallback guards are a nice defensive touch, and the timeout fix is a real pain-point solver for CPU-only / Ollama setups.

Both PRs are now live on api.ourmem.ai. Great work!

doctatortot mentioned this pull request May 17, 2026

feat(store): make Lance vector dimension configurable, derive from embedder #4

Merged

4 tasks

yhyyz merged commit 4677997 into ourmem:main May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(embed): configurable embedding dim + ollama timeout via env#5

feat(embed): configurable embedding dim + ollama timeout via env#5
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:upstream-pr/configurable-embed-dim-and-timeout

doctatortot commented May 17, 2026

Uh oh!

doctatortot commented May 17, 2026

Uh oh!

yhyyz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

doctatortot commented May 17, 2026

Summary

Motivation

Validated end-to-end

Relationship to the schema-flexible dim PR

Back-compat

Tests

Checklist

Uh oh!

doctatortot commented May 17, 2026

Uh oh!

yhyyz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants