feat(store): make Lance vector dimension configurable, derive from embedder by doctatortot · Pull Request #4 · ourmem/omem

doctatortot · 2026-05-16T23:35:56Z

Summary

Lifts the hardcoded const VECTOR_DIM: i32 = 1024; in src/store/lancedb.rs so omem-server can be deployed with any embedding model whose dimension is reported by EmbedService::dimensions(). The LanceStore struct carries the dim as a field; schema() and memory_to_batch() become methods that read self.vector_dim. main.rs builds the embedder first, then sizes the store accordingly.

Motivation

Running omem-server on CPU-only / non-AVX-512 hardware. mxbai-embed-large (1024-dim, 334M-param BERT) takes hundreds of ms per embed on older Xeons (e.g. E5-2697 v2 vintage). Smaller models — bge-small-en-v1.5, all-MiniLM-L6-v2 (both 384-dim) — are several times faster per call but currently unusable because the store schema is locked to 1024.

The current behavior also fails ugly: any model whose dim != 1024 makes the server panic deep in Arrow on first read/write with Length of the child array (768) must be the multiple of the value length (1024). This PR catches the mismatch up front with a clear OmemError.

Back-compat

LanceStore::new(uri) and StoreManager::new(uri) still default to 1024-dim — existing callers (tests, connectors, downstream code) keep working with zero changes.
DEFAULT_VECTOR_DIM is exposed as a public const for tests / mocks.
The Bedrock embedder is unchanged (still hardcoded to 1024 since that's Titan's native dim).
No DB migration needed for existing 1024-dim deployments.

Validation at startup

init_table() now compares the persisted table's vector-column dim against the configured value:

Match: proceed normally
Mismatch: return OmemError::Storage("vector dim mismatch: table has X but embedder produces Y; wipe the table or switch back to the matching embedding model") so the operator sees the problem immediately instead of a stacktrace in Arrow internals.

memory_to_batch() similarly rejects writes whose vector length doesn't match self.vector_dim, catching dimension drift before it corrupts data.

Tests

Three new tests on LanceStore cover the new surface:

test_with_dim_stores_correct_dimension — construct at 384, round-trip a 384-dim vector
test_with_dim_rejects_wrong_length_vector — feeding a 768-dim vec into a 384-dim store errors cleanly
test_init_table_rejects_dim_mismatch_on_reopen — reopening an existing table with a different configured dim errors loudly

All existing store::lancedb::tests::* continue to pass (they go through the default 1024-dim path).

How verified

Built and tested via Forgejo Actions on my own infra (act_runner v6.3.1, stable rustc). The three new tests pass; existing store tests pass.

Note for maintainers: I noticed a few pre-existing test failures on upstream main unrelated to this PR — rustls 0.23 needs CryptoProvider::install_default() before any test that uses reqwest (20 No provider set panics in connectors::github::tests, embed::openai_compat::tests, llm::openai_compat::tests, retrieve::reranker::tests); plus 5 api::tests space-membership tests asserting 200 but getting 404. Happy to file those as separate issues if they're news.

Checklist

Existing tests pass (in the modules touched by this change)
New behavior covered by tests
Back-compat preserved for LanceStore::new() callers
No new dependencies

…bedder Hardcoded `const VECTOR_DIM: i32 = 1024` in `src/store/lancedb.rs` forced every deployment to use a 1024-dim embedding model regardless of what the configured embedder actually produced. Swapping to `nomic-embed-text` (768) or any of the 384-dim sentence-transformers panics deep in Arrow on the first read/write with: Length of the child array (768) must be the multiple of the value length (1024) and the array length (1). Lift the dimension to a `LanceStore` field, derived from the active `EmbedService::dimensions()` at startup. Existing behavior preserved when callers use `LanceStore::new(uri)` — defaults to 1024. What changes - `LanceStore::with_dim(uri, vector_dim)` — new explicit constructor - `LanceStore::new(uri)` retained as a thin wrapper that defaults to `DEFAULT_VECTOR_DIM` (1024) for back-compat with existing callers - `schema()` and `memory_to_batch()` become methods (`&self`) reading `self.vector_dim` instead of the const - `StoreManager::with_vector_dim(uri, dim)` plumbs the dim from caller through to each per-tenant LanceStore - `main.rs` builds the embedder first, then sizes the store from `embed.dimensions()` - `init_table()` validates the persisted table's vector-column dim against the configured value and returns a clear `OmemError::Storage` instead of letting Arrow panic on first read - `memory_to_batch()` rejects writes whose vector length doesn't match `self.vector_dim` (catches dimension drift before it corrupts data) Three new tests on `LanceStore`: - `test_with_dim_stores_correct_dimension` — construct at 384, round-trip a 384-dim vector - `test_with_dim_rejects_wrong_length_vector` — feeding a 768-dim vec into a 384-dim store errors cleanly - `test_init_table_rejects_dim_mismatch_on_reopen` — reopening an existing table with a different configured dim errors loudly Motivation Running omem-server on CPU-constrained / non-AVX-512 hardware. `mxbai-embed-large` (1024-dim, BERT-large, 334M params) takes hundreds of ms per embed on older Xeons. Smaller models like `bge-small-en-v1.5` and `all-MiniLM-L6-v2` (both 384-dim) are several times faster but currently unusable because the store schema is locked to 1024. `DEFAULT_VECTOR_DIM` is exposed as a public const so existing tests + the noop embedder keep their 1024-dim wiring without any change.

doctatortot · 2026-05-17T09:59:54Z

Follow-up PR #5 (feat(embed): configurable embedding dim + ollama timeout via env) makes the openai-compat embedder actually report a non-1024 dim, so this PR's schema-flexible store can be used end-to-end on the openai-compat path. They can land in either order — both are back-compat with the existing defaults.

yhyyz · 2026-05-18T07:27:31Z

Thanks for the excellent contribution @doctatortot! 🎉

The code quality here is impressive — clean design, proper back-compat via LanceStore::new() defaulting to 1024, clear error messages on dim mismatch, and solid test coverage. The motivation is spot-on: smaller embedding models (bge-small, all-MiniLM) are much faster on CPU-only hardware, and the previous hard-panic on dim mismatch was a terrible DX.

This has been merged and deployed to production (api.ourmem.ai). Looking forward to more contributions!

doctatortot mentioned this pull request May 17, 2026

feat(embed): configurable embedding dim + ollama timeout via env #5

Merged

4 tasks

yhyyz merged commit bbbbaee into ourmem:main May 18, 2026

doctatortot mentioned this pull request May 18, 2026

test(api): install rustls CryptoProvider in setup_app #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(store): make Lance vector dimension configurable, derive from embedder#4

feat(store): make Lance vector dimension configurable, derive from embedder#4
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:upstream-pr/dynamic-vector-dim

doctatortot commented May 16, 2026

Uh oh!

doctatortot commented May 17, 2026

Uh oh!

yhyyz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

doctatortot commented May 16, 2026

Summary

Motivation

Back-compat

Validation at startup

Tests

How verified

Checklist

Uh oh!

doctatortot commented May 17, 2026

Uh oh!

yhyyz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants