Skip to content

feat(store): make Lance vector dimension configurable, derive from embedder#4

Merged
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:upstream-pr/dynamic-vector-dim
May 18, 2026
Merged

feat(store): make Lance vector dimension configurable, derive from embedder#4
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:upstream-pr/dynamic-vector-dim

Conversation

@doctatortot
Copy link
Copy Markdown
Contributor

Summary

Lifts the hardcoded const VECTOR_DIM: i32 = 1024; in src/store/lancedb.rs so omem-server can be deployed with any embedding model whose dimension is reported by EmbedService::dimensions(). The LanceStore struct carries the dim as a field; schema() and memory_to_batch() become methods that read self.vector_dim. main.rs builds the embedder first, then sizes the store accordingly.

Motivation

Running omem-server on CPU-only / non-AVX-512 hardware. mxbai-embed-large (1024-dim, 334M-param BERT) takes hundreds of ms per embed on older Xeons (e.g. E5-2697 v2 vintage). Smaller models — bge-small-en-v1.5, all-MiniLM-L6-v2 (both 384-dim) — are several times faster per call but currently unusable because the store schema is locked to 1024.

The current behavior also fails ugly: any model whose dim != 1024 makes the server panic deep in Arrow on first read/write with Length of the child array (768) must be the multiple of the value length (1024). This PR catches the mismatch up front with a clear OmemError.

Back-compat

  • LanceStore::new(uri) and StoreManager::new(uri) still default to 1024-dim — existing callers (tests, connectors, downstream code) keep working with zero changes.
  • DEFAULT_VECTOR_DIM is exposed as a public const for tests / mocks.
  • The Bedrock embedder is unchanged (still hardcoded to 1024 since that's Titan's native dim).
  • No DB migration needed for existing 1024-dim deployments.

Validation at startup

init_table() now compares the persisted table's vector-column dim against the configured value:

  • Match: proceed normally
  • Mismatch: return OmemError::Storage("vector dim mismatch: table has X but embedder produces Y; wipe the table or switch back to the matching embedding model") so the operator sees the problem immediately instead of a stacktrace in Arrow internals.

memory_to_batch() similarly rejects writes whose vector length doesn't match self.vector_dim, catching dimension drift before it corrupts data.

Tests

Three new tests on LanceStore cover the new surface:

  • test_with_dim_stores_correct_dimension — construct at 384, round-trip a 384-dim vector
  • test_with_dim_rejects_wrong_length_vector — feeding a 768-dim vec into a 384-dim store errors cleanly
  • test_init_table_rejects_dim_mismatch_on_reopen — reopening an existing table with a different configured dim errors loudly

All existing store::lancedb::tests::* continue to pass (they go through the default 1024-dim path).

How verified

Built and tested via Forgejo Actions on my own infra (act_runner v6.3.1, stable rustc). The three new tests pass; existing store tests pass.

Note for maintainers: I noticed a few pre-existing test failures on upstream main unrelated to this PR — rustls 0.23 needs CryptoProvider::install_default() before any test that uses reqwest (20 No provider set panics in connectors::github::tests, embed::openai_compat::tests, llm::openai_compat::tests, retrieve::reranker::tests); plus 5 api::tests space-membership tests asserting 200 but getting 404. Happy to file those as separate issues if they're news.

Checklist

  • Existing tests pass (in the modules touched by this change)
  • New behavior covered by tests
  • Back-compat preserved for LanceStore::new() callers
  • No new dependencies

…bedder

Hardcoded `const VECTOR_DIM: i32 = 1024` in `src/store/lancedb.rs`
forced every deployment to use a 1024-dim embedding model regardless
of what the configured embedder actually produced. Swapping to
`nomic-embed-text` (768) or any of the 384-dim sentence-transformers
panics deep in Arrow on the first read/write with:

    Length of the child array (768) must be the multiple of
    the value length (1024) and the array length (1).

Lift the dimension to a `LanceStore` field, derived from the active
`EmbedService::dimensions()` at startup. Existing behavior preserved
when callers use `LanceStore::new(uri)` — defaults to 1024.

What changes
- `LanceStore::with_dim(uri, vector_dim)` — new explicit constructor
- `LanceStore::new(uri)` retained as a thin wrapper that defaults to
  `DEFAULT_VECTOR_DIM` (1024) for back-compat with existing callers
- `schema()` and `memory_to_batch()` become methods (`&self`) reading
  `self.vector_dim` instead of the const
- `StoreManager::with_vector_dim(uri, dim)` plumbs the dim from
  caller through to each per-tenant LanceStore
- `main.rs` builds the embedder first, then sizes the store from
  `embed.dimensions()`
- `init_table()` validates the persisted table's vector-column dim
  against the configured value and returns a clear `OmemError::Storage`
  instead of letting Arrow panic on first read
- `memory_to_batch()` rejects writes whose vector length doesn't match
  `self.vector_dim` (catches dimension drift before it corrupts data)

Three new tests on `LanceStore`:
- `test_with_dim_stores_correct_dimension` — construct at 384,
  round-trip a 384-dim vector
- `test_with_dim_rejects_wrong_length_vector` — feeding a 768-dim
  vec into a 384-dim store errors cleanly
- `test_init_table_rejects_dim_mismatch_on_reopen` — reopening an
  existing table with a different configured dim errors loudly

Motivation
Running omem-server on CPU-constrained / non-AVX-512 hardware.
`mxbai-embed-large` (1024-dim, BERT-large, 334M params) takes
hundreds of ms per embed on older Xeons. Smaller models like
`bge-small-en-v1.5` and `all-MiniLM-L6-v2` (both 384-dim) are
several times faster but currently unusable because the store
schema is locked to 1024.

`DEFAULT_VECTOR_DIM` is exposed as a public const so existing
tests + the noop embedder keep their 1024-dim wiring without any
change.
@doctatortot
Copy link
Copy Markdown
Contributor Author

Follow-up PR #5 (feat(embed): configurable embedding dim + ollama timeout via env) makes the openai-compat embedder actually report a non-1024 dim, so this PR's schema-flexible store can be used end-to-end on the openai-compat path. They can land in either order — both are back-compat with the existing defaults.

@yhyyz
Copy link
Copy Markdown
Contributor

yhyyz commented May 18, 2026

Thanks for the excellent contribution @doctatortot! 🎉

The code quality here is impressive — clean design, proper back-compat via LanceStore::new() defaulting to 1024, clear error messages on dim mismatch, and solid test coverage. The motivation is spot-on: smaller embedding models (bge-small, all-MiniLM) are much faster on CPU-only hardware, and the previous hard-panic on dim mismatch was a terrible DX.

This has been merged and deployed to production (api.ourmem.ai). Looking forward to more contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants