feat(store): make Lance vector dimension configurable, derive from embedder#4
Conversation
…bedder
Hardcoded `const VECTOR_DIM: i32 = 1024` in `src/store/lancedb.rs`
forced every deployment to use a 1024-dim embedding model regardless
of what the configured embedder actually produced. Swapping to
`nomic-embed-text` (768) or any of the 384-dim sentence-transformers
panics deep in Arrow on the first read/write with:
Length of the child array (768) must be the multiple of
the value length (1024) and the array length (1).
Lift the dimension to a `LanceStore` field, derived from the active
`EmbedService::dimensions()` at startup. Existing behavior preserved
when callers use `LanceStore::new(uri)` — defaults to 1024.
What changes
- `LanceStore::with_dim(uri, vector_dim)` — new explicit constructor
- `LanceStore::new(uri)` retained as a thin wrapper that defaults to
`DEFAULT_VECTOR_DIM` (1024) for back-compat with existing callers
- `schema()` and `memory_to_batch()` become methods (`&self`) reading
`self.vector_dim` instead of the const
- `StoreManager::with_vector_dim(uri, dim)` plumbs the dim from
caller through to each per-tenant LanceStore
- `main.rs` builds the embedder first, then sizes the store from
`embed.dimensions()`
- `init_table()` validates the persisted table's vector-column dim
against the configured value and returns a clear `OmemError::Storage`
instead of letting Arrow panic on first read
- `memory_to_batch()` rejects writes whose vector length doesn't match
`self.vector_dim` (catches dimension drift before it corrupts data)
Three new tests on `LanceStore`:
- `test_with_dim_stores_correct_dimension` — construct at 384,
round-trip a 384-dim vector
- `test_with_dim_rejects_wrong_length_vector` — feeding a 768-dim
vec into a 384-dim store errors cleanly
- `test_init_table_rejects_dim_mismatch_on_reopen` — reopening an
existing table with a different configured dim errors loudly
Motivation
Running omem-server on CPU-constrained / non-AVX-512 hardware.
`mxbai-embed-large` (1024-dim, BERT-large, 334M params) takes
hundreds of ms per embed on older Xeons. Smaller models like
`bge-small-en-v1.5` and `all-MiniLM-L6-v2` (both 384-dim) are
several times faster but currently unusable because the store
schema is locked to 1024.
`DEFAULT_VECTOR_DIM` is exposed as a public const so existing
tests + the noop embedder keep their 1024-dim wiring without any
change.
|
Follow-up PR #5 ( |
|
Thanks for the excellent contribution @doctatortot! 🎉 The code quality here is impressive — clean design, proper back-compat via This has been merged and deployed to production (api.ourmem.ai). Looking forward to more contributions! |
Summary
Lifts the hardcoded
const VECTOR_DIM: i32 = 1024;insrc/store/lancedb.rsso omem-server can be deployed with any embedding model whose dimension is reported byEmbedService::dimensions(). TheLanceStorestruct carries the dim as a field;schema()andmemory_to_batch()become methods that readself.vector_dim.main.rsbuilds the embedder first, then sizes the store accordingly.Motivation
Running omem-server on CPU-only / non-AVX-512 hardware.
mxbai-embed-large(1024-dim, 334M-param BERT) takes hundreds of ms per embed on older Xeons (e.g. E5-2697 v2 vintage). Smaller models —bge-small-en-v1.5,all-MiniLM-L6-v2(both 384-dim) — are several times faster per call but currently unusable because the store schema is locked to 1024.The current behavior also fails ugly: any model whose dim != 1024 makes the server panic deep in Arrow on first read/write with
Length of the child array (768) must be the multiple of the value length (1024). This PR catches the mismatch up front with a clearOmemError.Back-compat
LanceStore::new(uri)andStoreManager::new(uri)still default to 1024-dim — existing callers (tests, connectors, downstream code) keep working with zero changes.DEFAULT_VECTOR_DIMis exposed as a public const for tests / mocks.Validation at startup
init_table()now compares the persisted table's vector-column dim against the configured value:OmemError::Storage("vector dim mismatch: table has X but embedder produces Y; wipe the table or switch back to the matching embedding model")so the operator sees the problem immediately instead of a stacktrace in Arrow internals.memory_to_batch()similarly rejects writes whose vector length doesn't matchself.vector_dim, catching dimension drift before it corrupts data.Tests
Three new tests on
LanceStorecover the new surface:test_with_dim_stores_correct_dimension— construct at 384, round-trip a 384-dim vectortest_with_dim_rejects_wrong_length_vector— feeding a 768-dim vec into a 384-dim store errors cleanlytest_init_table_rejects_dim_mismatch_on_reopen— reopening an existing table with a different configured dim errors loudlyAll existing
store::lancedb::tests::*continue to pass (they go through the default 1024-dim path).How verified
Built and tested via Forgejo Actions on my own infra (act_runner v6.3.1, stable rustc). The three new tests pass; existing store tests pass.
Note for maintainers: I noticed a few pre-existing test failures on upstream main unrelated to this PR — rustls 0.23 needs
CryptoProvider::install_default()before any test that uses reqwest (20No provider setpanics inconnectors::github::tests,embed::openai_compat::tests,llm::openai_compat::tests,retrieve::reranker::tests); plus 5api::testsspace-membership tests asserting 200 but getting 404. Happy to file those as separate issues if they're news.Checklist
LanceStore::new()callers