Feature Proposal: Add Valkey as a Vector Database Option (using Valkey GLIDE) #1296
MatthiasHowellYopp
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
|
@MatthiasHowellYopp - apologies for the delay in actioning this. We have a large backlog that we are working through. Please give us a couple of days, and we will revert. Really appreciate your contribution and support of llmware! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
I'd like to propose adding Valkey as a supported vector database in llmware, using the official Valkey GLIDE client library.
Why Valkey?
FT.CREATE/FT.SEARCHcommands with HNSW and FLAT indexing, supporting vector similarity, full-text search, tag matching, numeric range filters, and hybrid queries combining all of these in a single request.FT.AGGREGATEwithGROUPBY,REDUCE,SORTBY, and computed fields, enabling analytics without round-tripping data to the application layer.Why Valkey GLIDE as the Client?
ftmodule commands (ft.create,ft.search,ft.dropindex) as part of its Python API — no raw command passthrough needed.GlideClient(sync, since v2.1) and an async client, giving flexibility for llmware's synchronous embedding workflow.GlideClient/GlideClusterClient.pip install valkey-glide(current version 2.3.1, supports Python 3.9–3.13)Implementation Approach
Based on my review of the existing codebase, adding Valkey would involve:
1. New class
EmbeddingValkeyinllmware/embeddings.pyImplementing the three required methods following the same pattern as
EmbeddingMilvus,EmbeddingRedis, etc.:create_new_embedding(doc_ids=None, batch_size=500)— iterate text blocks, generate embeddings, store as hashes with binary vector blobs, indexed byFT.CREATEsearch_index(query_embedding_vector, sample_count=10)— KNN query viaft.searchwith*=>[KNN k @vector $query_vec]syntaxdelete_index()— drop the index viaft.dropindexand clean up text index flags2. New
ValkeyConfiginllmware/configs.pyFollowing the same pattern as
RedisConfig,MilvusConfig, etc.:3. Register in
VectorDBRegistry4. Dependency
valkey-glide— optional, lazy-imported following the existing pattern used by all other vector DB drivers (no hard requirement at install time).Sketch of the Vector Search Flow (GLIDE Python)
Key Differences from the Existing Redis Implementation
EmbeddingRedis(redis-py)EmbeddingValkey(GLIDE)redis(Python)valkey-glide(Rust + Python bindings)redis.commands.searchmethod chainingft.create(),ft.search()GlideClusterClientThe API surface is different enough from
redis-pythat this would be a standalone class rather than a shared base withEmbeddingRedis.Questions for Maintainers
Standalone class vs. shared base — Are you open to this being a fully independent
EmbeddingValkeyclass? Given the API differences with GLIDE, sharing a base withEmbeddingRediswould add complexity without much benefit.Sync client — The GLIDE client supports both sync and async modes. Should the implementation use the sync
GlideClientto match the existing synchronous pattern in other embedding classes?Cluster mode — Should cluster mode be configurable via
ValkeyConfigfrom the start, or is standalone sufficient for an initial implementation?Version requirements — This implementation targets Valkey Search ≥ 1.2.0 (which adds full-text, tag, numeric, and hybrid query support alongside vector similarity). This requires Valkey ≥ 9.1. For users on managed services, AWS ElastiCache 9.0+ for Valkey and the
valkey/valkey-bundleDocker image include the search module. Is there any concern with this being a relatively recent minimum version, or is that acceptable given this is a new integration with no backward compatibility to maintain?I'm Happy to Implement This
If the maintainers are open to this addition, I'm willing to submit a PR with the full implementation, following the fork-and-pull workflow documented in CONTRIBUTING.md.
References:
Beta Was this translation helpful? Give feedback.
All reactions