feat: add VADD_BATCH command for bulk vector inserts#161
Merged
Conversation
add VADD_BATCH command that accepts multiple vectors in a single command to reduce per-vector round-trip overhead for bulk inserts. RESP3 syntax: VADD_BATCH key DIM n elem1 f32... elem2 f32... [opts] the DIM keyword is required so the parser knows where each vector ends and the next element name begins. max batch size is 10,000.
- add VAddBatchResult struct and vadd_batch() method to keyspace with upfront NaN/inf validation and single memory check for the entire batch - add ShardRequest::VAddBatch and ShardResponse::VAddBatchResult - refactor to_aof_record → to_aof_records (returns Vec<AofRecord>) so VADD_BATCH can expand each applied vector into its own AofRecord::VAdd — no new AOF format needed
wire up VADD_BATCH through both sharded and concurrent mode code paths in connection.rs. response returns integer count of newly added elements, matching VADD's pattern.
- add VAddBatchEntry and VAddBatchRequest proto messages - add VAddBatch RPC returning IntResponse (count of added elements) - add to PipelineRequest oneof (field 72) - implement v_add_batch handler in grpc.rs with validation - regenerate go and python proto stubs
- add vadd_batch() method to python gRPC client - update bench-vector.py to send batches via VADD_BATCH (RESP) and vadd_batch (gRPC) instead of individual VADD calls - update bench-memory.sh vector helper to use VADD_BATCH - bump command count to 107 in README and bench README
when system python has the base deps but grpc mode forces a venv, the venv was missing numpy/redis. now installs all required deps alongside ember-py.
the protoc codegen produces `from ember.v1 import` but the package layout needs `from ember.proto.ember.v1 import`. the Makefile has a sed fixup for this but manual regen skipped it.
benchmarked on GCP c2-standard-8. VADD_BATCH improves insert throughput: RESP 963 → 1,483 vec/s (+54%), gRPC 1,009 → 2,374 vec/s (+135%). query throughput unchanged as expected.
kacy
added a commit
that referenced
this pull request
Feb 19, 2026
* feat: add VADD_BATCH command parsing to protocol layer add VADD_BATCH command that accepts multiple vectors in a single command to reduce per-vector round-trip overhead for bulk inserts. RESP3 syntax: VADD_BATCH key DIM n elem1 f32... elem2 f32... [opts] the DIM keyword is required so the parser knows where each vector ends and the next element name begins. max batch size is 10,000. * feat: add VADD_BATCH to core engine and refactor AOF recording - add VAddBatchResult struct and vadd_batch() method to keyspace with upfront NaN/inf validation and single memory check for the entire batch - add ShardRequest::VAddBatch and ShardResponse::VAddBatchResult - refactor to_aof_record → to_aof_records (returns Vec<AofRecord>) so VADD_BATCH can expand each applied vector into its own AofRecord::VAdd — no new AOF format needed * feat: add VADD_BATCH dispatch to connection handler wire up VADD_BATCH through both sharded and concurrent mode code paths in connection.rs. response returns integer count of newly added elements, matching VADD's pattern. * feat: add VADD_BATCH gRPC RPC and regenerate client stubs - add VAddBatchEntry and VAddBatchRequest proto messages - add VAddBatch RPC returning IntResponse (count of added elements) - add to PipelineRequest oneof (field 72) - implement v_add_batch handler in grpc.rs with validation - regenerate go and python proto stubs * feat: update python client and benchmarks to use VADD_BATCH - add vadd_batch() method to python gRPC client - update bench-vector.py to send batches via VADD_BATCH (RESP) and vadd_batch (gRPC) instead of individual VADD calls - update bench-memory.sh vector helper to use VADD_BATCH - bump command count to 107 in README and bench README * fix: install base deps into venv when grpc benchmarks are requested when system python has the base deps but grpc mode forces a venv, the venv was missing numpy/redis. now installs all required deps alongside ember-py. * fix: correct import path in generated python grpc stubs the protoc codegen produces `from ember.v1 import` but the package layout needs `from ember.proto.ember.v1 import`. the Makefile has a sed fixup for this but manual regen skipped it. * docs: update vector benchmark results with VADD_BATCH numbers benchmarked on GCP c2-standard-8. VADD_BATCH improves insert throughput: RESP 963 → 1,483 vec/s (+54%), gRPC 1,009 → 2,374 vec/s (+135%). query throughput unchanged as expected.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
summary
VADD_BATCH key DIM n elem1 f32... elem2 f32... [METRIC|QUANT|M|EF]VAddBatch(VAddBatchRequest) returns (IntResponse)with packed float entrieswhat was tested
to_aof_records_for_vadd_batch— 3 applied vectors produce 3 AofRecord::VAdd recordscargo test -p ember-protocol— 330 tests passcargo test -p emberkv-core --features vector— 351 tests pass (1 pre-existing failure in memory::tests::entry_overhead_not_too_small, unrelated)cargo build -p ember-server --features jemalloc,vector,grpc— clean builddesign considerations
Option<AofRecord>toVec<AofRecord>so VADD_BATCH can expand each applied vector into an individual VAdd record. all existing arms are mechanicalSome(x)→vec![x]/None→vec![]changes