feat(drive): expand count-index group-by carrier shapes (G1a/G1b/G8a-c)#3652
feat(drive): expand count-index group-by carrier shapes (G1a/G1b/G8a-c)#3652QuantumExplorer wants to merge 1 commit into
Conversation
New count-query fixtures and the rs-drive infrastructure that proves them: - G1a — In on byBrand with one absent value (proof grows by ~255 B absence subproof; verifier omits absent branches from Entries). - G1b — high-fanout In on byBrand (|IN| = B = 100); same shape as G1 scaled up. - G8a — bounded carrier + bounded ACOR (range on both axes) with descending walk; same carrier shape as G8 with different op variants on both range commitments. - G8b/G8c — rejection shapes for the two-range carrier with group_by = [brand, color] and group_by = [] respectively (carrier is opened only for GroupByRange + single-field group_by). Touches the dispatcher, range-aggregate carrier proof executor, path-query builder, verify side, and contract-insert wiring for the new count-tree variants. Book chapter updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
📖 Book Preview built successfully. Download the preview from the workflow artifacts. Updated at 2026-05-15T13:37:43.834Z |
📝 WalkthroughWalkthroughThis PR migrates the document count request API from CBOR-encoded raw values to structured typed clauses, adds directional traversal support for descending order-by scenarios, and expands benchmark coverage for group-by proof cases. The dispatcher validates and canonicalizes where clauses, merging same-field range pairs; proof generation now respects iteration direction via a left_to_right parameter threaded through the carrier aggregate pipeline. ChangesDocument Count Request API Modernization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review GateCommit:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/rs-drive/src/query/mod.rs (1)
784-801:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop silently dropping malformed
orderByclauses.Line [789] currently uses
filter_mapwith.ok(), so invalidorderByentries are ignored instead of rejected. This can change query semantics silently instead of returning a syntax error.💡 Suggested fix
- let order_by_clauses: Vec<OrderClause> = order_by - .map(|id_cbor| { - if let Value::Array(clauses) = id_cbor { - clauses - .iter() - .filter_map(|order_clause| { - if let Value::Array(clauses_components) = order_clause { - OrderClause::from_components(clauses_components).ok() - } else { - None - } - }) - .collect() - } else { - Vec::new() - } - }) - .unwrap_or_default(); + let order_by_clauses: Vec<OrderClause> = match order_by { + None => Vec::new(), + Some(Value::Array(clauses)) => clauses + .iter() + .map(|order_clause| match order_clause { + Value::Array(clauses_components) => { + OrderClause::from_components(clauses_components).map_err(Error::from) + } + _ => Err(Error::Query(QuerySyntaxError::InvalidOrderByProperties( + "order clauses must be [field, \"asc\"|\"desc\"] arrays", + ))), + }) + .collect::<Result<Vec<_>, Error>>()?, + Some(_) => { + return Err(Error::Query(QuerySyntaxError::InvalidOrderByProperties( + "order clauses must be an array", + ))); + } + };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/rs-drive/src/query/mod.rs` around lines 784 - 801, The code currently builds order_by_clauses by using filter_map(...).ok(), which silently drops malformed orderBy entries; instead, change the logic around OrderClause::from_components to validate every clause and return an error when any clause fails parsing. Replace the filter_map + collect with mapping to Result<OrderClause, _> (using OrderClause::from_components) and collect into a Result<Vec<OrderClause>, _>, then propagate or map the error so order_by_clauses is only produced on success; reference OrderClause::from_components, order_by_clauses, and the matching on Value::Array to locate the change.
🧹 Nitpick comments (2)
packages/rs-drive/src/query/drive_document_count_query/execute_range_count.rs (1)
450-460: ⚡ Quick winDocument
left_to_rightas a proof-shaping argument.This new parameter affects the serialized
PathQueryused by prover/verifier parity, but the method docs don’t describe its behavior yet. Please add a short arg note and ordering semantics to keep this API self-describing.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/rs-drive/src/query/drive_document_count_query/execute_range_count.rs` around lines 450 - 460, Add a short doc comment to execute_carrier_aggregate_count_with_proof explaining that the left_to_right boolean is a proof-shaping argument that controls the ordering of entries in the serialized PathQuery (affecting prover/verifier parity), specify which ordering corresponds to true vs false (e.g., left-to-right vs right-to-left), and note that this must match callers of carrier_aggregate_count_path_query to ensure consistent proof serialization and verification across prover and verifier.packages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/mod.rs (1)
10-13: ⚡ Quick winUpdate verifier docs for directional traversal.
The API now takes
left_to_right, but the doc text still describes lex-ascending output unconditionally. Please align the contract text with direction-dependent ordering.Also applies to: 32-51
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/mod.rs` around lines 10 - 13, The doc comment for the verifier function that returns (root_hash, per_key_counts) is out of sync with the updated API: it still claims lex-ascending output unconditionally but the function now accepts a left_to_right parameter. Update the function/module docs (the comment above the carrier AggregateCountOnRange verifier in mod.rs, and the similar block referenced at lines 32–51) to state that per_key_counts ordering depends on left_to_right (e.g., left_to_right == true => serialized lexicographic ascending order, left_to_right == false => serialized lexicographic descending order), and mention the left_to_right parameter name in the contract so callers know the ordering behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/rs-drive/benches/document_count_worst_case.rs`:
- Around line 1056-1077: The matrix row for G8a isn't actually exercising
descending walks because raw_order_by isn't threaded through the matrix runner:
update report_group_by_matrix to accept and propagate raw_order_by alongside
raw_where, ensure the MatrixCase struct/instances (the new G8a entry) include
the appropriate raw_order_by value (representing
descending/left_to_right=false), and modify drive_count_outcome (the code that
builds the request) to use the passed raw_order_by instead of Value::Null when
constructing the request so the order-sensitive path (left_to_right=false) is
truly tested.
---
Outside diff comments:
In `@packages/rs-drive/src/query/mod.rs`:
- Around line 784-801: The code currently builds order_by_clauses by using
filter_map(...).ok(), which silently drops malformed orderBy entries; instead,
change the logic around OrderClause::from_components to validate every clause
and return an error when any clause fails parsing. Replace the filter_map +
collect with mapping to Result<OrderClause, _> (using
OrderClause::from_components) and collect into a Result<Vec<OrderClause>, _>,
then propagate or map the error so order_by_clauses is only produced on success;
reference OrderClause::from_components, order_by_clauses, and the matching on
Value::Array to locate the change.
---
Nitpick comments:
In
`@packages/rs-drive/src/query/drive_document_count_query/execute_range_count.rs`:
- Around line 450-460: Add a short doc comment to
execute_carrier_aggregate_count_with_proof explaining that the left_to_right
boolean is a proof-shaping argument that controls the ordering of entries in the
serialized PathQuery (affecting prover/verifier parity), specify which ordering
corresponds to true vs false (e.g., left-to-right vs right-to-left), and note
that this must match callers of carrier_aggregate_count_path_query to ensure
consistent proof serialization and verification across prover and verifier.
In
`@packages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/mod.rs`:
- Around line 10-13: The doc comment for the verifier function that returns
(root_hash, per_key_counts) is out of sync with the updated API: it still claims
lex-ascending output unconditionally but the function now accepts a
left_to_right parameter. Update the function/module docs (the comment above the
carrier AggregateCountOnRange verifier in mod.rs, and the similar block
referenced at lines 32–51) to state that per_key_counts ordering depends on
left_to_right (e.g., left_to_right == true => serialized lexicographic ascending
order, left_to_right == false => serialized lexicographic descending order), and
mention the left_to_right parameter name in the contract so callers know the
ordering behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b77c2ed8-5e9e-4952-95ee-0d33ab0e2f6d
📒 Files selected for processing (11)
book/src/drive/count-index-group-by-examples.mdpackages/rs-drive/benches/document_count_worst_case.rspackages/rs-drive/src/drive/contract/insert/insert_contract/v0/mod.rspackages/rs-drive/src/query/drive_document_count_query/drive_dispatcher.rspackages/rs-drive/src/query/drive_document_count_query/execute_range_count.rspackages/rs-drive/src/query/drive_document_count_query/executors/range_aggregate_carrier_proof.rspackages/rs-drive/src/query/drive_document_count_query/path_query.rspackages/rs-drive/src/query/drive_document_count_query/tests.rspackages/rs-drive/src/query/mod.rspackages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/mod.rspackages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/v0/mod.rs
| // G8a: bounded carrier + bounded ACOR with descending walk. | ||
| // Same `RangeAggregateCarrierProof` mode as G8 but the | ||
| // dispatcher merges two-sided ranges into `between*` clauses | ||
| // via `merge_same_field_range_pairs` and threads | ||
| // `left_to_right = false` through the carrier path query. | ||
| MatrixCase { | ||
| label: "[brand] / where=brand BETWEEN AND color BETWEEN (left_to_right=false)", | ||
| platform_allowed: | ||
| "yes (RangeAggregateCarrierProof — bounded-range carrier with descending walk)", | ||
| raw_where: Value::Array(vec![ | ||
| clause("brand", ">", Value::Text(brand_label(BRAND_COUNT / 2))), | ||
| clause( | ||
| "brand", | ||
| "<", | ||
| Value::Text(brand_label(BRAND_COUNT * 65 / 100)), | ||
| ), | ||
| clause("color", ">", Value::Text(color_label(200))), | ||
| clause("color", "<", Value::Text(color_label(400))), | ||
| ]), | ||
| mode: CountMode::GroupByRange, | ||
| limit: None, | ||
| }, |
There was a problem hiding this comment.
Thread raw_order_by through the matrix runner.
This new G8a matrix row is labeled as the descending carrier walk, but report_group_by_matrix still only carries raw_where, and drive_count_outcome() builds the request with Value::Null for order_by. So this case is actually exercising the default direction, not left_to_right=false, and the matrix output is misleading for the new order-sensitive path.
Suggested wiring
struct MatrixCase {
label: &'static str,
platform_allowed: &'static str,
raw_where: Value,
+ raw_order_by: Value,
mode: CountMode,
limit: Option<u32>,
}
fn drive_count_outcome(
fixture: &CountBenchFixture,
raw_where: Value,
+ raw_order_by: Value,
mode: CountMode,
limit: Option<u32>,
prove: bool,
platform_version: &PlatformVersion,
) -> String {
- let request = count_request(fixture, raw_where, Value::Null, mode, limit, prove);
+ let request = count_request(fixture, raw_where, raw_order_by, mode, limit, prove);
// ...
}
MatrixCase {
label: "[brand] / where=brand BETWEEN AND color BETWEEN (left_to_right=false)",
platform_allowed:
"yes (RangeAggregateCarrierProof — bounded-range carrier with descending walk)",
raw_where: Value::Array(vec![/* ... */]),
+ raw_order_by: order_by_brand_desc.clone(),
mode: CountMode::GroupByRange,
limit: None,
},
let noproof_result = drive_count_outcome(
fixture,
case.raw_where.clone(),
+ case.raw_order_by.clone(),
case.mode,
case.limit,
false,
platform_version,
);
let prove_result = drive_count_outcome(
fixture,
case.raw_where.clone(),
+ case.raw_order_by.clone(),
case.mode,
case.limit,
true,
platform_version,
);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/rs-drive/benches/document_count_worst_case.rs` around lines 1056 -
1077, The matrix row for G8a isn't actually exercising descending walks because
raw_order_by isn't threaded through the matrix runner: update
report_group_by_matrix to accept and propagate raw_order_by alongside raw_where,
ensure the MatrixCase struct/instances (the new G8a entry) include the
appropriate raw_order_by value (representing descending/left_to_right=false),
and modify drive_count_outcome (the code that builds the request) to use the
passed raw_order_by instead of Value::Null when constructing the request so the
order-sensitive path (left_to_right=false) is truly tested.
Issue being fixed or feature implemented
Extends the count-index group-by surface introduced alongside #3633 with new fixture coverage for edge cases of the carrier-aggregate-on-range (ACOR) and
In-fanout paths, plus the rs-drive infrastructure those fixtures exercise.What was done?
New bench fixtures + matching rs-drive support + book chapter:
InonbyBrandwith one absent value. Proof grows by ~255 B for the absence subproof; verifier omits the absent branch fromEntries.InonbyBrand(|IN| = B = 100). Same shape as G1, scaled up; reveals every byBrand entry.group_by = [brand, color]andgroup_by = []. Carrier is opened only forGroupByRange + single-field group_by.Touched code:
packages/rs-drive/src/query/drive_document_count_query/drive_dispatcher.rs— mode-detection from clauses + per-mode dispatch for the new variants.packages/rs-drive/src/query/drive_document_count_query/{execute_range_count,path_query,tests}.rsandexecutors/range_aggregate_carrier_proof.rs— executor side.packages/rs-drive/src/query/mod.rs— query plumbing.packages/rs-drive/src/verify/document_count/verify_carrier_aggregate_count_proof/{mod,v0/mod}.rs— verifier side.packages/rs-drive/src/drive/contract/insert/insert_contract/v0/mod.rs— count-tree wiring needed by the new fixtures.packages/rs-drive/benches/document_count_worst_case.rs— fixture bodies.book/src/drive/count-index-group-by-examples.md— chapter expanded with the new G* sections (proof sizes, verified shapes, tree-walk descriptions).How Has This Been Tested?
cargo test -p drive(executor + verifier paths for the new variants)document_count_worst_case) regenerated; proof sizes match the documented G* numbers in the book chapter.Breaking Changes
None. Drive-internal additions on top of the v1 count surface; no wire-protocol or consensus changes.
Checklist:
Summary by CodeRabbit
New Features
Improvements