Background
The AVG dispatcher's no-prove path in
packages/rs-drive/src/query/drive_document_average_query/drive_dispatcher.rs
currently splits a single DocumentAverageRequest into two
sub-requests (DocumentCountRequest + DocumentSumRequest),
dispatches each through its respective per-surface dispatcher,
then zips the responses.
let count_request = DocumentCountRequest { /* … 9 fields … */ };
let sum_request = DocumentSumRequest { /* … same 9 fields … */ };
// open shared transaction
let count_response = self.execute_document_count_request(count_request, …);
let sum_response = self.execute_document_sum_request(sum_request, …);
match (count_response, sum_response) { /* zip */ }
Maintainer feedback on PR #3661:
We should not have 2 requests... We should have just one CountSumRequest
The current shape is correct (the local read-transaction guarantees
both sub-reads see the same grovedb snapshot, no off-by-one race),
just does more grovedb work than strictly necessary — every AVG query
on the no-prove path walks grovedb twice.
The PCPS prove path at execute_document_average_prove already does
the right thing: one walk against a ProvableCountProvableSumTree
terminator, both metrics extracted from each visited element via
count_sum_value_or_default(). The no-prove path lags behind.
Goal
Single DocumentCountSumRequest (or unified use of
DocumentAverageRequest) drives a single dispatcher
execute_document_count_and_sum_request that walks grovedb once and
yields (count, sum) per visited element / range.
let response = self.execute_document_count_and_sum_request(
request, // unified
transaction,
platform_version,
)?;
match response {
DocumentCountAndSumResponse::Aggregate { count, sum } => …,
DocumentCountAndSumResponse::Entries(entries) => …,
}
Removes the dual-routing fragility (count + sum routing tables
have to stay in sync — already caught one routing bug in PR #3661 when
sum's GroupByCompound routing diverged from count's), halves
grovedb work on AVG no-prove, and gives AVG a single source of truth
for routing decisions.
Required work (rs-drive only — no grovedb dependency)
Everything needed is already present in grovedb:
| Grovedb primitive |
Where it's used today |
What we'd use it for |
Element::count_sum_value_or_default() |
prove path (PCPS verifier) |
no-prove element decoding |
grove_get_raw on PCPS element |
sum + count separate executors |
unified single-element read |
grove_query on PCPS range |
sum + count separate executors |
accumulate (count, sum) per element |
Query::new_aggregate_count_and_sum_on_range |
prove path |
(could be reused) |
No new grovedb primitive is required. There's a possible optional
optimization where grovedb could expose a no-prove analog of
AggregateCountAndSumOnRange that does the accumulation in-engine
rather than returning all PCPS elements over the FFI for Drive to
fold — but that's a perf tune, not a blocker. Today's grove_query
over PCPS elements + in-Drive accumulation is functionally
equivalent.
Drive surfaces to add
-
DocumentCountSumRequest struct in
packages/rs-drive/src/query/drive_document_average_query/mod.rs
(or repurpose DocumentAverageRequest if its field set already
matches — likely yes).
-
DocumentCountSumResponse enum with Aggregate { count, sum }
and Entries(Vec<CountSumEntry>) variants. Already exists as
DocumentAverageResponse — can be reused directly.
-
Joint per-mode executors (these are the new functions, mostly
thin compositions of existing count + sum per-mode executors):
| Mode |
New executor |
Reuses (existing) |
New work |
| Total (Aggregate, no where) |
execute_document_count_and_sum_total_no_proof |
grove_get on […doctype, 0] |
decode both via count_sum_value_or_default() |
| PointLookup (Equal/In on summable+countable index) |
execute_document_count_and_sum_point_lookup_no_proof |
grove_get per key |
decode both per key |
| PerInValue (In on a non-range field) |
execute_document_count_and_sum_per_in_value_no_proof |
grove_get per In branch |
decode both per In branch |
| RangeNoProof (range or distinct on PCPS index) |
execute_document_count_and_sum_range_no_proof |
grove_query over PCPS range |
accumulate (count, sum) per element |
Each is structurally a fold of the existing pair of per-mode
executors with a shared grovedb walk. The sum side already knows
how to walk these shapes; the count extraction is just adding
count_value_or_default() to what the sum walker already
produces (it currently reads sum_value_or_default() only).
-
Single dispatcher Drive::execute_document_count_and_sum_request
that:
- Resolves mode via the existing
detect_sum_mode_from_inputs
routing table (the sum side's table — count's identical-shape
table is no longer separately needed for the AVG path).
- Dispatches to one of the joint executors above.
- The current AVG dispatcher's body collapses to:
pub fn execute_document_average_request(…) -> … {
if request.prove { return self.execute_document_average_prove(…); }
self.execute_document_count_and_sum_request(request.into(), …)
}
-
Delete the count↔avg / sum↔avg adapter logic that currently
maps AverageMode → (CountMode, SumMode) (lines 86-94 of the
current dispatcher) and the response-shape zipping at lines
148-176. Both go away once the joint executors return the
correct shape directly.
Drive surfaces to remove
count_request + sum_request struct construction (lines 100-120
of drive_document_average_query/drive_dispatcher.rs).
zip_entries helper used to merge the two response vectors
(still needed for the prove path's carrier case? — verify; if so,
keep it where the prove path consumes it).
- The
let local_tx = self.grove.start_transaction(); … shared
transaction plumbing (still needed but moves inside the joint
executor where it has fewer hops).
Validation
- The prove path is unchanged — it already works via a single PCPS
walk. Tests in verify/document_sum/ and the AVG bench's Q1-Q7
should remain green without modification.
- The no-prove path's existing tests (
drive_dispatcher.rs:test_…)
should produce identical (count, sum) outputs as before — only
the underlying grovedb-call count changes.
- AVG bench's no-prove mode should show ~half the per-query grovedb
time on AVG queries (one walk instead of two). Worth pinning the
speedup in the chapter.
Acceptance criteria
execute_document_average_request no longer constructs separate
DocumentCountRequest + DocumentSumRequest instances on the
no-prove path.
- AVG no-prove queries do at most one grovedb read per result
row / one grovedb walk per range query (confirm with a probe
counter or a tracing span if available).
- All existing AVG tests pass (
cargo test -p drive --lib drive_document_average_query).
- New tests: one per joint executor confirming
(count, sum)
match what the current double-dispatch produces, against the
same grades-contract fixture.
- PR description notes the perf-win on the no-prove path (and
updates the AVG chapter's What's Next if it currently
mentions the composition strategy).
Grovedb dependency
None required. All primitives needed are already present in the
current pinned grovedb rev. The optional no-prove-aggregate engine-side
optimization (avoid materializing all PCPS elements over FFI) would
be a separate grovedb PR if performance profiling warrants it after
this refactor lands.
Why not bundle into PR #3661
The refactor needs four new no-prove joint executors, each with their
own per-mode dispatch logic and per-mode test coverage. PR #3661 was
already large enough that bundling this in would have made it
unreviewable. Splitting this out gets it its own focused review window.
Context
Referenced from PR #3661 inline comment:
#3268402399.
The dispatcher carries an inline architectural-note comment at
drive_dispatcher.rs:100-115
pointing at this issue.
Background
The AVG dispatcher's no-prove path in
packages/rs-drive/src/query/drive_document_average_query/drive_dispatcher.rscurrently splits a single
DocumentAverageRequestinto twosub-requests (
DocumentCountRequest+DocumentSumRequest),dispatches each through its respective per-surface dispatcher,
then zips the responses.
Maintainer feedback on PR #3661:
The current shape is correct (the local read-transaction guarantees
both sub-reads see the same grovedb snapshot, no off-by-one race),
just does more grovedb work than strictly necessary — every AVG query
on the no-prove path walks grovedb twice.
The PCPS prove path at
execute_document_average_provealready doesthe right thing: one walk against a
ProvableCountProvableSumTreeterminator, both metrics extracted from each visited element via
count_sum_value_or_default(). The no-prove path lags behind.Goal
Single
DocumentCountSumRequest(or unified use ofDocumentAverageRequest) drives a single dispatcherexecute_document_count_and_sum_requestthat walks grovedb once andyields
(count, sum)per visited element / range.Removes the dual-routing fragility (count + sum routing tables
have to stay in sync — already caught one routing bug in PR #3661 when
sum's
GroupByCompoundrouting diverged from count's), halvesgrovedb work on AVG no-prove, and gives AVG a single source of truth
for routing decisions.
Required work (rs-drive only — no grovedb dependency)
Everything needed is already present in grovedb:
Element::count_sum_value_or_default()grove_get_rawon PCPS elementgrove_queryon PCPS range(count, sum)per elementQuery::new_aggregate_count_and_sum_on_rangeNo new grovedb primitive is required. There's a possible optional
optimization where grovedb could expose a no-prove analog of
AggregateCountAndSumOnRangethat does the accumulation in-enginerather than returning all PCPS elements over the FFI for Drive to
fold — but that's a perf tune, not a blocker. Today's
grove_queryover PCPS elements + in-Drive accumulation is functionally
equivalent.
Drive surfaces to add
DocumentCountSumRequeststruct inpackages/rs-drive/src/query/drive_document_average_query/mod.rs(or repurpose
DocumentAverageRequestif its field set alreadymatches — likely yes).
DocumentCountSumResponseenum withAggregate { count, sum }and
Entries(Vec<CountSumEntry>)variants. Already exists asDocumentAverageResponse— can be reused directly.Joint per-mode executors (these are the new functions, mostly
thin compositions of existing count + sum per-mode executors):
execute_document_count_and_sum_total_no_proof[…doctype, 0]count_sum_value_or_default()execute_document_count_and_sum_point_lookup_no_proofexecute_document_count_and_sum_per_in_value_no_proofexecute_document_count_and_sum_range_no_proof(count, sum)per elementEach is structurally a fold of the existing pair of per-mode
executors with a shared grovedb walk. The sum side already knows
how to walk these shapes; the count extraction is just adding
count_value_or_default()to what the sum walker alreadyproduces (it currently reads
sum_value_or_default()only).Single dispatcher
Drive::execute_document_count_and_sum_requestthat:
detect_sum_mode_from_inputsrouting table (the sum side's table — count's identical-shape
table is no longer separately needed for the AVG path).
Delete the count↔avg / sum↔avg adapter logic that currently
maps
AverageMode→(CountMode, SumMode)(lines 86-94 of thecurrent dispatcher) and the response-shape zipping at lines
148-176. Both go away once the joint executors return the
correct shape directly.
Drive surfaces to remove
count_request+sum_requeststruct construction (lines 100-120of
drive_document_average_query/drive_dispatcher.rs).zip_entrieshelper used to merge the two response vectors(still needed for the prove path's carrier case? — verify; if so,
keep it where the prove path consumes it).
let local_tx = self.grove.start_transaction(); …sharedtransaction plumbing (still needed but moves inside the joint
executor where it has fewer hops).
Validation
walk. Tests in
verify/document_sum/and the AVG bench's Q1-Q7should remain green without modification.
drive_dispatcher.rs:test_…)should produce identical
(count, sum)outputs as before — onlythe underlying grovedb-call count changes.
time on AVG queries (one walk instead of two). Worth pinning the
speedup in the chapter.
Acceptance criteria
execute_document_average_requestno longer constructs separateDocumentCountRequest+DocumentSumRequestinstances on theno-prove path.
row / one grovedb walk per range query (confirm with a probe
counter or a tracing span if available).
cargo test -p drive --lib drive_document_average_query).(count, sum)match what the current double-dispatch produces, against the
same grades-contract fixture.
updates the AVG chapter's
What's Nextif it currentlymentions the composition strategy).
Grovedb dependency
None required. All primitives needed are already present in the
current pinned grovedb rev. The optional no-prove-aggregate engine-side
optimization (avoid materializing all PCPS elements over FFI) would
be a separate grovedb PR if performance profiling warrants it after
this refactor lands.
Why not bundle into PR #3661
The refactor needs four new no-prove joint executors, each with their
own per-mode dispatch logic and per-mode test coverage. PR #3661 was
already large enough that bundling this in would have made it
unreviewable. Splitting this out gets it its own focused review window.
Context
Referenced from PR #3661 inline comment:
#3268402399.
The dispatcher carries an inline architectural-note comment at
drive_dispatcher.rs:100-115
pointing at this issue.