feat(merk,grovedb): add no-proof query_aggregate_count entry point by QuantumExplorer · Pull Request #662 · dashpay/grovedb

QuantumExplorer · 2026-05-11T20:00:27Z

Issue being fixed or feature implemented

Adds a no-proof execution variant of AggregateCountOnRange. Callers that need a count value but not a proof (e.g. server handlers answering prove=false count requests) previously had to call prove_query and discard the proof bytes. That wastes CPU and allocations on proof construction / serialization for a count that's then thrown away, and reads as confused intent.

Caller context: dashpay/platform#3623 wires up the unified GetDocumentsCount endpoint with range-count support over range_countable indexes; its summed-mode path walks every emitted element in Rust today (O(distinct values in range)). With this primitive available, that path becomes O(log n) and drops directly into execute_range_count_no_proof.

What was done?

merk: Merk::count_aggregate_on_range walks the same Contained / Disjoint / Boundary classification as prove_aggregate_count_on_range, using each internal node's stored aggregate count to short-circuit fully-inside / fully-outside subtrees, but emits no proof ops. NonCounted-correctness is preserved via the same own_count = node_count − left_struct − right_struct derivation the prover uses (NonCounted leaves have stored aggregate 0 → own_count 0). Tree-type gate (ProvableCountTree / ProvableCountSumTree only) and empty-merk-returns-0 contract are identical to the proof variant.
grovedb: GroveDb::query_aggregate_count(path_query, transaction, grove_version) -> CostResult<u64, Error>. Validates the PathQuery shape up front via validate_aggregate_count_on_range (same gate the prover and verifier use), opens the leaf merk at path_query.path, and delegates to the merk-level walk.
version: new query_aggregate_count_on_range field on GroveDBOperationsQueryVersions, wired through v1/v2/v3 at version 0.

The returned count is not independently verifiable — callers trust their own merk read path. For a verifiable count, callers continue to use prove_query + verify_aggregate_count_query. The doc comments on both entry points say so explicitly.

How Has This Been Tested?

Merk-level (6 new tests, merk/src/proofs/query/aggregate_count.rs):

no_proof_matches_prover_closed_range_inclusive
no_proof_matches_prover_closed_range_exclusive
no_proof_matches_prover_open_range_from
no_proof_matches_prover_range_below_all_keys
no_proof_empty_merk_returns_zero
no_proof_rejected_on_normal_tree

Each cross-checks count_aggregate_on_range against the prover's count for the same merk + range, so any divergence between the two walks fails the test.

GroveDB-level (11 new tests, grovedb/src/tests/aggregate_count_query_tests.rs):

All range variants (inclusive, exclusive, from, after, to-inclusive, disjoint)
ProvableCountTree and ProvableCountSumTree
3-layer path (single- and multi-layer parents)
Empty ProvableCountTree returns 0
Invalid inner range (Key) rejected with Error::InvalidQuery before storage reads
NormalTree rejected via Error::MerkError from the merk-level gate

Each test cross-checks the no-proof result against the proof variant (prove_query + verify_aggregate_count_query).

Suite: full merk lib (cargo test -p grovedb-merk --lib) → 415 passing; full grovedb lib (cargo test -p grovedb --lib) → 1507 passing, 0 failed.

Breaking Changes

None. The new field on GroveDBOperationsQueryVersions is additive and starts at 0 across all versions. No existing API signatures changed.

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated relevant unit/integration/functional/e2e tests
I have made corresponding changes to the documentation

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added aggregate count query functionality that returns item counts for range queries without generating cryptographic proofs, offering improved performance for count-only operations.
- Extended version tracking system to support the new aggregate count query capability.

Adds an O(log n) execution variant of `AggregateCountOnRange` that returns the count directly, without producing or verifying a proof. Server-side handlers answering `prove=false` count requests no longer need to compute a proof just to discard it. - merk: `Merk::count_aggregate_on_range` walks the same classification path as `prove_aggregate_count_on_range` (Contained / Disjoint / Boundary, using each node's stored aggregate count to short-circuit fully-inside / fully-outside subtrees) but skips proof-op emission. NonCounted-correctness is preserved via the same `own_count = node_count − left − right` derivation the prover uses. - grovedb: `GroveDb::query_aggregate_count` validates the PathQuery shape via `validate_aggregate_count_on_range`, opens the leaf merk at the given path, and delegates to the merk-level walk. Tree-type rejection (`ProvableCountTree` / `ProvableCountSumTree` only) is enforced at the merk entry. - version: new `query_aggregate_count_on_range` field on `GroveDBOperationsQueryVersions`, wired through v1/v2/v3 at version 0. The returned count is not independently verifiable — callers trust their own merk read path. For a verifiable count, callers continue to use `prove_query` + `verify_aggregate_count_query`. Tests: 6 merk-level tests cross-check the no-proof count against the prover's count across all range variants, the empty-merk and wrong-tree-type cases. 11 GroveDB-level tests cover the public API on single- and three-layer paths, all range variants, empty trees, malformed inner ranges, and the NormalTree rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-11T20:00:35Z

Warning

Rate limit exceeded

@QuantumExplorer has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 55 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 84100af6-c0fe-4291-98b5-06c7713c502c

📥 Commits

Reviewing files that changed from the base of the PR and between 4f2820e and 93ec94e.

📒 Files selected for processing (3)

grovedb/src/tests/aggregate_count_query_tests.rs
merk/src/merk/get.rs
merk/src/proofs/query/aggregate_count.rs

📝 Walkthrough

Walkthrough

This PR adds a no-proof variant of aggregate count on range queries to GroveDB. It extends version management, implements counting traversal in the merk layer, exposes a public query_aggregate_count API in GroveDB, and provides comprehensive test coverage validating consistency with proof-based results.

Changes

No-Proof Aggregate Count Query

Layer / File(s)	Summary
Version Contract `grovedb-version/src/version/grovedb_versions.rs`, `grovedb-version/src/version/v1.rs`, `grovedb-version/src/version/v2.rs`, `grovedb-version/src/version/v3.rs`	Adds `query_aggregate_count_on_range: FeatureVersion` field to `GroveDBOperationsQueryVersions` struct; initializes the field to `0` in `GROVE_V1`, `GROVE_V2`, and `GROVE_V3` version constants.
Merk Counting Implementation `merk/src/merk/prove.rs`, `merk/src/proofs/query/aggregate_count.rs`	Implements `RefWalker::count_aggregate_on_range` with helper functions `provable_count_from_walker` and `walk_count_only` to traverse the tree and compute in-range counts without emitting proofs; delegates from `Merk::count_aggregate_on_range` to RefWalker; validates tree type and returns 0 for empty merks.
GroveDB Public API `grovedb/src/operations/get/query.rs`	Adds `GroveDb::query_aggregate_count` method that validates query shape, opens the target merk subtree transactionally, calls merk-level counting, and returns the u64 result with cost accounting.
Test Coverage `grovedb/src/tests/aggregate_count_query_tests.rs`, `merk/src/proofs/query/aggregate_count.rs`	Introduces helper `no_proof_matches_proof` for cross-checking no-proof results against prove→verify; adds success-path tests for `ProvableCountTree` and `ProvableCountSumTree` variants; covers failure cases: invalid range shapes, wrong tree types, missing paths, and empty trees; extends merk-level tests to validate count consistency across range shapes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A count without proof, swift and light,
Through aggregate ranges, climbing the height,
No proofs to verify, just numbers so true,
Merk walks the tree, old and new!
GroveDB now counts with a whisper, not shout.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately reflects the main change: adding a no-proof query_aggregate_count entry point across merk and grovedb. It is concise, specific, and clearly summarizes the primary objective.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/determined-edison-b2dd07

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-11T20:03:32Z

Codecov Report

❌ Patch coverage is 94.20849% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.76%. Comparing base (dbd83dc) to head (93ec94e).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines	Patch %	Lines
merk/src/proofs/query/aggregate_count.rs	95.65%	9 Missing ⚠️
grovedb/src/operations/get/query.rs	80.00%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #662      +/-   ##
===========================================
+ Coverage    90.74%   90.76%   +0.01%     
===========================================
  Files          184      184              
  Lines        55532    55791     +259     
===========================================
+ Hits         50395    50639     +244     
- Misses        5137     5152      +15

Components	Coverage Δ
grovedb-core	`88.52% <80.00%> (-0.02%)`	⬇️
merk	`92.32% <96.06%> (+0.05%)`	⬆️
storage	`86.36% <ø> (ø)`
commitment-tree	`96.43% <ø> (ø)`
mmr	`96.76% <ø> (ø)`
bulk-append-tree	`89.14% <ø> (ø)`
element	`95.75% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Addresses codecov/patch feedback (PR #662 patch coverage was 87.89%, target 90%): 1. Refactor walk_count_only to collapse error branches: - Extract `provable_count_from_walker` helper to share the aggregate_data + provable_count_from_aggregate error mapping between the Contained-leaf and Boundary positions. - Replace match-on-Option<RefWalker> with if-let-Some so the "link is Some but walk returned None" arm — defensive, unreachable in practice — is no longer counted as an uncovered branch. 2. Drop the redundant tree_type check on `RefWalker::count_aggregate_on_range`: the caller (`Merk::count_aggregate_on_range`) already validates, and the per-node `provable_count_from_aggregate` check catches any mismatch between declared and in-memory type. 3. Add positive tests for more code paths: - merk: ProvableCountSumTree happy-path, RangeAfter, RangeTo, RangeToInclusive, RangeAfterToInclusive (each cross-checks against the prover). - grovedb: transactional read (`TransactionArg = Some(&tx)`) and the path-not-found error path (`open_transactional_merk_at_path` error arm). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t.rs The no-proof aggregate-count walk does not produce a proof — it's a read operation that happens to share its tree-walking pattern with `prove_aggregate_count_on_range`. `prove.rs` is documented as "Generating Merkle proofs for queries against a Merk tree" (merk/src/merk/mod.rs:47), so the function fits more naturally in `get.rs` ("Getting values by key from a Merk tree") alongside the other read entry points. No functional change — just file relocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

grovedb/src/tests/aggregate_count_query_tests.rs (1)
1443-1459: ⚡ Quick win

Make the transaction test observe uncommitted state.

Starting the transaction after all data is already committed only proves that Some(&tx) doesn't error. This still passes if query_aggregate_count ignores the transaction and reads from the base view. A small uncommitted insert/delete inside tx, plus a None vs Some(&tx) assertion, would turn this into a real regression test for transaction threading.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/tests/aggregate_count_query_tests.rs` around lines 1443 - 1459,
The test no_proof_uses_provided_transaction currently starts the transaction
after all data is committed, so change it to make the transaction observe
uncommitted state: after calling let tx = db.start_transaction() perform a
mutating operation inside that transaction (e.g., insert or remove a key under
the same path used by PathQuery via the transaction API) that will change the
expected aggregate count by ±1, then call
grove_db.query_aggregate_count(&path_query, Some(&tx), v) and assert the
returned count reflects the uncommitted change, and also call
grove_db.query_aggregate_count(&path_query, None, v) (or before the tx mutation)
to assert the base view does not include the change; use the existing symbols
tx, query_aggregate_count, PathQuery::new_aggregate_count_on_range and TEST_LEAF
to locate where to add the transactional insert/delete and the paired
assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@merk/src/proofs/query/aggregate_count.rs`:
- Around line 524-580: The code currently treats impossible mismatches as silent
undercounts; change both child-walk branches and the own_count computation to
fail fast: for the left/right walker, replace the if let Some(...) pattern with
a match on walker.walk(...) and if it returns None while the corresponding link
is present, propagate an error instead of skipping (use the same error
propagation mechanism as cost_return_on_error! to return a corrupted-state
error); likewise, after computing own_count =
node_count.saturating_sub(left_link_aggregate).saturating_sub(right_link_aggregate),
detect if node_count < left_link_aggregate + right_link_aggregate and return an
error rather than clamping — update the logic in walk_count_only / walker.walk
usage sites (functions/symbols: walker.walk, walk_count_only,
cost_return_on_error!, node_count, left_link_aggregate, right_link_aggregate,
total, range.contains(&node_key)) so callers no longer receive silently
truncated totals.

---

Nitpick comments:
In `@grovedb/src/tests/aggregate_count_query_tests.rs`:
- Around line 1443-1459: The test no_proof_uses_provided_transaction currently
starts the transaction after all data is committed, so change it to make the
transaction observe uncommitted state: after calling let tx =
db.start_transaction() perform a mutating operation inside that transaction
(e.g., insert or remove a key under the same path used by PathQuery via the
transaction API) that will change the expected aggregate count by ±1, then call
grove_db.query_aggregate_count(&path_query, Some(&tx), v) and assert the
returned count reflects the uncommitted change, and also call
grove_db.query_aggregate_count(&path_query, None, v) (or before the tx mutation)
to assert the base view does not include the change; use the existing symbols
tx, query_aggregate_count, PathQuery::new_aggregate_count_on_range and TEST_LEAF
to locate where to add the transactional insert/delete and the paired
assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f53d71ec-bfd9-40d0-9b76-c6194db3ce0f

📥 Commits

Reviewing files that changed from the base of the PR and between 1da6299 and 4f2820e.

📒 Files selected for processing (8)

grovedb-version/src/version/grovedb_versions.rs
grovedb-version/src/version/v1.rs
grovedb-version/src/version/v2.rs
grovedb-version/src/version/v3.rs
grovedb/src/operations/get/query.rs
grovedb/src/tests/aggregate_count_query_tests.rs
merk/src/merk/prove.rs
merk/src/proofs/query/aggregate_count.rs

Address CodeRabbit feedback on PR #662. **Major (correctness)**: The proof variant's verifier catches inconsistent tree state at verify time, but the no-proof variant returns the count straight to the caller — so it must fail loudly instead of silently undercounting. Restore the corrupted-state arms my earlier coverage- focused refactor had collapsed: - Re-introduce `CorruptedState` errors when `tree.link(true/false)` is `Some` but `walker.walk(...)` returns `None`. - Switch own_count derivation from `saturating_sub` to `checked_sub`: children claiming more keys than the parent's aggregate is corruption, not something to clamp to 0. **Nitpick (test quality)**: Tighten `no_proof_uses_provided_transaction` so it actually verifies transaction threading rather than just proving `Some(&tx)` doesn't error: - Read the base count (10). - Insert an in-range key inside a transaction. - Assert the transactional read sees 11 (uncommitted insert visible). - Assert the non-transactional read still sees 10 (base view isolated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-11T20:31:56Z

This is Claude — reply to @coderabbitai's review.

Nitpick on no_proof_uses_provided_transaction (test quality): addressed in 395b1e0. You were right that starting the transaction after all data is committed only proves Some(&tx) doesn't error, not that the transaction is actually threaded through to the merk read. The strengthened test now:

Reads the base count → asserts 10.
Inserts an in-range key ("k2") inside a transaction.
Asserts the transactional read sees 11 (uncommitted insert visible).
Asserts the non-transactional read still sees 10 (base view isolated).

This catches a regression where query_aggregate_count ignored its TransactionArg and read from the base view.

Both pieces of feedback (the major fail-fast change + this test) are in 395b1e0b. Inline reply to the major one is on the diff thread.

coderabbitai · 2026-05-11T20:32:16Z

Tip

For best results, initiate chat on the files or code changes.

@QuantumExplorer The strengthened test design is exactly right — the before/after transaction isolation check (10 → 11 transactional, 10 non-transactional) creates a meaningful regression guard that would catch any TransactionArg being silently ignored. That's a much more robust proof of correctness than simply verifying Some(&tx) doesn't panic. Thanks for the thorough follow-up!

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

CI linting flagged `ok_or_else(|| Error::CorruptedState(...))` with a zero-cost &'static str argument — the closure is unnecessary. Use `ok_or(...)` instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirrors PR #662's `query_aggregate_count` for the signed-sum side. Callers that need a sum value but not a proof (e.g. server handlers answering `prove=false` sum requests) can now bypass proof construction, serialization, and verification entirely. The merk-level walk is `O(log n + |boundary|)` in the number of distinct keys, identical complexity to the prover but without the proof-op allocations or hash recomputations. The signed-sum arithmetic carries the same `i128` accumulator the prover and verifier use (so adversarial intermediate sums never wrap), and narrows to `i64` at the public entry point. An out-of-i64 result is classified as `Error::CorruptedData` since a real `ProvableSumTree` maintains every aggregate as `i64` at every level. NEW APIS - `Merk::sum_aggregate_on_range(&inner_range, grove_version) -> CostResult<i64, Error>` in `merk/src/merk/get.rs`. Checks `tree_type == ProvableSumTree`; rejects any other tree type with `Error::InvalidProofError`. Returns 0 for an empty merk. - `RefWalker::sum_aggregate_on_range(&inner_range, grove_version)` in `merk/src/proofs/query/aggregate_sum.rs`. Walks the same Contained / Disjoint / Boundary classification path as `create_aggregate_sum_on_range_proof`, but emits no proof ops. - `GroveDb::query_aggregate_sum(path_query, transaction, grove_version) -> CostResult<i64, Error>` in `grovedb/src/operations/get/query.rs`. Validates the PathQuery up-front via `validate_aggregate_sum_on_range` (same gate the prover and verifier use — catches malformed ASOR queries plus the empty-path rejection from the prior commit before any storage reads), opens the leaf merk at `path_query.path`, and delegates to the merk-level walk. - New `query_aggregate_sum_on_range` field on `GroveDBOperationsQueryVersions`, wired through v1/v2/v3 at version `0`. NotSummed-correctness is preserved via the same `own_sum = node_sum - left_struct - right_struct` derivation the prover uses. NotSummed-wrapped subtrees have stored aggregate 0, so the subtraction yields 0 at the wrapper boundary - they do not contribute to the in-range total. The returned sum is **not** independently verifiable: callers are trusting their own merk read path. For a verifiable sum, continue using `prove_query` + `verify_aggregate_sum_query`. Documented explicitly on both entry points. TESTS - 10 new merk-level cross-checks (`merk/src/proofs/query/aggregate_sum.rs::tests`): each range variant against `prove_aggregate_sum_on_range`'s computed sum, plus empty-merk-returns-0, NormalTree rejection, ProvableCountTree rejection (precise tree-type match, not "any provable aggregate tree"), and a mixed-positive/negative scenario that exercises the signed `own_sum` subtraction. - 11 new GroveDB-level cross-checks (`grovedb/src/tests/aggregate_sum_query_tests.rs::tests`): every range shape on a populated `ProvableSumTree`, empty subtree returns 0, negative-sum scenario, invalid-inner-range (`Key`) rejected with `InvalidQuery`, empty-path rejected with `InvalidQuery`, NormalTree leaf rejected with `MerkError` from the merk-level gate. Workspace `cargo test --all-features`: 2985 passing / 0 failing (was 2964 / 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(types): add Element::ProvableSumTree variant + NotSummed twin extension Phase 1 of the ProvableSumTree feature — the missing parallel to ProvableCountTree that bakes the per-node sum into the node hash, making aggregate-sum range queries cryptographically verifiable. This commit adds the types-only foundation (no hash divergence yet — Phase 2 will introduce node_hash_with_sum and the new proof Node variants). DISCRIMINANTS - Element::ProvableSumTree at variant index 17 / bincode discriminant 17 (next free after the NotSummed wrapper byte at 16). This will renumber to 19 when PR #657 (CountIndexedTree) lands and reclaims 17/18. - NonCountedProvableSumTree = 0x80 | 17 = 145. - The NonCounted twin range widened from 0x80..=0x8F (4-bit base) to 0x80..=0x9F (5-bit base) — is_non_counted() now checks the top 3 bits (& 0xe0 == 0x80) instead of the top 4. Existing twins 128..=142 stay put. - The NotSummed twin scheme rebases analogously: prefix 0xb0 -> 0xa0, base mask 0x0F -> 0x1F, family range 0xA0..=0xBF. Existing twins move: NotSummedSumTree 180 -> 164 NotSummedBigSumTree 181 -> 165 NotSummedCountSumTree 183 -> 167 NotSummedProvableCountSumTree 186 -> 170 Plus the new NotSummedProvableSumTree = 0xa0 | 17 = 177. Safe because V1 is pre-shipping. is_not_summed() now uses & 0xe0 == 0xa0. NEW APIS - ElementType::ProvableSumTree, ElementType::NonCountedProvableSumTree, ElementType::NotSummedProvableSumTree. - TreeType::ProvableSumTree (discriminant 11, is_sum_bearing = true, allows_sum_item = true, inner_node_type = ProvableSumNode). - NodeType::ProvableSumNode and TreeFeatureType::ProvableSummedMerkNode(i64) with encode tag byte 7 and a parallel zero_sum() helper alongside zero_count(). - Element::new_provable_sum_tree*, empty_provable_sum_tree*, plus helpers (as_provable_sum_tree_value, is_provable_sum_tree). - Element::NotSummed now accepts ProvableSumTree as a sum-tree inner type (constructor, serialize, deserialize). PROOF DISPATCH ProvableSumTree joins the "provable aggregate parent" family alongside ProvableCountTree / ProvableCountSumTree in ElementType::proof_node_type: subtree children use KvValueHashFeatureType and item children use KvCount. PHASE-1 SCOPE BOUNDARIES ProvableSumTree behaves identically to SumTree for storage, aggregation, and hashing in Phase 1. The divergent node_hash_with_sum and the new proof Node variants (KVSum, KVHashSum, etc.) land in Phase 2. TreeFeatureType::ProvableSummedMerkNode maps to AggregateData::Sum at the Element/aggregate level for now; Phase 2 may introduce a dedicated variant once the hash diverges. Workspace cargo test --all-features green (1497+ tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(merk): node_hash_with_sum + proof Node variants for ProvableSumTree Phase 2 of the ProvableSumTree feature — bakes the per-node sum into the node hash so it becomes cryptographically committed via the parent's hash chain, parallel to how `node_hash_with_count` commits the count for `ProvableCountTree`. After this commit a `ProvableSumTree` with the same {key/value/sum} contents as a plain `SumTree` produces a different root hash, which is the whole point of the Phase 2 divergence. Phase 1 (commit c95cf749) was types-only; aggregation, storage, and hashing all used the SumTree code paths. Phase 2 introduces the new hash function, the new proof-node variants needed to transport sums through proofs, and the dispatch wiring on both prover and verifier sides. Phases 3 (insert/read), 4 (verify_grovedb walk), and 5 (AggregateSumOnRange) remain. HASH DISPATCH - `merk::tree::hash::node_hash_with_sum(kv, l, r, i64)` mirrors `node_hash_with_count` byte-for-byte except the appended 8-byte field is `i64::to_be_bytes()`. Negative sums hash via their two's-complement BE form, which is platform-independent. - New `AggregateData::ProvableSum(i64)` variant. The `From<TreeFeatureType>` conversion now maps `ProvableSummedMerkNode(v) -> ProvableSum(v)` (was `Sum(v)` in Phase 1) so `Tree::hash_for_link` and the commit path can dispatch through the new arm. - `Tree::hash_for_link(TreeType::ProvableSumTree)` and both commit paths (left/right Link::Modified arms) now call `node_hash_with_sum` when the aggregate is `ProvableSum`. `Tree::aggregate_data` for `ProvableSummedMerkNode` yields `ProvableSum` instead of `Sum`. - Helper updates: `child_aggregate_sum_data_as_i64` / `child_aggregate_sum_data_as_i128` treat `ProvableSum` identically to `Sum`; `child_aggregate_count_data_as_u64` returns 0. `child_ref_and_sum_size` covers the new variant. - `Link::encode_into` / `decode_into` learn tag byte 7 for `AggregateData::ProvableSum` (parallel to the existing `ProvableSummedMerkNode` tag byte 7 in `TreeFeatureType`). - `grovedb::batch` `InsertTreeWithRootHash` now reconstructs an `Element::ProvableSumTree` when seeing `AggregateData::ProvableSum`. PROOF NODE VARIANTS Five new `Node` enum variants in `grovedb-query/src/proofs/mod.rs`, mirroring the Count family member-for-member but with `i64` sums: - `KVSum(key, value, sum)` — sum analogue of `KVCount` - `KVHashSum(kv_hash, sum)` — analogue of `KVHashCount` - `KVRefValueHashSum(key, ref_value, ref_elem_hash, sum)` - `KVDigestSum(key, value_hash, sum)` — analogue of `KVDigestCount` - `HashWithSum(kv_hash, l, r, sum)` — analogue of `HashWithCount` `merk::proofs::tree::Tree::hash()` now dispatches each new variant through `node_hash_with_sum`. `KVValueHashFeatureType` / `...WithChildHash` handling gains a `ProvableSummedMerkNode` arm so proof-tree hashes recomputed from a Sum-bearing feature_type match the Merk-tree side. `aggregate_data()` returns `ProvableSum(sum)` for `KVSum` and `HashWithSum`; `key()` lists the three key-bearing new variants alongside their Count counterparts. `grovedb-element::ProofNodeType` gains `KvSum` and `KvRefValueHashSum`; `ElementType::proof_node_type` now picks them when the parent is `ProvableSumTree` (Phase 1 routed Sum-tree children through the Count dispatch). Subtrees inside ProvableSum still use `KvValueHashFeatureType` since the feature_type carries the sum. Proof generation in `merk/src/proofs/query/mod.rs` adds `to_kv_sum_node`, `to_kvhash_sum_node`, `to_kvdigest_sum_node` (parallel to the Count helpers) and an `is_provable_sum_tree` branch that emits Sum-bearing variants. `chunks.rs`'s `create_proof_node_for_chunk` dispatches the new ProofNodeType arms. GroveDB-side reference post-processing in `grovedb/src/operations/proof/generate.rs` rewrites the merk-level `KVValueHashFeatureType(_, _, _, ProvableSummedMerkNode(sum))` to `KVRefValueHashSum`, mirroring the existing `KVValueHashFeatureType -> KVRefValueHashCount` path. Both ref-rewriting loops in that file are updated. The regular query verifier in `merk/src/proofs/query/verify.rs` rejects `HashWithSum` at non-aggregate positions (fail-fast, matching the existing `HashWithCount` guard). `KVSum`, `KVDigestSum`, and `KVRefValueHashSum` are dispatched via `execute_node`. `KVHashSum` joins `KVHash` / `KVHashCount` in the "non-data-bearing on path" branch and in the absence-proof boundary set. WIRE FORMAT Tag bytes 0x30..=0x3D in the previously-unused 0x30..0x3F range: Push variants (V0 short + V1 wrapper for KV-style large values): 0x30 = KVSum (small), 0x31 = KVSum (large) 0x32 = KVHashSum 0x33 = KVRefValueHashSum (small), 0x34 = KVRefValueHashSum (large) 0x35 = KVDigestSum 0x36 = HashWithSum PushInverted parallel: 0x37 = KVSum (small), 0x38 = KVSum (large) 0x39 = KVHashSum 0x3a = KVRefValueHashSum (small), 0x3b = KVRefValueHashSum (large) 0x3c = KVDigestSum 0x3d = HashWithSum 0x3e and 0x3f are intentionally reserved. The on-wire i64 sum uses varint (via `ed::Encode for i64`) for compactness, matching the Count family. The hash recomputation in `node_hash_with_sum` uses the fixed 8-byte big-endian form independently — wire encoding and hash input are deliberately decoupled. `encoding_length()` and `Decode` arms parallel the Count family verbatim. V0 wire format is unchanged. All new tags are V1-only. TESTS - `merk::tree::hash` (4): `node_hash_with_sum` differs from `node_hash` even at sum=0; different sums give different hashes; `i64::MIN` / `i64::MAX` are distinct; determinism. - `merk::tree` (2): a `ProvableSummedMerkNode` tree aggregates to `ProvableSum`, `hash_for_link(ProvableSumTree)` matches `node_hash_with_sum(...)` and diverges from plain `Tree::hash()`; mutating a node sum changes the root hash. - `merk::proofs::tree` (4): forged sums on `HashWithSum`, `KVSum`, `KVHashSum` change the recomputed node hash; Phase 1 -> Phase 2 cornerstone — same {key/value/sum} contents give a different ProvableSumTree hash than a plain SumTree. - `grovedb-query::proofs::encoding` (4): round-trip every new variant through `Op::Push` and `Op::PushInverted` at sum values {`i64::MIN`, -42, -1, 0, 1, 42, `i64::MAX`}; tag-byte sanity check for all 10 new tags. - `merk::tree::tree_feature_type`: extended every existing `AggregateData` test to cover the new `ProvableSum` variant. Workspace `cargo test --all-features` green: 2881 tests passing, zero failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(types): NotSummed twin uses explicit per-variant mapping Phase 1.6 — revert the Phase 1.5 mask widening. The NotSummed family returns to its original 4-bit prefix `0xb0` / range `0xB0..=0xBF`. The five legal sum-tree inner types are mapped to twin slots explicitly, 1-at-a-time, instead of via the `prefix | base` bitwise formula: SumTree (base 4) -> 180 (0xB4) BigSumTree (base 5) -> 181 (0xB5) CountSumTree (base 7) -> 183 (0xB7) ProvableCountSumTree (base 10) -> 186 (0xBA) ProvableSumTree (base 17) -> 177 (0xB1) Existing twins return to their original discriminant values; only the new ProvableSumTree slot is freshly assigned. Pre-shipping V1, so this discriminant churn is fine. WHY EXPLICIT MAPPING `prefix | inner_byte` can only generate twin slots when the inner discriminant fits in the prefix's complement nibble. For ProvableSumTree at base 17, the formula `0xb0 | 17` would produce `0xB1` AND then `disc & 0x0F` would invert it back to base 1 (Reference) — a collision. Widening the mask to 5 bits in Phase 1.5 rebased every existing twin discriminant; reverting to per-variant mapping keeps the historical values stable while still allowing arbitrary new slot assignments. CONSEQUENCES - `NOT_SUMMED_TWIN_PREFIX` stays as a const but is now only a family-range marker, never composed with a base byte. - `NOT_SUMMED_BASE_MASK` removed — no remaining callers. - `is_not_summed()` back to `& 0xf0 == 0xb0`. - `base()` for NotSummed now uses an explicit per-variant match. - `from_serialized_value` NotSummed branch uses an explicit `inner_byte → twin_variant` match. NonCounted is unaffected — it still uses the bitwise formula because all its bases fit cleanly in the low 5 bits under `0x80`. Workspace cargo test --all-features green (2881 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(grovedb): wire ProvableSumTree through insert/read/batch paths Phase 3 of the ProvableSumTree feature — wires the variant through the extension traits, cost calculator, reconstruction helper, batch propagation, and read-path subtree validation so that direct insertion, nested aggregation, and child-sum mutation behave correctly end-to-end. Phase 1 (commit c95cf749) added the variant and its twins; Phase 2 (commit 3364f08c) introduced node_hash_with_sum and the proof Node family. The "behave like SumTree" fallback Phase 1 leaned on covered most surfaces but several dispatch sites guarded subsequent operations through explicit per-variant match arms — those sites would silently drop ProvableSumTree or fail to traverse into it. Phase 3 fills each in deliberately. EXTENSION TRAITS — merk/src/element/tree_type.rs ElementTreeTypeExtensions had six trait methods that enumerated tree variants explicitly: root_key_and_tree_type_owned, root_key_and_tree_type, tree_flags_and_type, tree_type, maybe_tree_type, tree_feature_type. Each was missing its ProvableSumTree arm — get_feature_type was the only one already wired (Phase 1). Adding the missing arms unblocks callers across get, batch, and visualize that thread tree types through these helpers to decide layout, hashing, and aggregate-data extraction. tree_feature_type now maps ProvableSumTree -> ProvableSummedMerkNode(sum) explicitly, matching the parallel ProvableCountedMerkNode wiring. COST CALCULATOR — merk/src/element/costs.rs get_specialized_cost, the layered_value_byte_cost path in specialized_costs_for_key_value, the layered_value_defined_cost type filter, and the value_defined_cost dispatch all enumerated the eight Merk-tree variants and would have either returned None or mis-sized a ProvableSumTree element. Added explicit ProvableSumTree => SUM_TREE_COST_SIZE arms (parity with SumTree as established in Phase 1) and the matching LayeredValueDefinedCost branch. RECONSTRUCTION — merk/src/element/reconstruct.rs ElementReconstructExtensions::reconstruct_with_root_key, used by batch propagation to rebuild a tree element after a root-key update, returned None for ProvableSumTree. Added the arm that pulls aggregate_data.as_sum_i64() into Element::ProvableSumTree(root, sum, flags); without this, batch operations that mutated a ProvableSumTree subtree would lose their tree element entirely during the parent's upward propagation. as_sum_i64 already handles the AggregateData::ProvableSum case (Phase 2). BATCH PROPAGATION — grovedb/src/batch/mod.rs The InsertTreeWithRootHash else-if chain that transcribes a Merk-tree mutation into the appropriate root-hash-bearing operation enumerated each tree-element variant explicitly. ProvableSumTree was missing, so a batch that mutated a ProvableSumTree subtree would fall through to the CommitmentTree arm — wrong shape entirely. Mirrored the ProvableCountSumTree arm directly. The accompanying tree-cost match list above (used by the apply_batch storage-cost callback) was also missing the variant. READ-PATH SUBTREE VALIDATION — grovedb/src/operations/get/mod.rs check_subtree_exists rejected paths whose final segment resolved to a ProvableSumTree because the variant wasn't in its accepted-tree match list. This would have broken every query that traversed INTO a ProvableSumTree. TESTS — grovedb/src/tests/provable_sum_tree_tests.rs Ten tests covering Phase 3's externally-observable surface: - Round-trip insert/read with aggregate-sum tracking. - Aggregation across mixed positive/negative/zero values + i64::MIN/ i64::MAX extremes. - Root-hash divergence vs a plain SumTree with identical children (the Phase 2 cornerstone, verified end-to-end via open_transactional_merk_at_path). - Nested ProvableSumTree[A] -> ProvableSumTree[B] aggregate propagation; mutation of B's children shifts the grovedb root hash. - Wrapper interactions: NonCounted(ProvableSumTree) contributes 0 to a CountTree parent; NotSummed(ProvableSumTree) contributes 0 to a SumTree parent; the wrapped tree's own aggregate is preserved (verified via get_raw, which retains the wrapper byte). - Deleting a SumItem child shifts the ProvableSumTree root hash because the aggregate sum is hash-bound. - Direct insert of a ProvableSumTree built from an existing template. The wrapper round-trip tests use db.get_raw rather than db.get because db.get strips wrappers via into_underlying — by design. Workspace cargo test --all-features green: 2891 tests passing (was 2881 in Phase 2 + 10 new), zero failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(grovedb): verify_grovedb consistency check for aggregate fields Phase 4 of the ProvableSumTree feature. The existing `verify_merk_and_submerks_in_transaction` walk is cryptographically complete — `combine_hash(value_hash(parent_bytes), inner_merk_root) == stored_element_value_hash` catches any byte-level tampering, and for ProvableSumTree the inner aggregate is bound into the inner Merk's root_hash via `node_hash_with_sum` (Phase 2). What that walk did not catch was the *software-consistency* class of drift: a parent `ProvableSumTree(_, N, _)` whose stored sum field N disagrees with the inner Merk's actual `aggregate_data()` value M. For provable variants both N and M are bound into element_value_hash, but they live on disk independently and could disagree if Phase 3's propagation logic drifts. For non-provable variants (SumTree, BigSumTree, CountTree, CountSumTree) the recorded aggregate isn't hash-bound at all, so a pure software bug in propagation would silently corrupt the tree. LIB.RS — verify_merk_and_submerks_in_transaction After the existing cryptographic check, for any tree element whose inner Merk holds actual data (i.e. excluding the non-Merk-data trees CommitmentTree/MmrTree/BulkAppendTree/DenseTree, which already short-circuit via `uses_non_merk_data_storage`), the verifier now opens the inner Merk, reads its `aggregate_data()`, and compares against the parent's recorded aggregate field via a new free helper `aggregate_consistency_labels`. The helper covers all seven aggregate- bearing tree variants: - SumTree vs. AggregateData::Sum - ProvableSumTree vs. AggregateData::ProvableSum - BigSumTree vs. AggregateData::BigSum - CountTree vs. AggregateData::Count - CountSumTree vs. AggregateData::CountAndSum - ProvableCountTree vs. AggregateData::ProvableCount - ProvableCountSumTree vs. AggregateData::ProvableCountAndSum Plus an empty-Merk identity case (NoAggregateData with zero recorded aggregate matches), and a fallback that reports any variant-shape mismatch (e.g. ProvableSumTree paired with AggregateData::Count(_)). VERIFICATIONISSUES SHAPE — placeholder hashes, not type extension `VerificationIssues` is a private type alias `HashMap<Vec<Vec<u8>>, (CryptoHash, CryptoHash, CryptoHash)>` whose shape is consumed by `visualize_verify_grovedb`. To avoid breaking its callers and the visualize hex output, mismatched aggregates are packed into deterministic placeholder CryptoHashes via `blake3(format!("recorded ..."))` and `blake3(format!("inner ..."))`, slotted into the "expected" and "actual" fields. The "root" slot reuses the inner-Merk root_hash for path locality. Documented inline. INTEGRITY WALK TESTS — 7 new tests A new `integrity_walk_tests` module in `provable_sum_tree_tests.rs` exercises the verifier end-to-end via two raw-storage tampering helpers: - `tamper_value_no_hash_update` decodes the on-disk TreeNode for a leaf, replaces only its element bytes, re-encodes (leaving the stored value_hash stale), writes back via the immediate storage context. Simulates byte-level tampering caught by the SumItem arm's `value_hash(bytes) != stored_value_hash` check. - `tamper_parent_element_with_consistent_hashes` splices in fresh element bytes AND recomputes hash + value_hash to remain crypto-consistent with the inner Merk's existing root_hash. Used for aggregate-mismatch scenarios — the crypto check passes, but the new aggregate-consistency check fires. Offsets into the on-disk TreeNodeInner encoding are derived from the decoded `value_as_slice().len()`. Scenarios covered: 1. Inner SumItem value tamper (different bytes) — crypto check catches it. 2. Inner SumItem same-length value tamper — crypto check catches it (assert: hashes, not lengths, are what's verified). 3. Parent ProvableSumTree aggregate mismatch (sum=999 stored vs. 40 actual) — new aggregate-consistency check fires. 4. Clean ProvableSumTree verifies clean (with mixed positive, negative, zero, and large values). 5. Clean ProvableCountTree verifies clean. 6. Parent ProvableCountTree aggregate mismatch (count=9999 vs. 3) — sanity check that the generalized helper handles the count variant too. 7. Reload-after-write determinism: insert, drop the db handle, reopen, verify_grovedb reports zero issues; the parent's ProvableSumTree.sum_value field round-trips. Workspace cargo test --all-features green: 2898 passing (Phase 3 baseline of 2891 + 7 new), zero failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: AggregateSumOnRange query + proof + verify for ProvableSumTree Adds the marquee Phase 5 feature for ProvableSumTree: a query that asks "what's the cryptographically-verifiable signed sum of children with keys in range [a, b]?" against a ProvableSumTree, with proof size O(log n + |boundary|) and a verify path that returns the root hash plus the aggregate i64 sum. Mirrors AggregateCountOnRange line-for-line: - QueryItem::AggregateSumOnRange(Box<QueryItem>) variant (wire tag 11) - Query / SizedQuery / PathQuery::validate_aggregate_sum_on_range with the same nested-rejection, no-subquery, no-pagination, allowed-inner-range rules - merk/src/proofs/query/aggregate_sum.rs (~760 lines) implementing create_aggregate_sum_on_range_proof + verify_aggregate_sum_on_range_proof with the same Disjoint/Contained/Boundary classification, HashWithSum self-verifying compression at fully-inside/outside subtrees, and KVDigestSum at boundaries - grovedb/src/operations/proof/aggregate_sum.rs (~330 lines) for the GroveDB-level multi-layer envelope chain check - prove_query / verify_query dispatch in generate.rs and verify.rs - Tree-type rejection arms in BulkAppendTree, DenseTree, MMR for the new variant Key correctness points handled differently from count: - i128 accumulator throughout the verifier (sum can validly be 0 with non-zero children, so no "if sum == 0" short-circuit; final narrow to i64 with an explicit overflow error) - No checked_sub equivalent for own_sum derivation — signed sums make arithmetic-only corruption detection meaningless; the hash chain binds the values regardless - ProvableSumTree-only at the merk-level gate (Sum/BigSum use different hash dispatches and can't host this proof shape) Tests: 35 new tests total (14 merk-level in aggregate_sum.rs, 21 GroveDB- level in aggregate_sum_query_tests.rs) covering empty trees, single-key ranges, full/sub/boundary ranges, negative sums, mixed-sign extremes including i64::MAX + i64::MIN = -1, tampering rejection, wrong-tree rejection, validation rejection of nested/Key/RangeFull/orthogonal-aggregate inners, multi-layer paths, NotSummed-wrapped subtree exclusion, V0 envelope round-trip. Workspace test count: 2898 → 2938, zero failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: ProvableSumTree element + AggregateSumOnRange query Final phase of the ProvableSumTree feature — documentation. Adds: - `docs/book/src/aggregate-sum-on-range-queries.md`: new dedicated chapter describing the AggregateSumOnRange query, the ProvableSumTree tree type it operates on, why the existing sum trees can't be queried this way, the proof node vocabulary (KVSum / KVHashSum / HashWithSum / KVDigestSum / KVRefValueHashSum at wire tags 0x30..=0x3D), and the signed-sum correctness notes (no zero-sum short-circuit; i128 accumulator with i64 narrowing at the entry points; overflow handling at i64::MAX extremes). - `docs/book/src/element-system.md`: ProvableSumTree row added to the aggregate-tree table; ProvableSummedMerkNode added to the TreeFeatureType enum block; NonCounted/NotSummed wrapper indices surfaced; explanation of when to choose ProvableSumTree over plain SumTree (sum is part of the protocol invariant vs metadata) and the rationale for the explicit `NotSummedProvableSumTree = 177` slot. - `docs/book/src/hashing.md`: parallel "Aggregate Hashing for ProvableSumTree" section showing node_hash_with_sum's i64 BE input layout and the wire-vs-hash encoding split. - `docs/book/src/appendix-a.md`: rows for NonCounted (15), NotSummed (16), and ProvableSumTree (17) added to the discriminant table. - `docs/book/src/aggregate-sum-queries.md`: disambiguation banner at the top distinguishing the existing sum-budget iterator from the new AggregateSumOnRange query, with a cross-link. - `docs/book/src/SUMMARY.md`: registers the new chapter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: targeted coverage for ProvableSumTree code paths Adds focused unit tests for production code added by this PR that the existing test suite happened to exercise indirectly. None of these tests add new behavior; they only pin down branches that codecov flagged as uncovered on the patch. Covers: - grovedb-query/src/proofs/encoding.rs: long-value (>= 65536 bytes) round-trips for the four KV-style ProvableSumTree wire variants (KVSum, KVRefValueHashSum in both Push and PushInverted directions -- tag bytes 0x31, 0x34, 0x38, 0x3b). - grovedb-query/src/proofs/mod.rs: Display tests for KVSum, KVHashSum, KVRefValueHashSum, KVDigestSum, and HashWithSum proof nodes. - grovedb-element/src/element_type.rs: proof_node_type dispatch on ProvableSumTree parents (Items -> KvSum, References -> KvRefValueHashSum), plus as_str / Display for ProvableSumTree, NonCountedProvableSumTree, NotSummedProvableSumTree. - grovedb-element/tests/element_constructors_helpers.rs: every ProvableSumTree constructor + is_provable_sum_tree / as_provable_sum_tree_value / into_provable_sum_tree_value helpers, including the wrong-element error paths. - grovedb-element/tests/element_display_and_serialization.rs: extends the all-variants Display test to include ProvableSumTree. - merk/src/proofs/tree.rs: forged-sum sensitivity for KVDigestSum and KVRefValueHashSum (the latter exercises the full combine(referenced_value_hash, node_value_hash) -> node_hash_with_sum path), aggregate_data() returning ProvableSum for KVSum / HashWithSum, and key() returning the right thing for all five sum node variants. - merk/src/proofs/query/aggregate_sum.rs: two new regression tests for Merk::prove (regular query) on a ProvableSumTree, asserting that the emitted proof contains KVSum + KVHashSum (and KVDigestSum for an absent-key boundary). These hit the to_kv_sum_node / to_kvhash_sum_node / to_kvdigest_sum_node helpers whose only callers are inside create_proof_internal's ProvableSumTree branches. All 2956 workspace tests pass (was 2938 before this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(verify): reject KVRefValueHashSum in trunk/branch chunk proofs The trunk/branch proof extractor was rejecting Node::KVRefValueHash and Node::KVRefValueHashCount as having an opaque value_hash, but the new Phase 2 KVRefValueHashSum variant was missing from the rejection arm. Without this guard, get_key_value_from_node still surfaces (key, value) for KVRefValueHashSum nodes and the verifier would deserialize and insert the value bytes into the elements map. The embedded node_value_hash is opaque (combine_hash of the node_value_hash and the referenced_value_hash) and cannot be recomputed from the value bytes alone, so a forged value could ride along while the per-node merk hash chain still appears valid. Add KVRefValueHashSum to the rejection arm in extract_elements_and_leaf_keys, alongside KVRefValueHash and KVRefValueHashCount, with a regression test that mirrors the existing KVRefValueHash trunk-proof rejection test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(query): scan conditional-branch selectors for AggregateSumOnRange has_aggregate_sum_on_range_anywhere previously walked only branch.subquery for each conditional branch, ignoring the branch selector (the IndexMap key). Selectors are themselves QueryItems and the type system permits an AggregateSumOnRange tag there even though it is not a meaningful conditional matcher. The shape check is meant to be exhaustive — if any ASOR is present "anywhere", the prover must refuse to route through the regular-proof path — so the walker must surface a selector-tagged ASOR too. Iterate `(selector, branch)` instead of `branches.values()` and short-circuit on `selector.is_aggregate_sum_on_range()` before recursing into `branch.subquery`. Add a regression test that mirrors the existing count walker test and explicitly covers the selector case. The same gap exists for has_aggregate_count_on_range_anywhere but is left as-is here since it predates this PR and changing it would be out of scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: error classification and entry preservation in aggregate-sum paths Three small correctness improvements flagged by review: * grovedb/src/lib.rs verify_merk_and_submerks_in_transaction: when both the cryptographic combined_value_hash check and the aggregate_consistency check fail for the same subtree path, the aggregate-consistency branch's `issues.insert(...)` clobbered the cryptographic mismatch entry. The real merk-hash chain mismatch is the more diagnostic message, so switch to `.entry().or_insert(...)` to preserve the first-inserted entry per path. * merk/src/proofs/query/aggregate_sum.rs: the prover walks our *own* in-memory merk. If `aggregate_data()` refuses to surface a `ProvableSum` for a node in a tree we already gated as `ProvableSumTree`, that is local storage/state corruption, not a peer-supplied invalid proof. Reclassify three sites from `Error::InvalidProofError` to `Error::CorruptedData` to match the repo error-handling convention. * grovedb/src/operations/proof/generate.rs: the two ASOR call sites forwarded bare `Error::MerkError` for `prove_aggregate_sum_on_range` failures, making them indistinguishable from surrounding proof-generation merk errors. Wrap each with `Error::CorruptedData(format!("prove_aggregate_sum_on_range failed: {}", e))` per the repo convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: strengthen aggregate-sum regression coverage Address review feedback on test gaps in the ProvableSumTree / AggregateSumOnRange suite: * provable_sum_tree_tests::populated_provable_sum_tree_round_trips: the per-iteration loop fetched the parent psum but only asserted its variant — the running aggregate after each insert was never checked. Track an `expected_sum` accumulator and assert `as_provable_sum_tree_value() == expected_sum` after every insert so the running aggregate (7 → 20 → 40) is pinned down, not just the final sum. * direct_insert_provable_sum_tree_with_root_key_and_sum: the test was named for a direct-insert path but never actually performed one — it built a populated template tree and inspected its on-disk shape, then stopped. Capture the template's root_key + sum, then run a batch `insert_only_known_to_not_already_exist_op` over a ProvableSumTree element carrying those values at a fresh top-level key (the non-batch insert path forbids non-empty Tree elements with "a tree should be empty at the moment of insertion when not using batches", so the documented direct-insert semantics are only reachable via the batch API). Assert the round-tripped element preserves the captured root_key + sum. * element/tree_type.rs get_feature_type_zeros_sum_for_not_summed_in_sum_parents: extend the NotSummed wrapper regression to include `TreeType::ProvableSumTree` so the new Phase 2 sum-bearing parent variant's zero-sum semantics stay pinned alongside SumTree, BigSumTree, CountSumTree, and ProvableCountSumTree. * tree/link.rs round_trip_aggregate_data_provable_sum_negative: pin down the new wire tag 7 introduced for `AggregateData::ProvableSum`. Encode a negative-sum reference link (also exercises the signed-i64 varint path), assert tag 7 is present in the bytes, then decode and assert the link's fields round-trip identically. * aggregate_sum_query_tests::aggregate_sum_with_subquery_is_rejected_at_validation: previously only fed a dummy proof to the verifier-side validator, leaving the new prover-side gate (`prove_query_non_serialized` short-circuit) unasserted. Build the malformed PathQuery, set up the standard 15-key fixture, and assert `db.prove_query(...)` returns `Err` so a regression in the prover gate can't slip through with this test still passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: refresh ProvableSumTree doc comments The TreeType::ProvableSumTree doc said "Phase 1: behaves identically to SumTree everywhere except in inner_node_type / empty_tree_feature_type. Phase 2 will diverge the hash computation." Phase 2 has shipped — the hash dispatch now goes through node_hash_with_sum and the new proof-node families (KVSum, KVHashSum, KVDigestSum, KVRefValueHashSum, HashWithSum) plus the AggregateSumOnRange query are all in place. Rewrite the comment to describe the current (post-Phase-2) semantics: an i64 sum baked into every node's hash, making sum tampering catchable via proof verification, as the sum-side counterpart of ProvableCountTree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(verify): require provable tree type at aggregate query terminal layer Security finding (Codex): the `verify_aggregate_sum_query` and `verify_aggregate_count_query` chain walkers only checked `element.is_any_tree()` for path elements. At the terminal (leaf) layer this is insufficient — if the honest tree at the queried path happens to be an EMPTY Merk-backed tree of any type (NormalTree, SumTree, BigSumTree, CountTree, CountSumTree, ProvableCountTree, ProvableCountSumTree, ProvableSumTree), its stored `value_hash = combine_hash(H(element_bytes), NULL_HASH)`. The merk verifier accepts empty proof bytes as `(NULL_HASH, 0)`, so an attacker can construct a forged proof with: - layer 0: honest single-key proof of the leaf path key in its parent - layer 1: empty bytes (forged) and the chain check passes uniformly. The verifier returns `sum = 0` (or `count = 0`) against the trusted root hash, even though the leaf isn't a Provable{Sum,Count}Tree. The numeric answer is correct (an empty tree has sum 0 / count 0), so this isn't a value forgery — but it IS a type-confusion soundness gap: a caller that infers "leaf is a ProvableSumTree" from "the aggregate verifier accepted" is deceived. The prover-side gate in `Merk::prove_aggregate_{sum,count}_on_range` already rejects non-provable inputs, but the verifier didn't mirror that invariant. THE FIX In `enforce_lower_chain`, add an `is_terminal: bool` parameter. At intermediate depths nothing changes (`is_any_tree()` still suffices — the GroveDB grove can route through any tree type on the way down). At the terminal depth — passed `is_terminal = true` when `depth + 1 == path_keys.len()` — the verifier now requires: - aggregate-sum: `matches!(element, Element::ProvableSumTree(..))` - aggregate-count: `matches!(element, Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..))` Wrapper variants (NonCounted, NotSummed) are stripped via the existing `into_underlying()` so they continue to work transparently. TESTS Three new regression tests that surgically construct the forgery from a real honest single-key envelope and confirm the verifier now rejects: - `empty_leaf_type_confusion_forgery_rejected` (sum side, empty NormalTree at leaf) - `empty_provable_count_tree_at_leaf_rejected_for_sum` (sum side, empty ProvableCountTree at leaf — confirms type-specificity) - `empty_leaf_type_confusion_forgery_rejected` (count side, empty NormalTree at leaf) The path == 0 case is unaffected: the merk-level hash divergence between `node_hash` and `node_hash_with_sum` / `node_hash_with_count` makes it computationally infeasible to forge a proof that matches the trusted root, so the path-elements check is unnecessary at the root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(verify): reject empty-path aggregate-sum/count queries at validation Codex follow-up + CodeRabbit: the previous fix added a terminal-type gate in `enforce_lower_chain`, but `verify_v0_layer` and `verify_v1_layer` short-circuit to the leaf verifier when `depth == path_keys.len()`. With an empty path (`path == []`) that's true at depth 0, so the type gate is never invoked. In practice the empty-path case is already protected by hash divergence: the GroveDB root merk is always a `NormalTree` (built with `Element::empty_tree()` by API), so its root_hash uses `node_hash`. An attacker's forged proof of `HashWithSum` / `HashWithCount` ops would reconstruct via `node_hash_with_sum` / `node_hash_with_count` — distinct hash functions, no collision. So the caller's root-hash compare catches the forgery cryptographically. But the defense-in-depth principle says: don't rely on the cryptographic divergence implicitly. Reject up-front, before any proof handling. PathQuery::validate_aggregate_{sum,count}_on_range now check `self.path.is_empty()` and return a clear InvalidQuery error naming why (root is always NormalTree, no valid Provable* target at root). The check fires at the entry of `verify_aggregate_{sum,count}_query` (which call `validate_*` first thing) and at `prove_query` (the generator also validates the path query before dispatch). TESTS - `empty_path_aggregate_sum_rejected_at_validation` - `empty_path_aggregate_count_rejected_at_validation` Both pin the rejection at both the PathQuery validator and the verify entrypoint. 2964 workspace tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(merk,grovedb): add no-proof query_aggregate_sum entry point Mirrors PR #662's `query_aggregate_count` for the signed-sum side. Callers that need a sum value but not a proof (e.g. server handlers answering `prove=false` sum requests) can now bypass proof construction, serialization, and verification entirely. The merk-level walk is `O(log n + |boundary|)` in the number of distinct keys, identical complexity to the prover but without the proof-op allocations or hash recomputations. The signed-sum arithmetic carries the same `i128` accumulator the prover and verifier use (so adversarial intermediate sums never wrap), and narrows to `i64` at the public entry point. An out-of-i64 result is classified as `Error::CorruptedData` since a real `ProvableSumTree` maintains every aggregate as `i64` at every level. NEW APIS - `Merk::sum_aggregate_on_range(&inner_range, grove_version) -> CostResult<i64, Error>` in `merk/src/merk/get.rs`. Checks `tree_type == ProvableSumTree`; rejects any other tree type with `Error::InvalidProofError`. Returns 0 for an empty merk. - `RefWalker::sum_aggregate_on_range(&inner_range, grove_version)` in `merk/src/proofs/query/aggregate_sum.rs`. Walks the same Contained / Disjoint / Boundary classification path as `create_aggregate_sum_on_range_proof`, but emits no proof ops. - `GroveDb::query_aggregate_sum(path_query, transaction, grove_version) -> CostResult<i64, Error>` in `grovedb/src/operations/get/query.rs`. Validates the PathQuery up-front via `validate_aggregate_sum_on_range` (same gate the prover and verifier use — catches malformed ASOR queries plus the empty-path rejection from the prior commit before any storage reads), opens the leaf merk at `path_query.path`, and delegates to the merk-level walk. - New `query_aggregate_sum_on_range` field on `GroveDBOperationsQueryVersions`, wired through v1/v2/v3 at version `0`. NotSummed-correctness is preserved via the same `own_sum = node_sum - left_struct - right_struct` derivation the prover uses. NotSummed-wrapped subtrees have stored aggregate 0, so the subtraction yields 0 at the wrapper boundary - they do not contribute to the in-range total. The returned sum is **not** independently verifiable: callers are trusting their own merk read path. For a verifiable sum, continue using `prove_query` + `verify_aggregate_sum_query`. Documented explicitly on both entry points. TESTS - 10 new merk-level cross-checks (`merk/src/proofs/query/aggregate_sum.rs::tests`): each range variant against `prove_aggregate_sum_on_range`'s computed sum, plus empty-merk-returns-0, NormalTree rejection, ProvableCountTree rejection (precise tree-type match, not "any provable aggregate tree"), and a mixed-positive/negative scenario that exercises the signed `own_sum` subtraction. - 11 new GroveDB-level cross-checks (`grovedb/src/tests/aggregate_sum_query_tests.rs::tests`): every range shape on a populated `ProvableSumTree`, empty subtree returns 0, negative-sum scenario, invalid-inner-range (`Key`) rejected with `InvalidQuery`, empty-path rejected with `InvalidQuery`, NormalTree leaf rejected with `MerkError` from the merk-level gate. Workspace `cargo test --all-features`: 2985 passing / 0 failing (was 2964 / 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(query_item): mirror AggregateCountOnRange tests for AggregateSumOnRange Adds parallel coverage for the variant-11 dispatch paths in encode/ decode_with_depth/borrow_decode_with_depth/Display/serde, plus helper accessors (lower_bound, upper_bound, is_aggregate_sum_on_range, aggregate_sum_inner, ...). Mirrors the existing AggregateCountOnRange test set so each match arm has direct exercise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(element/helpers): per-variant flag-accessor coverage Adds direct round-trip tests of get_flags_owned / get_flags_mut / set_flags on every aggregate-bearing variant (CountSumTree, ProvableCountTree, ProvableCountSumTree, ProvableSumTree, ItemWithSumItem, CommitmentTree, MmrTree, BulkAppendTree, DenseAppendOnlyFixedSizeTree, BigSumTree, CountTree) plus NotSummed / NonCounted delegation arms. Each test pins one match arm. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(grovedb): unit coverage for aggregate_consistency_labels Direct unit tests for the previously untestable internal helper. Each aggregate-bearing tree variant now has both a matching (returns None) and mismatching (returns Some) case, plus tests for the empty-merk identity arms (zero-recorded with NoAggregateData), non-Merk data tree arms (always None), and the catch-all variant/shape mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(grovedb/proof): verifier error-path coverage for aggregate-sum Adds 10 mutation-style tests for the GroveDB-side aggregate-sum verifier (verify_v0_layer / verify_v1_layer / verify_sum_leaf / verify_single_key_layer_proof_v0 / enforce_lower_chain). Each test pins one previously-uncovered error arm: - V1 unexpected non-merk leaf bytes - V0 and V1 missing lower_layer for path key - Malformed leaf sum proof (Phase 1 rejection) - Corrupted non-leaf merk bytes (single-key proof failure) - Non-leaf proof without target key - KV replaced by KVDigest in non-leaf (no value bytes) - Undeserializable value bytes on the path - Intermediate-tree non-tree element rejection - Unparsable envelope bincode decode error Plus mirrors of the count-side AggregateSumOnRange rejection tests in proof/generate.rs for dense/MMR/BulkAppend index helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(merk/aggregate_sum): unit coverage for helpers and edge cases Direct unit tests for previously-uncovered internal helpers in merk/src/proofs/query/aggregate_sum.rs: - provable_sum_from_aggregate Err arm for every non-ProvableSum AggregateData variant (CorruptedData classification check) - provable_sum_from_aggregate happy path including i64::MIN/MAX - is_provable_sum_bearing false for every non-ProvableSumTree TreeType variant - classify_subtree additional disjoint-above / contained-within / boundary-overlapping-upper cases - key_strictly_inside unbounded endpoint and equality cases - empty ProvableSumTree prove+verify round-trip Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(non-merk trees): aggregate-sum/count rejection in index helpers Adds parallel-variant rejection tests in the BulkAppendTree and dense fixed-size Merkle tree proof modules. Both tree types have no count or sum commitment in their node hash, so their index-resolution helpers reject AggregateCountOnRange and AggregateSumOnRange query items outright. This exercises the previously-uncovered rejection arms in both proof modules. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: per-variant tree_type extensions + sum-proof Display arms Two small targeted coverage additions: - merk/src/element/tree_type.rs: direct per-variant tests for the ProvableSumTree / CommitmentTree / BulkAppendTree / DenseAppendOnlyFixedSizeTree / MmrTree arms of root_key_and_tree_type, tree_flags_and_type, tree_type, maybe_tree_type, and tree_feature_type, plus a ProvableSumTree-through-NotSummed delegation test. - grovedb/src/tests/aggregate_sum_query_tests.rs: tests that drive node_to_string's KVSum / KVHashSum / KVDigestSum / HashWithSum / KVRefValueHashSum Display arms in grovedb/src/operations/proof/mod.rs by formatting real ProvableSumTree proofs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * nit(coderabbit): doc-comment + test + error-wrap polish Four low-value but clean tweaks from CodeRabbit on PR #661: - `grovedb-query/src/query_item/mod.rs`: refresh the stale `NonAggregateInner::deserialize` inline comment to mention both excluded aggregate variants (Count + Sum), matching the struct-level doc and `NON_AGGREGATE_VARIANTS`. - `grovedb/src/tests/aggregate_sum_query_tests.rs`: drop the redundant disjunction `msg.contains("must be a ProvableSumTree") || msg.contains("ProvableSumTree")` — the first clause already implies the second; pin the exact phrase. - `grovedb/src/tests/aggregate_sum_query_tests.rs`: harden `provable_sum_tree_overflow_at_i64_max_is_rejected` so it no longer silently passes when insert AND prove AND verify all accept an overflow. Replace the early-return-on-both-inserts-accepted with an explicit "at least one stage must reject" assertion. - `grovedb/src/operations/get/query.rs`: wrap the MerkError from `Merk::sum_aggregate_on_range` (and the count sibling) with contextual `CorruptedData(format!("query_aggregate_{sum,count} at path {:?}: {}", path_slices, e))` per the repo error-wrapping convention. Two test assertions updated from `MerkError(_)` to `CorruptedData(_)` to match. Workspace `cargo test --all-features`: 3102 pass / 0 fail (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: clarify appendix-a Element/TreeType discriminant columns CodeRabbit flagged the TreeType column for ProvableSumTree as "should be 17" — the technical claim was wrong (11 is the correct TreeType discriminant per `merk/src/tree_type/mod.rs:77`), but the confusion was fair: the column header "TreeType" was ambiguous and the table had several pre-existing inaccuracies in adjacent rows. This commit fixes the ambiguity AND the bugs. Changes: - Rename column header from "TreeType" to "TreeType disc" and add an intro paragraph explaining that "Element disc" and "TreeType disc" are discriminants of two SEPARATE enums. - Add the TreeType-variant label to every tree row for consistency (some had it, most didn't). The new format is `N (VariantName)` — e.g. `5 (ProvableCountTree)` — which CodeRabbit-style auto-review can't misread. - Fix three pre-existing wrong TreeType disc values: `BigSumTree`: 4 -> 2 `CountTree`: 2 -> 3 `CountSumTree`: 3 -> 4 (These were drift from the actual `TreeType::discriminant()` implementation; the file had `4 (BigSumTree)` etc. but those labels were wrong.) - Swap the row order at Element discriminants 8 and 9 to match the actual `Element` enum order: 8 = `ProvableCountTree` (was incorrectly listed as `ItemWithSumItem`) 9 = `ItemWithSumItem` (was incorrectly listed as `ProvableCountTree`) - Tighten the `ProvableCountSumTree` Purpose blurb to note "only count in hash" since the sum is tracked metadata, not bound — this is the half-step variant a future `ProvableCountAndSumTree` would replace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(verify): accept Sum-family boundary nodes in range bound checks Codex security finding: the regular query verifier's lower- and upper-bound `last_push` match arms in `merk/src/proofs/query/verify.rs::execute_proof` (lines 206 / 241) accept the Count-family boundary node variants (`KVCount`, `KVDigestCount`, `KVRefValueHashCount`) but omit the parallel Sum family (`KVSum`, `KVDigestSum`, `KVRefValueHashSum`). Regular proofs against a `ProvableSumTree` can legitimately emit `KVDigestSum` as the absence-boundary node for a queried key, so a multi-item query like `Key("aa")` followed by `Range("g".."j")` would reject the perfectly valid proof with `Cannot verify lower bound of queried range` whenever the preceding boundary happened to be sum-flavored. The downstream absence check at line ~572 already handled all six node types (Count + Sum), making the omission an asymmetry between the two checks within the same function. THE FIX Add `KVSum`, `KVDigestSum`, `KVRefValueHashSum` to both the lower- and upper-bound `last_push` match arms. While at it, also extend `boundaries_in_proof` (line ~742) to surface `KVDigestSum` boundary keys alongside `KVDigest` and `KVDigestCount` — same class of omission, same trivial extension. TESTS New `provable_sum_tree_bound_regression_tests` module at the bottom of `verify.rs` covering: - `key_plus_range_on_provable_sum_tree_left_to_right_verifies` — the exact `[Key("aa"), Range("g".."j")]` shape Codex flagged, in forward iteration. Without the fix this returns `InvalidProofError("Cannot verify lower bound of queried range")`. - `key_plus_range_on_provable_sum_tree_right_to_left_verifies` — same query with `left_to_right = false`. The bug is symmetric, so the regression coverage is too. - `kv_digest_sum_appears_in_boundaries_in_proof` — proves that `boundaries_in_proof` now surfaces `KVDigestSum`-flavor boundary keys produced by `ProvableSumTree` proofs. Workspace `cargo test --all-features`: 3150 pass / 0 fail (was 3147 / 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(query): split aggregate validator error label per count/sum variant Develop landed a regression test (`query_validation_error_to_static_str_projects_invalid_operation_and_catches_other_variants`, commit 7a649386) that pins the catch-all fallback string returned by `query_validation_error_to_static_str` to `"AggregateCountOnRange query validation failed"`. This PR had generalised the helper to serve both count and sum, returning `"aggregate query validation failed"`, which broke the develop test under GitHub's "merge into base" CI workflow. Split the helper into two so each aggregate variant's error surface stays self-describing: - `query_validation_error_to_static_str` — count side, restored to the `"AggregateCountOnRange query validation failed"` label so develop's regression test stays green. - `sum_query_validation_error_to_static_str` — new sum-side helper returning `"AggregateSumOnRange query validation failed"`. Used by `SizedQuery::validate_aggregate_sum_on_range`. Both follow the same projection contract: `InvalidOperation(msg)` passes the static string through unchanged; any other variant (unreachable from real validators) gets the variant-specific fallback. No behavior change at the InvalidOperation happy path, which is all real callers reach. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(verify): key_exists_as_boundary_in_proof must accept KVDigestSum CodeRabbit symmetry finding on top of commit 5b3a0460: that fix extended `boundaries_in_proof` to recognize `KVDigestSum` boundary nodes from `ProvableSumTree` proofs, but missed the parallel helper `key_exists_as_boundary_in_proof`. The two public helpers are documented to behave identically; without this both helpers disagreed on valid `ProvableSumTree` absence proofs. Add `Op::Push(Node::KVDigestSum(..))` and the PushInverted variant to the match in `key_exists_as_boundary_in_proof`. Tighten the doc-comment to spell out that the two helpers share node-type coverage. Extended the regression test `kv_digest_sum_appears_in_boundaries_in_proof` (now renamed to `kv_digest_sum_appears_in_both_boundary_helpers`) so every boundary key surfaced by `boundaries_in_proof` is also reported by `key_exists_as_boundary_in_proof`, pinning the symmetry. Workspace `cargo test --all-features` for the affected module: 3 of 3 regression tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(comments): strip implementation-phase labels from ProvableSumTree code Remove "Phase 1 / Phase 2 / etc." prefixes that referred to the PR's implementation timeline. Retains "Phase 1 / Phase 2" labels that describe runtime decode-vs-walk algorithm steps of the aggregate-count/sum verifiers (documented in docs/book/src/aggregate-count-queries.md). In the test-fixture stack-builder (merk/src/proofs/tree.rs) and the provable_sum_tree direct-insert test, renamed enumeration-style "Phase N" labels to "Step N" for clarity. Also renamed the phase2_* test fn prefixes in encoding.rs and tree.rs to drop the timeline label. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(serde): QueryItem Serialize emits snake_case variant tags CodeRabbit re-flagged a pre-existing asymmetry in `QueryItem`'s manual serde implementations: the `Serialize` impl emitted PascalCase variant tags via `serialize_newtype_variant` (e.g. `"AggregateSumOnRange"`, `"Range"`), but the `Deserialize` impl uses `Field` enums marked `#[serde(field_identifier, rename_all = "snake_case")]`, expecting snake_case (`"aggregate_sum_on_range"`, `"range"`). The asymmetry was invisible to bincode (the format GroveDB actually uses in proofs and storage) because bincode identifies variants by index, not by the textual tag. But it broke round-trip through every text-based format that carries variant names verbatim (JSON, YAML, TOML). An in-code comment at the existing token-stream test site even documented the issue as "a pre-existing mismatch ... that breaks JSON round-trip but is invisible to formats that don't carry variant names textually." THE FIX Change every `serialize_newtype_variant` (and the one `serialize_unit_variant` for `RangeFull`) call in `grovedb-query/src/query_item/mod.rs` to emit snake_case variant tags. The variant indices stay the same so the bincode wire format is unchanged — only textual formats see the new tag names. Affected variants: `Key`, `Range`, `RangeInclusive`, `RangeFull`, `RangeFrom`, `RangeTo`, `RangeToInclusive`, `RangeAfter`, `RangeAfterTo`, `RangeAfterToInclusive`, `AggregateCountOnRange`, `AggregateSumOnRange` — i.e. every variant, not just the aggregate ones. This is the symmetric fix; doing only the new `AggregateSumOnRange` variant would have diverged it from the existing `AggregateCountOnRange` (and from the other ten variants that have always been broken the same way). Also updated the in-code comment at the token-stream test site to reflect the new contract. TESTS Two new `serde_test::assert_tokens` round-trip regression tests pin the snake_case contract on both aggregate variants: - serde_round_trip_aggregate_sum_on_range_uses_snake_case_tag - serde_round_trip_aggregate_count_on_range_uses_snake_case_tag assert_tokens exercises both Serialize AND Deserialize against the same token stream, so any future regression on either side fails the test immediately. Workspace cargo test --all-features: 3152 pass / 0 fail (was 3150 / 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(aggregate-sum): mirror PR #663 — split into subdir, reject V0 envelopes V0 (`MerkOnlyLayerProof`) envelopes predate the aggregate-sum feature and cannot legitimately carry an aggregate-sum proof, so the V0 layer walker was unreachable in any honestly-produced proof. Mirror the count-side changes from PR #663: - Convert `aggregate_sum.rs` into a subdirectory mirroring `aggregate_count/`: `mod.rs` (public API + V0 rejection), `helpers.rs` (envelope decode, single-key layer verification, chain enforcement, leaf sum verification), `leaf_chain.rs` (V1 leaf-chain walker). Removes the dead `verify_v0_layer` path. - Add a prover-side V0 gate in `prove_query_non_serialized`: when grove_version dispatches to V0 and the path query carries an `AggregateSumOnRange` anywhere, return `NotSupported` instead of emitting a V0 envelope the verifier would (correctly) reject. - Update tests: replace the V0 round-trip test with a V0-rejection test; broaden the empty-leaf type-confusion test to accept either the V0-rejection or the terminal-type-gate error; remove the now- unreachable V0 missing-lower-layer test (V1 counterpart already pins the missing-layer behavior); refresh stale doc-comments that pointed at `aggregate_sum.rs` line numbers. No carrier-shape support yet — `AggregateSumOnRange` is still leaf-only on the merk side, so there is no `classification.rs` / `per_key.rs` mirror. The leaf-only shape can be extended later in parallel with matching merk-level work, just like the count side did. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * harden(aggregate-sum): strict lower_layers shape + V1 terminal-type test Addresses CodeRabbit findings on PR #661. **Finding 1 (Major, fixed)** — `verify_v1_leaf_chain` accepted arbitrary `lower_layers` entries at non-leaf depths and under the leaf merk. That let two byte-distinct envelopes verify to the same `(root, sum)` because the smuggled siblings/children were never inspected, and gave downstream syntactic scanners an unverified surface to read from. Added a strict shape gate that mirrors the honest prover (`prove_aggregate_sum_on_range` short-circuit always emits `lower_layers: BTreeMap::new()` at the leaf, and each path-prefix wrapper inserts exactly one entry): - At `depth == path_keys.len()` (leaf merk): require `lower_layers.is_empty()`. - At non-leaf depths: require `lower_layers.len() == 1` and the sole key to equal the expected descent key `path_keys[depth]`. Two new regression tests (`sum_v1_envelope_with_extra_lower_layer_*` and `sum_v1_envelope_with_lower_layers_under_leaf_*`) construct byte-modified envelopes that the gate rejects. **Finding 4 (nitpick, addressed)** — the existing empty-leaf type-confusion tests build V0 envelopes and now hit the V0-rejection gate before the terminal-type gate in `enforce_lower_chain` runs. Added `empty_leaf_type_confusion_forgery_rejected_under_v1_envelope` which builds the same forgery under a V1 `LayerProof` envelope so the terminal-type gate fires directly. The test asserts the specific "must be a ProvableSumTree" error from `enforce_lower_chain` so future refactors that drop the gate are caught. **Finding 2 (Major, skipped with reason)** — CodeRabbit asks to make `hash_for_link` fail-closed (panic/Err) when a `ProvableSumTree` node's `aggregate_data()` doesn't return `ProvableSum`. The current fallback to `self.hash()` is identical across all three `Provable*` variants (`ProvableCountTree`, `ProvableCountSumTree`, `ProvableSumTree`) and also appears in commit-time dispatch (lines 1233-1304). Fixing only the sum arm creates asymmetry with the count side; the broader refactor (plus the matching dispatch-centralization in Finding 3) is out of scope for this sum-feature PR and is documented in MEMORY M1 as intentional. **Finding 3 (nitpick, skipped with reason)** — the duplicated `AggregateData → hash` dispatch in `merk/src/tree/mod.rs` predates this PR and applies to all three `Provable*` variants. Centralizing it would touch hot proof-emission paths; out of scope here. Also updated `sum_v1_envelope_with_missing_lower_layer_is_rejected` to accept the new strict-shape error message — removing an entry now trips the shape gate first instead of the older missing-layer arm. Both messages pin the same property. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(aggregate-sum): split strict-shape gate + cover trailing-bytes / wrong-key After splitting the strict-shape gate in \`verify_v1_leaf_chain\` into two reachable error arms (entry-count vs. entry-key), the ok_or_else-arm on the subsequent \`lower_layers.get(&next_key)\` became naturally reachable for the wrong-key case (single entry but under an unexpected key). Previously the combined-OR gate made both error sites mutually exclusive, leaving the get-arm dead. Coverage impact (aggregate_sum/ subdir, local tarpaulin): - leaf_chain.rs: 36/41 (88%) -> 44/44 (100%) - Subdir total: ~80% -> 89.94% New tests in aggregate_sum_query_tests: - sum_v1_envelope_with_wrong_keyed_lower_layer_is_rejected — single \`lower_layers\` entry under \"impostor\" instead of \"st\". Exercises the key-shape gate distinctly from the count-shape gate. - sum_proof_with_trailing_bytes_is_rejected — mirror of \`aggregate_count_proof_with_trailing_bytes_is_rejected\`. Pins the canonical-decode invariant in \`decode_grovedb_proof\`. Tightened assertion in \`sum_v1_envelope_with_extra_lower_layer_is_rejected\` to match the new \"lower-layer entries at depth N\" error string, and broadened the assertion in \`sum_v1_envelope_with_missing_lower_layer_is_rejected\` to accept either the old missing-layer message or the new entry-count message (removing the only entry trips the count gate first). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(proof): hoist canonical proof decoder to operations/proof/mod.rs The same `decode_grovedb_proof` function lived in both `aggregate_count/helpers.rs` and `aggregate_sum/helpers.rs`, differing only in whether the error string named "aggregate-count" or "aggregate-sum" as the offending shape. Two copies of the same canonical-decode contract is a maintenance hazard — drift between the two would mean one aggregate path could (e.g.) accept trailing bytes that the other rejects, breaking the equality-by-bytes assumption the contract is meant to guarantee. Hoist the function to `operations/proof/mod.rs` as `decode_grovedb_proof_canonical` with `pub(super)` visibility. Both helper modules now call `super::decode_grovedb_proof_canonical`. The error message generalizes from "aggregate-{count,sum} proof has N trailing bytes" to "proof has N trailing bytes" since the call site provides the surrounding context; the existing `*_proof_with_trailing_bytes_is_rejected` tests assert `msg.contains("trailing bytes")` and remain green. No behavior change beyond the wording adjustment in the trailing- bytes error. Tests: workspace 3088 / 0 fail; aggregate tests 223 / 0 fail (113 sum + 75 count + 35 others). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * harden(proof): general verify paths now use canonical decoder Migrate the four `bincode::decode_from_slice(...)?.0` sites in `operations/proof/verify.rs` (`verify_query_with_options`, `verify_query_get_parent_tree_info_with_options`, `verify_query_raw`, `verify_trunk_chunk_proof`) to call `super::decode_grovedb_proof_canonical(proof)?` — the same canonical decoder the aggregate-count and aggregate-sum entry points already use. Before this change, the general verifier silently accepted trailing bytes after the encoded `GroveDBProof` envelope, because the `.0` discarded the bincode `consumed` count. The aggregate-count and aggregate-sum entry points rejected trailing bytes via their own private decoder. That asymmetry meant the same logical proof could have many distinct byte encodings through the general-verify surface but only one through aggregate-verify — a malleability gap that breaks any equality-by-bytes / caching / dedup assumption a consumer might rely on. The cryptographic chain still bound the answer, so this wasn't a soundness break. The general-verify path now matches: trailing bytes are rejected with `Error::CorruptedData("proof has N trailing bytes after the encoded envelope")`. Added `verify_query_rejects_proof_with_trailing_bytes` in `proof_advanced_tests.rs` to lock the new behavior in place, mirror of `sum_proof_with_trailing_bytes_is_rejected` and `aggregate_count_proof_with_trailing_bytes_is_rejected`. Tests: workspace 3089 / 0 fail (3088 + 1 new regression test). No honest test/proof emitted trailing bytes, so no existing test needed updating. Co-Authored-…

Extend PR #656 (AggregateCountOnRange) and PR #662 (no-proof query_aggregate_count entry point) to CountIndexedTree. The cidx secondary IS a ProvableCountTree, so both merk-level APIs accept it directly — only the cidx wrapper layer was missing. Three new public APIs on `GroveDb`: - `count_indexed_count_range_aggregate(path, lo_count, hi_count, ...)` — no-proof variant returning `u64`. Wraps `Merk::count_aggregate_on_range` against the cidx secondary; O(log n + boundary) merk cost regardless of how many entries match. Answers "how many cidx entries have count_value in [lo, hi]?" without surfacing the entries themselves. Use this when callers only need the count — leaderboard size queries, membership-counting endpoints, etc. `lo > hi` returns `Ok(0)`; `(0, u64::MAX)` returns the total entry count. - `prove_count_indexed_count_range_aggregate(path, lo_count, hi_count, ...)` — proof variant. Reuses the cidx layer-walking + primary-attestation + ancestor-cidx-secondary-attestation shape from the existing top_k/paginated builders, but the secondary half is `secondary_merk.prove_aggregate_count_on_range(...)` — emitting only `HashWithCount` and `KVDigestCount` ops. Proof size is O(log n + boundary). - `verify_count_indexed_count_range_aggregate(proof, path, lo_count, hi_count)` — verifier. Authenticates echoed (lo, hi) before decoding the merk proof (same defense-in-depth pattern as the other cidx verifiers). Verifies the secondary count proof via `verify_aggregate_count_on_range_proof`, then chains the secondary's root hash through the cidx H1-A composition layer by layer. Returns `CountIndexedAggregateCountResult { root_hash, count }`. Inner-range construction: secondary keys are `count_be ‖ original_key`. A cidx count in `[lo_count, hi_count]` maps to the secondary key range `[lo_count_be ‖ ∅, (hi_count+1)_be ‖ ∅)` — exclusive upper. `hi_count == u64::MAX` uses `RangeFrom` to avoid the +1 overflow. `lo > hi` is handled in the prover by emitting a proof against the guaranteed-empty range `(hi+1)_be..(hi+1)_be` so the proof still hashes to the real secondary root (an empty proof bytes would give NULL_HASH and break the H1-A chain). The verifier rebuilds the same item shape so the verify call hashes to the same root. Wire format: new `CountIndexedAggregateCountProof` envelope mirrors the existing cidx envelopes (layer_proofs, primary_root_hash, ancestor_cidx_secondary_root_hashes, secondary_proof) but echoes `lo_count: u64` and `hi_count: u64` instead of the top_k-style `requested_limit` / `descending`. New envelope rather than reusing CountIndexedRangeProof because the secondary_proof bytes have a different shape (AggregateCountOnRange proof vs regular Merk range proof). Notes: - Tree-type gating already happens at the merk level (`prove_aggregate_count_on_range` rejects anything that isn't `ProvableCountTree` / `ProvableCountSumTree`). The cidx secondary is always opened as `ProvableCountTree`, so this is satisfied by construction. - The merk-level `count` returned by `prove_aggregate_count_on_range` is intentionally unused at the cidx prover side — the verifier independently re-derives the count from the proof shape, and that's the only value that's trustworthy. The prover-side count exists only for caller introspection on the merk-level call. Tests (8 new in `count_indexed_tree_tests.rs`): - `count_indexed_count_range_aggregate_returns_inrange_count` — five entries, exercising mid-range, full-range, no-match, degenerate `lo>hi`, and single-point `[c, c]` cases. - `count_indexed_count_range_aggregate_on_empty_cidx_returns_zero` — empty cidx returns 0 cleanly. - `prove_and_verify_count_indexed_count_range_aggregate_round_trip` — 7 entries; mid-range proof returns the expected count and matches the actual GroveDB root hash; full-range repeats with count = 7. - `prove_and_verify_count_indexed_count_range_aggregate_no_matches` — out-of-range proof yields count = 0 with a real root hash. - `verify_count_indexed_count_range_aggregate_rejects_bounds_mismatch` — verifier rejects both lo and hi mismatches; honest call succeeds. - `prove_count_indexed_count_range_aggregate_at_root_path_errors` — empty path → `InvalidPath`. - `verify_count_indexed_count_range_aggregate_rejects_corrupted_bytes` — bincode decode failure on garbage → `CorruptedData`. - `count_indexed_count_range_aggregate_matches_proof_count` — cross-check: for five different ranges, the no-proof variant and the verified proof count agree exactly. Verified: - `cargo test --workspace --lib`: 3300+ tests, 0 failures. - `cargo clippy --workspace --all-features -- -D warnings` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removed references that read as stale dev-process noise to a reader encountering the code months later, with no information loss about the code's current behavior or rationale: **PR # references (8 sites)** - generate.rs (combined-aggregate gate): "PR #670 / grove v3+ feature" → "grove v3+ feature". - count_offset_paginated_tests.rs (assert message + 2 docs): "PR #672 closes the P1 finding" / "the P1 finding's root cause" → describe the actual rule (NonCounted inserts into ProvableCountTree rejected). - provable_count_provable_sum_tree_tests.rs (4 docs): "PR #672 for ProvableCountTree…" → "Same rejection rule that applies to…". - aggregate_count_query_tests.rs (forgery test branches): drop "added in PR #663" / "added in this PR" qualifiers from documentation of two error-branches. - not_counted_or_summed_tests.rs: drop "PR #666's contract" qualifier. - reference_with_sum_item_tests.rs (2 sites): drop "added in PR #667" / "added in this PR" / "PR #667 already covers" — rephrase as factual statements. - aggregate_sum_query_tests.rs (test-section header): drop "PR #662's no-proof query_aggregate_count" → "Sum-side mirror of the no-proof query_aggregate_count tests." - query.rs (query_aggregate_sum doc): drop "Mirrors PR #662's". - non_counted_tests.rs (module header): drop "Codex review of PR #654". **CodeRabbit references (3 sites)** - count_offset_paginated_tests.rs (2 sites): drop "(CodeRabbit review on grovedb#669)" parenthetical. - merk/mod.rs (test doc): drop "Tightened (per CodeRabbit review)". - merk/proofs/query/aggregate_count/tests.rs (comment): drop "(per CodeRabbit review)". **Temporal markers — "Before this fix" / "in this PR" framing (4 sites)** - merk/mod.rs (2 docs): "Before this fix the supports_count match…" → "the support check delegates to is_count_bearing(), so any hand-rolled match here would be a drift-risk regression." / "previously omitted from this manual match" → drop the temporal qualifier. - aggregate_sum_carrier_query_tests.rs (test doc): "previously blanket — it blocked legitimate root-carrier queries" → describe current shape-aware behavior. - provable_count_provable_sum_tree_tests.rs (2 sites): "fixed in this PR — without KVRefValueHashCountSum" → "rely on the KVRefValueHashCountSum dispatch arm — without it". "both gained Element::ProvableCountProvableSumTree arms in this PR" → "both carry Element::ProvableCountProvableSumTree arms". **"before this feature" comments (2 sites)** - aggregate_count_query_tests.rs + aggregate_sum_carrier_query_tests.rs (per-key symmetry tests): "same proof bytes it did before this feature" → "same proof bytes whether the caller verifies via verify_aggregate_X_query or the per-key entry point." **"Before this module existed" / "forthcoming" framing (2 sites)** - aggregate_common.rs: "Before this module existed each axis carried its own private copy" → "These items would otherwise be byte-identical copies across each axis. Centralizing them here…" - aggregate_count/mod.rs: "The same leaf/carrier shape will apply to forthcoming aggregate variants (sum, average)" → "The same leaf/carrier shape applies to the sum and combined axes — see the sibling … modules." - grovedb-query/src/aggregate_count.rs: "Forthcoming aggregate variants (sum, average) will live in sibling modules" → "The sum and combined axes live in sibling modules." **Renamed:** `p1_noncounted_in_provable_count_tree_rejected_at_insert` → `noncounted_in_provable_count_tree_rejected_at_insert` to drop the "P1" audit-finding prefix from the test name itself. No behavior changes. All tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer and others added 2 commits May 12, 2026 03:21

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread merk/src/proofs/query/aggregate_count.rs Outdated

QuantumExplorer merged commit a917d92 into develop May 11, 2026
10 of 11 checks passed

QuantumExplorer deleted the claude/determined-edison-b2dd07 branch May 11, 2026 20:40

coderabbitai Bot mentioned this pull request May 20, 2026

feat(grovedb,merk): no-prove query_aggregate_count_and_sum accumulator #676

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(merk,grovedb): add no-proof query_aggregate_count entry point#662

feat(merk,grovedb): add no-proof query_aggregate_count entry point#662
QuantumExplorer merged 5 commits into
developfrom
claude/determined-edison-b2dd07

QuantumExplorer commented May 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

codecov Bot commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

QuantumExplorer commented May 11, 2026

Uh oh!

coderabbitai Bot commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

QuantumExplorer commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Breaking Changes

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

codecov Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

QuantumExplorer commented May 11, 2026

Uh oh!

coderabbitai Bot commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

QuantumExplorer commented May 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading

codecov Bot commented May 11, 2026 •

edited

Loading