Skip to content

feat(grovedb,element): add Element::ReferenceWithSumItem variant + RefreshReferenceWithSumItem batch op#667

Merged
QuantumExplorer merged 22 commits into
developfrom
claude/practical-kalam-bc19ed
May 17, 2026
Merged

feat(grovedb,element): add Element::ReferenceWithSumItem variant + RefreshReferenceWithSumItem batch op#667
QuantumExplorer merged 22 commits into
developfrom
claude/practical-kalam-bc19ed

Conversation

@QuantumExplorer
Copy link
Copy Markdown
Member

@QuantumExplorer QuantumExplorer commented May 16, 2026

Issue being fixed or feature implemented

Adds Element::ReferenceWithSumItem — a hybrid element variant that resolves like an Element::Reference AND contributes an explicit SumValue to a sum-bearing parent tree like Element::SumItem / Element::ItemWithSumItem.

The sum value is independent of whatever the reference resolves to — it is the caller-supplied weight associated with the link itself. The motivating use case is ranked / sortable index entries where the key encodes the rank, the reference points to a canonical record elsewhere, and the sum is the entry's monetary weight that aggregates into the parent's total. Today this requires storing the sum redundantly in the target record; with this variant, the link cell carries it natively.

A first-class batch op GroveOp::RefreshReferenceWithSumItem is added alongside, so refreshing the link and the sum is a single atomic, deterministic operation.

What was done?

Element::ReferenceWithSumItem

Layout:

ReferenceWithSumItem(
    ReferencePathType,
    MaxReferenceHop,
    SumValue,           // independent of target
    Option<ElementFlags>,
)
Property Reference ItemWithSumItem ReferenceWithSumItem
Resolves on get() yes (chain) no yes (chain)
Cycle / hop check (MAX_REFERENCE_HOPS = 10) yes n/a yes
is_reference() true false true
Value-hash shape combined simple combined
Proof node type (normal / ProvableCount parent) KVRefValueHash / KVRefValueHashCount Kv / KvCount KVRefValueHash / KVRefValueHashCount
sum_value_or_default() 0 inner i64 inner i64 (independent of target)
Legal parents any any any (sum dropped silently in non-sum parents)
  • Bincode discriminant: 18 (after the new NotCountedOrSummed wrapper byte at 17 introduced in feat(grovedb,merk,element): add Element::NotCountedOrSummed wrapper for combined sum+count opt-out #666).
  • ElementType::ReferenceWithSumItem = 18, NonCountedReferenceWithSumItem = 146 (= 0x80 | 18).
  • NonCounted wrapper compatibility: full. NotSummed / NotCountedOrSummed: rejected — their whitelists accept only sum-tree variants.
  • Version gate: reuses existing element.serialize / element.deserialize gates (same approach NotSummed and NotCountedOrSummed took).

Wrapper bit-encoding update

The pre-existing is_non_counted upper-nibble compare (disc & 0xf0 == 0x80) assumed base discriminants ≤ 15 so that NonCounted twins fit in 0x80..=0x8F. Base discriminant 18 → twin 0x92 broke that assumption. Switched to a range check (0x80..=0xAF is NonCounted, 0xB0..=0xBF is NotSummed, 0xC0..=0xCF is NotCountedOrSummed) so the next few base-variant additions are safe.

GroveOp::RefreshReferenceWithSumItem

First-class peer to RefreshReference at to_u8 = 17 (next free after InsertNonMerkTree = 16). Carries the same fields plus sum_value. Cross-type refresh (this op against a plain Reference on disk, or RefreshReference against a ReferenceWithSumItem) is rejected at apply time when trust_refresh_reference = false — silent coercion would corrupt parent sum aggregates. Public constructor: QualifiedGroveDbOp::refresh_reference_with_sum_item_op.

Touched files (~22)

  • grovedb-element: element/mod.rs, element_type.rs, element/helpers.rs, element/constructor.rs, element/visualize.rs; tests extended in tests/element_constructors_helpers.rs, tests/element_display_and_serialization.rs.
  • merk: element/get.rs (proof-node arms only — feature-type dispatch flows through helpers automatically).
  • grovedb: lib.rs, reference_path.rs, debugger.rs, operations/insert/mod.rs, operations/get/{mod,query}.rs (4 sites), operations/proof/{generate,verify}.rs (4 sites), batch/{mod,batch_structure}.rs, batch/estimated_costs/{average,worst}_case_costs.rs, element/aggregate_sum_query/mod.rs, tests/mod.rs.
  • New file: grovedb/src/tests/reference_with_sum_item_tests.rs (12 end-to-end tests).

How Has This Been Tested?

cargo test -p grovedb-element -p grovedb-merk -p grovedb (no regressions; final counts: grovedb-element 109, grovedb-merk 432, grovedb 1565, +12 new integration tests).

Discriminant-pinning tests (load-bearing safety net for the on-disk format)

The existing test_element_serialization_discriminants_match_element_type test was extended with a 16th entry (ReferenceWithSumItem at discriminant 18). The companion test_non_counted_wrapper_discriminant_pinned test now asserts that NonCounted(ReferenceWithSumItem(..)) serializes to [15, 18] and resolves to NonCountedReferenceWithSumItem = 146. The test_from_serialized_value test asserts the new variant is rejected as an inner of NotSummed and NotCountedOrSummed wrappers.

Integration tests in grovedb/src/tests/reference_with_sum_item_tests.rs

  • insert_in_sum_tree_aggregates_sum — direct insert in SumTree parent; sum propagates.
  • multiple_refs_with_sum_items_accumulate — three variants with distinct positive/negative sums add up.
  • insert_in_normal_tree_sum_silently_ignored — sum silently dropped in non-sum parents (matches ItemWithSumItem).
  • get_resolves_to_target_item_bytesget() resolves the reference like Reference does.
  • get_raw_returns_reference_with_sum_item_unfollowedget_raw() returns the variant verbatim.
  • chain_through_reference_with_sum_item_resolves_to_terminal_itemRefWithSum → Ref → Item chain exercises the shared follow_reference match arm.
  • non_counted_reference_with_sum_item_in_count_sum_treeNonCounted wrapper zeros count, sum still propagates.
  • batch_insert_reference_with_sum_item_propagates_sum — batch path produces same parent aggregate as direct insert.
  • batch_refresh_reference_with_sum_item_updates_sum_and_path — refresh moves both link and sum atomically.
  • batch_refresh_cross_type_rejected — refreshing a Reference with the new op (or vice versa) errors at apply time.
  • predicates_persist_through_round_tripis_reference() / is_reference_with_sum_item() work after deserialize.
  • new_not_summed_rejects_reference_with_sum_itemNotSummed constructor whitelist rejects the new variant.

Lints

cargo clippy -p grovedb -p grovedb-element -p grovedb-merk --all-targets produces no new warnings touching the new code.

Breaking Changes

On-disk format: this commits a new bincode discriminant (18) and a new batch-op tag (17) to the wire format. Replicas running an older binary cannot deserialize an Element::ReferenceWithSumItem or apply a GroveOp::RefreshReferenceWithSumItem. Following project precedent (NotSummed, NotCountedOrSummed), this is gated only through element.serialize / element.deserialize version checks — no new per-feature GroveVersion field. Coordinate the rollout the same way as other Element additions.

The PR title is non-! because reading old data with new code works (existing variants unchanged); the breaking direction is only old code reading new data.

Open risks / explicitly-deferred items

  1. grovedbg-types wire schema — first cut renders the new variant as a plain Reference in the debugger UI, dropping the sum (TODO at the site). Follow-up should add a ReferenceWithSumItem { reference, sum_item_value, element_flags } arm.
  2. Cost-estimate regressions — the new variant and new batch op are ~8 bytes heavier than their Reference counterparts. Downstream fee tables should be audited.
  3. V0/V1 proof versions — both reference proof arms in generate.rs / verify.rs are extended. Existing reference proof tests continue to pass.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Database operations now support references that carry explicit sum values alongside their target paths, with support across get, insert, batch refresh, and query operations.
    • Cost estimation now accounts for element wrapper overhead in batch operations.
  • Tests

    • Added extensive test coverage for references with sum values, including multi-hop resolution, aggregate propagation, batch refresh semantics, and query operations.

Review Change Stack

QuantumExplorer and others added 2 commits May 16, 2026 22:35
…freshReferenceWithSumItem batch op

Adds a new element variant that fuses Reference and SumItem semantics: a row
that resolves like Element::Reference on get() (hop-limited, cycle-detected,
combined value hash) AND contributes an explicit SumValue to a sum-bearing
parent like SumItem / ItemWithSumItem. The sum is independent of the resolved
target's value — it is a caller-supplied weight associated with the link
itself, e.g. for ranked / sortable index entries where the key encodes the
rank, the reference points to a canonical record elsewhere, and the sum is
the entry's monetary weight that aggregates into the parent's total.

Element variant signature:

    ReferenceWithSumItem(ReferencePathType, MaxReferenceHop, SumValue,
                         Option<ElementFlags>)

- Bincode discriminant: 17 (appended after NotSummed = 16).
- ElementType::ReferenceWithSumItem = 17, NonCountedReferenceWithSumItem = 145
  (= 0x80 | 17, NonCounted twin).
- Permitted in any parent tree type — the sum simply doesn't propagate in
  non-sum parents (same rule ItemWithSumItem follows).
- NonCounted wrapper compatibility: full. NotSummed: rejected — its whitelist
  only accepts sum-tree variants.
- Version gate: reuses existing element.serialize / element.deserialize gates
  (same approach NotSummed took).

Wrapper bit-encoding fix:
The existing scheme assumed base discriminants ≤ 15 so that NonCounted twins
fit in the 0x80..=0x8F upper nibble. Base discriminant 17 → twin 0x91 broke
that assumption. Switched is_non_counted from an upper-nibble compare to a
range check (0x80..=0xAF = NonCounted, 0xB0..=0xBF = NotSummed) so future
base variants up through 0x2F work without further changes. NotSummed bases
still must fit in the low nibble.

Batch op GroveOp::RefreshReferenceWithSumItem (to_u8 = 17):
First-class peer to RefreshReference. Carries the same fields plus sum_value,
so refresh becomes deterministic: the on-disk variant and the parent's sum
aggregate stay in sync without re-reading the previous element. Cross-type
refresh (this op against a Reference on disk, or RefreshReference against a
ReferenceWithSumItem) is rejected at apply time when trust_refresh_reference
is false — silent coercion would corrupt parent aggregates.

Public API: QualifiedGroveDbOp::refresh_reference_with_sum_item_op.

Tests:
- Discriminant-pinning tests updated (test_cases.len bumped from 15 to 16)
  and continue to pass — load-bearing safety net for the on-disk format.
- 12 new end-to-end integration tests in
  grovedb/src/tests/reference_with_sum_item_tests.rs covering: insert into
  SumTree with sum aggregation, insertion into non-sum parents (sum dropped),
  get() resolution to terminal item, get_raw() preserving the variant
  verbatim, multi-hop chain (RefWithSum → Ref → Item), NonCounted-wrapped
  variant in CountSumTree, batch insert with sum propagation, batch refresh
  updating both link and sum, cross-type refresh rejection, predicates
  through round-trip, NotSummed rejection, multiple-link accumulation.
- Test totals (no regressions): grovedb-element 90 (was 81), grovedb-merk
  429 (unchanged), grovedb lib 1559 (was 1544).

Deferred (documented as open risks):
- grovedbg-types wire schema: the variant renders as a plain Reference in
  the debugger UI; the sum is dropped from the wire format with a TODO.
- Cost-estimate regressions in downstream fee tables (the new variant +
  batch op are ~8 bytes heavier than their Reference counterparts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves conflicts with the NotCountedOrSummed wrapper (PR #666). Both
features add new Element variants; the merge keeps both:
- NotCountedOrSummed wrapper byte at discriminant 17 (from develop)
- ReferenceWithSumItem base type moved from discriminant 17 to 18

Updates to my variant for the new discriminant slot:
- ElementType::ReferenceWithSumItem = 18 (was 17)
- ElementType::NonCountedReferenceWithSumItem = 146 (was 145, = 0x80 | 18)
- TryFrom<u8> arms bumped: 18 → ReferenceWithSumItem, 146 → twin
- from_serialized_value NonCounted-inner allowlist: 0..=14 | 18 (was | 17),
  and the NotSummed / NotCountedOrSummed rejection lists now include byte
  18 and twin 146 as illegal inners
- Discriminant-pinning tests updated to assert byte 18, twin 146
- Round-trip tests updated to assert serialized[0] == 18

Doc comments combined: count/sum/big_sum value helpers describe all four
wrapper behaviors (NonCounted suppresses count, NotSummed suppresses sum,
NotCountedOrSummed suppresses both, ReferenceWithSumItem contributes
explicit sum). Flag accessors / setters merge the NotCountedOrSummed inner
arm with my existing NonCounted | NotSummed delegation.

Test totals after merge (no regressions):
- grovedb-element: 85 + 17 + 3 + 4 = 109 (was 90; develop adds NotCountedOrSummed coverage)
- grovedb-merk: 432 (was 429; develop adds insert/reconstruct guards)
- grovedb lib: 1565 (was 1556 with both feature sets present)

The wrapper bit-encoding range check I introduced for is_non_counted
(0x80..=0xAF instead of upper-nibble compare) survives the merge unchanged
and remains required because base discriminant 18 produces a twin (0x92)
that falls outside the original 0x80 upper-nibble assumption.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR adds a new Element::ReferenceWithSumItem variant that references a target element while carrying an explicit sum value weight independent of the resolved target. It includes batch refresh-operation support via a new RefreshReferenceMode enum, serialization/deserialization infrastructure, and integration throughout reference-following code paths in get, query, insert, and proof operations.

Changes

ReferenceWithSumItem Feature

Layer / File(s) Summary
Element Variant & Type System Definition
grovedb-element/src/element/mod.rs, grovedb-element/src/element_type.rs
Adds Element::ReferenceWithSumItem(ReferencePathType, MaxReferenceHop, SumValue, Option<ElementFlags>) variant and corresponding ElementType::ReferenceWithSumItem (discriminant 18) and ElementType::NonCountedReferenceWithSumItem (discriminant 146) with wrapper validation and proof-node type classification.
Constructors & Element Helpers
grovedb-element/src/element/constructor.rs, grovedb-element/src/element/helpers.rs
Implements four constructor methods (with/without flags/hops) and extends all helper methods (sum-value accessors, predicates, flag getters/setters, reference-path conversion, wrapper support) to uniformly handle ReferenceWithSumItem alongside existing references.
Serialization, Display & Visualization
grovedb-element/src/element/mod.rs (serde), grovedb-element/src/element/visualize.rs, grovedb-element/tests/element_display_and_serialization.rs
Adds serde ElementShadow support, Display formatting with max-hop/sum/flags, visualization output, and extends type tests to validate discriminant pinning and bincode round-trips.
Element Unit Tests
grovedb-element/tests/element_constructors_helpers.rs
Validates constructor variants, predicate behavior, non-counted wrapping, sum propagation, flag manipulation, reference conversion, and serialization round-trips.
RefreshReferenceMode Operation Type
grovedb/src/batch/refresh_reference_mode.rs
Introduces RefreshReferenceMode enum with five variants encoding both reference shape (plain vs sum-item) and trust mode (trusted vs untrusted), plus is_trusted() method for dispatch logic.
GroveOp::RefreshReference Refactoring
grovedb/src/batch/mod.rs
Replaces trust_refresh_reference: bool field with mode: RefreshReferenceMode, adds non_counted flag, updates QualifiedGroveDbOp constructors (refresh_reference_op, refresh_reference_with_sum_item_op, refresh_reference_with_sum_item_keep_sum_op), and adjusts Debug formatting to display refresh modes.
Batch Reference Resolution & Dispatch
grovedb/src/batch/mod.rs
Updates hop=1 fast-path, slow-path control flow, hop>1 dispatch, in-batch value-hash computation, and dependent-reference following to treat ReferenceWithSumItem the same as Reference during chain following.
Batch Insertion & Override-Check Updates
grovedb/src/batch/mod.rs
Updates wrapper-aware override-checking via element.is_reference() and reference-insertion wiring to handle ReferenceWithSumItem uniformly with Reference.
RefreshReference Execution Handler
grovedb/src/batch/mod.rs
Rewrites apply-time logic to dispatch on RefreshReferenceMode: trusted modes reconstruct elements directly from op fields, untrusted modes read on-disk element and enforce variant/wrapper compatibility, optionally override sum, and compute refreshed value-hash via follow-reference.
Batch Cost Estimation Infrastructure
grovedb/src/batch/estimated_costs/mod.rs, grovedb/src/estimated_costs/average_case_costs.rs, grovedb/src/estimated_costs/worst_case_costs.rs
Adds wrapper_overhead_for helper function and updates average_case_merk_insert_tree/worst_case_merk_insert_tree signatures to accept wrapper_overhead parameter.
Average-Case Cost Estimation
grovedb/src/batch/estimated_costs/average_case_costs.rs, grovedb/src/tests/estimated_costs_average_case_tests.rs
Updates InsertTreeWithRootHash, RefreshReference (rebuilds element based on mode), and InsertNonMerkTree to include wrapper overhead; adds refresh-with-sum-item and wrapper-byte accounting tests.
Worst-Case Cost Estimation
grovedb/src/batch/estimated_costs/worst_case_costs.rs, grovedb/src/tests/estimated_costs_worst_case_tests.rs
Adds RefreshReferenceWithSumItem handling to estimator, updates wrapper overhead calculation for InsertTreeWithRootHash and InsertNonMerkTree, includes tests validating worst-case estimates and wrapper overhead.
Reference-Following Integration
grovedb/src/operations/get/mod.rs, grovedb/src/operations/insert/mod.rs, grovedb/src/operations/proof/generate.rs, grovedb/src/operations/proof/verify.rs, grovedb/src/reference_path.rs, grovedb/src/element/aggregate_sum_query/mod.rs, grovedb/src/lib.rs
Updates all reference-following code paths (get, insert, proof generation/verification, path resolution, aggregate sum queries) to treat ReferenceWithSumItem uniformly with Reference during chain following.
Query Layer Integration
grovedb/src/operations/get/query.rs
Updates query result-handling layers to match both reference variants, enforce absolute-path requirements, follow via follow_reference, and clarify that carried sums are parent-aggregation properties.
Test Infrastructure & Aggregate Sum Tests
grovedb/src/tests/mod.rs, grovedb/src/element/aggregate_sum_query/tests.rs
Adds test module integration and extends aggregate sum query tests for direct and chained ReferenceWithSumItem resolution.
End-to-End Integration Tests
grovedb/src/tests/reference_with_sum_item_tests.rs
Comprehensive test suite covering sum propagation to parent trees, multi-hop resolution chains, batch operations, cross-type rejection, predicate persistence, query/proof coverage, and dependent-reference hash recomputation.
Batch Test Updates
grovedb/src/tests/batch_coverage_tests.rs, grovedb/src/tests/batch_unit_tests.rs, grovedb/src/tests/misc_coverage_tests.rs
Updates batch tests to pass non_counted parameter and use RefreshReferenceMode in op construction.
Debugger Wire-Format Support
grovedb/src/debugger.rs
Adds reference_path_to_grovedbg helper and extends element_to_grovedbg to emit grovedbg_types::Element::ReferenceWithSumItem, preserving sum_item_value on wire; includes conversion tests.
Debugger Type Definition
grovedbg-types/src/lib.rs, grovedbg-types/Cargo.toml
Adds Element::ReferenceWithSumItem variant carrying Reference and i64 sum_item_value, adds serde_json dev-dependency, includes JSON round-trip tests.
Merk Storage Cost Accounting
merk/src/element/get.rs
Updates deserialization cost accounting to treat ReferenceWithSumItem as a free sum-item for storage-loaded-bytes calculation.

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • dashpay/grovedb#654: Both PRs touch reference-handling pathways in grovedb batch and get operations, with this PR extending chain logic for ReferenceWithSumItem while the related PR alters the same dispatch to properly unwrap Element::NonCounted.

🐰 A sum now weighs on references' backs,
Through hopping chains and queried tracks,
New modes refresh, old paths stay true,
With carried sums the journey through! 🏃‍♂️✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new Element::ReferenceWithSumItem variant and RefreshReferenceWithSumItem batch operation, which is the central focus of this large changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/practical-kalam-bc19ed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

❌ Patch coverage is 90.50132% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.01%. Comparing base (6611acb) to head (92b7599).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
grovedb/src/batch/mod.rs 85.95% 25 Missing ⚠️
...db/src/batch/estimated_costs/average_case_costs.rs 91.91% 16 Missing ⚠️
...vedb/src/batch/estimated_costs/worst_case_costs.rs 90.64% 16 Missing ⚠️
grovedb/src/element/aggregate_sum_query/mod.rs 44.44% 5 Missing ⚠️
grovedb/src/operations/get/query.rs 60.00% 4 Missing ⚠️
grovedb-element/src/element/mod.rs 72.72% 3 Missing ⚠️
grovedb-element/src/element/helpers.rs 97.22% 1 Missing ⚠️
grovedb-element/src/element_type.rs 98.41% 1 Missing ⚠️
grovedb/src/operations/proof/generate.rs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #667      +/-   ##
===========================================
+ Coverage    90.87%   91.01%   +0.14%     
===========================================
  Files          189      192       +3     
  Lines        57139    57792     +653     
===========================================
+ Hits         51923    52598     +675     
+ Misses        5216     5194      -22     
Components Coverage Δ
grovedb-core 88.85% <88.75%> (+0.32%) ⬆️
merk 92.39% <ø> (ø)
storage 86.36% <ø> (ø)
commitment-tree 96.43% <ø> (ø)
mmr 96.76% <ø> (ø)
bulk-append-tree 89.14% <ø> (ø)
element 96.56% <96.37%> (+0.10%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
grovedb/src/batch/mod.rs (1)

1794-1818: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Resolve refreshed sum-item references through the new path, not the on-disk one.

execute_ops_on_path later rebuilds RefreshReferenceWithSumItem from the op payload even when trust_refresh_reference is false, but this branch still falls back to process_reference(..., None, ...) in that case. If another reference in the same batch points at the refreshed key, its value hash is computed against the stale on-disk target instead of the new reference_path_type, so the dependent reference can be committed with the wrong hash.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/batch/mod.rs` around lines 1794 - 1818, The branch handling
GroveOp::RefreshReferenceWithSumItem currently passes None to process_reference
when trust_refresh_reference is false, causing dependent references to resolve
against the on-disk target; always resolve refreshed sum-item references through
the op's new path by passing Some(reference_path_type) to process_reference
(i.e., remove the conditional around trust_refresh_reference and supply
reference_info = Some(reference_path_type)), so
GroveOp::RefreshReferenceWithSumItem (and code that rebuilds
RefreshReferenceWithSumItem in execute_ops_on_path) will use the refreshed
reference_path_type when calling process_reference.
🧹 Nitpick comments (1)
grovedb/src/debugger.rs (1)

748-757: ⚡ Quick win

Add a focused unit test for ReferenceWithSumItem debugger conversion.

This new branch intentionally drops the carried sum in debugger wire output; a dedicated test would lock that contract and prevent accidental behavior drift.

Proposed test addition
 #[cfg(test)]
 mod tests {
     use super::*;
+    use crate::reference_path::ReferencePathType;

     #[test]
     fn element_to_grovedbg_converts_item_with_sum_item() {
         let flags = Some(vec![1, 2, 3]);
         let element = crate::Element::ItemWithSumItem(b"dbg".to_vec(), -5, flags.clone());
@@
             _ => panic!("unexpected debugger conversion"),
         }
     }
+
+    #[test]
+    fn element_to_grovedbg_converts_reference_with_sum_item_as_reference() {
+        let flags = Some(vec![9]);
+        let element = crate::Element::ReferenceWithSumItem(
+            ReferencePathType::AbsolutePathReference(vec![b"root".to_vec(), b"k".to_vec()]),
+            None,
+            42,
+            flags.clone(),
+        );
+
+        match element_to_grovedbg(element) {
+            grovedbg_types::Element::Reference(
+                grovedbg_types::Reference::AbsolutePathReference { path, element_flags },
+            ) => {
+                assert_eq!(path, vec![b"root".to_vec(), b"k".to_vec()]);
+                assert_eq!(element_flags, flags);
+            }
+            _ => panic!("unexpected debugger conversion"),
+        }
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/debugger.rs` around lines 748 - 757, Add a focused unit test that
constructs a crate::Element::ReferenceWithSumItem (with a nonzero _sum_value)
and verifies that element_to_grovedbg returns the same debugger output as
calling element_to_grovedbg on crate::Element::Reference(created with the same
reference_path, max_hop, element_flags), thereby locking the current behavior
that the carried sum is dropped; place the test in the debugger.rs tests module
and assert equality of serialized/debug outputs (or the returned structure) to
detect any future drift in Element::ReferenceWithSumItem handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@grovedb/src/batch/mod.rs`:
- Around line 2020-2026: The new match arm handling
Element::ReferenceWithSumItem shares the same insertion path as
Element::Reference but the guard function
validate_insertion_does_not_override_tree only exempts Element::Reference (on
the outer enum), causing ReferenceWithSumItem and
NonCounted(ReferenceWithSumItem) to be incorrectly rejected; update the
validation logic so it treats ReferenceWithSumItem the same as Reference (and
recognize NonCounted wrapping it) — e.g., extend the exemption pattern(s) in
validate_insertion_does_not_override_tree to include
Element::ReferenceWithSumItem and the NonCounted(ReferenceWithSumItem) variant
(or normalize the inner enum check) so both reference kinds pass the same
validation.
- Around line 2332-2338: The trusted-refresh branch currently builds a bare
Element::ReferenceWithSumItem when trust_refresh_reference is true, which strips
a stored NonCounted(...) wrapper and changes serialized bytes and aggregates;
update the logic that constructs the element (the branch using
trust_refresh_reference) to detect and preserve the NonCounted wrapper bit
(i.e., reconstruct and write back NonCounted(ReferenceWithSumItem(...)) when the
stored element was wrapped) or, if preserving wrapped refs is undesired,
explicitly reject/return an error when encountering a
NonCounted(ReferenceWithSumItem) during trusted refresh so you do not silently
drop the wrapper; adjust the codepaths that read/construct elements (the place
constructing Element::ReferenceWithSumItem) to carry the wrapper information
through to the write.

---

Outside diff comments:
In `@grovedb/src/batch/mod.rs`:
- Around line 1794-1818: The branch handling
GroveOp::RefreshReferenceWithSumItem currently passes None to process_reference
when trust_refresh_reference is false, causing dependent references to resolve
against the on-disk target; always resolve refreshed sum-item references through
the op's new path by passing Some(reference_path_type) to process_reference
(i.e., remove the conditional around trust_refresh_reference and supply
reference_info = Some(reference_path_type)), so
GroveOp::RefreshReferenceWithSumItem (and code that rebuilds
RefreshReferenceWithSumItem in execute_ops_on_path) will use the refreshed
reference_path_type when calling process_reference.

---

Nitpick comments:
In `@grovedb/src/debugger.rs`:
- Around line 748-757: Add a focused unit test that constructs a
crate::Element::ReferenceWithSumItem (with a nonzero _sum_value) and verifies
that element_to_grovedbg returns the same debugger output as calling
element_to_grovedbg on crate::Element::Reference(created with the same
reference_path, max_hop, element_flags), thereby locking the current behavior
that the carried sum is dropped; place the test in the debugger.rs tests module
and assert equality of serialized/debug outputs (or the returned structure) to
detect any future drift in Element::ReferenceWithSumItem handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 61c2fd10-9580-4c41-93c2-31eb3b6154fd

📥 Commits

Reviewing files that changed from the base of the PR and between 6611acb and f86f68d.

📒 Files selected for processing (23)
  • grovedb-element/src/element/constructor.rs
  • grovedb-element/src/element/helpers.rs
  • grovedb-element/src/element/mod.rs
  • grovedb-element/src/element/visualize.rs
  • grovedb-element/src/element_type.rs
  • grovedb-element/tests/element_constructors_helpers.rs
  • grovedb-element/tests/element_display_and_serialization.rs
  • grovedb/src/batch/batch_structure.rs
  • grovedb/src/batch/estimated_costs/average_case_costs.rs
  • grovedb/src/batch/estimated_costs/worst_case_costs.rs
  • grovedb/src/batch/mod.rs
  • grovedb/src/debugger.rs
  • grovedb/src/element/aggregate_sum_query/mod.rs
  • grovedb/src/lib.rs
  • grovedb/src/operations/get/mod.rs
  • grovedb/src/operations/get/query.rs
  • grovedb/src/operations/insert/mod.rs
  • grovedb/src/operations/proof/generate.rs
  • grovedb/src/operations/proof/verify.rs
  • grovedb/src/reference_path.rs
  • grovedb/src/tests/mod.rs
  • grovedb/src/tests/reference_with_sum_item_tests.rs
  • merk/src/element/get.rs

Comment thread grovedb/src/batch/mod.rs
Comment thread grovedb/src/batch/mod.rs Outdated
QuantumExplorer and others added 3 commits May 16, 2026 23:04
…gate-sum paths for ReferenceWithSumItem

Addresses Codecov failure on PR #667 (patch coverage 69.76% < 90% target).
The added arms in query.rs, batch refresh, cost estimation, aggregate-sum-
query, and proof generation/verify all lacked test coverage. Adds:

batch/estimated_costs/average_case_costs.rs:
- test_refresh_reference_with_sum_item_average_case_cost: covers the new
  GroveOp::RefreshReferenceWithSumItem arm in average_case_cost.

batch/estimated_costs/worst_case_costs.rs:
- test_refresh_reference_with_sum_item_worst_case_cost: same for worst-case.

element/aggregate_sum_query/tests.rs:
- test_reference_with_sum_item_chain_followed_to_sum_item: aggregate-sum
  query follows the new variant to a SumItem and returns the target's sum
  (the carried sum is parent-aggregation-only).
- test_reference_with_sum_item_chain_through_intermediate_reference:
  exercises the chain-continuation arm with a RefWithSum -> Ref -> SumItem
  chain.

tests/reference_with_sum_item_tests.rs:
- query_item_value_follows_reference_with_sum_item: covers operations/get/
  query.rs query_item_value reference arm.
- query_item_value_or_sum_follows_reference_with_sum_item: same for
  query_item_value_or_sum.
- query_sums_follows_reference_with_sum_item_to_sum_item: same for
  query_sums.
- query_encoded_many_follows_reference_with_sum_item: covers the multi-
  path query arm near query.rs:98.
- batch_refresh_reference_with_sum_item_untrusted: covers the
  trust_refresh_reference=false branch which reads the on-disk element
  and verifies it is also a ReferenceWithSumItem before rebuilding.
- prove_and_verify_reference_with_sum_item: prove_query + verify round-
  trip exercises both reference proof arms.

Test totals: grovedb 1575 (was 1565, +10), grovedb-element 85, grovedb-merk 432.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mItem; exempt new refs from tree-override guard

Addresses CodeRabbit review feedback on PR #667:

Finding 1 (Minor, batch/mod.rs:2026): The
validate_insertion_does_not_override_tree guard exempted only
Element::Reference(..) on the outer enum via matches!, rejecting
ReferenceWithSumItem and any NonCounted-wrapped reference as
"attempting to overwrite a tree". Switched to element.is_reference(),
which looks through NonCounted and recognizes both reference variants.

Finding 2 (Major, batch/mod.rs:2338): The trusted-refresh branch
unconditionally rebuilt a bare Element::ReferenceWithSumItem(...),
silently stripping any NonCounted wrapper that was on disk. This
would change count_value_or_default from 0 to 1 and corrupt the
parent's count aggregate. Fixed by adding an explicit non_counted: bool
field to GroveOp::RefreshReferenceWithSumItem and the
refresh_reference_with_sum_item_op constructor:

- Trusted path (trust_refresh_reference = true): caller's non_counted
  declaration is taken at face value; wraps in NonCounted iff the flag
  is set.
- Untrusted path (trust_refresh_reference = false): cross-checks the
  declared non_counted against the on-disk element and returns
  Error::InvalidInput on mismatch, preventing silent wrapper drop or
  injection.

Added two tests:
- batch_refresh_reference_with_sum_item_trusted_preserves_non_counted_wrapper:
  seeds NonCounted(RefWithSum) in a CountSumTree, refreshes trusted with
  non_counted=true, asserts the parent's CountAndSum aggregate stays
  (0, new_sum) - wrapper preserved.
- batch_refresh_reference_with_sum_item_untrusted_rejects_wrapper_mismatch:
  seeds a bare RefWithSum, attempts untrusted refresh with non_counted=true,
  asserts the apply path errors out.

Total: 1577 grovedb tests (was 1575, +2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss_reference

Addresses the critical "outside diff" CodeRabbit finding on PR #667.

When a batch contained both a RefreshReference / RefreshReferenceWithSumItem
op AND another reference (in the same batch) that pointed at the refreshed
key, the dependent reference's value hash was computed against the stale
on-disk target. The bug:

    let reference_info = if *trust_refresh_reference {
        Some(reference_path_type)
    } else {
        None  // <-- falls back to on-disk lookup (pre-batch state)
    };
    self.process_reference(.., reference_info, ..);

When trust=false, process_reference received None, treated the reference
as if it weren't being refreshed in the batch, and read the stale
on-disk value hash. The apply path later wrote the refreshed element
to disk, but other dependent refs in the same batch had already been
committed with the wrong hash.

Fix: always pass Some(reference_path_type). The op payload IS the
authoritative new path during batch processing; the trust flag is
orthogonal and only controls cross-validation against disk at apply
time in execute_ops_on_path, not path resolution for dependent refs.

This also fixes the same pre-existing bug in plain RefreshReference,
since both ops share the merged match arm in process_reference's
in-batch branch.

Added regression test
batch_dependent_reference_resolves_through_refreshed_path: builds a
batch with a RefreshReferenceWithSumItem (trust=false, path: old to new)
and a dependent Reference that points at the refreshed key.
Re-resolving the dependent ref after the batch must return the NEW
target's payload, proving the in-batch hash computation followed the
refreshed path.

1578 grovedb tests (was 1577, +1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

This is Claude. Two follow-ups on the latest review:

Outside-diff finding (Critical) — process_reference passing None when trust=false: fixed in 9384dc0. The reference_info argument to process_reference now unconditionally passes Some(reference_path_type) for both RefreshReference and RefreshReferenceWithSumItem. The op's payload is the authoritative new path during batch processing; the trust_refresh_reference flag is orthogonal and only controls cross-validation against disk at apply time, not in-batch path resolution for dependent references. This also fixes the same pre-existing bug in plain RefreshReference since both ops share the merged arm.

Added regression test batch_dependent_reference_resolves_through_refreshed_path: builds a batch with a RefreshReferenceWithSumItem (trust=false, path changes item_olditem_new) and a dependent plain Reference that resolves through the refreshed key. Resolving the dependent ref after the batch must return the new target's payload, proving the in-batch hash computation followed the refreshed path.

Optional cost-path follow-up (from your reply on comment 3253081380): the average/worst-case cost estimators construct a bare ReferenceWithSumItem even when non_counted=true, so the +1 wrapper byte is undercounted. I'm deferring this — pre-existing RefreshReference cost estimators have the same imprecision and the byte difference is tiny relative to the path/sum/flags bytes. Will track as a separate follow-up. Happy to address now if you'd prefer.

Run totals: 1578 grovedb tests (was 1577, +1), all green.

QuantumExplorer and others added 2 commits May 16, 2026 23:39
…erve, missing key

Boosts patch coverage on PR #667 (codecov was 86.66% < 90% target).

grovedb/src/operations/get/query.rs:
Restructured the first match arm to keep `=> match reference_path { ... }`
on the same line as the `| Element::ReferenceWithSumItem(...)` pattern
extension. The previous form wrapped the inner match in an extra block,
re-indenting the entire match body and causing codecov to flag all
pre-existing inner branches as "patch" lines.

Test additions in grovedb/src/tests/reference_with_sum_item_tests.rs:

- refresh_reference_with_sum_item_op_tag_pin: compares the new op
  against Delete and InsertOrReplace via Ord::cmp, exercising the
  to_u8 arm at line 443 (RefreshReferenceWithSumItem to 17). This is
  the wire-format pin for batch op serialization.
- refresh_reference_with_sum_item_debug_format: asserts the
  fmt::Debug arm produces a string with the op name, max_hop, sum,
  and trust flag.
- batch_refresh_reference_with_sum_item_trusted_with_nc_wraps: trusted
  refresh + non_counted=true on a CountSumTree parent goes through
  Element::new_non_counted on the rebuilt inner.
- batch_refresh_reference_with_sum_item_untrusted_matches_wrapper_succeeds:
  untrusted refresh against a NonCounted(RefWithSum) with
  non_counted=true succeeds (the cross-check passes and the wrapper
  is rebuilt).
- batch_refresh_reference_with_sum_item_untrusted_missing_key_errors:
  refresh of a non-existing key with trust=false exercises the
  "trying to refresh a non existing reference" error arm.

1583 grovedb tests (was 1578, +5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ost arms honor non_counted

P2 fix - trusted refresh could persist NonCounted in non-count-bearing trees.
The direct insert/replace/patch path at grovedb/src/batch/mod.rs lines
2016-2037 rejects NonCounted-wrapped elements in non-count-bearing
parents. The RefreshReferenceWithSumItem apply branch added in this PR
constructs the element internally from the op's non_counted flag and
bypassed that guard: with trust_refresh_reference = true, a caller could
refresh a bare ReferenceWithSumItem in a NormalTree into
NonCounted(ReferenceWithSumItem) - violating the invariant that
NonCounted-wrapped elements only live in count-bearing trees.

Fix: replicate the per-merk wrapper invariant in the refresh apply
branch, after rebuilding element and before get_feature_type:

    if element.is_non_counted() && !in_tree_type.is_count_bearing() {
        return Err(Error::InvalidBatchOperation(
            "RefreshReferenceWithSumItem with non_counted=true requires \
             a count-bearing parent",
        )).wrap_with_cost(cost);
    }

Added regression test
batch_refresh_reference_with_sum_item_trusted_with_nc_rejected_in_normal_tree:
seeds a bare RefWithSum in a NormalTree, attempts a trusted refresh
with non_counted=true, asserts the apply errors out AND the on-disk
shape is unmodified.

P3 fix - cost estimators ignored non_counted. The
RefreshReferenceWithSumItem arms in average_case_costs.rs:130 and
worst_case_costs.rs:123 destructured the op with `..` and always
constructed a bare Element::ReferenceWithSumItem for cost computation,
even when execution writes NonCounted(...). That undercounted the
serialized value by the wrapper byte.

Fix: build the element shape that the apply path will actually write:

    let inner = Element::ReferenceWithSumItem(...);
    let element = if *non_counted {
        Element::NonCounted(Box::new(inner))
    } else { inner };

Added two tests that compute the cost for non_counted=true and bare
variants and assert the non_counted estimate is >= the bare estimate
(it must include the wrapper byte).

1586 grovedb tests (was 1583, +3).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
grovedb/src/tests/reference_with_sum_item_tests.rs (1)

621-653: ⚡ Quick win

Assert the exact refresh op wire tag, not just ordering

This check currently proves relative ordering only. At Line 650-Line 652, the test still passes if the tag changes from 17 to any larger value. Please assert the exact encoded/tag value so compatibility is truly pinned.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/tests/reference_with_sum_item_tests.rs` around lines 621 - 653,
Add an explicit assertion that the refresh op's wire tag equals 17: after
creating refresh via QualifiedGroveDbOp::refresh_reference_with_sum_item_op (the
variable named refresh, which is a GroveOp), assert refresh.to_u8() == 17 (e.g.
assert_eq!(refresh.to_u8(), 17u8)) so the test pins the exact encoded/tag value
rather than only relative ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@grovedb/src/tests/reference_with_sum_item_tests.rs`:
- Around line 621-653: Add an explicit assertion that the refresh op's wire tag
equals 17: after creating refresh via
QualifiedGroveDbOp::refresh_reference_with_sum_item_op (the variable named
refresh, which is a GroveOp), assert refresh.to_u8() == 17 (e.g.
assert_eq!(refresh.to_u8(), 17u8)) so the test pins the exact encoded/tag value
rather than only relative ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1b2aa6bf-4aad-4f31-9a48-5f3f23a2c616

📥 Commits

Reviewing files that changed from the base of the PR and between f86f68d and 4330362.

📒 Files selected for processing (6)
  • grovedb/src/batch/estimated_costs/average_case_costs.rs
  • grovedb/src/batch/estimated_costs/worst_case_costs.rs
  • grovedb/src/batch/mod.rs
  • grovedb/src/element/aggregate_sum_query/tests.rs
  • grovedb/src/operations/get/query.rs
  • grovedb/src/tests/reference_with_sum_item_tests.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • grovedb/src/batch/estimated_costs/worst_case_costs.rs
  • grovedb/src/operations/get/query.rs
  • grovedb/src/batch/mod.rs

QuantumExplorer and others added 2 commits May 17, 2026 03:25
…tale/wrong hashes

P1 was REAL. Confirmed with a probe test that builds a batch with
RefreshReferenceWithSumItem(link) plus an InsertOrReplace of a
dependent reference dep to link. With the pre-fix code, verify_grovedb
reported a hash mismatch on [test_leaf, dep] (the load-bearing
assertion from CodeRabbit's local probe). The probe is now turned
into a regression test.

Root cause: process_reference at grovedb/src/batch/mod.rs:1332 had a
`recursions_allowed == 1` fast path that returned
merk.get_value_hash(target_key). Two ways that was wrong:

1. For an in-batch refreshed target, the on-disk merk value_hash is
   stale - the apply path hasn't written the refreshed element yet
   when the dependent ref's hash is being computed in the same batch.

2. For ANY Reference target (even unrefreshed), the merk value_hash
   is combine_hash(H(serialize(ref)), referenced_value) - NOT the
   simple hash of the terminal item, which is what insert_reference
   bakes in via Op::PutCombinedReference. So a dep to link to target
   chain committed via batch with max_hop=1 stored a different hash
   than what verify_grovedb (which follows the chain to terminal
   with MAX_REFERENCE_HOPS budget) recomputes.

Fix: remove the fast path entirely. The dispatch is now:
- intermediate_reference_info=Some (target in batch as refresh):
  hop through the op's new path, decrementing recursions_allowed.
- intermediate_reference_info=None (target not in batch): always
  go through process_reference_with_hop_count_greater_than_one,
  which reads the actual on-disk element. Item terminals return
  H(serialize(item)) (consistent with direct insert + verify); a
  Reference target recurses with recursions_allowed - 1 and produces
  ReferenceLimit when the user-declared max_hop is exhausted.

Behavior changes:
- A dep to link to item chain inserted via batch now stores the same
  combined value_hash as the direct insert path computes - so
  verify_grovedb reports clean afterwards.
- max_hop=Some(1) pointing at another Reference cleanly fails with
  ReferenceLimit at batch time, replacing the previous silent-stale
  behavior. The existing batch::tests::test_references explicitly
  covers this rejection (line 5997) and continues to pass.

Tests added in grovedb/src/tests/reference_with_sum_item_tests.rs:

- batch_dependent_ref_resolves_through_refreshed_path_via_chain:
  dep to link to target with link being refreshed in same batch.
  Post-batch verify_grovedb must be clean - the regression check
  for the P1 finding.

- batch_one_hop_dependent_ref_into_ref_chain_rejected: dep with
  max_hop=Some(1) pointing at a Reference is rejected with
  ReferenceLimit. Documents that the strict enforcement is preserved.

1588 grovedb tests (was 1586, +2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses CodeRabbit nitpick on PR #667
(grovedb/src/tests/reference_with_sum_item_tests.rs:621-653): the
prior test only checked relative `Ord::cmp` ordering, so the new op's
sort tag could silently drift from 17 to any larger value without
failing the test. Bump `GroveOp::to_u8` from private to `pub(crate)`
so the test can pin the exact value, and assert
`refresh.to_u8() == 17` directly. Existing relative-ordering checks
are retained as a sanity belt.

1588 grovedb tests, unchanged count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

This is Claude. Addressed the nitpick on grovedb/src/tests/reference_with_sum_item_tests.rs:621-653 in d4055e7: bumped GroveOp::to_u8 from private to pub(crate) and added an exact assert refresh.to_u8() == 17 to the refresh_reference_with_sum_item_op_tag_pin test. The previous relative-ordering checks are retained as a sanity belt. Any silent renumbering to a larger value will now fail the test.

The average-case and worst-case cost estimators for the batch ops that
insert merk trees (InsertTreeWithRootHash) and non-merk trees
(InsertNonMerkTree) were under-counting the on-disk payload by 1 byte
when the op's element is wrapped in NonCounted (or, for the merk-tree
case, NotSummed/NotCountedOrSummed). The wrapper prepends a 1-byte
discriminant ahead of the inner element's bincode payload — exactly
the same accounting fix already applied for InsertReference and the
RefreshReferenceWithSumItem cost arms.

Mechanics:
- average_case_merk_insert_tree / worst_case_merk_insert_tree gain a
  wrapper_overhead: u32 parameter that's folded into value_len next
  to flags_len.
- A new pub(in crate::batch) helper wrapper_overhead_for(non_counted,
  not_summed) -> u32 in batch/estimated_costs/mod.rs returns 1 when
  either wrapper bit is set (the two are mutually exclusive on any
  element, so the result is always 0 or 1).
- The four batch cost arms (avg / worst × InsertTreeWithRootHash /
  InsertNonMerkTree) destructure their wrapper bits and feed the
  helper. InsertNonMerkTree only carries non_counted, so it bypasses
  the helper and uses the literal.
- Test callsites that called the two helpers with 6 args are updated
  to pass 0 for the new wrapper_overhead slot.

Why this matters: the cost estimators are used to pre-budget storage
fees before applying batches. A 1-byte under-count on every wrapped
tree insert is small individually but compounds at batch size and
silently lowers reservation requirements relative to actual storage
consumption. Same shape as the bug already fixed for InsertReference
in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

This is Claude.

Extended the cost-arm wrapper-byte fix to the analogous spots in InsertTreeWithRootHash and InsertNonMerkTree (commit 9c38ade).

The same 1-byte under-count pattern that was lurking in the reference cost arms was also present in the tree-insert arms: average_case_merk_insert_tree / worst_case_merk_insert_tree sized the inner element + flags but didn't account for the bincode wrapper discriminant when the on-disk element is NonCounted(...), NotSummed(...), or NotCountedOrSummed(...).

Mechanics:

  • Both helpers gain a wrapper_overhead: u32 parameter, folded into value_len.
  • New pub(in crate::batch) fn wrapper_overhead_for(non_counted, not_summed) -> u32 helper in grovedb/src/batch/estimated_costs/mod.rs — the two wrapper bits are mutually exclusive on any element (validator enforces this), so the result is always 0 or 1.
  • The four batch cost arms (avg / worst × InsertTreeWithRootHash / InsertNonMerkTree) destructure their wrapper bits and feed the helper. InsertNonMerkTree only carries non_counted, so it uses the literal directly.
  • 6 in-tree test callsites updated to pass 0 for the new slot.

Verified: 1588/1588 cargo test -p grovedb --lib --tests.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@grovedb/src/batch/estimated_costs/average_case_costs.rs`:
- Around line 1293-1304: The assertion that the NonCounted-wrapped cost must be
at least as large as the bare variant is too weak; update the check comparing
nc_cost and bare_cost sums (nc_cost.storage_cost.added_bytes +
nc_cost.storage_cost.replaced_bytes vs bare_cost.storage_cost.added_bytes +
bare_cost.storage_cost.replaced_bytes) to require a strict increase (i.e. ensure
the left-hand sum is greater than the right-hand sum or >= right-hand sum + 1)
so identical (undercount) results fail the test; modify the assertion message
accordingly to reflect the stricter expectation.

In `@grovedb/src/batch/estimated_costs/worst_case_costs.rs`:
- Around line 991-997: The assertion that verifies wrapper-byte overhead uses a
non-strict comparison; update the check comparing nc_cost and bare_cost (the
expression summing storage_cost.added_bytes and storage_cost.replaced_bytes for
nc_cost vs bare_cost) to use a strict greater-than (>) so non_counted=true must
be strictly larger than bare, and adjust the assertion message accordingly to
reflect a strict increase for the nc_cost vs bare_cost check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 298d80eb-397f-40b1-bb45-7a2dab985b15

📥 Commits

Reviewing files that changed from the base of the PR and between 4330362 and 9c38ade.

📒 Files selected for processing (9)
  • grovedb/src/batch/estimated_costs/average_case_costs.rs
  • grovedb/src/batch/estimated_costs/mod.rs
  • grovedb/src/batch/estimated_costs/worst_case_costs.rs
  • grovedb/src/batch/mod.rs
  • grovedb/src/estimated_costs/average_case_costs.rs
  • grovedb/src/estimated_costs/worst_case_costs.rs
  • grovedb/src/tests/estimated_costs_average_case_tests.rs
  • grovedb/src/tests/estimated_costs_worst_case_tests.rs
  • grovedb/src/tests/reference_with_sum_item_tests.rs

Comment thread grovedb/src/batch/estimated_costs/average_case_costs.rs
Comment thread grovedb/src/batch/estimated_costs/worst_case_costs.rs
…rapper byte

The previous commit added wrapper-byte accounting for InsertTreeWithRootHash
but only threaded the non_counted and not_summed flags. The op also carries
a third wrapper bit, not_counted_or_summed (introduced in #666 as the
combined sum+count opt-out), which was still being silently dropped.

Mechanics:
- wrapper_overhead_for now takes three flags (non_counted, not_summed,
  not_counted_or_summed). All three are mutually exclusive on any element
  (the wrapper-invariant validator enforces this), so the result is still
  0 or 1, but the helper now matches the full set of wrappers that
  InsertTreeWithRootHash actually carries.
- The two InsertTreeWithRootHash cost arms (avg + worst) destructure the
  third field and feed it to the helper.
- InsertNonMerkTree arms are unchanged (non-Merk trees are never
  sum-bearing, so not_summed / not_counted_or_summed cannot apply).

Coverage tests added (4 in-file tests):
- test_insert_tree_with_root_hash_wrapper_bits_average_case_cost_direct
- test_insert_tree_with_root_hash_wrapper_bits_worst_case_cost_direct
  Both construct InsertTreeWithRootHash with each wrapper bit set in
  turn and assert the bare and wrapped costs differ. Pins the +1 byte
  delta per wrapper bit.
- test_insert_non_merk_tree_non_counted_average_case_cost_direct
- test_insert_non_merk_tree_non_counted_worst_case_cost_direct
  Same for InsertNonMerkTree's single wrapper bit.

These tests also exercise the previously-uncovered true-branch of
wrapper_overhead_for, lifting patch coverage above the 90% threshold.

1592 grovedb tests pass (was 1588, +4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

This is Claude.

While investigating the codecov/patch failure, I found a real bug in the cost-arm fix from 9c38aded:

GroveOp::InsertTreeWithRootHash carries three wrapper bits — non_counted, not_summed, and not_counted_or_summed (the third was added in #666 as the combined sum+count opt-out). My helper only took the first two; the third was silently dropped, so a batch that inserts a NotCountedOrSummed(SumTree) would still under-count the wrapper byte.

Fix:

  • wrapper_overhead_for now takes all three flags. All three are mutually exclusive on any element (the validator enforces this), so the result remains 0 or 1.
  • Both InsertTreeWithRootHash cost arms (avg + worst) destructure the third field and feed it.
  • InsertNonMerkTree arms are unchanged — non-Merk trees are never sum-bearing, so only non_counted applies.

Coverage tests added (the missing patch lines):

  • test_insert_tree_with_root_hash_wrapper_bits_average_case_cost_direct / ..._worst_case_cost_direct — construct the op with each wrapper bit in turn and assert each wrapped variant costs strictly more than bare. Pins the +1 byte delta for every wrapper bit, including not_counted_or_summed.
  • test_insert_non_merk_tree_non_counted_average_case_cost_direct / ..._worst_case_cost_direct — same for InsertNonMerkTree's single bit.

These tests also hit the previously-uncovered if non_counted || not_summed || not_counted_or_summed { 1 } true-branch of wrapper_overhead_for, which is what tripped codecov.

Verified: 1592 grovedb tests pass (was 1588, +4).

QuantumExplorer and others added 10 commits May 17, 2026 04:32
Addresses CodeRabbit feedback on PR #667:

  > Line 1305 / 998 uses `>=`, so an undercount regression that produces
  > an identical cost would still pass. Assert a strict increase (or `+1`
  > minimum) to make this test load-bearing.

The bug being pinned is "estimator silently ignores the non_counted
flag and produces the same cost as bare". Equality is exactly the
failure mode — non-strict `>=` would let that regression slip past
green CI. Switching to strict `>` makes the assertions actually
load-bearing.

Two assertions tightened (matching the test pattern I already used in
the InsertTreeWithRootHash / InsertNonMerkTree direct cost tests):

- test_refresh_reference_with_sum_item_non_counted_average_case_cost
- test_refresh_reference_with_sum_item_non_counted_worst_case_cost

1592 grovedb tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…user contract

Reverts the fast-path removal from bc421eb. The user contract is now
explicit: when `max_reference_hop` is set to 1, the caller is asserting
the target is an `Item` (or `SumItem` / `ItemWithSumItem`) terminal.
Ill-formed callers (`max_hop = 1` pointing at another `Reference`) are
out of scope — their behavior is undefined, not a system-level invariant
to enforce on every well-formed hop=1 ref.

What the fast path does and why it's sound under the contract:

- `recursions_allowed == 1` means the user's budget allows exactly one
  more hop.
- Under the contract that hop lands on an `Item` terminal. The merk-
  stored `value_hash` of an `Item` IS `H(serialize(item))` — the simple
  hash that `insert_reference` bakes into the dependent ref via
  `Op::PutCombinedReference`. So `merk.get_value_hash(target_key)`
  returns the exact value we'd otherwise derive via a full element
  decode.
- Skipping the decode saves the cost of deserializing the terminal
  element on every leaf-of-chain dereference.

Ill-formed input (covered by the deleted `batch_one_hop_dependent_ref_
into_ref_chain_rejected` test): if a user violates the contract by
setting `max_hop = 1` and pointing at another `Reference`, the fast
path returns the target's merk-combined hash as if it were a terminal
simple hash. The dependent ref's stored hash then disagrees with what
`verify_grovedb` recomputes (which walks the chain with the global
`MAX_REFERENCE_HOPS` budget). The user broke their own budget; we don't
pin behavior for that case.

What stays from the earlier P1 work:

- `9384dc09`'s fix is independent and correct: the
  `RefreshReference[WithSumItem]` arm in `follow_reference_get_value_hash`
  unconditionally threads the op's new path to `process_reference`. The
  `trust_refresh_reference` flag is orthogonal — it gates apply-time
  cross-checking, not path resolution for dependent refs.
- `batch_dependent_ref_resolves_through_refreshed_path_via_chain` stays:
  dep has no `max_hop` set (budget = 10), so it never reaches the
  `recursions_allowed == 1` branch. The test's true coverage is the
  `9384dc09` path-threading fix, and the doc comment is updated to say
  so accurately.

Test deleted:
- `batch_one_hop_dependent_ref_into_ref_chain_rejected`: pinned the
  out-of-scope ill-formed case to `ReferenceLimit`. Under the new
  contract that case has no specified behavior, so the test is gone.

1591 grovedb tests pass (was 1592, -1 for the deleted out-of-scope test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s sum only

Reworks the `trust_refresh_reference` semantics on
`RefreshReferenceWithSumItem` so the two trust modes are clean and
non-overlapping:

* `trust = true`: apply writes the full op payload (path, max_hop,
  sum_value, flags, non_counted wrapper). No on-disk read. Use this
  to repoint and/or adjust the carried sum atomically.

* `trust = false`: apply reads disk, cross-checks variant +
  `non_counted` wrapper, then writes back with the on-disk path /
  max-hop / flags / wrapper — only `sum_value` is taken from the op.
  Op fields `reference_path_type`, `max_reference_hop`, `flags`, and
  the wrapper bit are intentionally ignored in this mode.

Previously the untrusted mode validated the on-disk shape but then
silently substituted the op's payload anyway — a shallow check that
let a `trust=false` caller change path AND sum. That was hard to
justify: "untrusted" effectively meant "validate one axis, clobber
the rest." The new contract is that `trust=false` strictly means "I
don't know / don't want to assert the path; just refresh the
weight."

Knock-on changes:

- `follow_reference_get_value_hash` `RefreshReference` /
  `RefreshReferenceWithSumItem` arm: gate the path threaded into
  `process_reference` on `trust_refresh_reference`. `Some(op_path)`
  when trusted (apply will write op's path), `None` when untrusted
  (apply will keep on-disk's path). This keeps dependent-ref hash
  computation consistent with whichever path apply actually writes
  — and is now symmetric for both refresh ops, because both share
  the rule "trust=true uses op's path, trust=false keeps on-disk".

- `GroveOp::RefreshReferenceWithSumItem` doc comment rewritten to
  document the two modes precisely (which op fields apply in each).

- Public constructor `refresh_reference_with_sum_item_op` doc
  rewritten to match.

Tests:

- `batch_refresh_reference_with_sum_item_untrusted`: rewritten to
  assert the new contract — pass a bogus `ref_b`, observe the
  on-disk path stayed `ref_a`, only `sum_value` updated to 42.

- `batch_dependent_reference_resolves_through_refreshed_path`:
  switched to `trust=true` (the only mode that rewrites the path).
  Added a `verify_grovedb` clean-state check.

- New `batch_untrusted_refresh_keeps_on_disk_path_only_sum_updates`:
  positive coverage for the new untrusted contract — observe the
  path stays unchanged and the sum updates.

1592 grovedb tests pass (was 1591, +1 for the new untrusted test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…for refresh ops

`GroveOp::RefreshReference` and `GroveOp::RefreshReferenceWithSumItem`
both follow the same trust-mode contract, but the docs were vague /
in places outright wrong about what happens on cross-type mismatch.
Specifically, the existing `RefreshReferenceWithSumItem` doc claimed
"Cross-type refresh ... is rejected at apply time", which is only
true under `trust=false` — under `trust=true` both ops silently
coerce the on-disk element to the variant the op declares.

This is intentional. The trust=true contract is "I'm asserting the
on-disk shape; write my payload verbatim". A caller using trust=true
against a mismatching on-disk variant gets:

- `RefreshReference` (trust=true) on a `ReferenceWithSumItem`:
  silently overwrites with plain `Reference`, dropping the sum.
  Parent sum aggregate becomes inconsistent.
- `RefreshReferenceWithSumItem` (trust=true) on a plain `Reference`:
  silently writes a `ReferenceWithSumItem` carrying the op's
  `sum_value`. Parent sum aggregate jumps by `+sum_value`.

These are the caller's responsibility, the same way `max_hop = 1`
pointing at a `Reference` is the caller's responsibility (the
fast-path-restored-under-well-formed-contract decision in 4a1a73c).

Changes:

- `GroveOp::RefreshReference` doc rewritten to document the two
  trust modes precisely (which fields apply in each; what happens
  on cross-type mismatch).

- `GroveOp::RefreshReferenceWithSumItem` doc rewritten to drop the
  incorrect "is rejected at apply time" claim and explicitly call
  out the trust=true silent-coercion behavior.

- Public constructors `refresh_reference_op` and
  `refresh_reference_with_sum_item_op`: doc comments rewritten to
  match the variants', so callers reading the constructor see the
  contract at the API surface too.

Tests (contract pins — will fail if a future contributor "fixes"
the silent coercion as a bug, forcing a contract review):

- `batch_refresh_reference_trusted_silently_coerces_ref_with_sum_item`:
  seed a `ReferenceWithSumItem(sum=10)` in a SumTree, refresh with
  `RefreshReference` trust=true, assert on-disk is now plain
  `Reference`.

- `batch_refresh_reference_with_sum_item_trusted_silently_coerces_plain_reference`:
  symmetric — seed a plain `Reference` in a SumTree, refresh with
  `RefreshReferenceWithSumItem` trust=true sum=77, assert on-disk
  is now `ReferenceWithSumItem(sum=77)`.

1594 grovedb tests pass (was 1592, +2 for the two contract pins).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rence

Drops the awkward `let _ = merk;` no-op from the slow path of
`process_reference`. The previous shape was:

    let merk = ... self.merks.entry(..) ...;  // borrows self.merks
    if recursions_allowed == 1 {
        // uses merk
        return ...;
    }
    if let Some(_) = ... {
        self.follow_reference_get_value_hash(..)  // &mut self
    } else {
        let _ = merk;  // explicit drop of the &mut self.merks borrow
        self.process_reference_with_hop_count_greater_than_one(..)
    }

The `let _ = merk;` released the `&mut self.merks` borrow so the
next call could take `&mut self`. NLL usually handles this, but the
binding was also confusing — the slow path never used `merk` and
the helper internally opens (or reuses the cached) merk via the
same `self.merks.entry(..)` path.

Cleaner: open the merk only inside the fast-path branch where we
actually use it. The slow path no longer carries a stale binding;
the helpers open their own merk handle, which is a HashMap entry
lookup against the same `self.merks` cache — no extra disk work.

No behavior change. 1594 grovedb tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…thSumItem

Resolves the TODO in `grovedb/src/debugger.rs::element_to_grovedbg`
that was rendering `Element::ReferenceWithSumItem` as a plain
`Reference` and dropping the carried sum from the debugger UI.

grovedbg-types changes (additive):

- New `Element::ReferenceWithSumItem { reference: Reference,
  sum_item_value: i64 }` variant on the wire `Element` enum. Reuses
  the existing wire `Reference` enum (which already encodes
  `element_flags` inside every path discriminant), and tacks on the
  independent `sum_item_value` at the outer level — same shape as
  the existing `Element::ItemWithSumItem` variant.

- `serde_json` added as a dev-dependency for the round-trip tests.

debugger.rs changes:

- Extracted `reference_path_to_grovedbg(reference_path,
  element_flags) -> grovedbg_types::Reference` helper that handles
  the seven-way `ReferencePathType` → wire `Reference`
  discriminant match. Removes the (previously inlined) 70+ lines
  of per-discriminant boilerplate from `element_to_grovedbg`.

- `element_to_grovedbg`'s `Reference` arm collapses to one
  delegation call. New `ReferenceWithSumItem` arm uses the same
  helper to encode the path and wraps it in the new wire variant
  with the carried `sum_item_value`.

Tests:

- grovedbg-types: two JSON round-trip pins for the new wire variant
  — absolute-path-with-flags and sibling-no-flags. These lock in
  the wire schema so a future renumbering or rename trips the
  test.

- grovedb (under the `grovedbg` feature, alongside the existing
  `element_to_grovedbg_converts_item_with_sum_item` test): two
  conversion tests — one absolute-path RefWithSumItem and one
  SiblingReference RefWithSumItem — exercise both the helper and
  the new wire variant end-to-end.

Wire-schema impact: this is an additive change. Existing JSON
payloads continue to round-trip unchanged. The new variant only
appears on the wire when the producer is upgraded; older consumers
will see an unknown-variant decode error if they encounter one,
which is the standard serde behavior for additive enum changes.

1597 grovedb tests pass (with the `grovedbg` feature); 2 new
grovedbg-types tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… into one variant

Previously the two refresh ops were separate `GroveOp` variants with
mostly-duplicate fields, differing only in:
1. `sum_value: SumValue` (only on the sum-item variant)
2. `non_counted: bool` (only on the sum-item variant; plain
   `RefreshReference` handled the wrapper transparently)

This split forced parallel arms across the entire batch pipeline
(apply, dispatch, cost-estimate, ordering, transient-op check,
batch-only rejection, Display, propagation) and made the call sites
hard to keep in sync. It also made the wrapper-handling story
asymmetric: trusted `RefreshReference` silently dropped the
`NonCounted` wrapper, while trusted `RefreshReferenceWithSumItem`
preserved it.

Unification: single `GroveOp::RefreshReference` with
`sum_value: Option<SumValue>` and an explicit `non_counted: bool`.

  GroveOp::RefreshReference {
      reference_path_type: ReferencePathType,
      max_reference_hop:   MaxReferenceHop,
      sum_value:           Option<SumValue>,  // None=plain, Some=RefWithSum
      flags:               Option<ElementFlags>,
      non_counted:         bool,
      trust_refresh_reference: bool,
  }

Semantics, by `sum_value`:
* `None`  → writes `Element::Reference(..)`
* `Some(v)` → writes `Element::ReferenceWithSumItem(.., v, ..)`

Trust modes (unchanged from the prior sum-item op, now apply to
both shapes):
* `trust=true`: writes the op's payload verbatim; if the on-disk
  variant or wrapper disagrees, it's silently coerced. Caller's
  responsibility (same caller-asserted-shape contract).
* `trust=false`: reads on-disk, cross-checks variant + wrapper
  (`sum_value=None` ↔ `Reference`, `sum_value=Some(..)` ↔
  `ReferenceWithSumItem`), and writes back with on-disk path /
  max-hop / flags / wrapper. For sum-item refreshes the op's
  `sum_value` overrides on-disk's sum; plain refreshes write the
  on-disk element back verbatim. Mismatches are rejected.

Sites collapsed (~9 dual arms became single):
- `to_u8`: dropped the `RefreshReferenceWithSumItem => 17` arm.
  Tag 17 is now a "do not reuse" hole — comment in place.
- `fmt::Debug` op-shape rendering: one arm; label switches between
  "Refresh Reference" and "Refresh Reference With Sum Item" based
  on `sum_value.is_some()`.
- `follow_reference_get_value_hash` dispatch arm: collapsed to one
  pattern; threading behavior on `trust_refresh_reference` is
  unchanged (Some(op_path) when trusted, None when untrusted).
- Apply path: one arm constructs the element by matching on
  `sum_value` (None → plain Reference, Some → RefWithSumItem),
  applies the `NonCounted` wrapper if declared, and does the
  cross-check + rebuild for `trust=false`.
- Cost estimators (avg + worst): one arm builds the same element
  shape the apply path will write, so the wrapper byte is counted.
- `batch_structure.rs` ordering classification: collapsed.
- Insert-under-refreshed-reference rejection (line ~3269):
  collapsed.
- Batch-only NotSupported rejection (line ~3721): collapsed and
  message updated.

Public API:
- `refresh_reference_op(..)`: unchanged signature; now builds the
  unified op with `sum_value=None`, `non_counted=false`.
- `refresh_reference_with_sum_item_op(..)`: unchanged signature;
  now builds the unified op with `sum_value=Some(..)` and
  `non_counted` from the caller.

Wire-format note: `GroveOp` is in-memory only (used for batch
processing within a single apply call), so collapsing the variants
is not a persisted-format break. `to_u8` ordering values for all
other ops are unchanged.

Tests:
- `batch_unit_tests::test_grove_op_ord_all_variants`: updated to
  match the new shape (16 variants — RefreshReferenceWithSumItem
  was already missing from this list, so the "all 16" claim now
  matches the test's content).
- `refresh_reference_op_tag_pin` (renamed from
  `refresh_reference_with_sum_item_op_tag_pin`): pins both
  constructors to op-tag 5 (the unified RefreshReference tag);
  asserts they share the same variant.
- `refresh_reference_with_sum_item_debug_format`: updated
  assertion to expect "sum Some(42)" (Option-rendered) instead of
  "sum 42" (raw int).
- `refresh_reference_constructors_share_unified_variant` (new):
  structural regression test pinning that both public
  constructors produce `GroveOp::RefreshReference` distinguished
  only by `sum_value`.

1595 grovedb tests pass (was 1594, +1 for the structural pin).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nts, no invalid pairs)

Switches `GroveOp::RefreshReference` from a 3-variant mode +
separate `trust_refresh_reference: bool` to a 5-variant enum where
trust is encoded directly in the variant name. The invalid
combination (SumItemReferenceNoValueUpdate + trust=true) is now
unrepresentable by construction instead of being a runtime error.

```rust
pub enum RefreshReferenceMode {
    PlainReferenceTrusted,
    PlainReferenceUntrusted,
    SumItemReferenceTrusted(SumValue),
    SumItemReferenceUntrustedValueUpdate(SumValue),
    SumItemReferenceUntrustedNoValueUpdate,
}
```

There is no `SumItemReferenceTrustedNoValueUpdate` variant —
"refresh a RefWithSumItem without changing its carried sum" only
makes sense in untrusted mode (under trusted the apply path has no
sum to write without reading disk). The previous design exposed
that combination at compile time and rejected it at runtime; this
design makes it impossible at compile time, which is the
type-safe option C from the discussion.

GroveOp::RefreshReference field changes:
- `sum_value: Option<SumValue>` + `trust_refresh_reference: bool`
  → replaced by single `mode: RefreshReferenceMode`.
- Other fields unchanged: `reference_path_type`,
  `max_reference_hop`, `flags`, `non_counted`.

Knock-on changes:
- `RefreshReferenceMode::is_trusted()` helper. Used by
  `follow_reference_get_value_hash` (replaces the prior
  `*trust_refresh_reference` check) to decide whether dependent
  refs should resolve against the op's path or the on-disk path.
- Apply path collapses to a 5-way `match mode`. The runtime error
  path that previously rejected (NoValueUpdate, trust=true) is
  gone — that combination doesn't exist anymore.
- Cost estimators (avg + worst): the sum-item match arms collapse
  `SumItemReferenceTrusted` and
  `SumItemReferenceUntrustedValueUpdate` into one pattern (both
  carry a sum value); the `NoValueUpdate` variant uses 0 as the
  cost-only stand-in sum.
- Display: dispatches on the 5 mode variants; drops the
  `trust_reference` suffix (trust is in the mode name now).

Public constructors (signatures unchanged for the existing two —
backward compatible for callers):
- `refresh_reference_op(.., trust_refresh_reference)`: builds
  `PlainReferenceTrusted` or `PlainReferenceUntrusted` based on
  the bool.
- `refresh_reference_with_sum_item_op(.., sum_value, ..,
  trust_refresh_reference)`: builds `SumItemReferenceTrusted(v)`
  or `SumItemReferenceUntrustedValueUpdate(v)` based on the bool.
- `refresh_reference_with_sum_item_keep_sum_op(..)`: builds
  `SumItemReferenceUntrustedNoValueUpdate`. No trust parameter —
  untrusted is the only valid variant.

Tests:
- `refresh_reference_constructors_share_unified_variant`: extended
  to cover all 5 mode variants (the four trusted-vs-untrusted
  cells plus the untrusted-no-value-update). Pins the
  `is_trusted()` helper too.
- `refresh_reference_with_sum_item_debug_format`: updated to
  expect the new "mode SumItemReferenceTrusted(42)" rendering and
  drops the now-removed `trust_reference` suffix.
- `batch_refresh_keep_sum_trusted_rejected` (removed): the
  combination it was pinning is no longer constructible, so the
  apply-path runtime check is gone and the test is moot.

1596 grovedb tests pass; 1599 with `--features grovedbg`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the `RefreshReferenceMode` enum + its `impl` (with the
`is_trusted()` helper) out of the now-large `batch/mod.rs` into a
dedicated `batch/refresh_reference_mode.rs` module. The type is
re-exported from `batch` so existing call sites (e.g.
`crate::batch::RefreshReferenceMode`) continue to work without
changes.

No behavior change. The intra-doc links to `Element::Reference`,
`Element::ReferenceWithSumItem`, and `GroveOp::RefreshReference`
are spelled with explicit `crate::` / `super::` paths since the
type now lives in a sibling submodule.

1596 grovedb tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GroveOp is in-memory only (used for batch processing within a single
apply call) — there's no persisted wire format to reserve a tag for.
Future ops are free to use 17.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@QuantumExplorer QuantumExplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed

Restores symmetry with the other two refresh constructors. Previously
`refresh_reference_op` hard-coded `non_counted: false`, while
`refresh_reference_with_sum_item_op` and
`refresh_reference_with_sum_item_keep_sum_op` both accepted the
wrapper bit. Callers refreshing a plain `Reference` couldn't preserve
or set a `NonCounted` wrapper through the public API.

New signature:

```rust
pub fn refresh_reference_op(
    path: Vec<Vec<u8>>,
    key: Vec<u8>,
    reference_path_type: ReferencePathType,
    max_reference_hop: MaxReferenceHop,
    flags: Option<ElementFlags>,
    non_counted: bool,            // <-- new
    trust_refresh_reference: bool,
) -> Self
```

Behavior:
- Under the trusted variant (`PlainReferenceTrusted`): written at
  face value into the rebuilt element.
- Under the untrusted variant (`PlainReferenceUntrusted`):
  cross-checked against the on-disk wrapper; mismatch is rejected
  (a silent wrapper drop would corrupt the parent's count
  aggregate). Same contract as the sum-item constructors.

This is a breaking change to the function signature. Updated all 17
in-tree callsites to pass `/* non_counted = */ false` explicitly —
preserves the prior behavior at every call.

The structural pin
`refresh_reference_constructors_share_unified_variant` is extended
to (a) call `refresh_reference_op` with `non_counted = true` and
(b) assert that `non_counted` is threaded through correctly by all
three constructors.

1596 grovedb tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer QuantumExplorer merged commit 8e93631 into develop May 17, 2026
7 of 9 checks passed
@QuantumExplorer QuantumExplorer deleted the claude/practical-kalam-bc19ed branch May 17, 2026 00:46
QuantumExplorer added a commit that referenced this pull request May 17, 2026
Resolves conflicts with develop's PR #667 (Element::ReferenceWithSumItem
+ RefreshReferenceWithSumItem batch op). PR #667 added a new sum-bearing
reference variant + a non_counted field on InsertNonMerkTree; this PR
adds CountIndexedTree / ProvableCountIndexedTree + the
ReplaceAggregateIndexedTreeRootKeys batch op.

**Second on-disk discriminant collision fixed.** Develop's
`ElementType::ReferenceWithSumItem = 18` collides with this PR's prior
post-merge assignment `ElementType::CountIndexedTree = 18`. (Same
shape as the prior merge with NotCountedOrSummed at byte 17.)

Resolution: shift cidx discriminants by another 1 and re-derive the
NonCounted twin bytes:
  - `ElementType::CountIndexedTree`              18 → 19
  - `ElementType::ProvableCountIndexedTree`      19 → 20
  - `ElementType::NonCountedCountIndexedTree`   146 → 147 (`0x80 | 19`)
  - `ElementType::NonCountedProvableCountIndexedTree`
                                                147 → 148 (`0x80 | 20`)

The `Element` enum variant order now places `ReferenceWithSumItem` at
index 18 (matching develop) and the two cidx variants at 19 and 20.
The `ElementShadow` serde decoder mirrors this order. Updated:
  - `try_from` numeric branches, `from_serialized_value` tests
  - NonCounted-wrapper allowlist `0..=14 | 18` → `0..=14 | 18 | 19 | 20`
  - Range assertions: NonCounted twin range `[128, 147]` → `[128, 148]`;
    invalid-byte tests adjusted (149..=179 instead of 148..=179)
  - 18-base-case test grew to 18 cases with explicit counts

Other conflict resolutions union both sides' match arms in:
  - `grovedb-element/element/helpers.rs` (get_flags / get_flags_owned /
    get_flags_mut / set_flags — single combined arm for cidx + RWSI)
  - `grovedb-element/element/mod.rs` (Element enum, ElementShadow,
    element_type() dispatch including the wrapper path)
  - `grovedb-element/element_type.rs` (display strings, try_from,
    NonCounted wrapper allowlist, from_serialized_value tests, the
    canonical base-variant round-trip test grown to 18 cases)
  - `grovedb/batch/estimated_costs/average_case_costs.rs`
    (`InsertNonMerkTree` gained the `non_counted` field on develop;
    union keeps both that signature change and the existing
    `ReplaceAggregateIndexedTreeRootKeys` arm)
  - `grovedb/batch/mod.rs` (RefreshReference doc-comment expansion
    on develop kept alongside the
    `ReplaceAggregateIndexedTreeRootKeys` variant definition)

And one non-exhaustive-match fix flagged by rustc post-merge:
  - `grovedb/operations/count_indexed_tree.rs:438` — fold
    `ReferenceWithSumItem` into the existing `Reference` arm
    (both resolve identically and `insert_reference` already does
    the variant-aware feature-type lookup via `get_feature_type`).

Verified:
- `cargo check -p grovedb` clean
- `cargo test --workspace --lib` ~3000 tests, 0 failures
- `cargo fmt --all` applied

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QuantumExplorer added a commit that referenced this pull request May 17, 2026
…to 19

Pulls in PR #667 (`Element::ReferenceWithSumItem` +
`GroveOp::RefreshReferenceWithSumItem`) and uses the merge as an
opportunity to re-position `Element::ProvableSumTree` from slot 17 to
slot 19 (the end of the Element enum). This restores develop's
on-disk encoding for every variant that landed before this PR.

## The skew this fixes

Our earlier merge of PR #666 slid `NotCountedOrSummed` from develop's
slot 17 to slot 18 so it wouldn't collide with our `ProvableSumTree`
at slot 17. PR #667 then added `ReferenceWithSumItem` at slot 18 in
develop. If we'd slid that to slot 19, every PR that lands on develop
after `ProvableSumTree` would inherit a 2-slot rotation
`[NotCountedOrSummed, ReferenceWithSumItem]` →
`[ProvableSumTree, NotCountedOrSummed, ReferenceWithSumItem]`,
breaking the wire format of every variant that landed before us.

Re-slotting `ProvableSumTree` to the end keeps develop's encoding
verbatim and limits the divergence to a single new appended variant.

## Final layout

| Slot | Variant         | Notes                                        |
|------|-----------------|----------------------------------------------|
| 15   | NonCounted      | wrapper byte                                 |
| 16   | NotSummed       | wrapper byte                                 |
| 17   | NotCountedOrSummed | wrapper byte (matches develop)            |
| 18   | ReferenceWithSumItem | base type (matches develop)             |
| 19   | ProvableSumTree | this branch's addition, appended at the end  |

## Discriminant changes

- `ElementType::ProvableSumTree`: 17 → **19** (matches new Element
  position).
- `NonCountedProvableSumTree`: 145 (`0x80|17`) → **147 (`0x80|19`)**
  (NonCounted twin formula `base | 0x80`).
- `NotSummedProvableSumTree`: stays at **177** (`0xB1`) — already
  hand-assigned because base 17/19 both overflow the formula's low
  nibble.
- `NotCountedOrSummedProvableSumTree`: stays at **193** (`0xC1`) —
  same hand-assigned-slot rationale.

## Conflict resolution

- `Element` enum (mod.rs): kept develop's NotCountedOrSummed=17,
  ReferenceWithSumItem=18; moved ProvableSumTree to 19 with an updated
  doc comment explaining the positioning.
- `ElementType` enum: restored develop's discriminant assignments for
  the wrapper-byte comment block (17 = NCOS wrapper) and updated the
  ProvableSumTree base + NonCounted twin to 19/147.
- `is_non_counted`: adopted develop's range check
  (`0x80 <= disc < 0xB0`) — required to cover synthetic twins for
  high-numbered base discriminants like 147 (= `0x80|19`).
- `from_serialized_value`: NonCounted-inner allowlist now includes
  `18` (ReferenceWithSumItem) AND `19` (ProvableSumTree).
- NotSummed / NotCountedOrSummed inner-byte matches: `17 =>
  *ProvableSumTree` → `19 => *ProvableSumTree`.
- `element/helpers.rs`: collapsed the two-sided `match` arms in
  `sum_value_or_default`, `count_sum_value_or_default`, and
  `big_sum_value_or_default` so both `ProvableSumTree` and
  `ReferenceWithSumItem` route through the same `(_, sum_value, _)`
  pattern.
- Tests: extended the discriminant table to 17 variants (added byte
  19 for ProvableSumTree); merged the ProvableSumTree and
  ReferenceWithSumItem constructor-helper test blocks so both sets of
  tests live in the same module.
- Updated all stale `base 17` / `145` / `0x80|17` references in doc
  comments to match the new layout.

## Verification

- `cargo build --workspace`: clean.
- `cargo clippy --workspace --all-features`: clean.
- `cargo test --workspace`: 3193 / 0 fail (3138 before, +55 from
  merging in #667's new tests, all green after the discriminant
  fix-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QuantumExplorer added a commit that referenced this pull request May 17, 2026
Audit found that PR #667 (ReferenceWithSumItem) and this PR
(ProvableSumTree) shipped without any tests covering their interaction
— despite the fact that the motivating use-case for RWSI ("ranked /
sortable index entries where the key encodes the rank, the reference
points to a canonical record, the sum is the entry's monetary weight
that aggregates into the parent's total") is exactly the use-case that
ProvableSumTree was added to make cryptographically verifiable.

Adds five crossover tests in
`tests/reference_with_sum_item_tests.rs`:

  - `insert_reference_with_sum_item_in_provable_sum_tree_aggregates_sum`:
    a single RWSI's `sum_value` propagates into the ProvableSumTree's
    hash-bound aggregate. This is the contract that makes the
    PR-#667 use-case work against the proof-bearing tree variant.

  - `multiple_reference_with_sum_items_in_provable_sum_tree_accumulate`:
    three RWSI entries (30, -8, 100) — two pointing at the same
    target — accumulate to the expected ProvableSum(122) on the
    parent. Confirms sums are link-level (not target-level) under
    the provable variant just like under SumTree.

  - `aggregate_sum_on_range_over_reference_with_sum_item_in_provable_sum_tree`:
    end-to-end proof round-trip. Builds a ProvableSumTree of 15 RWSI
    links with weights 1..=15, runs `AggregateSumOnRange` over the
    c..=l sub-range, proves + verifies, and asserts the returned sum
    is exactly 75 (3+4+...+12). Exercises the merk `aggregate_sum`
    prover/verifier, the GroveDB envelope walker, the
    terminal-type gate, AND the `node_hash_with_sum` binding — all
    in one shot.

  - `aggregate_sum_handles_negative_weight_from_reference_with_sum_item`:
    a single RWSI with `sum_value = -42` verifies to -42 through the
    full proof round-trip. Confirms the i64-signed contract works
    end-to-end for RWSI-carried weights.

  - `non_counted_reference_with_sum_item_rejected_in_provable_sum_tree`:
    documents the wrapper-compatibility rule that
    `NonCounted(RWSI)` is REJECTED inside `ProvableSumTree`
    (ProvableSumTree is sum-only, not count-bearing, so the
    NonCounted parent-type guard fires). Catches my own initial
    wrong assumption and pins the parent-type rule.

  - `new_not_counted_or_summed_rejects_reference_with_sum_item`:
    parity with PR #667's `new_not_summed_rejects_reference_with_sum_item`
    — `NotCountedOrSummed`'s sum-bearing-tree-only allow-list rejects
    RWSI just like `NotSummed` does.

Also leaves the existing element-level tests in
`element::aggregate_sum_query::tests` (which already cover
RWSI-resolution chains through aggregate-sum queries) untouched and
relies on them for the no-proof side.

Tests: 3199 / 0 fail (3193 → 3199, +6 crossover tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant