Skip to content

Performance and memory optimizations for CellDb#154

Merged
bvscd merged 1 commit into
release/node/v0.6.2from
celldb
May 19, 2026
Merged

Performance and memory optimizations for CellDb#154
bvscd merged 1 commit into
release/node/v0.6.2from
celldb

Conversation

@bvscd
Copy link
Copy Markdown
Collaborator

@bvscd bvscd commented May 19, 2026

No description provided.

Copilot AI review requested due to automatic review settings May 19, 2026 10:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on performance/memory improvements around Cell/BOC storage and shardstate application, adding lazy loading to reduce immediate allocations and expanding telemetry to observe memory behavior under jemalloc and Linux.

Changes:

  • Add LazyLoadCell support and a new MerkleUpdate::apply_lazy_unchecked path, plus tests validating lazy behavior.
  • Make DynamicBoc counter-cache updates consistent with RocksDB commits (flush cache only post-commit) and add atomic batching for shardstate delete + BOC delete.
  • Improve observability and ops behavior: new telemetry metrics (cell bytes, jemalloc stats, /proc/self/status logging), archive GC adjustments, and additional sync/download logging.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/node/tests/test_run_net/test_run_net_ci.sh Improves CI failure diagnostics and uses an absolute stop script path.
src/node/storage/src/tests/test_dynamic_boc_rc_db.rs Updates tests for new delete_boc signature.
src/node/storage/src/shardstate_db_async.rs Retries shardstate jobs on failure; batches shardstate index delete with BOC delete; callback API change.
src/node/storage/src/lib.rs Adds storing_cells_bytes telemetry metric.
src/node/storage/src/dynamic_boc_rc_db.rs Defers counter-cache mutations until after RocksDB commit; adds atomic extra ops to delete_boc; lazy-cell counter lookup tweak.
src/node/storage/src/db/rocksdb.rs Adds helper to append deletes into an existing WriteBatch.
src/node/storage/src/cell_db.rs Tracks storing-cells bytes in telemetry; implements lazy-load cell creation via stored loader.
src/node/storage/src/archives/package.rs Fixes truncate cursor positioning; improves read_entry error context.
src/node/storage/src/archives/file_maps.rs Refactors archive GC and log formatting; introduces rollback reinsertion logic.
src/node/storage/src/archives/archive_manager.rs Adds timing instrumentation via time_checker! around archive operations.
src/node/src/sync.rs Adjusts archive download queue logic and adds detailed downloads accounting logs.
src/node/src/main.rs Updates jemalloc config symbol/name and decay settings; adds panic hook to trigger shutdown.
src/node/src/internal_db/mod.rs Updates shardstate callback interface and call sites.
src/node/src/full_node/apply_block.rs Switches fast merkle-update apply to apply_lazy_unchecked with an added old-hash check.
src/node/src/engine.rs Adds telemetry metrics (cells MB, jemalloc stats), Linux /proc/self/status logger, and more timing instrumentation.
src/node/src/engine_traits.rs Extends EngineTelemetry with cell/jemalloc metrics.
src/node/src/collator_test_bundle.rs Updates telemetry construction to include new metrics.
src/node/Cargo.toml Adds Tokio signal feature; adds optional tikv-jemalloc-ctl dependency and updates jemalloc feature.
src/Cargo.lock Locks new dependencies (tikv-jemalloc-ctl, paste).
src/block/src/tests/test_merkle_update.rs Adds tests covering apply_lazy_unchecked behavior and laziness properties.
src/block/src/merkle_update.rs Extends CellsFactory with lazy creation; implements apply_lazy_unchecked and lazy traversal behavior.
src/block/src/cell/tests/test_cell.rs Adds extensive LazyLoadCell tests, including concurrency and mask handling.
src/block/src/cell/mod.rs Implements LazyLoadCell variant, heap cell byte counters, and safer deallocation on early errors.
Comments suppressed due to low confidence (1)

src/node/src/full_node/apply_block.rs:166

  • The fast path now uses merkle_update.apply_lazy_unchecked(&cf), which (by design) skips MerkleUpdate::check(...) and therefore does not validate that pruned branches in the update are contained in the previous state. The added old_hash check only verifies the root hash, not branch consistency. This can allow a malformed (or adversarial) merkle update to be accepted and persisted, potentially leading to later panics/failed loads when the lazy branches are accessed, or to an incorrect state being stored. Consider running merkle_update.check(&prev_ss_root, ...) before the unchecked apply, or using a fast path that still performs the consistency check (even if it produces lazy cells).
        let merkle_update = block.block()?.read_state_update()?;
        if merkle_update.old_hash != *prev_ss_root.repr_hash() {
            fail!(
                "Merkle update old_hash mismatch with prev state for block {}: \
                 update.old_hash = {:x}, prev_ss_root.repr_hash = {:x}",
                block.id(),
                merkle_update.old_hash,
                prev_ss_root.repr_hash()
            );
        }
        let block_id = block.id().clone();
        let engine_cloned = engine.clone();

        let block_descr_clone = block_descr.clone();
        let ss = tokio::task::spawn_blocking(move || -> Result<Arc<ShardStateStuff>> {
            let now = std::time::Instant::now();
            let cf = engine_cloned.db_cells_factory()?;
            let mut fast_attempt = true;
            let (ss_root, _metrics) =
                match merkle_update.apply_lazy_unchecked(&cf) {
                    Err(e) => {
                        log::debug!(
                            "Failed the fast attempt of Merkle update applying for block {}: {}. Trying classic approach...",
                            block_id, e
                        );
                        fast_attempt = false;
                        merkle_update.apply_with_factory(&prev_ss_root, &cf).map_err(|e| {
                            error!(
                                "Error applying Merkle update for block {}: {}\
                                prev_ss_root: {:#.2}\
                                merkle_update: {}",
                                block_id, e, prev_ss_root, merkle_update
                            )
                        })?
                    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/node/src/engine.rs
Comment thread src/node/storage/src/archives/file_maps.rs
@bvscd bvscd merged commit 3b2efc6 into release/node/v0.6.2 May 19, 2026
16 of 17 checks passed
@bvscd bvscd deleted the celldb branch May 19, 2026 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants