Performance and memory optimizations for CellDb#154
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR focuses on performance/memory improvements around Cell/BOC storage and shardstate application, adding lazy loading to reduce immediate allocations and expanding telemetry to observe memory behavior under jemalloc and Linux.
Changes:
- Add
LazyLoadCellsupport and a newMerkleUpdate::apply_lazy_uncheckedpath, plus tests validating lazy behavior. - Make DynamicBoc counter-cache updates consistent with RocksDB commits (flush cache only post-commit) and add atomic batching for shardstate delete + BOC delete.
- Improve observability and ops behavior: new telemetry metrics (cell bytes, jemalloc stats,
/proc/self/statuslogging), archive GC adjustments, and additional sync/download logging.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/node/tests/test_run_net/test_run_net_ci.sh | Improves CI failure diagnostics and uses an absolute stop script path. |
| src/node/storage/src/tests/test_dynamic_boc_rc_db.rs | Updates tests for new delete_boc signature. |
| src/node/storage/src/shardstate_db_async.rs | Retries shardstate jobs on failure; batches shardstate index delete with BOC delete; callback API change. |
| src/node/storage/src/lib.rs | Adds storing_cells_bytes telemetry metric. |
| src/node/storage/src/dynamic_boc_rc_db.rs | Defers counter-cache mutations until after RocksDB commit; adds atomic extra ops to delete_boc; lazy-cell counter lookup tweak. |
| src/node/storage/src/db/rocksdb.rs | Adds helper to append deletes into an existing WriteBatch. |
| src/node/storage/src/cell_db.rs | Tracks storing-cells bytes in telemetry; implements lazy-load cell creation via stored loader. |
| src/node/storage/src/archives/package.rs | Fixes truncate cursor positioning; improves read_entry error context. |
| src/node/storage/src/archives/file_maps.rs | Refactors archive GC and log formatting; introduces rollback reinsertion logic. |
| src/node/storage/src/archives/archive_manager.rs | Adds timing instrumentation via time_checker! around archive operations. |
| src/node/src/sync.rs | Adjusts archive download queue logic and adds detailed downloads accounting logs. |
| src/node/src/main.rs | Updates jemalloc config symbol/name and decay settings; adds panic hook to trigger shutdown. |
| src/node/src/internal_db/mod.rs | Updates shardstate callback interface and call sites. |
| src/node/src/full_node/apply_block.rs | Switches fast merkle-update apply to apply_lazy_unchecked with an added old-hash check. |
| src/node/src/engine.rs | Adds telemetry metrics (cells MB, jemalloc stats), Linux /proc/self/status logger, and more timing instrumentation. |
| src/node/src/engine_traits.rs | Extends EngineTelemetry with cell/jemalloc metrics. |
| src/node/src/collator_test_bundle.rs | Updates telemetry construction to include new metrics. |
| src/node/Cargo.toml | Adds Tokio signal feature; adds optional tikv-jemalloc-ctl dependency and updates jemalloc feature. |
| src/Cargo.lock | Locks new dependencies (tikv-jemalloc-ctl, paste). |
| src/block/src/tests/test_merkle_update.rs | Adds tests covering apply_lazy_unchecked behavior and laziness properties. |
| src/block/src/merkle_update.rs | Extends CellsFactory with lazy creation; implements apply_lazy_unchecked and lazy traversal behavior. |
| src/block/src/cell/tests/test_cell.rs | Adds extensive LazyLoadCell tests, including concurrency and mask handling. |
| src/block/src/cell/mod.rs | Implements LazyLoadCell variant, heap cell byte counters, and safer deallocation on early errors. |
Comments suppressed due to low confidence (1)
src/node/src/full_node/apply_block.rs:166
- The fast path now uses
merkle_update.apply_lazy_unchecked(&cf), which (by design) skipsMerkleUpdate::check(...)and therefore does not validate that pruned branches in the update are contained in the previous state. The addedold_hashcheck only verifies the root hash, not branch consistency. This can allow a malformed (or adversarial) merkle update to be accepted and persisted, potentially leading to later panics/failed loads when the lazy branches are accessed, or to an incorrect state being stored. Consider runningmerkle_update.check(&prev_ss_root, ...)before the unchecked apply, or using a fast path that still performs the consistency check (even if it produces lazy cells).
let merkle_update = block.block()?.read_state_update()?;
if merkle_update.old_hash != *prev_ss_root.repr_hash() {
fail!(
"Merkle update old_hash mismatch with prev state for block {}: \
update.old_hash = {:x}, prev_ss_root.repr_hash = {:x}",
block.id(),
merkle_update.old_hash,
prev_ss_root.repr_hash()
);
}
let block_id = block.id().clone();
let engine_cloned = engine.clone();
let block_descr_clone = block_descr.clone();
let ss = tokio::task::spawn_blocking(move || -> Result<Arc<ShardStateStuff>> {
let now = std::time::Instant::now();
let cf = engine_cloned.db_cells_factory()?;
let mut fast_attempt = true;
let (ss_root, _metrics) =
match merkle_update.apply_lazy_unchecked(&cf) {
Err(e) => {
log::debug!(
"Failed the fast attempt of Merkle update applying for block {}: {}. Trying classic approach...",
block_id, e
);
fast_attempt = false;
merkle_update.apply_with_factory(&prev_ss_root, &cf).map_err(|e| {
error!(
"Error applying Merkle update for block {}: {}\
prev_ss_root: {:#.2}\
merkle_update: {}",
block_id, e, prev_ss_root, merkle_update
)
})?
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mnogoborec
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.