perf: improve daemon sync speed by sn1f3rt · Pull Request #65 · nerva-project/nerva

sn1f3rt · 2026-03-22T06:46:38Z

Summary

P2P / network throughput

Increases BLOCKS_SYNCHRONIZING_DEFAULT_COUNT from 20 → 100
Implements adaptive sync in get_block_sync_size(): requests up to 256 blocks/batch when far behind the tip, scaling down to exactly blocks_behind when within 100–256 blocks of the target, then falling back to 100 at tip
Caps batch size at BLOCKS_SYNCHRONIZING_SAFE_BATCH_COUNT (256) — CNA v3/v4/v5 look up block data at height - 256; a larger uncommitted write batch would make those lookups invisible to new read-only transactions
Raises P2P_DEFAULT_SYNC_SEARCH_CONNECTIONS_COUNT from 2 → 4
Raises P2P_DEFAULT_LIMIT_RATE_UP from 2048 → 8192 KB/s to match the download rate limit (the previous asymmetry was causing the outbound throttle to fire on seed nodes)
Moves CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT from cryptonote_protocol_handler.h into cryptonote_config.h

LMDB / block cache

Fixes sync crash at HF7 (block 173500): build_block_cache now reuses the active batch write transaction instead of opening a fresh read-only txn that cannot see uncommitted blocks
Introduces block_cache_data compact struct (96 → 56 bytes per entry), storing only the four fields accessed by get_cna_v4_data / get_cna_v5_data — reduces L3 cache pressure at lower chain heights
Eliminates 5 random LMDB reads per block for CNA v2 hash lookups by routing them through the block cache

Weight limit / median computation

Replaces hash-based long-term weight cache validation with height comparison, saving one get_block_hash_from_height LMDB read per block on the incremental path
Passes the long-term median computed in get_next_long_term_block_weight directly into update_next_cumulative_weight_limit to avoid recomputing it; uses std::optional<uint64_t> to make presence explicit (replaces a uint64_t == 0 sentinel)
Skips short-term weight median recomputation for non-final blocks within a sync batch (mid_batch); the 50k-block long-term median is stable within a 256-block batch, so the stale limit safely covers intra-batch validation

Closes #64

github-actions · 2026-03-22T06:49:43Z

Build Artifacts

Target	Status	Download
nerva-linux-armv7	✅	Download
nerva-linux-armv8	✅	Download
nerva-windows-x32	✅	Download
nerva-linux-i686	✅	Download
nerva-windows-x64	✅	Download
nerva-linux-x86_64	✅	Download
nerva-macos-x64	✅	Download
nerva-macos-armv8	✅	Download
nerva-freebsd-x86_64	✅	Download
nerva-android-armv8	✅	Download

10 succeeded, 0 failed | View workflow run

benevanoff · 2026-03-23T13:07:59Z

These seem like reasonable changes. Did you make any measurements for how much this actually improves the sync time?

sn1f3rt · 2026-03-23T14:22:01Z

These seem like reasonable changes. Did you make any measurements for how much this actually improves the sync time?

I've asked Rob, Grey and others to test this out - will update the PR description when I have the results.

…ing adaptive sync Increases BLOCKS_SYNCHRONIZING_DEFAULT_COUNT from 20 to 100 for a 5x throughput improvement, implements real adaptive sync logic in get_block_sync_size() that scales up to 1000 blocks/batch when far behind the chain tip, and raises P2P_DEFAULT_SYNC_SEARCH_CONNECTIONS_COUNT from 2 to 4 to double peer discovery parallelism. Closes #64

The adaptive sync introduced COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) as the ceiling for P2P block request batches, but the P2P protocol enforces CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT (500). Nodes requesting >500 blocks are immediately dropped by all peers, breaking initial sync from genesis. Cap the adaptive batch size at CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT and move that constant to cryptonote_config.h where it belongs alongside the other sync-related constants.

CNA v3/v4/v5 PoW variants look up block data at height-256 using a fresh read-only LMDB transaction. When blocks are added in a batch write transaction, that read-only txn cannot see uncommitted blocks in the same batch. With a 500-block batch starting below HF9 (height 240500), any block at position >= 257 in the batch will attempt to read block data that is still uncommitted, causing MDB_NOTFOUND. The PoW hash then uses garbage stack data, the block fails the "enough proof of work" check, and the peer is dropped and blocked. Fix: cap the adaptive sync batch size at 256 (BLOCKS_SYNCHRONIZING_SAFE_BATCH_COUNT). A batch of <=256 blocks can only look up data at heights strictly below the batch start, which are always committed. The P2P protocol limit (CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT=500) is unchanged; 256 is well within that limit.

build_block_cache stores mdb_block_info (96 bytes/entry) but get_cna_v4_data and get_cna_v5_data only read four fields: hash, timestamp, diff_lo, and coins (56 bytes total). Introduce a compact block_cache_data struct with only those fields, replacing std::vector<mdb_block_info> with std::vector<block_cache_data>. Cache size at height 930k: ~49 MB (down from ~85 MB) Cache size at current tip (~4.1M): ~218 MB (down from ~374 MB) On CPUs with >=48 MB L3, the full 930k cache now fits in L3, restoring compute-bound behaviour that was lost when the cache grew past L3 capacity at the HF12 boundary.

…ost-HF12 This caches the median computed in get_next_long_term_block_weight and passes it to update_next_cumulative_weight_limit, avoiding a duplicate 50k-window calculation per block. Improves post-HF12 sync speed by ~37%.

…ong-term weight median

… during sync

…reads per block get_cna_v2_data previously called get_block_hash_from_height() five times, each requiring a random LMDB read. During initial sync these are cold reads that stall the CPU waiting on I/O. build_block_cache() already loads a contiguous window of block hashes into memory for get_cna_v4_data and get_cna_v5_data. Calling it at the start of get_cna_v2_data (a no-op if the cache is already current) and reading h0–h4 from m_block_cache eliminates all 5 LMDB reads per block for CNA v2 PoW. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The daemon's getblocks.bin RPC handler returns up to COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The wallet held m_daemon_rpc_mutex for the entire duration of each response, meaning one 1000-block fetch (potentially several seconds on Windows) blocked all other wallet RPC operations — balance queries, height updates, etc. — causing the GUI to show timeouts during wallet scan. Note: the adaptive sync cap introduced in PR #65 applies only to P2P block propagation via get_block_sync_size(); it has no effect on the wallet getblocks.bin endpoint, which always used the 1000-block server default. Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST request (0 = server default, backwards-compatible). The wallet sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold to a fraction of its previous duration and allowing status calls to interleave between batches.

…ailure at HF7 (block 173500) build_block_cache was always opening a new read-only LMDB transaction, which cannot see blocks written in the current uncommitted batch transaction. At HF7 (block 173500), CNA v2 PoW is first activated and calls build_block_cache to read the preceding blocks — all of which are in the active batch and invisible to a read-only txn. This caused a DB error, block rejection, peer ban, and sync halt. Fix: reuse m_write_txn when a batch is active, matching the established TXN_BLOCK_PREFIX pattern used elsewhere in this file. Also move m_block_cache_height.store() inside the critical section to close a secondary race condition.

m_prepare_height is reset to 0 inside prepare_handle_incoming_blocks before the function returns, so by the time handle_block_to_main_chain runs the condition `new_height < m_prepare_height + m_prepare_nblocks` reduced to `new_height < m_prepare_nblocks`, which is false for any block past the genesis batch. Introduce m_batch_start_height, set alongside m_prepare_nblocks at the start of the prepare phase and not reset mid-function, so the mid_batch guard correctly covers all non-final blocks in each sync batch.

…d long-term median Using 0 as "not computed" silently misfires if the long-term median ever evaluates to exactly zero (theoretically impossible in a live chain, but a hidden invariant). Replace with std::optional<uint64_t> so presence is explicit, and clean up the NULL/nullptr inconsistency in the default arg.

The daemon's getblocks.bin RPC handler returns up to COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The wallet held m_daemon_rpc_mutex for the entire duration of each response, meaning one 1000-block fetch (potentially several seconds on Windows) blocked all other wallet RPC operations — balance queries, height updates, etc. — causing the GUI to show timeouts during wallet scan. Note: the adaptive sync cap introduced in PR #65 applies only to P2P block propagation via get_block_sync_size(); it has no effect on the wallet getblocks.bin endpoint, which always used the 1000-block server default. Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST request (0 = server default, backwards-compatible). The wallet sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold to a fraction of its previous duration and allowing status calls to interleave between batches.

sn1f3rt added the bug Something isn't working label Mar 22, 2026

sn1f3rt self-assigned this Mar 22, 2026

sn1f3rt added the bug Something isn't working label Mar 22, 2026

sn1f3rt changed the title ~~fix: improve daemon sync speed~~ perf: improve daemon sync speed Mar 22, 2026

sn1f3rt added enhancement New feature or request and removed bug Something isn't working labels Mar 22, 2026

ngeojiajun approved these changes Mar 23, 2026

View reviewed changes

sn1f3rt added 5 commits March 24, 2026 21:55

perf: raise default upload rate limit to match download limit

2d28765

sn1f3rt force-pushed the fix/slow-daemon-sync branch from 31b094e to ba8931a Compare March 24, 2026 16:25

R0BC0D3R and others added 4 commits March 29, 2026 16:08

perf: replace hash-based cache validation with height comparison in l…

701e9ec

…ong-term weight median

perf: skip short-term weight limit recomputation for mid-batch blocks…

87df789

… during sync

sn1f3rt mentioned this pull request Apr 6, 2026

fix: bound wallet scan threadpool to half of available cores #70

Open

R0BC0D3R and others added 3 commits April 15, 2026 16:56

sn1f3rt merged commit a01db4b into master May 12, 2026
14 checks passed

sn1f3rt deleted the fix/slow-daemon-sync branch May 12, 2026 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: improve daemon sync speed#65

perf: improve daemon sync speed#65
sn1f3rt merged 12 commits into
masterfrom
fix/slow-daemon-sync

sn1f3rt commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

benevanoff commented Mar 23, 2026

Uh oh!

sn1f3rt commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sn1f3rt commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

P2P / network throughput

LMDB / block cache

Weight limit / median computation

Uh oh!

github-actions Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Build Artifacts

Uh oh!

benevanoff commented Mar 23, 2026

Uh oh!

sn1f3rt commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sn1f3rt commented Mar 22, 2026 •

edited

Loading

github-actions Bot commented Mar 22, 2026 •

edited

Loading