perf: improve daemon sync speed#65
Merged
Merged
Conversation
Build Artifacts
10 succeeded, 0 failed | View workflow run |
ngeojiajun
approved these changes
Mar 23, 2026
Contributor
|
These seem like reasonable changes. Did you make any measurements for how much this actually improves the sync time? |
Member
Author
I've asked Rob, Grey and others to test this out - will update the PR description when I have the results. |
…ing adaptive sync Increases BLOCKS_SYNCHRONIZING_DEFAULT_COUNT from 20 to 100 for a 5x throughput improvement, implements real adaptive sync logic in get_block_sync_size() that scales up to 1000 blocks/batch when far behind the chain tip, and raises P2P_DEFAULT_SYNC_SEARCH_CONNECTIONS_COUNT from 2 to 4 to double peer discovery parallelism. Closes #64
The adaptive sync introduced COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) as the ceiling for P2P block request batches, but the P2P protocol enforces CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT (500). Nodes requesting >500 blocks are immediately dropped by all peers, breaking initial sync from genesis. Cap the adaptive batch size at CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT and move that constant to cryptonote_config.h where it belongs alongside the other sync-related constants.
CNA v3/v4/v5 PoW variants look up block data at height-256 using a fresh read-only LMDB transaction. When blocks are added in a batch write transaction, that read-only txn cannot see uncommitted blocks in the same batch. With a 500-block batch starting below HF9 (height 240500), any block at position >= 257 in the batch will attempt to read block data that is still uncommitted, causing MDB_NOTFOUND. The PoW hash then uses garbage stack data, the block fails the "enough proof of work" check, and the peer is dropped and blocked. Fix: cap the adaptive sync batch size at 256 (BLOCKS_SYNCHRONIZING_SAFE_BATCH_COUNT). A batch of <=256 blocks can only look up data at heights strictly below the batch start, which are always committed. The P2P protocol limit (CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNT=500) is unchanged; 256 is well within that limit.
build_block_cache stores mdb_block_info (96 bytes/entry) but get_cna_v4_data and get_cna_v5_data only read four fields: hash, timestamp, diff_lo, and coins (56 bytes total). Introduce a compact block_cache_data struct with only those fields, replacing std::vector<mdb_block_info> with std::vector<block_cache_data>. Cache size at height 930k: ~49 MB (down from ~85 MB) Cache size at current tip (~4.1M): ~218 MB (down from ~374 MB) On CPUs with >=48 MB L3, the full 930k cache now fits in L3, restoring compute-bound behaviour that was lost when the cache grew past L3 capacity at the HF12 boundary.
31b094e to
ba8931a
Compare
…ost-HF12 This caches the median computed in get_next_long_term_block_weight and passes it to update_next_cumulative_weight_limit, avoiding a duplicate 50k-window calculation per block. Improves post-HF12 sync speed by ~37%.
…ong-term weight median
…reads per block get_cna_v2_data previously called get_block_hash_from_height() five times, each requiring a random LMDB read. During initial sync these are cold reads that stall the CPU waiting on I/O. build_block_cache() already loads a contiguous window of block hashes into memory for get_cna_v4_data and get_cna_v5_data. Calling it at the start of get_cna_v2_data (a no-op if the cache is already current) and reading h0–h4 from m_block_cache eliminates all 5 LMDB reads per block for CNA v2 PoW. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sn1f3rt
added a commit
that referenced
this pull request
Apr 6, 2026
The daemon's getblocks.bin RPC handler returns up to COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The wallet held m_daemon_rpc_mutex for the entire duration of each response, meaning one 1000-block fetch (potentially several seconds on Windows) blocked all other wallet RPC operations — balance queries, height updates, etc. — causing the GUI to show timeouts during wallet scan. Note: the adaptive sync cap introduced in PR #65 applies only to P2P block propagation via get_block_sync_size(); it has no effect on the wallet getblocks.bin endpoint, which always used the 1000-block server default. Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST request (0 = server default, backwards-compatible). The wallet sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold to a fraction of its previous duration and allowing status calls to interleave between batches.
…ailure at HF7 (block 173500) build_block_cache was always opening a new read-only LMDB transaction, which cannot see blocks written in the current uncommitted batch transaction. At HF7 (block 173500), CNA v2 PoW is first activated and calls build_block_cache to read the preceding blocks — all of which are in the active batch and invisible to a read-only txn. This caused a DB error, block rejection, peer ban, and sync halt. Fix: reuse m_write_txn when a batch is active, matching the established TXN_BLOCK_PREFIX pattern used elsewhere in this file. Also move m_block_cache_height.store() inside the critical section to close a secondary race condition.
m_prepare_height is reset to 0 inside prepare_handle_incoming_blocks before the function returns, so by the time handle_block_to_main_chain runs the condition `new_height < m_prepare_height + m_prepare_nblocks` reduced to `new_height < m_prepare_nblocks`, which is false for any block past the genesis batch. Introduce m_batch_start_height, set alongside m_prepare_nblocks at the start of the prepare phase and not reset mid-function, so the mid_batch guard correctly covers all non-final blocks in each sync batch.
…d long-term median Using 0 as "not computed" silently misfires if the long-term median ever evaluates to exactly zero (theoretically impossible in a live chain, but a hidden invariant). Replace with std::optional<uint64_t> so presence is explicit, and clean up the NULL/nullptr inconsistency in the default arg.
sn1f3rt
added a commit
that referenced
this pull request
May 12, 2026
The daemon's getblocks.bin RPC handler returns up to COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The wallet held m_daemon_rpc_mutex for the entire duration of each response, meaning one 1000-block fetch (potentially several seconds on Windows) blocked all other wallet RPC operations — balance queries, height updates, etc. — causing the GUI to show timeouts during wallet scan. Note: the adaptive sync cap introduced in PR #65 applies only to P2P block propagation via get_block_sync_size(); it has no effect on the wallet getblocks.bin endpoint, which always used the 1000-block server default. Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST request (0 = server default, backwards-compatible). The wallet sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold to a fraction of its previous duration and allowing status calls to interleave between batches.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
P2P / network throughput
BLOCKS_SYNCHRONIZING_DEFAULT_COUNTfrom 20 → 100get_block_sync_size(): requests up to 256 blocks/batch when far behind the tip, scaling down to exactlyblocks_behindwhen within 100–256 blocks of the target, then falling back to 100 at tipBLOCKS_SYNCHRONIZING_SAFE_BATCH_COUNT(256) — CNA v3/v4/v5 look up block data atheight - 256; a larger uncommitted write batch would make those lookups invisible to new read-only transactionsP2P_DEFAULT_SYNC_SEARCH_CONNECTIONS_COUNTfrom 2 → 4P2P_DEFAULT_LIMIT_RATE_UPfrom 2048 → 8192 KB/s to match the download rate limit (the previous asymmetry was causing the outbound throttle to fire on seed nodes)CURRENCY_PROTOCOL_MAX_OBJECT_REQUEST_COUNTfromcryptonote_protocol_handler.hintocryptonote_config.hLMDB / block cache
build_block_cachenow reuses the active batch write transaction instead of opening a fresh read-only txn that cannot see uncommitted blocksblock_cache_datacompact struct (96 → 56 bytes per entry), storing only the four fields accessed byget_cna_v4_data/get_cna_v5_data— reduces L3 cache pressure at lower chain heightsWeight limit / median computation
get_block_hash_from_heightLMDB read per block on the incremental pathget_next_long_term_block_weightdirectly intoupdate_next_cumulative_weight_limitto avoid recomputing it; usesstd::optional<uint64_t>to make presence explicit (replaces auint64_t == 0sentinel)mid_batch); the 50k-block long-term median is stable within a 256-block batch, so the stale limit safely covers intra-batch validationCloses #64