-
Notifications
You must be signed in to change notification settings - Fork 1
DeroGold: Full‐Node IBD Optimisation Proposal
Status: Concept Draft — v0.1**
Author: Community RFC
Date: 2026-03
Related: [Bulletproofs Concept Draft RFC]
Syncing a DeroGold full node from genesis block zero today takes 7–10 days on modern hardware (NVMe SSD, 1 Gbps connection). The root cause is not slow hardware or a slow network — it is a combination of a structurally sequential sync pipeline and the permanent legacy of three deliberate blockchain bloat attacks between blocks ~1,100,000 and ~2,500,000, which produced a chain exceeding 600 GB at block 2,400,000.
The network went from approximately 1,000 full nodes globally to 10–15 active nodes today. The primary driver of that collapse was the hardware and time cost of running an archival node. Every new operator who wants to run a full node from genesis faces the same wall.
This proposal identifies the precise bottlenecks through codebase analysis and proposes five targeted optimisations, ranked by implementation effort and expected impact. Together, the highest-priority three are estimated to reduce a full genesis sync to 1–2 days on equivalent hardware — without requiring a consensus change or hard fork.
This section is included so future contributors understand the context behind the chain's shape. It is not a complaint — the countermeasures deployed at the time were correct given what was known. But the damage is permanently encoded in the chain and every new full-node operator inherits it.
A maketransaction.py script was used to generate as many valid transactions as possible within each block, exploiting the absence of per-transaction fees or output count limits at the time. The result was approximately 600,000 blocks of abnormally high transaction density. The transactions were valid by every rule in force at the time.
A more sophisticated two-vector attack: mass 0.01 DEGO outputs to create heavily fragmented UTXOs, combined with mass fusion transactions to consolidate them — keeping the mempool saturated and blocks full from both directions simultaneously. This coincided with the block target slowdown to 300 seconds at block 2,325,000, which meant more transactions could accumulate per block. The dynamic block size growth formula (designed for organic growth) was exploited to produce blocks averaging 1.7 MB, with peaks of 2.5 MB, sustained for an extended period.
| Height | Measure |
|---|---|
| 700,000 |
MAX_EXTRA_SIZE reduced to 1,024 bytes |
| 1,470,000 |
MAX_EXTRA_SIZE reduced further to 512 bytes |
| 2,325,000 | Block target increased to 300 s; fusion fee introduced |
| 2,361,823 | Hard block size cap: MAX_BLOCK_SIZE_V1 = 614,400 bytes (600 KB) |
| 2,370,000 | Transaction PoW activated (difficulty 17,000) |
All of these were correct responses. The block size cap and TX PoW ended the attacks. The chain has been clean from approximately block 2,500,000 onward. The problem is that the attack-era data is immutable — it cannot be removed from history — and every new full node must process all of it.
- Chain size at block 2,400,000: 600+ GB
- Node count: ~1,000 (pre-attack) → ~10–15 (today)
- Effective status: production-level testnet
The core sync function is processObjects() in CryptoNoteProtocolHandler.cpp:
for (size_t index = 0; index < rawBlocks.size(); ++index)
{
auto addResult = m_core.addBlock(cachedBlocks[index], std::move(rawBlocks[index]));
// ... error handling ...
m_dispatcher.yield();
}This is a plain for loop. One block is written to RocksDB, then the coroutine yields to let the network stack breathe, then the next block is written. There is no pipelining between download and write. The yield() is a coroutine yield, not concurrent processing.
Crucially, addBlock() is called from a single peer's connection context. There is no coordinator that distributes block ranges across multiple peer connections. Each peer independently tracks its own m_needed_objects queue but they all call the same sequential addBlock().
const size_t BLOCKS_IDS_SYNCHRONIZING_DEFAULT_COUNT = 1000; // hash requests
const uint64_t BLOCKS_SYNCHRONIZING_DEFAULT_COUNT = 100; // full block download batchThe dynamic rate algorithm (adjust_block_rate()) measures how long the last batch took and adjusts m_next_request_block_rate — but this governs hash requests, not full block downloads. The full block download batch is hardcoded at 100 blocks per round trip. Through the attack range, that is 4,000+ P2P round trips for Wave 2 alone (400,000 blocks ÷ 100).
For every block, the write path executes:
-
insertSpentKeyImages— key images for all inputs, written to RocksDB -
insertCachedTransaction— tx hash → metadata, for every transaction -
insertKeyOutputGlobalIndexes— per-amount output index entries, for every output -
insertCachedBlock— block header metadata -
insertRawBlock— the complete serialisedBlockTemplateplus every complete serialised transaction body, stored verbatim as a single RocksDB key-value entry
Step 5 is the dominant cost in the attack range. A 1.7 MB average block means 1.7 MB written to RocksDB on every iteration of the loop. For the ~400,000 attack-era blocks in Wave 2, that is approximately 680 GB of raw blob writes to RocksDB from step 5 alone, in a sequential loop.
insertRawBlock exists so the node can serve blocks to other syncing peers via the /sync/raw and /getrawblocks P2P endpoints. It is an archival function, not a consensus function. The structured index (steps 1–4) is what the node uses for its own validation and double-spend checking. These are two separate concerns sharing one code path.
validateTransactionInputsExpensive() — the ring signature verification — is already skipped for all blocks inside a checkpoint zone:
if (m_checkpoints.isInCheckpointZone(m_blockHeight + 1))
{
return true;
}Checkpoints exist every 5,000 blocks through the attack range and up to block 2,775,000. Ring sig verification is therefore already bypassed for ~96% of the chain. The bottleneck is not cryptographic — it is serialisation and disk I/O.
Priority: Highest | Complexity: Low | No consensus change required
Add a --compact-sync flag (or similar). When enabled, skip storage->pushBlock() — the insertRawBlock write — for all blocks inside a checkpoint zone. The structured index (key images, output indexes, block metadata) is written normally. The raw transaction blobs are discarded after indexing.
The raw blob is only needed to serve other syncing peers historical blocks via P2P. It is not needed for the node's own consensus operation. A node that has never stored blobs for blocks 0–2,775,000 can still:
- Validate all future transactions (it has the complete UTXO set and key image set)
- Mine blocks
- Participate in consensus
- Serve recent blocks (above the checkpoint floor) normally
It cannot serve the historical attack-range blocks to other peers. This is an explicit, bounded trade-off.
| Range | Avg block size | Raw blob total |
|---|---|---|
| Wave 1 (~1.1M–1.7M) | ~150 KB (tx-dense, small) | ~90 GB |
| Wave 2 (~1.9M–2.5M) | ~1.7 MB | ~1,020 GB |
| Clean history (0–1.1M, 1.7M–1.9M) | ~20 KB | ~20 GB |
Compact sync reduces the total chain storage from ~600+ GB to approximately 15–25 GB — only the structured index survives for checkpoint-covered blocks.
Eliminates the single most expensive write operation per block in the attack range. For a 1.7 MB block, writing the structured index is O(hundreds of bytes); writing the raw blob is O(1.7 MB). Removing step 5 from the attack-range loop is estimated to reduce attack-range write time by 60–80%.
A node synced with --compact-sync should be clearly identified in peer advertisements as a compact node (or equivalent), so syncing peers know to request attack-range blocks from archival nodes (i.e., nodes seeded from bootstrap.derogold.online) instead.
The existing --sync-from-height flag already establishes this pattern — nodes that used it are already non-archival for historical blocks. This is a natural extension of that concept.
- One conditional in
doPushBlock()/DatabaseBlockchainCache::pushBlock(): if compact mode enabled andm_checkpoints.isInCheckpointZone(blockIndex), skipstorage->pushBlock() - One new CLI flag and config option
- Peer capability advertisement (handshake flag or version string extension)
- Clear documentation
Priority: High | Complexity: Medium | No consensus change required
Accumulate write operations for N blocks (suggested: 500–1,000) into a single RocksDB WriteBatch and commit once, rather than committing per-block.
RocksDB's per-commit overhead — WAL sync, memtable pressure, compaction triggers — is significant when committing millions of small writes. Batching amortises this overhead. The safety argument: if the node crashes mid-batch, the worst case is re-syncing from the last checkpoint, which is at most 5,000 blocks back. This is acceptable during IBD. The existing checkpoint infrastructure already provides the recovery anchor.
This optimisation is only active during IBD in a checkpoint zone. At the chain tip (live operation), per-block commit semantics are preserved — an unconfirmed crash loss of up to 1 block at the tip is not acceptable in the same way.
Benchmarks from RocksDB documentation and similar CryptoNote codebases suggest batched writes can be 3–10× faster than individual commits for sequential write workloads. Applied to the structured index writes (OPT-1 already removes the blob writes), this is a meaningful secondary gain.
Priority: High | Complexity: Trivial | No consensus change required
Raise BLOCKS_SYNCHRONIZING_DEFAULT_COUNT from 100 to 500 or 1,000 when the node detects it is in IBD mode (i.e., more than N blocks behind the observed network height). Restore to 100 at the tip.
100 blocks per P2P round trip through a 400,000-block attack range means 4,000 round trips minimum. Even at 50 ms round-trip time, that is 200 seconds of pure round-trip overhead, not counting transfer time. At 500 blocks per request, this becomes 800 round trips. At 1,000 blocks, 400 round trips.
More importantly: each round trip involves the requesting node being idle while the peer reads from RocksDB, serialises, and transmits. Larger batches mean the downloader is idle less often.
The primary constraint on batch size is memory: 1,000 blocks × 1.7 MB average = ~1.7 GB buffered in RAM. A configurable limit with a sensible default (e.g., cap batch size such that in-flight data does not exceed 256 MB) would handle this gracefully.
- Runtime check: if
getTopBlockIndex() + IBD_THRESHOLD < observedNetworkHeight, use large batch; else use standard batch - Suggested
IBD_THRESHOLD: 10,000 blocks - Suggested large batch: 500 blocks (conservative) or 1,000 (aggressive, memory-permitting)
Priority: Medium | Complexity: Medium | No consensus change required
Decouple the download phase and the write phase into a producer-consumer pipeline. While the write thread is processing batch N, a download coroutine is already fetching batch N+1 (and optionally N+2) from the peer.
Today the flow is:
[request batch N] → [wait for peer] → [write batch N] → [request batch N+1] → ...
Both the network link and the disk are idle during each other's work. On fast hardware, the write phase and the download phase are of similar duration through the attack range (large blocks are expensive to both transfer and write). The overlap potential is high.
[request batch N+1] ──────────────────────────────→ [buffer]
[write batch N] [write batch N+1]
A bounded queue (capped at e.g. 500 MB of buffered raw blocks) between the download coroutine and the write loop prevents unbounded memory growth. If the queue is full, the download coroutine parks until the writer drains it.
On a 1 Gbps link with NVMe storage, rough estimates:
- Download time for 100 × 1.7 MB blocks ≈ 1.36 seconds
- Write time for 100 × 1.7 MB blocks (blob write) ≈ 1–3 seconds
With pipelining, these overlap. Combined with OPT-1 (no blob write) and OPT-3 (larger batches), the effective throughput approaches the faster of the two rather than the sum.
Priority: Lower (given current network size) | Complexity: High | No consensus change required
A coordinator assigns non-overlapping block ranges to different peer connections during IBD. A reorder buffer accumulates out-of-order arrivals and feeds addBlock() in sequence once a contiguous window is complete.
With 10–15 active nodes today, parallelism across peers is constrained by peer availability. The gain from 2–3 parallel download streams is real but bounded. OPT-1 through OPT-4 attack the write-side bottleneck, which is the dominant cost through the attack range. Multi-peer download primarily attacks the download-side bottleneck.
If node count recovers (which the other optimisations are intended to help), this becomes more valuable. It is worth designing with this in mind from the start — specifically, the pipelined architecture of OPT-4 is a natural foundation for multi-peer extension.
| Optimisation | Estimated effort | |
|---|---|---|
| ✓ | OPT-3: Raise IBD download batch size | 1–2 hours |
| ✓ | OPT-1: Compact sync flag (skip blob write in checkpoint zone) | 2–3 days |
| ✓ | OPT-2: RocksDB write batching during IBD | 2–4 days |
These three share no dependencies and can be developed and tested independently. OPT-3 is a one-line config change with a runtime IBD detection check. OPT-1 and OPT-2 touch the same write path and can be developed together.
| Optimisation | Estimated effort | |
|---|---|---|
| ○ | OPT-4: Pipelined download + write | 1–2 weeks |
| ○ | OPT-5: Multi-peer parallel download | 2–4 weeks |
OPT-4 requires separating the download and write concerns in processObjects() and request_missing_objects(). OPT-5 requires a block range coordinator and reorder buffer on top of OPT-4's foundation.
OPT-1 from this proposal and the compact checkpoint-mode storage concept described in the related chain-state RFC are the same feature, viewed from two angles:
- Speed angle (this RFC): Skipping blob writes makes IBD faster because the dominant per-block write is eliminated
- Storage angle (chain-state RFC): Skipping blob writes makes the resulting node much smaller
A node operator running --compact-sync gets both benefits simultaneously. The two proposals should be implemented together and documented as a unified feature.
The following estimates are based on codebase analysis and RocksDB write characteristics. They assume NVMe storage and a 1 Gbps connection to a well-connected peer.
| Configuration | Estimated genesis-to-tip sync time |
|---|---|
| Current (no changes) | 7–10 days |
| OPT-3 only (larger batches) | 5–7 days |
| OPT-1 + OPT-3 (compact + larger batches) | 2–3 days |
| OPT-1 + OPT-2 + OPT-3 (Phase 1 complete) | 1–2 days |
| Phase 1 + OPT-4 (pipelined) | < 1 day |
These figures are estimates. Actual results depend on peer bandwidth, RocksDB compaction behaviour, and the specific hardware profile. Community benchmarking after each phase is encouraged.
The bootstrap service already solves the "I don't want to sync from genesis at all" case. This proposal solves the "I want to or need to sync from genesis" case. They are complementary:
- Bootstrap service → fast onboarding, non-archival from the start
- Phase 1 optimisations → makes full archival sync feasible for those who want it
- Compact sync flag → makes archival sync fast and produces a smaller node, at the cost of not being able to serve historical attack-range blocks to peers
The long-term healthy state is a small number of full archival nodes (likely seeded from bootstrap but then extended back to genesis) alongside a larger number of compact-sync nodes that can participate in consensus and serve recent history. The bootstrap infrastructure supports the archival nodes; the compact sync flag supports the majority.
-
Compact sync default: Should
--compact-syncbe opt-in (default off, preserving archival behaviour) or opt-out (default on, improving onboarding at the cost of fewer archival nodes)? Given the current node count crisis, a default-on argument exists. -
Peer advertisement: What is the right mechanism for advertising compact-node status to peers? A version string extension, a handshake flag, or a dedicated service bit?
-
Batch size tuning: The suggested IBD batch size of 500–1,000 blocks needs empirical tuning. What memory limit should govern the cap?
-
Write batch commit boundary: Should write batches commit at checkpoint boundaries (every 5,000 blocks) or at a fixed count (every 500–1,000 blocks)? Checkpoint boundaries are semantically cleaner but may produce uneven commit intervals.
-
Phase 2 scope: Is OPT-4 (pipelined download+write) worth prioritising over node count recovery efforts (documentation, onboarding improvements, exchange listings)? The optimisations enable recovery but do not cause it.
-
src/cryptonoteprotocol/CryptoNoteProtocolHandler.cpp—processObjects(),request_missing_objects(),adjust_block_rate() -
src/cryptonotecore/BlockchainCache.cpp—doPushBlock(),pushTransaction() -
src/cryptonotecore/DatabaseBlockchainCache.cpp—insertRawBlock(), P2P raw block serving -
src/cryptonotecore/BlockchainWriteBatch.cpp— RocksDB write batch construction -
src/cryptonotecore/ValidateTransaction.cpp—validateTransactionInputsExpensive(), checkpoint zone bypass -
src/config/CryptoNoteConfig.h—BLOCKS_SYNCHRONIZING_DEFAULT_COUNT,BLOCKS_IDS_SYNCHRONIZING_DEFAULT_COUNT, attack-response constants -
src/config/CryptoNoteCheckpoints.h— checkpoint density through attack range - RocksDB documentation: Write Batch performance
This is a living concept document. Figures marked as estimates should be replaced with benchmark data as implementation proceeds. Community review and corrections are welcome.