Skip to content

bug: QRInfo masternode sync fails with missing rotated chain lock sig at tip after retries #470

@lklimek

Description

@lklimek

Summary

During SPV initial sync, the masternodes sync manager fails permanently with:

ERROR dash_spv::sync::masternodes::sync_manager: QRInfo failed after 4 retries: Required rotated chain lock sig at h - 0 not present for masternode diff block hash: 0000000ad913ed94a9a64e95a51c82d40c49807a207894e83ebe466f589399f4
ERROR dash_spv::sync::sync_manager: Masternode error handling message: Masternode sync failed: Required rotated chain lock sig at h - 0 not present for masternode diff block hash: 0000000ad913ed94a9a64e95a51c82d40c49807a207894e83ebe466f589399f4

The h - 0 variant means the chain lock signature for the current tip block is missing from the QRInfo response. The retry logic (3 retries × 5s delay = ~15s total) is exhausted without recovery.

Error Origin

dash/src/sml/masternode_list_engine/mod.rs ~line 660:

let sigmtip = maybe_sigmtip.ok_or(
    QuorumValidationError::RequiredRotatedChainLockSigNotPresent(
        0, mn_list_diff_tip_block_hash,
    ),
)?;

This fires when apply_diff() for the tip block's MnListDiff doesn't yield a chain lock signature, which happens when the QRInfo response references a tip block whose chain lock hasn't propagated yet.

Retry Logic

dash-spv/src/sync/masternodes/sync_manager.rs lines 268–303:

The retry logic specifically handles RequiredRotatedChainLockSigNotPresent(0, _) (tip only) with:

  • MAX_RETRY_ATTEMPTS = 3
  • CHAINLOCK_RETRY_DELAY_SECS = 5
  • Fixed delay between retries (not exponential backoff)
  • Re-requests QRInfo from peers on each retry via tick() (lines 509–518)

Potential Issues

  1. Fixed retry delay may be insufficient. Chain locks typically propagate within a few seconds, but under network congestion or with stale peers, 15 seconds total may not be enough. Exponential backoff (e.g., 5s, 10s, 20s) would be more resilient.

  2. Retries may hit the same stale peer. If the QRInfo re-request goes to the same peer that returned incomplete data, all retries will fail identically. Requesting from a different peer on retry would improve success rate.

  3. Tip may advance during retries. If a new block is mined during the 15-second retry window, the original tip's chain lock may never appear in QRInfo responses because peers have moved on to the new tip. The retry should potentially re-request with updated block locators rather than repeating the same request.

  4. Failure is permanent. Once MAX_RETRY_ATTEMPTS is exceeded, MasternodeSyncFailed is returned and the sync manager transitions to SyncState::Error with no automatic recovery. The user must manually restart the SPV client.

Observed Context

  • Network: Dash Testnet
  • Client: Dash Evo Tool using dash-spv via dash-sdk
  • Scenario: Initial SPV sync; headers/filters sync completed successfully, masternode sync fails during QRInfo processing
  • rust-dashcore revision: a05d256f59743c69df912462dd77dd487e1ff5b2

Suggested Improvements

  • Increase MAX_RETRY_ATTEMPTS or use exponential backoff
  • Rotate to a different peer on retry
  • Consider re-initiating QRInfo request with fresh block locators on retry (in case the tip advanced)
  • Optionally allow automatic recovery by restarting the masternode sync stage instead of failing the entire sync permanently

🤖 Co-authored by Claudius the Magnificent AI Agent

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions