Skip to content

Gossip: fd_crds unnecessarily flushes staked values upon initial transition to "stake-aware" state #7393

@ravyu-jump

Description

@ravyu-jump

Summary

When fd_crds first receives stake information, it transitions the table to a "stake-aware" mode. Because existing CRDS values have not yet been updated with their origin stake, the eviction policy incorrectly treats them as zero-stake nodes.

This causes a mass eviction of valid, staked values because the expiry threshold drops from 1 epoch to 15 seconds. The node is then forced to re-acquire these values from the cluster (via push or pull), causing unnecessary network churn.

Technical Details

fd_crds relies on the caller to provide origin stake during fd_crds_insert().

  1. Initial State (Unstaked): Before stake info is known, the caller supplies 0 for origin_stake. fd_crds applies a blanket expiry threshold of 1 epoch to all values.
  2. Transition: On the first call where a non-zero origin_stake is supplied, the table transitions to a "stake-aware" state.
  3. The Bug:
  • In the stake-aware state, the expiry policy dictates:
    - Non-zero stake: 1 epoch threshold.
    - Zero stake: 15s threshold.
  • Existing entries in the table still possess 0 stake (from Step 1).
  • They are not "promoted" to staked status until they are individually refreshed via a successful upsert.
  • Result: If the table is pruned before these values are refreshed, they are judged against the 15s threshold. CRDS Values, especially those that have a 3rd level of indexing beyond (pubkey, crds_type) like EpochSlots and Votes, are not refreshed on the network often enough to be at least 15s recent. Even if they were <15s old, they are not refreshed fast enough to avoid eviction.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions