Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Make gossip synchronization on bootstrap more robust #2866
Currently it simply waits for ring_delay (30 seconds).
If processing of the state takes longer than that, e.g. due to the number of nodes in the cluster, it will start bootstrapping before it finished processing all the nodes, leading to data loss. Refs #2855.
The wait could be improved by, in addition to a fixed sleep, wait for:
Is that not basically asking for having "wait_for_gossip_to_settle()" moved to the storage_service::bootstrap() code (or called there as well mayhap, since the waiting part is conditional in bootstrap)? Plus maybe adding some message state awareness to it?