-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Later updates to EpochSlots can overwrite earlier updates #17711
Labels
Comments
behzadnouri
added a commit
to behzadnouri/solana
that referenced
this issue
Jun 3, 2021
epoch-slots may be overwritten before they are written to crds table: solana-labs#17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code.
carllin
pushed a commit
to carllin/solana
that referenced
this issue
Jun 3, 2021
epoch-slots may be overwritten before they are written to crds table: solana-labs#17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code.
carllin
pushed a commit
to carllin/solana
that referenced
this issue
Jun 3, 2021
epoch-slots may be overwritten before they are written to crds table: solana-labs#17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code.
behzadnouri
added a commit
that referenced
this issue
Jun 4, 2021
epoch-slots may be overwritten before they are written to crds table: #17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code.
mergify bot
pushed a commit
that referenced
this issue
Jun 4, 2021
epoch-slots may be overwritten before they are written to crds table: #17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code. (cherry picked from commit 60b0a13)
mergify bot
added a commit
that referenced
this issue
Jun 4, 2021
epoch-slots may be overwritten before they are written to crds table: #17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code. (cherry picked from commit 60b0a13) Co-authored-by: behzad nouri <behzadnouri@gmail.com>
This issue has been automatically locked since there has not been any activity in past 7 days after it was closed. Please open a new issue for related bugs. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Problem
We currently track 255 EpochSlots bitvectors that are updated via
push_epoch_slots(slots)
where we:solana/gossip/src/cluster_info.rs
Line 907 in 0b0b9d9
solana/gossip/src/cluster_info.rs
Line 948 in 0b0b9d9
An issue arises when there are two updates to the same indexed EpochSlots structure (say index 0), before the asynchronous flush of the pending flush queue happens in the gossip thread, so the updated value is not yet observable in the
Crds
table.In this case, the second update will read out a stale
EpochSlots
from theCrds
table in step 1) above that doesn't include the first update, updates that value with the second update, then pushes that item to the pending flush queue. Then when the push happens later, the later update doesn't have the first update, thereby overwriting the first update when both values are pushed to the network.We could probably simulate the problem in this test if we got rid of the calls to
flush_push_queue()
in between the updatessolana/gossip/src/cluster_info.rs
Lines 4136 to 4137 in 0b0b9d9
Proposed Solution
Should we also check the push queue for more recent versions of the same message label in
crds.get()
?The text was updated successfully, but these errors were encountered: