-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hangover logic includes previous block chain sends after restart #8652
Labels
Comments
This issue is likely what caused a replay error when Nodes.Guru attempted to rollback after an AppHash:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Our hangover logic aims to handle the a one block discrepancy in committed state between cosmos and swingset. It does so by saving the "chain sends" performed by JS during a block, and replaying those instead of executing if cosmic-swingset realizes it's asked to process the last recorded block.
There has been issues in the past with logic in cosmic-swingset, incorrectly mutating some state during detected hangovers, or clearing chain sends during flush, resulting in one time only hangover recovery.
We experienced a remaining issue where the saved chain sends of the block committed before a halt of the node are not cleared and included in the saved chain sends of the first block. If the node experiences a hangover at that point, it will incorrectly replay chain sends of both blocks.
To Reproduce
Steps to reproduce the behavior:
COMMIT_BLOCK
aftersaveOutsideState
)Expected behavior
Hangover logic handles double halts like this.
More specifically, the "chain sends" are cleared once cosmic-swingset is sure that cosmos has correctly committed it's DB. While we could rely on the new
AFTER_COMMIT_BLOCK
action, because of #6736 we likely should callawait clearChainSends()
inBEGIN_BLOCK
ifblockNeedsExecution()
. Once we introduce execution between blocks, we'll have to be careful to clear chain sends at the appropriate time for that.Testing
Historically it has been difficult to test these integration issues between cosmos and cosmic-swingset, however we realized we could simulate the cosmos driver by importing
main
fromchain-main.js
and provide a mockagcc
.Additional context
Experienced by
validator-2
at block 2843573-2843574. See #8650Screenshots
TBD: capture of faulty saved chain sends
The text was updated successfully, but these errors were encountered: