-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geyser drops finalized notifications #31124
Comments
@linuskendall Have you seen any messages in the log files like the following? "bank_notification_sender failed: ..." |
Hm, looks like we're not getting any entries from bank_notification_sender.
Is this on INFO level?
…On Mon, Apr 10, 2023 at 11:56 PM Lijun Wang ***@***.***> wrote:
@linuskendall <https://github.com/linuskendall> Have you seen any
messages in the log files like the following?
"bank_notification_sender failed: ..."
—
Reply to this email directly, view it on GitHub
<#31124 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHOYRLRYQWCN7MHAFPHFB3XAR6W7ANCNFSM6AAAAAAWYZAEKM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
That message is from https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L2062 When we failed to send a notification. It is logged at WRN level. Do you actually see the following message in the log file for the slot? "new root " This is logged at INFO level. |
In the example above we did not get "new root" for the missing finalized slot notifications (34/35):
|
Okay -- that is why geyser did not see the notifications because the replay_stage did not see them when replaying the banks. Did you see in your slot table that the slot 187594840 has a parent 187594835? That can be used to auto mark 187594835 as root. |
Yes, so we did that workaround - we basically just trace parents of the
current finalized slot and mark those as finalized too. However, it is
indeed a bit unexpected behaviour when reading the geyser docs. Do we
expect this to be the 'standard' behaviour?
…On Tue, Apr 11, 2023 at 7:07 PM Lijun Wang ***@***.***> wrote:
Okay -- that is why geyser did not see the notifications because the
replay_stage did not see them when replaying the banks. Did you see in your
slot table that the slot 187594840 has a parent 187594835? That can be used
to auto mark 187594835 as root.
—
Reply to this email directly, view it on GitHub
<#31124 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHOYRLXLRYGWUFGKPBMKETXAWFVDANCNFSM6AAAAAAWYZAEKM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am investigating why replay_stage is missing the root events for these slots and see if any workaround in the plugin framework. |
This is due to the same root cause as #19040 and will be addressed together |
This is fixed by #31124 |
…backport of #31180) (#31650) Problem It is reported that Geyser is missing some Root notifications for slots. #31124 The Root notification is sent from replay_stage's code in handle_votable_bank. https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L1981. However, the validator does not necessarily vote on every slot on the rooted chain. From @carllin For instance if the rooted chain is 1->2->3->4 You might only vote on 1 and 4 But when 4 is rooted, 2 is also rooted But handle_votavle_bank is not called on 2 As result of this, we may miss notifications for slot 2 and 3. Summary of Changes Enhanced BankNotification to add NewRootedChain enum to send the chains of parent roots. Renamed BankNotification::Root -> BankNotification::NewRootBank Introduced SlotNotification for SlotStatusObserver interfaces to send slot status without Bank. In the OptimisticallyConfirmedBankTracker notify parents of a new root if these parents were not notified. Modified and added unit test cases to verify the logic.
Problem
It seems that quite often, Geyser will drop the 'Finalized' slot notification. It seems more commonly to happen around skipped slots, but we have found examples where it was not. An example is illustrated below:
Slot 187594840 has parent 187594835. However, 187594835 and 187594834 never saw slot notifications for finalization. Only for processed and confirmed. However, slot 187594840 saw the finalized slot notification correctly.
This is different from #28871 since it happens during general runtime, not just at start up.
Proposed Solution
This is possible to work around in the geyser plugin (just assume that the 'tree' below the last finalized slot is also finalized), but it leads to delays in ingesting state since we cannot know that the tree is finalized until 187594840 is finalized. Therefore it would be good to identify why geyser doesn't notify the finalized state of the slots in between.
The text was updated successfully, but these errors were encountered: