New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p/conn: large (or a lot of) outgoing messages from other reactors can block consensus reactor from making progress #5816
Comments
This is basically #2888. The new P2P stack will have separate queues per reactor channel. |
How's that? The issue here is incorrect dispatching (sending) while multiplexing over single TCP stream, while #2888 is about individual Reactor#Receive blocking receiving messages => sending != receiving. But you're right that if we adopt QUIC (independent streams), this issue will go away. |
Ah, got it -- you're right, different issue. New P2P stack will handle this as well, by having separate outbound queues per peer with some scheduling policy, and dropping messages if a peer can't keep up to avoid blocking reactors. |
blockchain/vX reactor priority was decreased because during the normal operation (i.e. when the node is not fast syncing) blockchain priority can't be the same as consensus reactor priority. Otherwise, it's theoretically possible to slow down consensus by constantly requesting blocks from the node. NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when the node is fast syncing, the priority is 10 (max), but when it's done fast syncing - the priority gets decreased to 5 (only to serve blocks for other nodes). But it's not possible now, therefore I decided to focus on the normal operation (priority = 5). evidence and consensus critical messages are more important than the mempool ones, hence priorities are bumped by 1 (from 5 to 6). statesync reactor priority was changed from 1 to 5 to be the same as blockchain/vX priority. Refs #5816
blockchain/vX reactor priority was decreased because during the normal operation (i.e. when the node is not fast syncing) blockchain priority can't be the same as consensus reactor priority. Otherwise, it's theoretically possible to slow down consensus by constantly requesting blocks from the node. NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when the node is fast syncing, the priority is 10 (max), but when it's done fast syncing - the priority gets decreased to 5 (only to serve blocks for other nodes). But it's not possible now, therefore I decided to focus on the normal operation (priority = 5). evidence and consensus critical messages are more important than the mempool ones, hence priorities are bumped by 1 (from 5 to 6). statesync reactor priority was changed from 1 to 5 to be the same as blockchain/vX priority. Refs #5816
blockchain/vX reactor priority was decreased because during the normal operation (i.e. when the node is not fast syncing) blockchain priority can't be the same as consensus reactor priority. Otherwise, it's theoretically possible to slow down consensus by constantly requesting blocks from the node. NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when the node is fast syncing, the priority is 10 (max), but when it's done fast syncing - the priority gets decreased to 5 (only to serve blocks for other nodes). But it's not possible now, therefore I decided to focus on the normal operation (priority = 5). evidence and consensus critical messages are more important than the mempool ones, hence priorities are bumped by 1 (from 5 to 6). statesync reactor priority was changed from 1 to 5 to be the same as blockchain/vX priority. Refs #5816
See #5796 and #3920 (comment)
What happened
Large messages (or many messages) produced by other reactors and scheduled to be send can temporarily block consensus reactor from making progress.
Why do you think it happens?
Either
libs/flow
library does not perform as expected or p2p/conn/connection scheduling logic is invalid.What did you expect?
No halting. Tendermint top priority is exchanging votes and making blocks with a few transactions + evidence (if any). Consensus reactor messages should have a top priority, while other (e.g. mempool gossip, evidence) should have a lower priority.
The text was updated successfully, but these errors were encountered: