-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mempool: disable MaxBatchBytes #5800
Conversation
My current theory is that the flowrate lib we're using to control flow (multiplex over a single TCP connection) was not designed w/ large blobs (1MB batch of txs) in mind. I've tried decreasing the Mempool reactor priority, but that did not have any visible effect. What actually worked is adding a time.Sleep into mempool.Reactor#broadcastTxRoutine after an each successful send == manual control flow of sort. Closes #5796
this is expensive
Codecov Report
@@ Coverage Diff @@
## master #5800 +/- ##
==========================================
+ Coverage 59.77% 59.82% +0.05%
==========================================
Files 262 262
Lines 23705 23688 -17
==========================================
+ Hits 14169 14171 +2
+ Misses 8023 8007 -16
+ Partials 1513 1510 -3
|
I hope you are sure about this @melekes, you know Tendermint better than me, but this workaround looks like buring a very serious problem that eventually will be solved with a complete refactor in the future. |
The rate limits are configurable. @p4u, can you see if increasing |
This is something we're trying to balance. Although we intend to do a complete refactor, it's not slated to begin for a few months, and it will probably take a few months itself. I think we're looking at mid-2021 before the mempool refactor is ready to roll. In the meantime, we'd like to see if there's a quick (but safe) fix that we can apply to get everything working for you again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think the original PR that introduced this made some other changes as well, might be worth verifying that this change actually fixes the problems, i.e. that other bugs weren't introduced too.
Sure, I understand. I just wanted to point out that if the work around is not safe enough it would become a mess for anyone using Tendermint and upgrading to 0.34. If we are not sure of the safety of this, I'd revert the whole Batch Tx feature for now.
Well, not only for me but for anyone using Tendermint I hope 👍 EDIT: I just noticed that the PR has been changed, so my comment was in the previous context that was quite risky IMO. |
Is this PR description still accurate, @melekes? |
yes |
@p4u from vocdoni.io reported that the mempool might behave incorrectly under a high load. The consequences can range from pauses between blocks to the peers disconnecting from this node. My current theory is that the flowrate lib we're using to control flow (multiplex over a single TCP connection) was not designed w/ large blobs (1MB batch of txs) in mind. I've tried decreasing the Mempool reactor priority, but that did not have any visible effect. What actually worked is adding a time.Sleep into mempool.Reactor#broadcastTxRoutine after an each successful send == manual control flow of sort. As a temporary remedy (until the mempool package is refactored), the max-batch-bytes was disabled. Transactions will be sent one by one without batching Closes #5796
@p4u from vocdoni.io reported that the mempool might behave incorrectly under a high load. The consequences can range from pauses between blocks to the peers disconnecting from this node. My current theory is that the flowrate lib we're using to control flow (multiplex over a single TCP connection) was not designed w/ large blobs (1MB batch of txs) in mind. I've tried decreasing the Mempool reactor priority, but that did not have any visible effect. What actually worked is adding a time.Sleep into mempool.Reactor#broadcastTxRoutine after an each successful send == manual control flow of sort. As a temporary remedy (until the mempool package is refactored), the max-batch-bytes was disabled. Transactions will be sent one by one without batching Closes #5796
@p4u from vocdoni.io reported that the mempool might behave incorrectly under a high load. The consequences can range from pauses between blocks to the peers disconnecting from this node. My current theory is that the flowrate lib we're using to control flow (multiplex over a single TCP connection) was not designed w/ large blobs (1MB batch of txs) in mind. I've tried decreasing the Mempool reactor priority, but that did not have any visible effect. What actually worked is adding a time.Sleep into mempool.Reactor#broadcastTxRoutine after an each successful send == manual control flow of sort. As a temporary remedy (until the mempool package is refactored), the max-batch-bytes was disabled. Transactions will be sent one by one without batching Closes #5796
This configuration is not used anymore; it's a leftover of batching txs in the mempool, which was deprecated (tendermint/tendermint#5800)
This configuration is not used anymore; it's a leftover of batching txs in the mempool, which was deprecated (tendermint/tendermint#5800) (cherry picked from commit dab72ad)
@p4u from vocdoni.io reported that the mempool might behave incorrectly under a
high load. The consequences can range from pauses between blocks to the peers
disconnecting from this node.
My current theory is that the flowrate lib we're using to control flow
(multiplex over a single TCP connection) was not designed w/ large blobs
(1MB batch of txs) in mind.
I've tried decreasing the Mempool reactor priority, but that did not
have any visible effect. What actually worked is adding a time.Sleep
into mempool.Reactor#broadcastTxRoutine after an each successful send ==
manual control flow of sort.
As a temporary remedy (until the mempool package
is refactored), the
max-batch-bytes
was disabled. Transactions will be sentone by one without batching
Closes #5796