Join GitHub today
mempool lock can cause induce adversarial large delay in block reap functions #2972
Currently the mempool places a mutex lock on every CheckTx (https://github.com/tendermint/tendermint/blob/master/mempool/mempool.go#L300), and Reap (https://github.com/tendermint/tendermint/blob/master/mempool/mempool.go#L467).
However reap has no prioritization in selection once the lock is free. There will be contention between all queued check tx processes and the queued reap process. There can be a large number of queued CheckTx processes. In the cosmos-sdk, tendermint is ran in process, and CheckTx's happen synchronously. So the CheckTx mutex lock will not unlock until this computation (i.e. signature verification) is complete. Secp256k1 in the golang implementation takes ~250 microseconds to verify a signature. Since a cosmos tx can be made at ~150 bytes such that signature verification would get ran, this implies that a 1 MBit/s line would allow an adversary to cause 218 ms of verification time in txs to the validator, per second. Linearly scaling the connections throughput to 10 Mbit/s implies 2.18 seconds of delay induced per second. (This is assuming that exponential back-off on the mutexes doesn't further increase the time) Note that the attacker can keep scaling the amount of txs they send with network throughput, since the check for mempool already being full happens after the mutex lock.
Supposing no new check tx contentions were made, since the reap process has no prioritization over CheckTxs, one would expect it to occur around the halfway mark. (So with 1 seconds worth of induced validator delay, this could be delayed by 1.09 seconds) However the attacker would continue their check tx spam, further increasing this bound.* Additionally, the newer check txs will have faster "refresh rates", since their exponential back-off is smaller, whereas the reap's refresh rate will have become longer. Thus this further increases the delay.
Currently the maximum mempool size was decreased to mitigate also large delays on rechecking within Update. We don't check the maximum mempool size before locking, we should do so. Then the attackers power no longer scales with their internet connection to you, it is blocked by your mempool size plus the number of txs you already have. This is still highly applicable, since a small mempool size of
Making the check txs actually asyncronous such that the mempool doesn't lock over its execution time would remove this problem, however the delays that may induce in normal execution may not be worth it.
In a similar theme to recent discussions deliver tx vs deliver block, making a batch check tx would strongly alleviate this problem, if the mempool buffered its check txs before engaging in a lock.
* We should do the math with it in the continually refreshed case, it may be that the above solutions don't actually solve it, and the removal of mutexes should happen relatively soon.
Just realized, we can actually probably remove the lock between CheckTx and Reap, and that would solve the problem. Reap in the current implementation is non-mutative, and begins its search from the beginning of the mempool linked list, while Check tx appends to the end. (CheckTx may mutate txs in between, but its okay if the mempoolTx reap pulls is not the latest mempoolTx from checkTx, as checkTx is just changing metadata) Reap doesn't actually remove the txs. Thus Reap and CheckTx could happen concurrently. What can't happen concurrently is Update and CheckTx, and Update and Reap.
This was exploited during GoS. With a mempool of
Note that our mempool grew very large because we set min fees in
If we expect to have large mempool in mainnet (thousdands of txs), which I think is a reasonable expectation, this indeed needs to be fixed.