feat(mempool): revert transactions after 3 failures#1832
feat(mempool): revert transactions after 3 failures#1832Mirko-von-Leipzig wants to merge 3 commits intomirko/mempool-v4from
Conversation
0ded389 to
ccecfa5
Compare
| } | ||
| let failed_txs = block | ||
| .iter() | ||
| .flat_map(|batch| batch.transactions().as_slice().iter().map(TransactionHeader::id)); |
There was a problem hiding this comment.
Would it be possible / desirable for us to pinpoint which transactions actually failed rather than treating all as failed?
There was a problem hiding this comment.
iiuc this is supposed to only occur if there is a bug in the proving system, or one of the kernels, or the batch was invalid. As in, this is a really unexpected outcome once we are at this stage.
But maybe there is some information that could be surfaced. cc @PhilippGackstatter
There was a problem hiding this comment.
I'm missing / have forgotten the context in what scenarios we get here, could you remind me?
There was a problem hiding this comment.
We select transactions for inclusion in a batch, and select batches for inclusion in a block. The mempool ensures that this selection process is coherent and that the state should be in order.
However, in the scenario where a batch or block fails, we want to prevent a buggy transaction from halting the network e.g. if it causes every batch/block it is in to fail then we want to evict that transaction and investigate wtf later.
How or why this could happen is.. likely only due to an internal bug. I think either in:
- mempool selection / DAG
- batch kernel
- block kernel
- node database corruption
Since we cannot (trivially? at all?) pinpoint which exact transaction, or set thereof, triggered this, the mempool assigns each participating transaction a strike, and after three strikes they're evicted.
There was a problem hiding this comment.
Thanks!
Since we cannot (trivially? at all?) pinpoint which exact transaction, or set thereof, triggered this, the mempool assigns each participating transaction a strike, and after three strikes they're evicted.
Yeah, I agree. I think each transaction in the mempool is internally consistent and valid from the protocol perspective, so we can indeed not really pinpoint which transaction was the issue by looking at them. So, I think the current strategy is coarse but the best thing we can do - afaict.
igamigo
left a comment
There was a problem hiding this comment.
LGTM! Left a couple of minor comments
7706fc9 to
f01d124
Compare
This PR changes the mempool behaviour when rolling back blocks and batches. Transactions are now reverted once they have participated in more than three failed batches and blocks.
The current behaviour is that transactions are never reverted for batch failures, and always reverted for block failures.
The intention here is to weed out transactions which cause these failures e.g. due to a bug in the prover/mempool/node. The new implementation is a bit more forgiving and gives innocent bystander transactions a couple of chances to avoid the buggy one.
The limit of three is arbitrary, but I also don't see a good reason to make this more configurable. Fight me :)
This PR builds on the mempool refactor in #1820.
Closes #594