Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction is removed from Mempool cache if CheckTX result != 0 #5751

Closed
p4u opened this issue Dec 6, 2020 · 12 comments · Fixed by #5813
Closed

Transaction is removed from Mempool cache if CheckTX result != 0 #5751

p4u opened this issue Dec 6, 2020 · 12 comments · Fixed by #5813
Labels
C:mempool Component: Mempool
Milestone

Comments

@p4u
Copy link
Contributor

p4u commented Dec 6, 2020

Tendermint version (use tendermint version or git rev-parse --verify HEAD if installed from source):

v0.34.0

ABCI app (name for built-in, URL for self-written if it's publicly available):

https://github.com/vocdoni/go-dvote

What happened:

If a transaction executed by checkTX returns an error code value (different from 0), the transaction is removed from the mempool cache as can be shown here: https://github.com/tendermint/tendermint/blob/master/mempool/clist_mempool.go#L453

That opens the door to spam/flood the tendermint application with invalid transactions, since the cache layer will let it pass always. The flow would be something like this: Invalid transaction -> Cache -> CheckTx -> Remove from Cache -> Receive it again -> Cache -> CheckTx -> etc..

As far as I understand the mempool cache objective is to avoid this kind of problems, when a transaction is considered invalid or already executed it remains in the cache in order not let it go into the mempool again. So IMO this remove does not have much sense. In order to avoid a future-valid-transaction enter the mempool again, a TTL could be used for purging the cache.

What you expected to happen:

If this behavior does not want to be changed, the mempool implementation should export some method to enable/disable this cache removal, so the application developer might choose what to do.

p4u added a commit to vocdoni/tendermint that referenced this issue Dec 6, 2020
@p4u
Copy link
Contributor Author

p4u commented Dec 6, 2020

What do you think about something like this?
vocdoni@8692cf8

p4u added a commit to vocdoni/vocdoni-node that referenced this issue Dec 6, 2020
p4u added a commit to vocdoni/tendermint that referenced this issue Dec 6, 2020
@melekes
Copy link
Contributor

melekes commented Dec 7, 2020

I'm in favor of adding TTLs. Related tendermint/spec#154

@melekes melekes added the C:mempool Component: Mempool label Dec 7, 2020
p4u added a commit to vocdoni/tendermint that referenced this issue Dec 8, 2020
@p4u
Copy link
Contributor Author

p4u commented Dec 8, 2020

@melekes that's nice, however this thread seems to be kind of stopped. So I am not sure when this new approach will be available. Would it be fine for you meanwhile to accept my modification in order to let the user decide about the cache removal?

@tessr
Copy link
Contributor

tessr commented Dec 8, 2020

this thread seems to be kind of stopped

It's been all of 29 hours and we're only human 😅

Anyways, thanks for all the info in Discord and for your diligent debugging. Again, I think we can patch this without waiting for our full upcoming mempool refactor.

Going to capture a few notes from that conversation here:

  • It seems the problem is that the cache is not preventing bad transactions from getting re-added to the mempool (as opposed to a problem where transactions are never clearing the mempool itself)
  • Root cause of this is likely that checkTx removes transactions from the cache if they fail, meaning that bad transactions can get readded again
  • Possible fix: Keep transactions in the cache even if they fail checkTx
  • Do we need to worry about an ever-growing cache? Maybe, but cache implementation drops the first element added once it fills up:
    if cache.list.Len() >= cache.size {
    (This is probably not optimal but it could be OK.)
  • See also:
    mem.cache.Remove(tx)

@tessr tessr added this to Untriaged 🥚 in Tendermint Core Project Board via automation Dec 8, 2020
@tessr tessr moved this from Untriaged 🥚 to To Do 📝 in Tendermint Core Project Board Dec 8, 2020
@tessr tessr added this to the v0.34.1 milestone Dec 8, 2020
p4u added a commit to vocdoni/tendermint that referenced this issue Dec 8, 2020
@p4u
Copy link
Contributor Author

p4u commented Dec 8, 2020

It's been all of 29 hours and we're only human sweat_smile

I was talking about the link that @melekes shared with me regarding the mempool refactor.

Thanks for your summary and comments @tessr :)

@melekes
Copy link
Contributor

melekes commented Dec 9, 2020

What do you think about something like this?
vocdoni@750e289

The more correct way to fix this may be introducing the PermanentlyInvalid flag in ResponseCheckTx, which, if true, indicates that the tx will never be valid and hence should remain in cache. Comparing to the above config setting, it enables more granular control over what txs are evicted from the cache vs not.

I would also like to remind you that mempool cache is the first layer of protection against replay attacks, but since it's limited in size, ABCI application must also implement replay protection.

@p4u
Copy link
Contributor Author

p4u commented Dec 9, 2020

Agree that's a better approach @melekes
But I'd not call it PermanentlyInvalid since once the cache is full the transaction could be purged and added back to the mempool. I'd propose KeepInCache or something like that. Would you like me to send a PR?

I would also like to remind you that mempool cache is the first layer of protection against replay attacks, but since it's limited in size, ABCI application must also implement replay protection.

Yes... I have been reminded several times for the replay protection. We do have it implemented :)

@alexanderbez
Copy link
Contributor

Let's keep this in mind when considering the mempool refactor redesign. I don't think keeping it in the cache is a viable idea also changing ABCI might seem overkill (but maybe that is the preferred solution?).

@p4u
Copy link
Contributor Author

p4u commented Dec 9, 2020

Here a clear example that shows the point of this issue:

[mempool] Added good transaction        {"tx": "A23E764BB9F4B0883373B0314E07AC23207495695D097480E2E444D014A32FB4", "res": {"check_tx":{"code":0,"data":"","log":"","info":"","gas_wanted":"0","gas_used":"0","events":[],"codespace":""}}, "height": 1850, "total": 1}
[tendermint] Could not check tx {"tx": "A23E764BB9F4B0883373B0314E07AC23207495695D097480E2E444D014A32FB4", "err": "tx already exists in app cache"}
[tendermint] Could not check tx {"tx": "A23E764BB9F4B0883373B0314E07AC23207495695D097480E2E444D014A32FB4", "err": "tx already exists in app cache"}
[tendermint] Could not check tx {"tx": "A23E764BB9F4B0883373B0314E07AC23207495695D097480E2E444D014A32FB4", "err": "tx already exists in app cache"}
[tendermint] Could not check tx {"tx": "A23E764BB9F4B0883373B0314E07AC23207495695D097480E2E444D014A32FB4", "err": "tx already exists in app cache"}

That's a Tendermint network of 6 nodes. So the first time the TX enters the mempool it is OK. But then the same transaction is received 4 more times from other p2p nodes. So the checkTx is executed 5 times. Our replay protection will prevent it to go further but that's not a good strategy IMO.... The Tendermint cache should prevent the application from this, only in case the cache is Full this behavior should be possible.

@p4u
Copy link
Contributor Author

p4u commented Dec 10, 2020

@alexanderbez as far as I understand, since you are now using protobuf for types, we don't break backwards compatibility so ABCI should not be affected.

@melekes I'd need some help with protobuf. I am not sure how to compile it... Maybe it's worth you make it instead of me?

diff --git a/proto/tendermint/abci/types.proto b/proto/tendermint/abci/types.proto
index 91ac34c04..4e08975f7 100644
--- a/proto/tendermint/abci/types.proto
+++ b/proto/tendermint/abci/types.proto
@@ -139,6 +139,7 @@ message Response {
     ResponseOfferSnapshot      offer_snapshot       = 13;
     ResponseLoadSnapshotChunk  load_snapshot_chunk  = 14;
     ResponseApplySnapshotChunk apply_snapshot_chunk = 15;
+    ResponseKeepTxInCache      keep_tx_in_cache     = 16;
   }
 }
 
@@ -262,6 +263,10 @@ message ResponseApplySnapshotChunk {
   }
 }
 
+message ResponseKeepTxInCache {
+  bool keepInCahe = 1;
+}
+
 //----------------------------------------
 // Misc.

@tac0turtle
Copy link
Contributor

tac0turtle commented Dec 10, 2020

as far as I understand, since you are now using protobuf for types, we don't break backwards compatibility so ABCI should not be affected.

I'm wary of making additions like this at the abci level primarily because it sets a precedent. This is an implementation detail or lack there of that we are pushing into the abci level. This also requires specification updates in tendermint/spec

Maybe the config approach is better if it really needs to happen in this codebase. The config is due for a cleanup so it would be easy to hide something in there that won't affect users.

I'd need some help with protobuf. I am not sure how to compile it.

There is a make command that uses docker: make proto-gen-docker.

@p4u
Copy link
Contributor Author

p4u commented Dec 10, 2020

I undrestand @marbar3778

I think that since the whole mempool needs to be probably rewritten, an approach such as the config flag should be enough for now. At least for us, so +1

p4u added a commit to vocdoni/tendermint that referenced this issue Dec 15, 2020
p4u added a commit to vocdoni/tendermint that referenced this issue Dec 19, 2020
If set to true, an invalid transaction will be kept in cache in order to
avoid processing it again.

see tendermint#5751

Signed-off-by: p4u <pau@dabax.net>
melekes pushed a commit that referenced this issue Dec 21, 2020
If set to true, an invalid transaction will be kept in cache in order to
avoid processing it again.

see #5751

Signed-off-by: p4u <pau@dabax.net>
Tendermint Core Project Board automation moved this from To Do 📝 to Done 🎉 Dec 21, 2020
melekes added a commit that referenced this issue Dec 21, 2020
When set to true, an invalid transaction will be kept in the cache (this may help some applications to protect against spam).

NOTE: this is a temporary config option. The more correct solution would be to add a TTL to each transaction (i.e. CheckTx may return a TTL in ResponseCheckTx).

Closes: #5751
melekes added a commit that referenced this issue Dec 21, 2020
When set to true, an invalid transaction will be kept in the cache (this may help some applications to protect against spam).

NOTE: this is a temporary config option. The more correct solution would be to add a TTL to each transaction (i.e. CheckTx may return a TTL in ResponseCheckTx).

Closes: #5751
melekes added a commit that referenced this issue Dec 21, 2020
When set to true, an invalid transaction will be kept in the cache (this may help some applications to protect against spam).

NOTE: this is a temporary config option. The more correct solution would be to add a TTL to each transaction (i.e. CheckTx may return a TTL in ResponseCheckTx).

Closes: #5751
cmwaters pushed a commit that referenced this issue Jan 4, 2021
When set to true, an invalid transaction will be kept in the cache (this may help some applications to protect against spam).

NOTE: this is a temporary config option. The more correct solution would be to add a TTL to each transaction (i.e. CheckTx may return a TTL in ResponseCheckTx).

Closes: #5751
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:mempool Component: Mempool
Projects
No open projects
5 participants