Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent stuck transactions by async locking nonce sequencing (+ estimate gas) #55

Merged
merged 10 commits into from
Oct 24, 2023

Conversation

emizzle
Copy link
Contributor

@emizzle emizzle commented Sep 27, 2023

During populateTransaction, a new nonce is obtained using signer.getNonce. After this, the transaction's gas is estimated, often times resulting in a failure because the simulated transaction failed (due to a contract revert, nonce too high/low, gas error). The next transaction created will have a nonce that is one too high, resulting in that transaction getting stuck until the previous transaction is executed. To solve this, the previous transaction can be cancelled by sending a 0-value transaction to the sender of the transaction with the same nonce as the previous transaction. This will ensure that any queued transactions with higher nonces will be able to be executed.

Copy link
Member

@AuHau AuHau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also LGTM, but again I would live approve to @markspanbroek

Copy link
Member

@markspanbroek markspanbroek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid sending transactions that were not specifically asked for by the user of the library. To solve the problem of stuck transactions, I think we should either:

  1. ensure that the situation never occurs
  2. signal when this occurs, and let the user of the library decide what to do with it (e.g. cancel the transaction)

If we go for option (2), then we could introduce a specific error that is only raised when a transaction will get stuck, and expose a means of handling it (e.g. the cancelTransaction() proc from this PR)

But thinking about this a bit more, we might actually be able to pull off option (1). A transaction can get stuck if there are multiple concurrent invocations of populateTransaction, whereby the nonce is increased a couple of times, but one of the invocations fails on estimateGas. If we ensure that we handle a single populateTransaction at a time, and never do this concurrently, then I think we will not run into this problem anymore. Either an AsyncLock or an AsyncQueue can probably be used to ensure that increasing the nonce and estimating gas either both succeed, or both fail.

In pseudocode:

try:
  await lock.acquire()
  populated.nonce = await signer.getNonce()
  populated.gasLimit = await signer.estimateGas()
except EstimateGasError:
  signer.decreaseNonce()
  raise
finally:
  await lock.release()

ethers/contract.nim Outdated Show resolved Hide resolved
ethers/contract.nim Outdated Show resolved Hide resolved
ethers/signer.nim Outdated Show resolved Hide resolved
ethers/testing.nim Outdated Show resolved Hide resolved
@emizzle
Copy link
Contributor Author

emizzle commented Oct 18, 2023

I'm not sure if that will solve the issue because the problem occurs when sendTransaction is called concurrently, which will increase the nonce for each transaction, and queue any transactions where there is a nonce gap. Adding an async lock would prevent concurrency in populating the transaction (and nonce), but it would not wait for the transaction to be sent, nor do I think that's something we want.

I think we can come up with a version of #2, where something like a StuckTransactionError is raised in sendTransaction which would allow the user to call cancelTransaction. Alternatively or in addition to, we could add a parameter to sendTransaction that enables cancelling potentially stuck transactions instead automatically.

@emizzle
Copy link
Contributor Author

emizzle commented Oct 18, 2023

Actually thinking about this further, async locking populateTransaction might actually work, because of the decreaseNonce (which I hadn't really thought through the first time I looked).

- remove auto-cancellation of failed transaction (failed during estimate gas) to prevent stuck txs
- replace it with an async lock during nonce sequencing + gas estimation
- simplified cancelTransaction (still exported) such that the new transaction is populated using populateTransaction, so that all gas and fees are reset
- moved reverting contract function into its own testing helpers module, and refactored any tests to use it
- updated the test helper reverts to check EstimateGasErrors
@emizzle emizzle changed the title cancel transaction after estimateGas failure prevent stuck transactions by async locking nonce sequencing (+ estimate gas) Oct 20, 2023
@emizzle
Copy link
Contributor Author

emizzle commented Oct 20, 2023

Ready for another review 👍

Copy link
Member

@markspanbroek markspanbroek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @emizzle, this is great!
I have a few small remarks, but otherwise good to go!

Comment on lines 6 to 12
import pkg/stew/byteutils
import ./basics
import ./provider
import ./signer
import ./events
import ./fields
import ./exceptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these imports still needed? I see no changes anymore in this file?

@@ -0,0 +1,7 @@
import ./basics

func msgStack*(error: ref EthersError): string =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be used anymore?

@@ -3,6 +3,7 @@ import std/tables
import std/uri
import pkg/json_rpc/rpcclient
import pkg/json_rpc/errors
import pkg/stew/byteutils
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import?

Comment on lines 3 to 9
import pkg/chronicles

export basics
export chronicles

logScope:
topics = "ethers signer"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no logging anymore in this PR, so this can probably also be removed?

## nonce is decremented. Disallows concurrent calls by use of AsyncLock.
## Immediately returns if tx already has a nonce or gasLimit.

if not (tx.nonce.isNone and tx.gasLimit.isNone):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although equivalent, this reads a little better, I think:

Suggested change
if not (tx.nonce.isNone and tx.gasLimit.isNone):
if tx.nonce.isSome or tx.gasLimit.isSome:

Comment on lines 132 to 143
var populated = await signer.ensureNonceSequence(transaction)

if transaction.sender.isNone:
if populated.sender.isNone:
populated.sender = some(await signer.getAddress())
if transaction.nonce.isNone:
populated.nonce = some(await signer.getNonce())
if transaction.chainId.isNone:
if populated.chainId.isNone:
populated.chainId = some(await signer.getChainId())
if transaction.gasPrice.isNone and (transaction.maxFee.isNone or transaction.maxPriorityFee.isNone):
if populated.gasPrice.isNone and (populated.maxFee.isNone or populated.maxPriorityFee.isNone):
populated.gasPrice = some(await signer.getGasPrice())
if transaction.gasLimit.isNone:
populated.gasLimit = some(await signer.estimateGas(populated))
if populated.nonce.isNone:
populated.nonce = some(await signer.getNonce())
if populated.gasLimit.isNone:
populated.gasLimit = some(await signer.estimateGas(populated))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit weird that we're estimating gas before setting all properties, such as gasLimit and sender.
I would suggest to either do the call to ensureNonceSequence last, or to combine populateTransaction and ensureNonceSequence into a single method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I've merged ensureNonceSequence into populateTransaction.

@emizzle emizzle merged commit 7eac841 into main Oct 24, 2023
4 checks passed
@emizzle emizzle deleted the feat/cancel-tx-on-estimateGas-failure branch October 24, 2023 23:42
Comment on lines +117 to +125
try:
populated.nonce = some(await signer.getNonce())
populated.gasLimit = some(await signer.estimateGas(populated))
except ProviderError, EstimateGasError:
let e = getCurrentException()
signer.decreaseNonce()
raise e
finally:
signer.populateLock.release()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try:finally should be around all the statements following lock.acquire(). If one of the getAddress() or getChainId() or ... fails, then the lock is stuck forever.

Copy link
Contributor Author

@emizzle emizzle Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #58

@@ -73,17 +95,55 @@ method populateTransaction*(signer: Signer,
if chainId =? transaction.chainId and chainId != await signer.getChainId():
raiseSignerError("chain id mismatch")

if signer.populateLock.isNil:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to initialize this in the constructor.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to this @emizzle ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants