feat: Gas Utils for L1 operations#9834
Conversation
There was a problem hiding this comment.
The scripts look good, but I think the gas utils have a few issues mostly around race conditions and maxFee vs maxPrioFee. We could split up this PR to get the scripts in, while we keep iterating on the gas utils.
Aside from the issues, I'm worried the API for using it makes it much more verbose. It'd be good to have an API where we just pass in the result of simulate, and then the util handles the first send, the monitoring, the retries, etc. Something like:
const { request } = await this.rollupContract.simulate.submitEpochRootProof(txArgs);
return await waitWithRetries(request, walletClient, gasOpts);We should also tweak the e2e test (which maybe we can degrade to a "unit" test that calls startAnvil, using a lower block time?) so that it exercises more scenarios: priority fee spiking instead of just base fee spiking, tx getting dropped (anvil_dropTransaction cheat code is your friend), an older replacement tx getting mined, etc.
| if (isAnvilTestChain(chainId)) { | ||
| dualLog(`Funding validator on L1`); | ||
| const cheatCodes = new EthCheatCodes(rpcUrl, debugLogger); | ||
| await cheatCodes.setBalance(validatorAddress, 10n ** 20n); | ||
| } |
There was a problem hiding this comment.
Should we check balance if we're not on anvil, and emit a noisy log remembering the user to fund the address manually?
| .addOption(pxeOption) | ||
| .description('Gets the information of an Aztec node from a PXE or directly from an Aztec node.') | ||
| .option('--node-url <string>', 'URL of the node.', `http://${LOCALHOST}:8080`) | ||
| .option('--from-node', 'Get the info directly from an Aztec node.', false) |
There was a problem hiding this comment.
Why not decide based on whether node-url or pxe-url is set, instead of having this flag?
There was a problem hiding this comment.
I realize now that's what it actually does in the action below, I forgot to remove this option
| export DEBUG=${DEBUG:-"aztec:*,-aztec:avm_simulator*,-aztec:libp2p_service*,-aztec:circuits:artifact_hash,-json-rpc*,-aztec:l2_block_stream,-aztec:world-state:*"} | ||
| export ETHEREUM_HOST="http://127.0.0.1:8545" | ||
| export ETHEREUM_HOST=${ETHEREUM_HOST:-"http://127.0.0.1:8545"} | ||
| export IS_ANVIL=${IS_ANVIL:-"true"} |
There was a problem hiding this comment.
You can use something along these lines (with a try/catch!) to query whether it is anvil or not, so we don't risk forgetting it:
curl -s -H "Content-Type: application/json" -X POST --data '{"method":"web3_clientVersion","params":[],"id":49,"jsonrpc":"2.0"}' $ETHEREUM_HOST | jq .result | grep -q anvil
| P2P_PORT=$((40401 + i)) | ||
| IDX=$((i + 1)) | ||
|
|
||
| # Use empty string as default if variables are not set |
There was a problem hiding this comment.
Wouldn't it be safer to fail loudly if the private key is not set?
There was a problem hiding this comment.
we don't actually want to fail if it's not set, these would only be set when running against public networks like Sepolia where we need to use funded accounts.
In local testing the keys & addresses are later generated in validator.sh and funded when doing add-l1-validator
There was a problem hiding this comment.
Makes sense! Could you add a comment with that, so next reader doesn't fall into the same?
| // Transaction not found and enough time has passed - might be stuck | ||
| if (Date.now() - lastSeen > config.stallTimeMs && attempts < config.maxAttempts) { |
There was a problem hiding this comment.
IIUC this bumps gas price if nodes drop the tx altogether, but not if it just sits in the mempool without being mined. We should add another condition for that.
| return retry( | ||
| async () => { | ||
| const gas = await estimator(); | ||
| return gas + (gas * this.gasConfig.bufferPercentage) / 100n; | ||
| }, | ||
| 'gas estimation', | ||
| makeBackoff([1, 2, 3]), | ||
| this.logger, | ||
| false, | ||
| ); |
There was a problem hiding this comment.
Why the retry here as opposed to on other rpc calls? If we are seeing flaky rpc endpoints where we need to retry, we should tackle that at a lower level (on viem's "transport" most likely).
| this.logger?.info(`Transaction ${currentTxHash} pending`); | ||
| if (tx) { | ||
| lastSeen = Date.now(); | ||
| await new Promise(resolve => setTimeout(resolve, config.checkIntervalMs)); |
There was a problem hiding this comment.
| await new Promise(resolve => setTimeout(resolve, config.checkIntervalMs)); | |
| await sleep(config.checkIntervalMs); |
| const tx = await this.publicClient.getTransaction({ hash: currentTxHash }); | ||
| this.logger?.info(`Transaction ${currentTxHash} pending`); |
There was a problem hiding this comment.
You can do the same trick as with the receipt here. Instead of doing getTx on every hash you've sent, you can query getTxCount using the pending tag, which should consider the txs that are sitting on the mempool for the given sender.
| data: originalTx.data, | ||
| nonce: originalTx.nonce, | ||
| gas: originalTx.gasLimit, | ||
| maxFeePerGas: newGasPrice, |
There was a problem hiding this comment.
I'm not sure if bumping only maxFeePerGas without touching maxPriorityFeePerGas has the intended effect. If maxPriorityFeePerGas is not set, then viem defaults it to either a chain default or to whatever the json rpc interface returns (see here). So we're not actually bumping the priority fee, we're just setting it to whatever nodes say is the latest value to use, and ignoring the gasPriceIncrease.
Maybe we should be setting maxPriorityFeePerGas as well? I'm thinking we could just always set maxFeePerGas to gasConfig.maxGwei, and then just handle stuck txs by bumping the maxPriorityFeePerGas, since the user never gets charged more than baseFee + priorityFee. We could get the max prio fee from either eth_maxPriorityFeePerGas RPC as viem does, or from a gas price oracle like BlockNative. Then, for every successive retry, we bump the maxPrioFee to the new eth_maxPriorityFeePerGas we see, or to a multiplier of the previous prio fee, whatever's greater.
|
I guess you're not going to choose me as reviewer in the future, sorry for so throwing in many comments. |
68af7ec to
f96b1d8
Compare
spalladino
left a comment
There was a problem hiding this comment.
New API looks a lot cleaner! I left a few more comments after digging through geth.
| "start": "node ./dest/index.js", | ||
| "test": "NODE_NO_WARNINGS=1 node --experimental-vm-modules ../node_modules/.bin/jest --passWithNoTests" | ||
| "test": "NODE_NO_WARNINGS=1 node --experimental-vm-modules ../node_modules/.bin/jest --passWithNoTests", | ||
| "gas": "node ./dest/test_gas_stuff.js" |
| export const deployerAbi = [ | ||
| { | ||
| inputs: [ | ||
| { internalType: 'bytes32', name: 'salt', type: 'bytes32' }, | ||
| { internalType: 'bytes', name: 'init_code', type: 'bytes' }, | ||
| ], | ||
| name: 'deploy', | ||
| outputs: [{ internalType: 'address', name: '', type: 'address' }], | ||
| stateMutability: 'nonpayable', | ||
| type: 'function', | ||
| }, | ||
| ]; |
There was a problem hiding this comment.
Are we using this abi at all? IIRC we had found that the deployer was not abi-compliant, right?
| let publicClient: any; | ||
| let walletClient: any; |
There was a problem hiding this comment.
| let publicClient: any; | |
| let walletClient: any; | |
| let walletClient: WalletClient<HttpTransport, Chain, Account>, | |
| let publicClient: PublicClient<HttpTransport, Chain>, |
Viem can be super annoying with types.
| // Reset base fee | ||
| await publicClient.transport.request({ | ||
| method: 'anvil_setNextBlockBaseFeePerGas', | ||
| params: [initialBaseFee.toString()], | ||
| }); | ||
| await publicClient.transport.request({ | ||
| method: 'evm_mine', | ||
| params: [], | ||
| }); |
There was a problem hiding this comment.
Should we move this to an afterEach block?
| // Get initial priority fee from the network | ||
| let priorityFee = await this.publicClient.estimateMaxPriorityFeePerGas(); | ||
| if (attempt > 0) { | ||
| // Bump priority fee by configured percentage for each attempt | ||
| priorityFee = (priorityFee * (100n + (gasConfig.priorityFeeBumpPercentage ?? 20n) * BigInt(attempt))) / 100n; | ||
| } |
There was a problem hiding this comment.
Heads up that this may return a priorityFee that's lower than the one used in the previous attempt, if there was a sudden drop in the current max priority fee estimated by the public client.
If that happens, the node will reject the tx replacement as "underpriced" when we try to send it. Not just that, but the node also requires that there is a minimum bump.
There was a problem hiding this comment.
Looking at geth's code, there is a minimum bump enforced for replacements for both the priority fee and the max fee per gas. So we'll need to ensure both are bumped wrt the previous tx we had sent. This means we cannot set the maxFee to the config maximum as I had suggested originally (apologies) since we would not be able to bump it at all if that were the case.
There was a problem hiding this comment.
And to make things even more annoying, it seems the replacement threshold is different for blob and regular txs.
There was a problem hiding this comment.
So, to consolidate: we need to at least bump both the maxFeePerGas and the maxPrioFeePerGas by a percentage wrt the previous tx we had sent (ie the one we want to replace), but also we want to take into account the current base fee and priority fee we see on the chain.
| let attempts = 0; | ||
| let lastSeen = Date.now(); | ||
|
|
||
| while (true) { |
There was a problem hiding this comment.
Seems like we're missing a global timeout where we just give up and throw?
| */ | ||
| private async estimateGas(account: Account, request: L1TxRequest, _gasConfig?: L1TxUtilsConfig): Promise<bigint> { | ||
| const gasConfig = { ...this.config, ..._gasConfig }; | ||
| const initialEstimate = await this.publicClient.estimateGas({ account, ...request }); |
There was a problem hiding this comment.
Viem throws here if the tx would revert (or so I believe). It'd be good if we catch that error, extract the revert reason (if possible), and throw it back to the caller in an easily digestable format. There's a prettyLogViemErrorMsg we could use for that.
There was a problem hiding this comment.
it does and it's caught by the propose methods in l1-publisher. I've made #10066 to track this as there's a larger need for consistency in how we handle the error in this case so it's properly recorded in metrics
| // Spike gas price to 3x the initial base fee | ||
| await publicClient.transport.request({ | ||
| method: 'anvil_setNextBlockBaseFeePerGas', | ||
| params: [(initialBaseFee * 3n).toString()], | ||
| }); |
There was a problem hiding this comment.
Can we add another test where we spike the priority fee, instead of the base one? Not sure if anvil supports that.
| // @note We perform this guesstimate instead of the usual `gasEstimate` since | ||
| // viem will use the current state to simulate against, which means that | ||
| // we will fail estimation in the case where we are simulating for the | ||
| // first ethereum block within our slot (as current time is not in the | ||
| // slot yet). | ||
| const gasGuesstimate = computeTxsEffectsHashGas + gasGuess; |
There was a problem hiding this comment.
How did you manage to work around this..?
There was a problem hiding this comment.
I've dded an option to add a fixed buffer when sending a tx so we're still adding the same amount to whatever estimate the tx gets.
There was a problem hiding this comment.
But the catch here is that the gas estimation is done on a different function than the one actually called in the transaction, because the called one would revert.
There was a problem hiding this comment.
huh so it should be failing? I haven't seen this revert at all 🤔
There was a problem hiding this comment.
It should, yeah. Maybe it's working just because of timings, things being slow enough that we don't end up on the first ethereum block?
| // Bump both priority fee and max fee by at least 10% | ||
| priorityFee = (priorityFee * (100n + bumpPercentage * BigInt(attempt))) / 100n; | ||
| maxFeePerGas = (maxFeePerGas * (100n + bumpPercentage * BigInt(attempt))) / 100n; |
There was a problem hiding this comment.
Heads up I think this will not yield the correct bump for the second retry. If the first tx is sent with 100, the first replacement must be sent with 110, and the next replacement will need to go in with 121 (110 * 1.1), not 120.
There was a problem hiding this comment.
Also, this is applying a bump on the current values fetched from the chain, not on the values sent with the previous attempt. I think our logic should be sending a retry with the maximum between the current chain values (bumped by a little extra) and the last attempt (bumped by 10% or whatever % we config). And then ensure that is capped by maxGwei as mentioned above.
|
|
||
| // Get initial priority fee from the network | ||
| let priorityFee = await this.publicClient.estimateMaxPriorityFeePerGas(); | ||
| let maxFeePerGas = gasConfig.maxGwei! * WEI_CONST; |
There was a problem hiding this comment.
This means that the user specified maxGwei will not be the maximum, but the value used for the first attempt.
I'd suggest calculating the maxFeePerGas as baseFee (fetched from the getBlock with some padding, keeping in mind that baseFee is capped to increase no more than 12.5% per block, so based on how fast you plan to replace you know up to how much base fee can climb until then) plus whatever you choose as priority fee. Then, after all bumps, check if you've hit max, and if so, bail.
There was a problem hiding this comment.
right, first part makes sense as we end up bumping the specified maxGwei
confused about the 2nd part with the 12.5% increase. We run this on each attempt so it will fetch the new block's baseFee, why do we need to calculate this 12.5% max increase per block?
There was a problem hiding this comment.
Sorry, re-reading my comment I see I wasn't very clear. What I meant is the following: let's say base fees are increasing consistently block over block, and the priority fee is zero (for simplicity). So if block N is at a base fee of 100, then:
N: 100
N+1: 112
N+2: 127
N+3: 142
Now, let's say that stallTime is 24s (2 L1 blocks). If you sent the tx at block N with a base fee of exactly 100, then your tx will not be elligible for inclusion at N+1, and you'll have to wait until N+2 to read the new base fee and resubmit.
If instead you submit your tx at N with a base fee of 127, you know that your tx will always be elligible until the time you retry, at which point you just bump the priority fee (and the total fee to make geth happy). This way you guarantee that you don't "miss" the opportunity of going in N+1.
| const baseFee = block.baseFeePerGas ?? 0n; | ||
|
|
||
| // Get initial priority fee from the network | ||
| let priorityFee = await this.publicClient.estimateMaxPriorityFeePerGas(); |
There was a problem hiding this comment.
WDYT about throwing in a little extra, to make sure we're picked?
There was a problem hiding this comment.
ah yes, that was the initial method and then ended up under if (attempt > 0)
There was a problem hiding this comment.
hmm actually no, it was the gas cost that's bumped on each attempt, not the fee
There was a problem hiding this comment.
I guess I wonder if we should be bumping by the defined priorityFeeBumpPercentage or a small fixed percentage, like 3%
There was a problem hiding this comment.
I'd avoid hardcoding anything. Gas prices can fluctuate so much that constants become a pain. I recall hardcoding a max fee of 1000gwei in a system at a time where the average fee was below 1, and a few months later it shot past 1500. So I'd say we go with that bump percentage, or introduce a new config variable for that.
By the way, priority fees (these days) are low enough that even a 100% bump over what's reported is reasonable.
| const startAnvil = async (l1BlockTime?: number) => { | ||
| const ethereumHostPort = await getPort(); | ||
| const rpcUrl = `http://127.0.0.1:${ethereumHostPort}`; | ||
| const anvil = createAnvil({ | ||
| port: ethereumHostPort, | ||
| blockTime: l1BlockTime, | ||
| }); | ||
| await anvil.start(); | ||
| return { anvil, rpcUrl }; | ||
| }; |
There was a problem hiding this comment.
There's a import { startAnvil } from '@aztec/ethereum/test' now that should take care of all this
| }); | ||
| }); | ||
| afterAll(async () => { | ||
| // disabling interval mining as it seems to cause issues with stopping anvil |
There was a problem hiding this comment.
Hah! I didn't know that was it! Good find!
| await publicClient.transport.request({ | ||
| method: 'anvil_setNextBlockBaseFeePerGas', | ||
| params: [initialBaseFee.toString()], | ||
| }); |
There was a problem hiding this comment.
WDYT about hiding these methods behind EthCheatCodes?
| // Verify that a replacement transaction was created | ||
| expect(receipt.transactionHash).not.toBe(txHash); |
There was a problem hiding this comment.
I'd add an assertion that the gas price was indeed bumped, just for extra safety
| // Add priority fee to maxFeePerGas | ||
| maxFeePerGas += priorityFee; |
There was a problem hiding this comment.
This should happen before the maxFeePerGas = maxFeePerGas > minMaxFee ? maxFeePerGas : minMaxFee, since the minMaxFee already has the priority fee component accounted for.
Otherwise, if maxFeePerGas gets set to minMaxFee and then we add priorityFee on top of it, we'd be accounting for the priorityFee twice.
Fixes #9833
Also
getNodeInfotoAztecNodeinterface