Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exhausted speculative trx not retried #1247

Closed
heifner opened this issue Jun 7, 2023 · 4 comments · Fixed by #1319 or #1337
Closed

Exhausted speculative trx not retried #1247

heifner opened this issue Jun 7, 2023 · 4 comments · Fixed by #1319 or #1337
Assignees
Labels
bug Something isn't working 👍 lgtm
Milestone

Comments

@heifner
Copy link
Member

heifner commented Jun 7, 2023

When running in non-default --read-mode speculative, an exhausted trx is not retried until a block is received. A test failure where no new blocks arrived illustrated this: https://github.com/AntelopeIO/leap/actions/runs/5202501518

This condition should not be hit for read-mode=head as an exhausted trx is an exhausted block as the billed cpu/net are reset at the end of each trx.

Currently we have:

                                       if (!self->process_incoming_transaction_async(result, api_trx, return_failure_traces, next)) {
                                          if (self->_pending_block_mode == pending_block_mode::producing) {
                                             self->schedule_maybe_produce_block(true);
                                          } else {
                                             self->restart_speculative_block();
                                          }
                                       }

Where process_incoming_transaction_async returns false for an exhausted block. Instead process_incoming_transaction_async should check self->_pending_block_mode == pending_block_mode::producing and return false if exhausted block and producing but return false if not producing and either trx is exhausted or block is exhausted.

@heifner heifner added the bug Something isn't working label Jun 7, 2023
@BenjaminGormanPMP BenjaminGormanPMP added this to the Leap 3.2.4 milestone Jun 8, 2023
@greg7mdp greg7mdp self-assigned this Jun 20, 2023
@greg7mdp
Copy link
Contributor

greg7mdp commented Jun 20, 2023

@kevin, in the last sentence of the description above, you wrote should return false for both cases. which one is wrong?

Also, in the 4.0 release notes I see:

read-mode=speculative remains supported for users who want old behavior but with an important change that transactions submitted via API endpoints will not be reapplied speculatively upon a new block being applied. The user is required to resubmit any transactions after a new block should those transactions' side effects need to be observed.

So should we still retry exhausted transactions in speculative move? When? Why would it be more likely to succeed when we retry it?

@heifner
Copy link
Member Author

heifner commented Jun 20, 2023

The retry talked about in the release notes is the api-persisted-trx option. See https://github.com/AntelopeIO/leap/blob/v3.2.3/plugins/producer_plugin/producer_plugin.cpp#L788 and persist_until_expired implementation in producer_plugin. This was a feature of API RPC trxs that they would be re-applied at the start of each block until they expired or were included in a block from a block producer. This allowed, for example, a user to submit a create account trx followed by a transfer to that account; this would work as long as the trxs were processed in order of sending (which is not guaranteed).

The retry talked about in this issue is the retry that happens for trxs when they do not fit into a block. When near a block boundary (CPU or NET), if the transaction exceeds the block limit it is retried in the next block. It will be more likely to succeed because it is likely to not hit the block boundary again.

Fun fact: if you run with a max-transaction-time set to larger than block time (500ms) and the transaction runs longer than block time (500ms), it will be retried over and over until it expires.

@greg7mdp
Copy link
Contributor

Thanks a lot Kevin, this helps a lot!

What about the first question: in the last sentence of the description above, you wrote should return false for both cases. which one is wrong?

@heifner
Copy link
Member Author

heifner commented Jun 20, 2023

If _pending_block_mode != pending_block_mode::producing then an exhausted trx should be seen as an exhausted block since you want to retry the transaction. This should not happen when in head mode, but doesn't hurt to have the same logic no matter the mode. The key thing is to treat an exhausted trx as an exhausted block when not producing. When producing you want to fit as many trx in a block that you can. If you are not producing, then the block is only a side-effect of how trx processing is implemented; just restart the speculative block.

greg7mdp added a commit that referenced this issue Jun 21, 2023
[3.2] Report transaction failed if trx was exhausted in non-producing mode
greg7mdp added a commit that referenced this issue Jun 22, 2023
[3.2 -> 4.0] Report transaction failed if trx was exhausted in non-producing mode (GH 1247)
greg7mdp added a commit that referenced this issue Jun 22, 2023
[4.0 -> main] Report transaction failed if trx was exhausted in non-producing mode (GH 1247)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment