Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST API: Check for pending transactions in most recent rounds first. #3836

Merged
merged 5 commits into from
Apr 19, 2022

Conversation

winder
Copy link
Contributor

@winder winder commented Mar 28, 2022

Summary

More efficient check for pending transactions. In most cases, the check will be done right away, so it would be in a recent round. The check starts at the oldest round, which is the opposite of what would normally happen.

Test Plan

Existing unit tests.

@winder winder self-assigned this Mar 28, 2022
barnjamin
barnjamin previously approved these changes Mar 28, 2022
Copy link
Contributor

@barnjamin barnjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks!

@winder winder changed the title Check for pending transactions in most recent rounds first. REST API: Check for pending transactions in most recent rounds first. Mar 28, 2022
algorandskiy
algorandskiy previously approved these changes Mar 28, 2022
@barnjamin
Copy link
Contributor

Before merging, why do we not bail here? https://github.com/algorand/go-algorand/blob/master/node/node.go#L643

@winder
Copy link
Contributor Author

winder commented Mar 30, 2022

Before merging, why do we not bail here? https://github.com/algorand/go-algorand/blob/master/node/node.go#L643

@algorandskiy do you happen to know how likely (or how long) a transaction might exist in both the txn pool and be committed to a round? If that case is common, I could understand why we search the ledger when a txn is pending, otherwise it seems pretty wasteful.

@winder winder requested a review from gmalouf March 30, 2022 14:55
@algorandskiy
Copy link
Contributor

Will I think the comment above makes sense - the txn not in the ledger when looking there, then the thread gets suspended, new block assembled and written, txn gone from the pool, then thread resumed, looking up the pool - no such txn id found.

@winder
Copy link
Contributor Author

winder commented Mar 30, 2022

Will I think the comment above makes sense - the txn not in the ledger when looking there, then the thread gets suspended, new block assembled and written, txn gone from the pool, then thread resumed, looking up the pool - no such txn id found.
@algorandskiy That race condition only exists if you check the ledger first, followed by the tx pool.

Since we check the tx pool first, why not just return if the tx is found? Basically, when an unconfirmed transaction is in the tx pool, we STILL check the last 1000 rounds even though we could return immediately.

@algorandskiy
Copy link
Contributor

We need to check ledger to return a round number at which it was persisted.

@barnjamin
Copy link
Contributor

What would be the consequence of returning the transaction we find in the pending pool right away? Is there a risk involved in returning a transaction that says it hasn't been confirmed yet?

@algorandskiy
Copy link
Contributor

It is up to our REST clients (the indexer, SDKs users etc). I think in general users want to know the txn they submit is confirmed (and it is in some block X).

@winder
Copy link
Contributor Author

winder commented Mar 30, 2022

@barnjamin I think the short answer is that we are not confident enough about the operation order to know if we can exit early or not.

@Blackglade
Copy link

It is up to our REST clients (the indexer, SDKs users etc). I think in general users want to know the txn they submit is confirmed (and it is in some block X).

I strongly disagree. This feels like preemptively trying to optimize the API UX. Why wouldn’t a user just query both the pending txns and the txns in the latest block and filter client side for their txn id? As a dev I’d rather be exposed to a faster API I can query repeatedly than just having it just wait around to find the confirmed txns.

@urtho
Copy link
Contributor

urtho commented Apr 1, 2022

It is up to our REST clients (the indexer, SDKs users etc). I think in general users want to know the txn they submit is confirmed (and it is in some block X).

With Humble & Folks + some crazy TX committers hitting Algonode with 200+ /pending/{txid} requests per second we were forced to either rate limit the endpoint or risk the early return on TX POOL hit and cut the CPU usage by x100 (for this endpoint).

The only 2 usage patterns for TX POSTing we've observed:

  • POST the TX
  • repeat GET /v2/transactions/pending/{txid} @ algod till it returns 404 or committed

There are zero cases where /pending/{txid} is repeated for more than 2 rounds.

  • POST the TX
  • repeat /v2/transactions/{txid} @ indexer until 200 OK or a timer (like 3 rounds) elapses

The second pattern seems to be the only "legal" way on algoexplorer's endpoints.
Probably due to load-balancer sending the query to random nodes.

Anyways - Algonode is running alogd with both reverse order search and early return on TX POOL hit patches.
In case of TX POOL hit the latency dropped from 600ms to ~6ms. Same for recently commited TX.

Considered a patch that scans recent 2 rounds instead 1k in case of TX POOL hit but found no rationale for that.
But I am quick to make decisions on things I know little about. ;)

If the code is used only for this endpoint and the usage is as above then it was all I needed to proceed.

@barnjamin
Copy link
Contributor

If we're still concerned about the txnPool holding transactions that have been confirmed maybe we can just set the maxLife value to some value representing the max number of rounds the transaction may stay in pending?

diff --git a/node/node.go b/node/node.go
index 262c50bb..63577dd1 100644
--- a/node/node.go
+++ b/node/node.go
@@ -645,11 +645,15 @@ func (node *AlgorandFullNode) GetPendingTransaction(txID transactions.Txid) (res
 
        var maxLife basics.Round
        latest := node.ledger.Latest()
-       proto, err := node.ledger.ConsensusParams(latest)
-       if err == nil {
-               maxLife = basics.Round(proto.MaxTxnLife)
+       if found {
+               maxLife = 3 // Max 3 round lookback if the transaction is stil in pending
        } else {
-               node.log.Errorf("node.GetPendingTransaction: cannot get consensus params for latest round %v", latest)
+               proto, err := node.ledger.ConsensusParams(latest)
+               if err == nil {
+                       maxLife = basics.Round(proto.MaxTxnLife)
+               } else {
+                       node.log.Errorf("node.GetPendingTransaction: cannot get consensus params for latest round %v", latest)
+               }
        }
        maxRound := latest
        minRound := maxRound.SubSaturate(maxLife)

@barnjamin barnjamin dismissed stale reviews from algorandskiy and themself via 47f5f74 April 12, 2022 23:00
@codecov-commenter
Copy link

codecov-commenter commented Apr 12, 2022

Codecov Report

Merging #3836 (47f5f74) into master (7126597) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3836   +/-   ##
=======================================
  Coverage   49.96%   49.97%           
=======================================
  Files         392      392           
  Lines       68378    68378           
=======================================
+ Hits        34167    34169    +2     
- Misses      30466    30472    +6     
+ Partials     3745     3737    -8     
Impacted Files Coverage Δ
node/node.go 23.26% <ø> (-1.85%) ⬇️
network/wsNetwork.go 62.79% <0.00%> (ø)
data/transactions/verify/txn.go 45.02% <0.00%> (+0.86%) ⬆️
data/abi/abi_type.go 88.62% <0.00%> (+0.94%) ⬆️
catchup/service.go 70.12% <0.00%> (+1.23%) ⬆️
ledger/tracker.go 76.82% <0.00%> (+2.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7126597...47f5f74. Read the comment docs.

node/node.go Outdated Show resolved Hide resolved
cce and others added 3 commits April 13, 2022 09:21
Co-authored-by: Ben Guidarelli <ben.guidarelli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants