Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking: mempool anti-entropy protocol #2193

Closed
jcnelson opened this issue Dec 17, 2020 · 9 comments
Closed

Networking: mempool anti-entropy protocol #2193

jcnelson opened this issue Dec 17, 2020 · 9 comments
Assignees
Labels
mempool Mempool related bugs or features

Comments

@jcnelson
Copy link
Member

Right now, Stacks nodes do not synchronize their mempools with one another. They should try to do so, in order to ensure that a transaction sent to one node will eventually reach all potential miners.

There's a lot of different ways to do this, and will need to be researched a bit before being implemented. Will follow up on this issue with proposals.

@jcnelson
Copy link
Member Author

This is high-priority, but #1805 is higher priority.

@jcnelson
Copy link
Member Author

Notes from the architecture meeting:

  • One low-hanging fix we can do to address this is add a "mempool sync" step for when a miner node boots up. Then, it can start mining transactions that are already in the mempool.
  • We need to get a good handle on what causes transactions to not be mined before tackling this. Is it because we do a bad job at replicating transactions? Is it because a missing transaction (or under-replicated transaction) causes dependent transactions to not be mined?
  • We should track time-of-mempool-arrival to time-of-mined-in-a-chain, and from there, see what causes transactions to languish in the mempool. To achieve this, we'll first want to implement DB: make tx_log a runtime setting #2378.

@diwakergupta diwakergupta added the mempool Mempool related bugs or features label Jan 29, 2021
@jcnelson
Copy link
Member Author

Action items:

  • Enable tx_log on all nodes so we can see which transactions get into the blockchain
  • Find out which transactions are pending and have been pending for a long, long time, and factor out the ones that are pending for a different reason

@zone117x
Copy link
Member

zone117x commented Jun 8, 2021

Is anything in place to ensure a newly-synchronized node gets a correct, up-to-date view of the mempool -- comparable to a node that has already been running for a while?

When syncing a node I see logs that appear to show mempool txs being provided to the node, then quickly rejected or disposed of because of the incompatible/old (still syncing) chaintip. Then, once it's caught up to its neighbors, it's missing most or all the mempool txs.

@jcnelson
Copy link
Member Author

jcnelson commented Jun 8, 2021

There isn't an API for this, but it can (and should) be added.

Basically, what you could do is make it so that once the node reaches the chain tip, it goes and gets the sequences of transactions for origin addresses that have recently sent transactions, since these are the ones that are about to be mined. To do this, the node would do the following:

  • Asks a bootstrap node for the (paginated) list of origin addresses that have sent a transaction in the past 2 hours or so.
  • For each address, request all transactions from it that have not yet been mined

Each of these would be an API endpoint. To keep things simple, you could just have the p2p thread pull out rows from mempool.sqlite directly. But, it's going to be important to make sure the query time is bounded. Just a bit of a design sketch:

  • The first one could be something like GET /v2/mempool/origins/{:min_time}/{:time_range}, which takes the minimum accept_time value and the range of time you'd be interested in, in seconds (e.g. 7200 for two hours), and returns the distinct origin_address values from the mempool whose accept_times fall into the range [min_time, min_time + time_range). To keep query times low, it would return at most a fixed number of addresses (e.g. 4096), plus the maximum accept_time, so the caller could page through lots of origin_addresses. For example, the caller might call GET /v2/mempool/origins/10000/7200, and then if the 4096'th row has an accept time of 11000, the caller would follow the query up with GET /v2/mempool/origins/11000/7200. The caller would keep going until min_time is the current time.

  • The second API call could be something like GET /v2/mempool/origins/{:origin_address}/transactions, which would return the list of unconfirmed transactions for that origin address. There will be at most MEMPOOL_TX_CHAINING entries, so its row count is always bound above.

  • You could offer a similar pair of API calls for sponsor addresses

@314159265359879
Copy link

314159265359879 commented Jan 15, 2022

image

I see these blocks with few transactions in them again. I know this can be caused by Bitcoin forking and STX miners being on the "wrong" chain. And onstacks.com/mining seems to have evidence for that (blocks with no or very few miners):
image

I am wondering though if it can also have something to do with the sheer amount of transactions in the mempool there are over 10,000 now. This is a copy from stxstats.co:
image

Perhaps it is a coincidence but I see blocks with few transactions often when the mempool has more than an average amount of transactions. Is it possible miners have a hard time finding transactions that can be mined (I suspect there will also be transactions that can not be mined because of leather-io/extension#2129 and leather-io/extension#1991 .

Would the mempool anti-entropy protocol be a way to battle this (suspected) problem?

@314159265359879
Copy link

A bit more data:
On the 14th of January about 19k transactions were processed, this started with a mempool at less then 1k
On the 15th of January about 16k transactions were processed, this started the day with a mempool at 10k
Data from stxstats.co

If the miners have more transactions to pick from I would expect more transactions to be processed if there are more transactions in the mempool to pick from (more opportunity for optimal filling of blocks?) but it is not the case. Leading me to believe something may be up with the miner logic: further improvement is possible.

@314159265359879
Copy link

stacksonchain.btc says stxstats.co may be overestimating the number of transactions. And that does show a less profound difference between the 14th and the 15th respectively 12.8k transactions and 12.3k transactions.
https://stacksonchain.com/dashboards/Transactions-per-Day/14

@diwakergupta diwakergupta moved this from New Issues to In progress in Stacks Blockchain Board Feb 1, 2022
@diwakergupta
Copy link
Member

Addressed with #2884

Stacks Blockchain Board automation moved this from In progress to Done Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mempool Mempool related bugs or features
Projects
Development

No branches or pull requests

4 participants