New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[p2p] Introduce node rebroadcast module #21061
[p2p] Introduce node rebroadcast module #21061
Conversation
Future workto be addressed either in this PR or a follow up:
Merge Plan & Next Steps:All of the functionality in this PR is hidden behind a configuration switch that defaults to off, and the wallet rebroadcast logic is currently untouched. The idea is to make these changes as safe as possible to merge into master, and allow reviewers to easily enable and observe the new rebroadcast mechanism in today’s network conditions. If we build confidence that these changes are safe and desirable, we can default enable this new node rebroadcast logic and disable the legacy wallet rebroadcast logic. This is the desired end goal. Project History:A version of these changes were originally proposed in #16698. #18038 broke out a subset of the functionality, enabling the node to provide a guarantee around minimal initial broadcast & reducing the frequency of the existing wallet rebroadcast functionality (unbroadcast set) Since #16698, there have been some changes in approach, and all of the unbroadcast conversation is no longer relevant. So, to help with review, I've decided to open a new PR. To preserve the relevant feedback and concerns, I've tried my best to capture the history of conversations in the following write-up: https://github.com/amitiuttarwar/bitcoin-notes/blob/main/rebroadcast-history.md. warning: it goes quite in depth. 🙌🏽 Huge shout out to the village of core contributors who have provided feedback & guidance on this project so far. Honestly too many to name, but this project is much better for it. Thank you :) |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
0994ead
to
4171930
Compare
marking as draft until I resolve CI issues |
4171930
to
6a24a6c
Compare
6a24a6c
to
3766479
Compare
68eba7b
to
47060cf
Compare
rebased master, added a functional test, some small fixups. this PR is ready for review! 🎈 |
47060cf
to
1da2dc8
Compare
rebased |
1da2dc8
to
4d16cad
Compare
…o make mempool transaction 1363b6c [doc / util] Use comments to clarify time unit for int64_t type. (Amiti Uttarwar) 47a7a16 [util] Introduce a SetMockTime that takes chrono time (Amiti Uttarwar) df6a5fc [util] Change GetMockTime to return chrono type instead of int (Amiti Uttarwar) a2d908e [test] Throw error instead of segfaulting in failure scenario (Amiti Uttarwar) 9a3bbe8 [test] Introduce a unit test helper to create a valid mempool transaction. (Amiti Uttarwar) Pull request description: Some miscellaneous improvements that came up when working on #21061 - The first commit is a helper to make valid mempool transactions & submit via ATMP. Introducing in this PR, using in #21061. - The second commit is a small improvement in `miner_tests.cpp` that uses `BOOST_REQUIRE_EQUAL` to properly terminate the program instead of segfaulting in the failure scenario where the blocks do not include the expected number of transactions. - The third commit changes the function signature of `GetMockTime()` to return a chrono type. - The fourth & fifth commit overload `SetMockTime` to also accept chrono type, and adds documentation to indicate that the `int64_t` function signature is deprecated. ACKs for top commit: vasild: ACK 1363b6c Tree-SHA512: c72574d73668ea04ee4c33858f8de68b368780f445e05afb569aaf8564093f8112259b3afe93cf6dc2ee12a1ab5af1130ac73c16416132c1ba2851c054a67d78
…elper to make mempool transaction 1363b6c [doc / util] Use comments to clarify time unit for int64_t type. (Amiti Uttarwar) 47a7a16 [util] Introduce a SetMockTime that takes chrono time (Amiti Uttarwar) df6a5fc [util] Change GetMockTime to return chrono type instead of int (Amiti Uttarwar) a2d908e [test] Throw error instead of segfaulting in failure scenario (Amiti Uttarwar) 9a3bbe8 [test] Introduce a unit test helper to create a valid mempool transaction. (Amiti Uttarwar) Pull request description: Some miscellaneous improvements that came up when working on bitcoin#21061 - The first commit is a helper to make valid mempool transactions & submit via ATMP. Introducing in this PR, using in bitcoin#21061. - The second commit is a small improvement in `miner_tests.cpp` that uses `BOOST_REQUIRE_EQUAL` to properly terminate the program instead of segfaulting in the failure scenario where the blocks do not include the expected number of transactions. - The third commit changes the function signature of `GetMockTime()` to return a chrono type. - The fourth & fifth commit overload `SetMockTime` to also accept chrono type, and adds documentation to indicate that the `int64_t` function signature is deprecated. ACKs for top commit: vasild: ACK 1363b6c Tree-SHA512: c72574d73668ea04ee4c33858f8de68b368780f445e05afb569aaf8564093f8112259b3afe93cf6dc2ee12a1ab5af1130ac73c16416132c1ba2851c054a67d78
src/miner.cpp
Outdated
if (min_package_fee_rate) { | ||
// Compare package fee rate and potentially update new minimum | ||
CFeeRate newFeeRate(packageFees, packageSize); | ||
if (newFeeRate < *min_package_fee_rate) *min_package_fee_rate = newFeeRate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier in this function, there's code to skip transactions if they're too large for the block. I think this can result in the mempool containing txs with fee rates of: 120 s/vb, 119 s/vb, 118 s/vb, 117 s/vb, 115 s/vb, 110 s/vb, 105 s/vb, 100 s/vb, 95 s/vb, 80 s/vb, 79 s/vb, ...
and the block containing txs with fee rates of: 120, 119, 118, 117, 95, 80
-- because the 110-100 s/vb txs just didn't fit in the block, but the 95 and 80 s/vb txs happened to be tiny. This would then incorrectly result in txs with fee rates above 80 s/vb being rebroadcast, whereas really only txs with fee rates above 117 s/vb should be.
The functionality introduced in this commit is currently unused. The ability to calculate the minimum fee rate for a transaction to be included into a block is used to identify the top of the mempool, and later applied to filter the rebroadcast candidates before sending them to peers.
Every minute, calculate the threshold fee rate for a transaction to be considered at the top of the mempool, and cache this value. Apply this as a condition when identifying rebroadcast candidates.
Introduce a boost::multi_index to track our previous rebroadcast attempts. We use this information primarily to ensure we don't exceed a maximum number of rebroadcast attempts. This allows transactions to genuinely expire from the majority of mempools. We disable automatic rebroadcasts because the conceptually desired behavior should be that after a certain amount of time, only the source wallet should rebroadcast the transaction. (Of course, we cannot enforce this on a decentralized network, but design the default behavior to uphold this idea.)
Since the attempt tracker is implemented as a unique pointer, it should not be copied. Delete the special member functions that would allow copying the TxRebroadcastHandler, and thus the unique pointer member variable (copy constructor and copy assignment operator).
…asted. When a transaction hits the maximum on the attempt tracker, add it to a bloom filter. Don't rebroadcast transactions present in the bloom filter. This ensures that we won't repeatedly rebroadcast transactions that have hit the MAX_REBROADCAST_COUNT, even if there are more transactions than can fit in m_attempt_tracker at once.
Ensure the index does not grow unbounded, and delete entries that were last updated >3 months ago.
…ions Depending on the reason a transaction is removed from the mempool, it can be extremely unlikely for it to be able to re-enter. Some examples are if it was mined into a block or replaced using RBF. In these circumstances, we can remove the transaction entry from the rebroadcast tracker. Under an unlikely circumstance where the transaction does re-enter the mempool and gets rebroadcast, it will simply be re-added to the attempt tracker.
Includes adding a helper to the shared unit test utilities to create a valid transaction and submit to the mempool. update test to use normal key generation
We test that when a block comes in, we rebroadcast missing transactions based on time and fee rate.
b927ab9
to
9b757c2
Compare
I've been simulating the behavior of this logic using historical transaction and block data, and I have some general concept thoughts:
Taking the above into account, I'd propose this approach: a) Any given node should rebroadcast way less often -- perhaps each node could attempt to rebroadcast on the order of once every 24 hours or every 48 hours on average, whether on a poisson timer or a random offset. In a network with 10k listening nodes this is already a lot of rebroadcasting (10,000 / (24*60) is about 7 such nodes attempting rebroadcast every minute, not even taking into account the non-listening nodes on the network which will also be doing this!). b) If we do (a) then we've probably greatly mitigated bandwidth spikes on the network, but I'd go further and limit the number of transactions that are ever selected to be rebroadcast at some small value like 25 or 50, to ensure we don't ever try to re-announce several hundred (or more) transactions onto the network at once. c) I think the logic in this PR could use some refining to avoid edge-case behavior when we might rebroadcast a lot of transactions that are not valuable to rebroadcast. These first two are cases that I've seen so far in my simulations, and the last one is a conjecture:
d) Given the issue described above with
e) I think that we should only rebroadcast transactions to peers that are using wtxidrelay (BIP 339). Bitcoin Core software that predates wtxidrelay will not cache validation failure for segwit transactions that don't make it into the mempool. This means that if a transaction doesn't propagate for some random policy reason (eg feerate), then the peer would still try download it from every connection that announces it. As this PR is drafted now, where all nodes are on the same announcement timer, that means a lot of wasted transaction download bandwidth (not just wasted INV bandwidth!) whenever a transaction that doesn't get accepted is rebroadcast. This would be mitigated by randomizing rebroadcasts and reducing the frequency, but I think we might as well just limit this to wtxidrelay peers as well for an extra safety margin. My simulations of this PR on historical data are still ongoing and I don't have a good summary yet, but please let me know if there are statistics or other specific questions that would be helpful for us to consider (I have a few ideas in mind but I'm still gathering my thoughts). |
I had one other thought about how we select transactions for rebroadcast; I think it could be problematic to base transaction rebroadcast solely on how transactions in our mempool compare to transactions that we would select in Imagine that you are running a somewhat older node, and there have been releases of Bitcoin Core that enable new standard transaction types that others are using. Then your node will not be aware of those new transactions, and it's possible that only looking at the set of transactions that you know to be standard would result in rebroadcasting transactions from your mempool that are actually quite a bit below the feerate that the rest of the network thinks is needed to be confirmed soon. (Or really there could be any reason that miners might have access to better feerate transactions than a single node does -- there's still no point in aggressively rebroadcasting transactions that are worse than what we see are being confirmed.) This problem would also be largely mitigated by making this behavior occur much less frequently and limiting the number of transactions that we'd consider rebroadcasting at once, but I think the heuristic of calling |
Marking this PR as draft- |
I was searching for privacy related PRs to review and write tests using PowerShell scripts for my project. Found this PR in the search results which I had already reviewed in March 2021, however I don't think writing test for it makes sense based on last two comments. Also agree with the things mentioned by Suhas Daftuar.
Hoping new approach will be better and we soon improve privacy in rebroadcast mechanism. Thanks for working on this. |
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
Closing this for now. It may be reopened at a later point. |
Status
(this section will be updated as the PR progresses, and eventually removed)
Motivation:
Our legacy rebroadcast mechanism lives in the wallet code. It rebroadcasts only & all transactions which are “mine”, not discerning if the fee rate indicates the transaction should actually have been confirmed by now. This is bad for privacy because it leaks information that allows spy nodes to link bitcoin addresses with IP addresses, a relationship we aim to obfuscate.
PR Overview:
This PR introduces a rebroadcast mechanism in the node. Instead of only rebroadcasting our own transactions, we will notify the network about any transactions we identified as missing from blocks based on fee rate expectations in our own mempool.
The new module is currently behind a configuration flag that defaults to off.
The end goal is to enable node rebroadcast by default, and remove wallet rebroadcast. This would improve privacy for a few main reasons:
Approach:
Conceptually, we want to rebroadcast transactions that we believe “should” have been mined by now. Since we expect miners to prioritize transactions with the highest package fee rates, we select high fee rate transactions that have been in our mempool for some time, but have not yet been mined.
This PR introduces a
txrebroadcast
module that encapsulates the selection logic. WhenPeerManager
gets notified that we have received and processed a block, we trigger the rebroadcast logic. The module calculates the highest fee rate mempool packages that meet some additional conditions- the transaction must be > 30 minutes old, and surpass a fee threshold. This threshold is calculated by periodically (every minute) identifying the top block of transactions in our mempool, and caching the fee rate for a package to be included. We eliminate any of these candidates that we have rebroadcasted recently (last 4 hours), or already rebroadcast more than a maximum amount of times (6). We store any remaining candidates on each peer’ssetInventoryTxToSend
, which they will potentially relay next time we hitSendMessages
for that peer (subject to general transaction relay conditions).