Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Mempool: rework rebroadcast logic to improve privacy #16698

Open
wants to merge 7 commits into
base: master
from

Conversation

@amitiuttarwar
Copy link
Contributor

commented Aug 23, 2019

The current rebroadcast logic is terrible for privacy because only the source wallet will rebroadcast transactions, and does so quite frequently. This PR aims to improve privacy dramatically while preserving the benefits of rebroadcasting that ensure txns are successfully propagated through the network.

This PR introduces changes so nodes will resend transactions that it believes should have already been mined. It extracts the logic from the wallet into the mempool, so nodes will rebroadcast txns regardless of the originating wallet. Txns are defined as "should have been mined" by using the block assembler logic, and excluding txns that were recently added to the mempool. The wallet will resubmit txns to the mempool on a regular cadence to ensure those txns aren't dropped (due to eviction, expiry, etc..) before being confirmed.

For more information, see: https://gist.github.com/amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba

@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #16756 (test: Connection eviction logic tests by mzumsande)
  • #16688 (log: Add validation interface logging by jkczyz)
  • #16341 (Introduce ScriptPubKeyMan interface and use it for key and script management (aka wallet boxes) by achow101)
  • #15759 (p2p: Add 2 outbound block-relay-only connections by sdaftuar)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@amitiuttarwar amitiuttarwar force-pushed the amitiuttarwar:rebroadcast branch 2 times, most recently from 9e72fb0 to 13fd122 Aug 23, 2019

[mining] add recency filter on block creation to get rebroadcast set
To reduce noise, apply a filter to ensure recent txns are not rebroadcast
[p2p] implement mempool rebroadcast functionality
Update rebroadcast logic to live in the mempool & have two
fundamental differences from the existing wallet logic:
1. Apply to all transactions (not just mine)
2. Rebroadcast txns expected to have been mined by now.

The main reasoning behind these changes are to improve privacy.

@amitiuttarwar amitiuttarwar force-pushed the amitiuttarwar:rebroadcast branch from 13fd122 to c383bde Aug 23, 2019

@amitiuttarwar amitiuttarwar force-pushed the amitiuttarwar:rebroadcast branch from c383bde to ba33693 Aug 23, 2019

@amitiuttarwar

This comment has been minimized.

Copy link
Contributor Author

commented Aug 23, 2019

This code is ready for initial review.

there are still some to dos before it would be ready for merge:

  • identify expected rebroadcast traffic & worst case bandwidth usage.
  • persist setUnbroadcastTxIDs to mempool.dat
  • add functionality to run an automated job to cache min fee rate for txns to be included in block, then apply that filter to exclude txns with fee rate < min from rebroadcast set. this will reduce rebroadcast noise in scenarios where the mempool is emptying out.

there are also some follow-ups that can be addressed in separate PRs:

  • m_best_block_time is no longer used & can be removed & the wallet no longer needs to subscribe to UpdatedBlockTip() validation interface method
  • functionality to mark a peer (as "local" or such) so the mempool would still enforce initial broadcast for transactions received from one of these peers.
@jnewbery

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

Concept ACK

[test] Test rebroadcast functionality
Add functional tests to ensure that rebroadcast behavior works
as expected, namely that only txns at the top of the mempool
(according to block assembler logic) are rebroadcast, and txns
that are recent are excluded from the set.
[wallet] update wallet to resubmit to node rather than rebroadcast
With the new functionality added to the mempool, the wallet
no longer needs to rebroadcast transactions to peers directly.
Instead, it resubmits txns to the nodes once a day in case they
were dropped (expired, evicted, etc.) from the local mempool
before being confirmed.

@amitiuttarwar amitiuttarwar force-pushed the amitiuttarwar:rebroadcast branch from ba33693 to b1d0a54 Aug 23, 2019

@fanquake fanquake added the P2P label Aug 24, 2019

[mempool] mempool tracks wallet txns & ensures successful initial bro…
…adcast

Since the newly implemented functionality only rebroadcasts txns at
the top of the mempool, its possible to hit a case where if a txn was
submitted locally & not immediately relayed, it would take a long time
to be included in the rebroadcast set & ever succesfully be initially
broadcast.

To prevent this case, the mempool keeps track of `setUnbroadcastTxIDs`,
and deems the txn relay successful when an associated GETDATA message is
received. On the rebroadcast timer, txns from this set are added to the
txns being relayed.

@amitiuttarwar amitiuttarwar force-pushed the amitiuttarwar:rebroadcast branch from b1d0a54 to 4bda142 Aug 24, 2019

@practicalswift

This comment has been minimized.

Copy link
Member

commented Aug 24, 2019

Concept ACK

@MarcoFalke
Copy link
Member

left a comment

Concept ACK. Some questions about why microseconds are truncated to seconds among other stuff.

/** Return a timestamp in the future (in microseconds) for exponentially distributed events. */
int64_t PoissonNextSend(int64_t now, int average_interval_seconds);

/** Wrapper to return mockable type */
inline std::chrono::seconds PoissonNextSend(std::chrono::seconds now, int average_interval_seconds)

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 25, 2019

Member

in commit 25a3b0a:

The legacy signature of this function takes as argument type and return type microseconds. Can you explain why this one is different?
Note that you are allowed to pass in std::chrono::seconds when the function takes std::chrono::microseconds.

inline std::chrono::seconds PoissonNextSend(std::chrono::seconds now, int average_interval_seconds)
{
int64_t now_micros = (std::chrono::duration_cast<std::chrono::microseconds>(now)).count();
return std::chrono::duration_cast<std::chrono::seconds>(std::chrono::microseconds{PoissonNextSend(now_micros, average_interval_seconds)});

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 25, 2019

Member

same commit:

Can you explain how the cast to seconds has an effect on the distribution? It appears that the most likely return value will be exactly 0? Alternatively, I'd rather return microseconds just like the existing function.

This comment has been minimized.

Copy link
@amitiuttarwar

amitiuttarwar Aug 26, 2019

Author Contributor

ah my bad, will fix

based on this tip

Note that you are allowed to pass in std::chrono::seconds when the function takes std::chrono::microseconds.

I'm thinking of updating the function signature to both take in & return microseconds. And the caller can pass through seconds when needed. I'm interested in moving all the poisson invocations to the chrono in a follow up PR. Its less of a gotcha since chrono requires the duration to be explicit, but it would be nice for it to be consistent.

@@ -135,6 +135,7 @@ class BlockAssembler
bool fIncludeWitness;
unsigned int nBlockMaxWeight;
CFeeRate blockMinFeeRate;
int64_t nMaxTxTime;

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 25, 2019

Member

In commit fd92d22:

I'd prefer if new members are prefixed with m_ and used snake case according to the dev notes. This makes it easier to guess from the variable name if something is a member, a global, or just a symbol in the local scope. Additionally, I'd prefer to use std::chrono::seconds (or whatever the type is) for those reasons:

  • It documents the type for reviewers
  • It enforces the type at compile time and prevents unwanted casts
  • It documents that the time is mockable. (I know that the memepool currently is mockable and uses the legacy GetTime() function and types, but at least new code should use the new GetTime<>() functions and types.)

LOCK(cs);
for (const auto& tx : pblocktemplate->block.vtx) {
if (mapTx.find(tx->GetHash()) == mapTx.end()) continue;

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 25, 2019

Member

in commit af38c70

Can you add a comment why this would ever be true, otherwise remove the dead code.

This comment has been minimized.

Copy link
@amitiuttarwar

amitiuttarwar Aug 26, 2019

Author Contributor

I added it as a safeguard. Do you feel confident it would never happen? If so, I will take another careful look at the logic to build my own confidence & remove.

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 26, 2019

Member

Even if it would, any follow-up logic that deals with the returned set needs to be robust against missing txs anyway.

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Aug 26, 2019

Member

Note that ::mempool.cs is released as soon as this method returns.

@mzumsande
Copy link
Contributor

left a comment

Txns are defined as "should have been mined" by using the block assembler logic

I don’t understand 1) the concept “should have been mined”, and 2) how the block assembler logic achieves this.
As to 1) do you mean the txns should have been mined in a specific block/range of blocks, but weren’t? Should no rebroadcasts happen in an ideal world where miners have an identical mempool to ours and mine rationally?
As to 2) from what I understand, BlockAssembler creates a block template with 3/4*MAX_BLOCK_WEIGHT including the top feerate packages of our current mempool. Wouldn’t it always fill this block template with txns if our mempool is large enough, and therefore rather include 75% of the txns that we expect to be mined in the next block, instead of txns that should have been mined in the past?


// also ensure inclusion of wallet txns that haven't been successfully broadcast yet
// since set elements are unique, this will be a no-op if the txns are already in setInventoryTxToSend
pto->setInventoryTxToSend.insert(mempool.setUnbroadcastTxIDs.begin(), mempool.setUnbroadcastTxIDs.end());

This comment has been minimized.

Copy link
@mzumsande

mzumsande Aug 26, 2019

Contributor

If we add wallet transactions to the rebroadcast INV that have a smaller feerate than the top of our mempool, wouldn’t there be a feerate gap to the other INVs of the message, making the wallet transactions easily identifiable as such and reducing privacy?

This comment has been minimized.

Copy link
@amitiuttarwar

amitiuttarwar Aug 26, 2019

Author Contributor

correct. I will add a comment to document.

this logic should only trigger when a user submits a txn locally and it doesn't get initially relayed.. an example where this would be needed.. a user submits a low fee rate txn with p2p disabled, inspects it in the local mempool, and enables p2p. currently, the txn would get initially relayed due to the wallet rebroadcast logic. with the proposed changes (w/out this additional mempool-force-relays-unbroadcast-txns mechansim), the node could have to be running at a pretty specific time (when the mempool is clearing out and mining low fee rate txns) in order for the txn to ever initially get broadcast. thus, setInventoryTxToSend.

In terms of privacy guarantees (or lack thereof) it mimics the current behavior. If you have any suggestions for how we could improve while preserving the propagation guarantees, I'm all ears :)

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Sep 3, 2019

Member

making the wallet transactions easily identifiable as such and reducing privacy?

I don't think this is true (at least no worse than today). While it is possible to guess whether a list of INVs is a rebroadcast inv or not, it shouldn't be possible to trivially find the source of the low feerate txs in that inv. Txs broadcast for the first time from our node should be no different from txs broadcast for the first time from another node to us and then relayed by us. It is know to not be perfect (see dandelion tx relay), but improving that seems like a separate issue to me.


# confirm that its back in the mempool
self.log.info("Transaction should be resubmitted to mempool")
wait_until(lambda: txhsh in node.getrawmempool(), timeout=30)

This comment has been minimized.

Copy link
@mzumsande

mzumsande Aug 26, 2019

Contributor

This goes into timeout for me, causing wallet_resendwallettransactions to fail when I run the entire test suite (same as in AppVeyor build here) but the test succeeds if I run it in isolation. If you can’t reproduce this I could look deeper into why it fails.

This comment has been minimized.

Copy link
@amitiuttarwar

amitiuttarwar Aug 26, 2019

Author Contributor

oh interesting. I haven't been able to reproduce locally. I would love some help!

This comment has been minimized.

Copy link
@mzumsande

mzumsande Aug 27, 2019

Contributor

I think that this is due to a bug with the time: RESEND_TXS_FREQUENCY is in microseconds. It is added to GetTime() which is in seconds, so an actual resend doesn't ever happen in test.
Yet the test sometimes succeeds because there is an initial resend (nNextResend is initialized to 0) which in some runs happens after the mocktime is set to two weeks and the tx has been expired (test passes), in some runs happens earlier (test fails).

This comment has been minimized.

Copy link
@amitiuttarwar

amitiuttarwar Aug 29, 2019

Author Contributor

thanks for digging in!! I will fix the bug and ensure the times are consistent in microseconds.

could you tell me more about how you debugged? were you able to isolate the failure, or was it always when run in the entire suite? I'm curious why this behavior wouldn't manifest as a flaky test when run in isolation.

This comment has been minimized.

Copy link
@mzumsande

mzumsande Aug 29, 2019

Contributor

On my other computer it also sometimes failed in the single run, which made testing easier. I found this the bug adding some debug statements in ResendWalletTransactions.

Fixing the times might not solve this completely, because the initial rebroadcast after startup also happened for me in a few runs right before assert txhsh not in node.getrawmempool(), which then fails. Adding a sleep before you jump ahead in time could help.

@amitiuttarwar

This comment has been minimized.

Copy link
Contributor Author

commented Aug 26, 2019

thanks for the review @mzumsande !

I don’t understand 1) the concept “should have been mined" [...] do you mean the txns should have been mined in a specific block/range of blocks, but weren’t? Should no rebroadcasts happen in an ideal world where miners have an identical mempool to ours and mine rationally?

based on the local mempool, we are attempting to answer the question of what txns we think should have been mined. And saying if it wasn't, maybe there was an issue with relay.
You are correct- in a world where all mempools are consistent there wouldn't be rebroadcasts.

As to 2) from what I understand, BlockAssembler creates a block template with 3/4*MAX_BLOCK_WEIGHT including the top feerate packages of our current mempool. Wouldn’t it always fill this block template with txns if our mempool is large enough, and therefore rather include 75% of the txns that we expect to be mined in the next block, instead of txns that should have been mined in the past?

yes. you will start with txns you expect to be mined in the next block. the recency filter will (likely) remove some of those transactions. however, in the case of a mempool thats emptying out, the recency filter might not have much effect. for that I have this todo before the PR would be ready for merge:

add functionality to run an automated job to cache min fee rate for txns to be included in block, then apply that filter to exclude txns with fee rate < min from rebroadcast set

@mzumsande

This comment has been minimized.

Copy link
Contributor

commented Aug 26, 2019

based on the local mempool, we are attempting to answer the question of what txns we think should have been mined.

What confuses me is how we can answer that without actually looking into recent blocks.
Considering that txns are removed from the mempool once they are included in a valid block, it is possible that the previous blocks removed the respective top of our previous versions of the mempool, so we are left with txns that were not at the top of the mempool earlier but are now, which we wouldn't want to rebroadcast.
Or the miners could have left out several high-fee-rate txns in favor of lower ones, in which case the higher ones are still present in our mempool and we would like to rebroadcast them.
Or there just might have been no new blocks found in the hour since the last rebroadcast, in which case we wouldn't need to rebroadcast.

How could we distinguish between these cases by just looking at our current mempool?

@MarcoFalke MarcoFalke referenced this pull request Aug 27, 2019
@amitiuttarwar

This comment has been minimized.

Copy link
Contributor Author

commented Aug 29, 2019

great questions @mzumsande. I've thought about this a lot, so let me share some of my reasoning-

What confuses me is how we can answer that without actually looking into recent blocks.

We can't. And further, even if we do look at the recent blocks, we still cannot answer exactly what "should" have been included. The two main pieces of information we are missing are 1. what the miner's mempool looked like and 2. any weight applied through prioritisetransaction. By looking at a block, it is difficult to extrapolate the exact minimum fee rate for transactions to be included. So instead, the approach here is for a node to look at its local mempool and work towards the picture of what it believes should have already been included.

so we are left with txns that were not at the top of the mempool earlier but are now, which we wouldn't want to rebroadcast.

yup, specifically in the case of the emptying out mempool, we would currently rebroadcast a full set of txns, thats why I want to build a cache of min fee rate and apply an extra filter to reduce noise in this circumstance ( from #16698 (comment) ):

add functionality to run an automated job to cache min fee rate for txns to be included in block, then apply that filter to exclude txns with fee rate < min from rebroadcast set. this will reduce rebroadcast noise in scenarios where the mempool is emptying out.

--

The crux of the thinking is that we are not going to get the rebroadcast set perfect, but that is ok. When a node rebroadcasts a txn, it sends an INV message to new connections (see filterInventoryKnown). Since INV messages are relatively small & can incorporate many transactions, we have some leeway.

All these different mechanisms are to reduce noise. I want to ensure the parameters chosen allow the worst case scenario (rebroadcasting the full set) to not be disruptive to the network. And these mechanisms should make the worst case infrequent.

Does this make sense to you? Let me know if you still have questions.

@mzumsande

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2019

Thanks for the answer! I think that your approach makes sense and am looking forward to the traffic/bandwidth analysis.
I am still not so sure if your approach is best described with the notion of "should have been mined", but to a degree that's just semantics.

@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2019

Needs rebase
@laanwj

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

Concept ACK

@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Sep 18, 2019

Are you still working on this? This missed the 0.19 feature freeze, but I'd like to see it in 0.20 (hopefully early in the cycle)

@MarcoFalke MarcoFalke added this to the 0.20.0 milestone Sep 18, 2019

@amitiuttarwar

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

@MarcoFalke yes will resume soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.