New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: Fill reconciliation sets (Erlay) #28765
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK
3d69f45
to
983f8c6
Compare
Concept ACK |
0e460a0
to
2af4d12
Compare
2af4d12
to
4d43c7d
Compare
|
4d43c7d
to
50dc9bf
Compare
// | ||
// Potentially reconciling parent+child would mean that for every | ||
// child we need to to check if any of the parents is currently | ||
// reconciled so that the child isn't fanouted ahead. But then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this reduced it, I think the situation where the child is fanouted ahead could still happen if we receive the parent first, add it to the recon set, and only after that receive the child and decide to fanout it.
Not sure if that is a problem though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right.
My fear with this is unexpected behavior for tx sender: e.g., you craft a "package" thinking parent always goes ahead, but then child gets ahead (potentially with the attacker's help) and dropped on the floor due to some policy. Something along this, but maybe I'm making it up.
Are these concerns at least semi-valid? @glozow
I can add "see whether a parent is in the set already" check, when looking at a child, if we think it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm with current implemented bip331 approach there is currently a MAX_ORPHAN_TOTAL_SIZE
limit.
You can always get a non-standard parent (e.g an under-dust output) and yet the child be policy valid.
There is currently no sanitization of policy equivalence among a set of parent within ancpkginfo
.
In the future, you could have reconciliation at the pkgtxns
-level or at package announcement (ancpkginfo
).
Ideally both, though that something that can be seen once erlay or bip331 are deployed.
Assuming there is no substantial timely delay between the parent being reconciliated and the child being fanout to the peer which would allow an overflow of MAX_ORPHAN_TOTAL_SIZE
, I don’t think it’s altering the package acceptance of the receiving peer.
Assuming no exploitable timers, one can still make the simulation to quantity the “child drift” risk for a distribution of parent / child being reconciliated / fanout on the average time discrepancies between those 2 tx announcement strategy. Ideally in the future, we would move to sender-initiated package, which would remove this concern from my understanding. However, this is already a post-bip331 future, we’re talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
50dc9bf
to
94e6ea6
Compare
94e6ea6
to
e5018f2
Compare
Addressed the comments, mostly refactoring. Some conversations pending above. The code is good for review. |
0a59a3f
to
f895ae4
Compare
Addressed all comments. Ready for review. |
They will be used later on.
7f12d4f
to
4d81bec
Compare
4d81bec
to
82126d3
Compare
// can't just be added; b) removing parents from reconciliation | ||
// sets for this one child is not good either. | ||
if ((*txiter)->GetCountWithDescendants() <= 1) { | ||
fanout = m_txreconciliation->ShouldFanoutTo(wtxid, pto->GetId(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can add a LogPrint(BCLog::NET, “Non-signaling reconciliation inbound peers flooding %d Outbound peers flooding %d for debug”);
for debug purpose and observation
// | ||
// Potentially reconciling parent+child would mean that for every | ||
// child we need to to check if any of the parents is currently | ||
// reconciled so that the child isn't fanouted ahead. But then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm with current implemented bip331 approach there is currently a MAX_ORPHAN_TOTAL_SIZE
limit.
You can always get a non-standard parent (e.g an under-dust output) and yet the child be policy valid.
There is currently no sanitization of policy equivalence among a set of parent within ancpkginfo
.
In the future, you could have reconciliation at the pkgtxns
-level or at package announcement (ancpkginfo
).
Ideally both, though that something that can be seen once erlay or bip331 are deployed.
Assuming there is no substantial timely delay between the parent being reconciliated and the child being fanout to the peer which would allow an overflow of MAX_ORPHAN_TOTAL_SIZE
, I don’t think it’s altering the package acceptance of the receiving peer.
Assuming no exploitable timers, one can still make the simulation to quantity the “child drift” risk for a distribution of parent / child being reconciliated / fanout on the average time discrepancies between those 2 tx announcement strategy. Ideally in the future, we would move to sender-initiated package, which would remove this concern from my understanding. However, this is already a post-bip331 future, we’re talking about.
Transactions eligible for reconciliation are added to the reconciliation sets. For the remaining txs, low-fanout is used. Co-authored-by: Martin Zumsande <mzumsande@gmail.com> Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>
82126d3
to
b3db2bc
Compare
if (m_tx_fanout_targets_cache_order.size() == FANOUT_TARGETS_PER_TX_CACHE_SIZE) { | ||
auto expired_tx = m_tx_fanout_targets_cache_order.front(); | ||
m_tx_fanout_targets_cache_data.erase(expired_tx); | ||
m_tx_fanout_targets_cache_order.pop_front(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the eventuality of an influx of inbound transactions, faster than we can flush out them to low-fanout flooding peers, my understanding of dropping the upfront wtxid candidate, we would keep propagating this transaction only according to tx-relay policy and connection state of other peers (not this NodeId
anymore).
I understand we’re fanning out only to outbound peers (m_tx_fanout_targets_cache_data
doc), though here it’s more a dependency on the perfomance capabilities of the full-node itself (i.e how fast you process vInv(MSG_WTX
) and how fast you-reannounce them to downstream peers if valid). To interferes with a transaction propagation, assuming a non-listening node, an attacker would have to be puppet or compromise all our low-fanout outbound peers, I think ? Obviously more outbound peers would make things better on this front, which should be allowed by Erlay tx-relay bandwidth savings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to be8ef38d29
, still reading back txrelayism issue on announcement-related bandwidth / latency and responsibilities trade-off for the choice of current constants.
* These values are used to salt short IDs, which is necessary for transaction reconciliations. | ||
*/ | ||
uint64_t m_k0, m_k1; | ||
|
||
/** | ||
* Store all wtxids which we would announce to the peer (policy checks passed, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of peers-side policy check (i.e m_fee_filter_received
), this policy limit is at the tx-relay link level and this is unilaterally initiated by the peer. As such I think there is no guarantee that between time point A we add a Wtxid
in m_local_set
and time point B we reconciliate, we have not received a new bip133 message, updating the m_fee_filter_received
. I believe we can retro-actively stale stored Wtxid
and as such a bandwidth performance leak, under situations of sudden network mempool spikes.
I don’t think there is that much a tx-announcement strategy (either flooding or reconciliation) can do it in itself, unless assuming some extensions to bip133 messages to commit on a feerate-level duration. As such, I think any improvement is out of scope for this PR.
// - limit CPU use for sketch computations. | ||
// | ||
// Since we reconcile frequently, reaching capacity either means: | ||
// (1) a peer for some reason does not request reconciliations from us for a long while, or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think “(1)” can be extended a bit more e.g “Memory DoS issue for a laggy peer are bounded by DEFAULT_MAX_PEER_CONNECTIONS
and reconciliation state is clean up with FinalizeNode
".
It helps to avoid recomputing every time we consider a transaction for fanout/reconciliation.
b3db2bc
to
a14dfd9
Compare
// it gets tricky when reconciliation sets are full: a) the child | ||
// can't just be added; b) removing parents from reconciliation | ||
// sets for this one child is not good either. | ||
if ((*txiter)->GetCountWithDescendants() <= 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One follow-up improvement, all the descendants in GetCountWithDescendants()
could be marked with parent_fanout=true
, that way we guarantee more stringently that all the members of a chain of transactions are tx-announcement relayed through the same strategy (either erlay or low-fanout flooding). I’ll check if there is test coverage here.
@@ -175,6 +175,8 @@ static constexpr double MAX_ADDR_RATE_PER_SECOND{0.1}; | |||
static constexpr size_t MAX_ADDR_PROCESSING_TOKEN_BUCKET{MAX_ADDR_TO_SEND}; | |||
/** The compactblocks version we support. See BIP 152. */ | |||
static constexpr uint64_t CMPCTBLOCKS_VERSION{2}; | |||
/** Used to determine whether to use low-fanout flooding (or reconciliation) for a tx relay event. */ | |||
static const uint64_t RANDOMIZER_ID_FANOUTTARGET = 0xbac89af818407b6aULL; // SHA256("fanouttarget")[0:8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FANOUTTARGET
-> FANOUT_TARGET
?
|
||
std::vector<NodeId> new_fanout_candidates; | ||
new_fanout_candidates.reserve(targets_size); | ||
for_each(best_peers.begin(), best_peers.end(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::for_each
?
// We use the pre-determined randomness to give a consistent result per transaction, | ||
// thus making sure that no transaction gets "unlucky" if every per-peer roll fails. | ||
CSipHasher deterministic_randomizer{m_deterministic_randomizer}; | ||
deterministic_randomizer.Write(wtxid.ToUint256()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking on CSipHasher
, given it’s a pseudo-random hash function, verified it’s well-initialized from two hidden random 64-bit seeds in src/init.cpp
(L1239
). Then we add a CSipHasher
instance provided by CConman
at TxReconciliationTracker
initialization in src/net_processing.cpp
. This respect the SipHash’s PRF’s requirement to initialize it with a random 128-bit key. I still wonder if in the future TxReconciliationTracker
shouldn’t get it’s own random seed (i.e use GetRand()
, it promises fast entropy generation) to isolate tx-announcement from the rest of network connection management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be consistent with how a deterministic randomizer is seeded in many other places in the codebase. What is your rationale for making it different here?
@@ -142,9 +250,104 @@ class TxReconciliationTracker::Impl | |||
return (recon_state != m_states.end() && | |||
std::holds_alternative<TxReconciliationState>(recon_state->second)); | |||
} | |||
|
|||
// Not const because of caching. | |||
bool IsFanoutTarget(const Wtxid& wtxid, NodeId peer_id, bool we_initiate, double limit) EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable name can be called destination
rather than limit
to be consistent with ShouldFanoutTo
and denotates more clearly it’s the sample space boundary.
for (const auto& indexed_state : m_states) { | ||
const auto cur_state = std::get_if<TxReconciliationState>(&indexed_state.second); | ||
if (cur_state && cur_state->m_we_initiate == we_initiate) { | ||
uint64_t hash_key = CSipHasher(deterministic_randomizer).Write(cur_state->m_k0).Finalize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment L66 in src/node/txreconciliation.cpp
can be updated to reflect the usage of m_k0
as a siphash input string for low-fanout flood peers selection. Not only used in ComputeShortID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can actually be seeded with anything, it doesn't have to be m_k0
. IMO it'd better not be, to not repurpose something that is meant for something completely different
auto salt_or_state = m_states.find(peer_id); | ||
if (salt_or_state == m_states.end()) return nullptr; | ||
|
||
auto* state = std::get_if<TxReconciliationState>(&salt_or_state->second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could return it directly.
// Since we reconcile frequently, reaching capacity either means: | ||
// (1) a peer for some reason does not request reconciliations from us for a long while, or | ||
// (2) really a lot of valid fee-paying transactions were dumped on us at once. | ||
// We don't care about a laggy peer (1) because we probably can't help them even if we fanout transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "laggy peer" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing "a peer for some reason does not request reconciliations from us for a long while", hence why it references (1)
Superseded by #30116 |
Keep track of per-peer reconciliation sets containing transactions to be exchanged efficiently. The remaining transactions are announced via usual flooding.
Erlay Project Tracking: #28646