p2p: Fill reconciliation sets (Erlay) #28765

naumenkogs · 2023-11-01T09:06:40Z

Keep track of per-peer reconciliation sets containing transactions to be exchanged efficiently. The remaining transactions are announced via usual flooding.

Erlay Project Tracking: #28646

DrahtBot · 2023-11-01T09:06:43Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	brunoerg
Stale ACK	sr-gi, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

mzumsande

Approach ACK

src/net_processing.cpp

src/test/txreconciliation_tests.cpp

src/node/txreconciliation.cpp

src/test/txreconciliation_tests.cpp

brunoerg · 2023-11-02T13:16:30Z

Concept ACK

src/net_processing.cpp

src/test/txreconciliation_tests.cpp

src/node/txreconciliation.cpp

DrahtBot · 2023-11-16T11:43:01Z

node/txreconciliation.cpp:173 AddToSet: Assertion `recon_state.m_local_set.insert(wtxid).second' failed.

src/node/txreconciliation.h

src/net_processing.cpp

src/node/txreconciliation.cpp

mzumsande · 2023-11-21T18:37:33Z

src/net_processing.cpp

+                                //
+                                // Potentially reconciling parent+child would mean that for every
+                                // child we need to to check if any of the parents is currently
+                                // reconciled so that the child isn't fanouted ahead. But then


Although this reduced it, I think the situation where the child is fanouted ahead could still happen if we receive the parent first, add it to the recon set, and only after that receive the child and decide to fanout it.
Not sure if that is a problem though.

You're right.

My fear with this is unexpected behavior for tx sender: e.g., you craft a "package" thinking parent always goes ahead, but then child gets ahead (potentially with the attacker's help) and dropped on the floor due to some policy. Something along this, but maybe I'm making it up.
Are these concerns at least semi-valid? @glozow

I can add "see whether a parent is in the set already" check, when looking at a child, if we think it's worth it.

I confirm with current implemented bip331 approach there is currently a MAX_ORPHAN_TOTAL_SIZE limit.
You can always get a non-standard parent (e.g an under-dust output) and yet the child be policy valid.
There is currently no sanitization of policy equivalence among a set of parent within ancpkginfo.
In the future, you could have reconciliation at the pkgtxns-level or at package announcement (ancpkginfo).
Ideally both, though that something that can be seen once erlay or bip331 are deployed.

Assuming there is no substantial timely delay between the parent being reconciliated and the child being fanout to the peer which would allow an overflow of MAX_ORPHAN_TOTAL_SIZE, I don’t think it’s altering the package acceptance of the receiving peer.

Assuming no exploitable timers, one can still make the simulation to quantity the “child drift” risk for a distribution of parent / child being reconciliated / fanout on the average time discrepancies between those 2 tx announcement strategy. Ideally in the future, we would move to sender-initiated package, which would remove this concern from my understanding. However, this is already a post-bip331 future, we’re talking about.

sr-gi

Concept ACK

src/node/txreconciliation.h

src/node/txreconciliation.cpp

src/net_processing.cpp

src/node/txreconciliation.cpp

src/net_processing.cpp

naumenkogs · 2023-11-24T08:53:58Z

Addressed the comments, mostly refactoring. Some conversations pending above. The code is good for review.

naumenkogs · 2023-12-04T09:13:29Z

Addressed all comments. Ready for review.

They will be used later on.

src/net_processing.cpp

ariard · 2024-02-20T23:36:45Z

src/net_processing.cpp

+                                // can't just be added; b) removing parents from reconciliation
+                                // sets for this one child is not good either.
+                                if ((*txiter)->GetCountWithDescendants() <= 1) {
+                                    fanout = m_txreconciliation->ShouldFanoutTo(wtxid, pto->GetId(),


can add a LogPrint(BCLog::NET, “Non-signaling reconciliation inbound peers flooding %d Outbound peers flooding %d for debug”); for debug purpose and observation

ariard · 2024-02-20T23:56:42Z

src/net_processing.cpp

+                                //
+                                // Potentially reconciling parent+child would mean that for every
+                                // child we need to to check if any of the parents is currently
+                                // reconciled so that the child isn't fanouted ahead. But then


I confirm with current implemented bip331 approach there is currently a MAX_ORPHAN_TOTAL_SIZE limit.
You can always get a non-standard parent (e.g an under-dust output) and yet the child be policy valid.
There is currently no sanitization of policy equivalence among a set of parent within ancpkginfo.
In the future, you could have reconciliation at the pkgtxns-level or at package announcement (ancpkginfo).
Ideally both, though that something that can be seen once erlay or bip331 are deployed.

Assuming there is no substantial timely delay between the parent being reconciliated and the child being fanout to the peer which would allow an overflow of MAX_ORPHAN_TOTAL_SIZE, I don’t think it’s altering the package acceptance of the receiving peer.

Assuming no exploitable timers, one can still make the simulation to quantity the “child drift” risk for a distribution of parent / child being reconciliated / fanout on the average time discrepancies between those 2 tx announcement strategy. Ideally in the future, we would move to sender-initiated package, which would remove this concern from my understanding. However, this is already a post-bip331 future, we’re talking about.

src/node/txreconciliation.cpp

Transactions eligible for reconciliation are added to the reconciliation sets. For the remaining txs, low-fanout is used. Co-authored-by: Martin Zumsande <mzumsande@gmail.com> Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>

ariard · 2024-02-22T19:37:19Z

src/node/txreconciliation.cpp

+        if (m_tx_fanout_targets_cache_order.size() == FANOUT_TARGETS_PER_TX_CACHE_SIZE) {
+            auto expired_tx = m_tx_fanout_targets_cache_order.front();
+            m_tx_fanout_targets_cache_data.erase(expired_tx);
+            m_tx_fanout_targets_cache_order.pop_front();


In the eventuality of an influx of inbound transactions, faster than we can flush out them to low-fanout flooding peers, my understanding of dropping the upfront wtxid candidate, we would keep propagating this transaction only according to tx-relay policy and connection state of other peers (not this NodeId anymore).

I understand we’re fanning out only to outbound peers (m_tx_fanout_targets_cache_data doc), though here it’s more a dependency on the perfomance capabilities of the full-node itself (i.e how fast you process vInv(MSG_WTX) and how fast you-reannounce them to downstream peers if valid). To interferes with a transaction propagation, assuming a non-listening node, an attacker would have to be puppet or compromise all our low-fanout outbound peers, I think ? Obviously more outbound peers would make things better on this front, which should be allowed by Erlay tx-relay bandwidth savings.

ariard

Reviewed up to be8ef38d29, still reading back txrelayism issue on announcement-related bandwidth / latency and responsibilities trade-off for the choice of current constants.

ariard · 2024-02-24T02:08:42Z

src/node/txreconciliation.cpp

     * These values are used to salt short IDs, which is necessary for transaction reconciliations.
     */
    uint64_t m_k0, m_k1;

+    /**
+     * Store all wtxids which we would announce to the peer (policy checks passed, etc.)


In terms of peers-side policy check (i.e m_fee_filter_received), this policy limit is at the tx-relay link level and this is unilaterally initiated by the peer. As such I think there is no guarantee that between time point A we add a Wtxid in m_local_set and time point B we reconciliate, we have not received a new bip133 message, updating the m_fee_filter_received. I believe we can retro-actively stale stored Wtxid and as such a bandwidth performance leak, under situations of sudden network mempool spikes.

I don’t think there is that much a tx-announcement strategy (either flooding or reconciliation) can do it in itself, unless assuming some extensions to bip133 messages to commit on a feerate-level duration. As such, I think any improvement is out of scope for this PR.

ariard · 2024-02-24T02:33:49Z

src/node/txreconciliation.cpp

+        // - limit CPU use for sketch computations.
+        //
+        // Since we reconcile frequently, reaching capacity either means:
+        // (1) a peer for some reason does not request reconciliations from us for a long while, or


I think “(1)” can be extended a bit more e.g “Memory DoS issue for a laggy peer are bounded by DEFAULT_MAX_PEER_CONNECTIONS and reconciliation state is clean up with FinalizeNode".

src/bench/txreconciliation.cpp

It helps to avoid recomputing every time we consider a transaction for fanout/reconciliation.

ariard · 2024-02-27T21:12:13Z

src/net_processing.cpp

+                                // it gets tricky when reconciliation sets are full: a) the child
+                                // can't just be added; b) removing parents from reconciliation
+                                // sets for this one child is not good either.
+                                if ((*txiter)->GetCountWithDescendants() <= 1) {


One follow-up improvement, all the descendants in GetCountWithDescendants() could be marked with parent_fanout=true, that way we guarantee more stringently that all the members of a chain of transactions are tx-announcement relayed through the same strategy (either erlay or low-fanout flooding). I’ll check if there is test coverage here.

Empact · 2024-02-29T20:46:36Z

src/net_processing.cpp

@@ -175,6 +175,8 @@ static constexpr double MAX_ADDR_RATE_PER_SECOND{0.1};
 static constexpr size_t MAX_ADDR_PROCESSING_TOKEN_BUCKET{MAX_ADDR_TO_SEND};
 /** The compactblocks version we support. See BIP 152. */
 static constexpr uint64_t CMPCTBLOCKS_VERSION{2};
+/** Used to determine whether to use low-fanout flooding (or reconciliation) for a tx relay event. */
+static const uint64_t RANDOMIZER_ID_FANOUTTARGET = 0xbac89af818407b6aULL; // SHA256("fanouttarget")[0:8]


constexpr?

FANOUTTARGET -> FANOUT_TARGET?

Empact · 2024-02-29T21:01:43Z

src/node/txreconciliation.cpp

+
+        std::vector<NodeId> new_fanout_candidates;
+        new_fanout_candidates.reserve(targets_size);
+        for_each(best_peers.begin(), best_peers.end(),


std::for_each?

ariard · 2024-03-01T19:30:44Z

src/node/txreconciliation.cpp

+        // We use the pre-determined randomness to give a consistent result per transaction,
+        // thus making sure that no transaction gets "unlucky" if every per-peer roll fails.
+        CSipHasher deterministic_randomizer{m_deterministic_randomizer};
+        deterministic_randomizer.Write(wtxid.ToUint256());


Looking on CSipHasher, given it’s a pseudo-random hash function, verified it’s well-initialized from two hidden random 64-bit seeds in src/init.cpp (L1239). Then we add a CSipHasher instance provided by CConman at TxReconciliationTracker initialization in src/net_processing.cpp. This respect the SipHash’s PRF’s requirement to initialize it with a random 128-bit key. I still wonder if in the future TxReconciliationTracker shouldn’t get it’s own random seed (i.e use GetRand(), it promises fast entropy generation) to isolate tx-announcement from the rest of network connection management.

This seems to be consistent with how a deterministic randomizer is seeded in many other places in the codebase. What is your rationale for making it different here?

ariard · 2024-03-01T19:33:28Z

src/node/txreconciliation.cpp

@@ -142,9 +250,104 @@ class TxReconciliationTracker::Impl
        return (recon_state != m_states.end() &&
                std::holds_alternative<TxReconciliationState>(recon_state->second));
    }
+
+    // Not const because of caching.
+    bool IsFanoutTarget(const Wtxid& wtxid, NodeId peer_id, bool we_initiate, double limit) EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex)


This variable name can be called destination rather than limit to be consistent with ShouldFanoutTo and denotates more clearly it’s the sample space boundary.

ariard · 2024-03-07T20:53:10Z

src/node/txreconciliation.cpp

+        for (const auto& indexed_state : m_states) {
+            const auto cur_state = std::get_if<TxReconciliationState>(&indexed_state.second);
+            if (cur_state && cur_state->m_we_initiate == we_initiate) {
+                uint64_t hash_key = CSipHasher(deterministic_randomizer).Write(cur_state->m_k0).Finalize();


I think the comment L66 in src/node/txreconciliation.cpp can be updated to reflect the usage of m_k0 as a siphash input string for low-fanout flood peers selection. Not only used in ComputeShortID.

I think this can actually be seeded with anything, it doesn't have to be m_k0. IMO it'd better not be, to not repurpose something that is meant for something completely different

brunoerg · 2024-03-21T17:58:23Z

src/node/txreconciliation.cpp

+        auto salt_or_state = m_states.find(peer_id);
+        if (salt_or_state == m_states.end()) return nullptr;
+
+        auto* state = std::get_if<TxReconciliationState>(&salt_or_state->second);


nit: you could return it directly.

brunoerg · 2024-03-21T18:08:29Z

src/node/txreconciliation.cpp

+        // Since we reconcile frequently, reaching capacity either means:
+        // (1) a peer for some reason does not request reconciliations from us for a long while, or
+        // (2) really a lot of valid fee-paying transactions were dumped on us at once.
+        // We don't care about a laggy peer (1) because we probably can't help them even if we fanout transactions.


What does "laggy peer" mean?

I'm guessing "a peer for some reason does not request reconciliations from us for a long while", hence why it references (1)

achow101 · 2024-05-16T14:15:04Z

Superseded by #30116

DrahtBot added the P2P label Nov 1, 2023

This was referenced Nov 1, 2023

p2p: Fill reconciliation sets and request reconciliation (Erlay) #26283

Closed

Erlay Project Tracking #28646

Open

DrahtBot mentioned this pull request Nov 1, 2023

Erlay: bandwidth-efficient transaction relay protocol #21515

Draft

mzumsande reviewed Nov 1, 2023

View reviewed changes

naumenkogs force-pushed the 2023-11-erlay2.1 branch 2 times, most recently from 3d69f45 to 983f8c6 Compare November 2, 2023 08:35

brunoerg reviewed Nov 2, 2023

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

brunoerg reviewed Nov 2, 2023

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

naumenkogs force-pushed the 2023-11-erlay2.1 branch 2 times, most recently from 0e460a0 to 2af4d12 Compare November 7, 2023 08:04

ariard reviewed Nov 12, 2023

View reviewed changes

src/test/txreconciliation_tests.cpp Show resolved Hide resolved

src/node/txreconciliation.cpp Outdated Show resolved Hide resolved

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 2af4d12 to 4d43c7d Compare November 15, 2023 11:04

DrahtBot added the CI failed label Nov 15, 2023

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 4d43c7d to 50dc9bf Compare November 17, 2023 11:37

glozow reviewed Nov 21, 2023

View reviewed changes

src/node/txreconciliation.h Outdated Show resolved Hide resolved

mzumsande reviewed Nov 21, 2023

View reviewed changes

src/net_processing.cpp Show resolved Hide resolved

src/node/txreconciliation.cpp Outdated Show resolved Hide resolved

mzumsande reviewed Nov 21, 2023

View reviewed changes

sr-gi reviewed Nov 21, 2023

View reviewed changes

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 50dc9bf to 94e6ea6 Compare November 23, 2023 11:42

DrahtBot removed the CI failed label Nov 23, 2023

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 94e6ea6 to e5018f2 Compare November 24, 2023 08:52

DrahtBot mentioned this pull request Nov 28, 2023

Nuke adjusted time from validation (attempt 2) #28956

Merged

naumenkogs force-pushed the 2023-11-erlay2.1 branch 2 times, most recently from 0a59a3f to f895ae4 Compare December 4, 2023 09:13

p2p: Functions to add/remove wtxids to tx reconciliation sets

be8ef38

They will be used later on.

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 7f12d4f to 4d81bec Compare January 19, 2024 11:30

This was referenced Jan 19, 2024

Improve new LogDebug/Trace/Info/Warning/Error Macros #29256

Open

logging: Update to new logging API #29231

Closed

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 4d81bec to 82126d3 Compare January 22, 2024 09:40

DrahtBot removed the CI failed label Jan 22, 2024

ariard reviewed Feb 20, 2024

View reviewed changes

ariard reviewed Feb 21, 2024

View reviewed changes

src/node/txreconciliation.cpp Show resolved Hide resolved

p2p: Add transactions to reconciliation sets

cad6f0a

Transactions eligible for reconciliation are added to the reconciliation sets. For the remaining txs, low-fanout is used. Co-authored-by: Martin Zumsande <mzumsande@gmail.com> Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>

naumenkogs force-pushed the 2023-11-erlay2.1 branch from 82126d3 to b3db2bc Compare February 22, 2024 07:20

ariard reviewed Feb 22, 2024

View reviewed changes

ariard reviewed Feb 24, 2024

View reviewed changes

brunoerg reviewed Feb 26, 2024

View reviewed changes

src/bench/txreconciliation.cpp Outdated Show resolved Hide resolved

mzumsande and others added 3 commits February 27, 2024 12:04

add bench for ShouldFanoutTo

87e0ec0

p2p: Cache fanout candidates to optimize txreconciliation

07f3ad4

p2p: Cache inbound reconciling peers count

a14dfd9

It helps to avoid recomputing every time we consider a transaction for fanout/reconciliation.

naumenkogs force-pushed the 2023-11-erlay2.1 branch from b3db2bc to a14dfd9 Compare February 27, 2024 10:05

ariard reviewed Feb 27, 2024

View reviewed changes

Empact reviewed Feb 29, 2024

View reviewed changes

ariard reviewed Mar 1, 2024

View reviewed changes

ariard reviewed Mar 7, 2024

View reviewed changes

This was referenced Mar 8, 2024

doc: fix typos #29593

Closed

p2p: opportunistically accept 1-parent-1-child packages #28970

Merged

brunoerg reviewed Mar 21, 2024

View reviewed changes

Prabhat1308 mentioned this pull request May 11, 2024

Review : p2p: Fill reconciliation sets (Erlay) Bitshala/BitcoinCore-PR-Review-Club#60

Open

sr-gi mentioned this pull request May 15, 2024

p2p: Fill reconciliation sets (Erlay) attempt 2 #30116

Open

achow101 closed this May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: Fill reconciliation sets (Erlay) #28765

p2p: Fill reconciliation sets (Erlay) #28765

naumenkogs commented Nov 1, 2023

DrahtBot commented Nov 1, 2023 •

edited

mzumsande left a comment

brunoerg commented Nov 2, 2023

DrahtBot commented Nov 16, 2023

mzumsande Nov 21, 2023

naumenkogs Nov 23, 2023

ariard Feb 20, 2024

sr-gi left a comment

naumenkogs commented Nov 24, 2023

naumenkogs commented Dec 4, 2023

ariard Feb 20, 2024

ariard Feb 20, 2024

ariard Feb 22, 2024 •

edited

ariard left a comment

ariard Feb 24, 2024

ariard Feb 24, 2024

ariard Feb 27, 2024

Empact Feb 29, 2024

Empact Feb 29, 2024

Empact Feb 29, 2024

ariard Mar 1, 2024

sr-gi May 14, 2024

ariard Mar 1, 2024

ariard Mar 7, 2024

sr-gi May 15, 2024

brunoerg Mar 21, 2024

brunoerg Mar 21, 2024

sr-gi May 14, 2024

achow101 commented May 16, 2024

p2p: Fill reconciliation sets (Erlay) #28765

p2p: Fill reconciliation sets (Erlay) #28765

Conversation

naumenkogs commented Nov 1, 2023

DrahtBot commented Nov 1, 2023 • edited

Code Coverage

Reviews

Conflicts

mzumsande left a comment

Choose a reason for hiding this comment

brunoerg commented Nov 2, 2023

DrahtBot commented Nov 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sr-gi left a comment

Choose a reason for hiding this comment

naumenkogs commented Nov 24, 2023

naumenkogs commented Dec 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard Feb 22, 2024 • edited

Choose a reason for hiding this comment

ariard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

achow101 commented May 16, 2024

DrahtBot commented Nov 1, 2023 •

edited

ariard Feb 22, 2024 •

edited