Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net_processing: Retry notfounds with more urgency #18238

Open
wants to merge 1 commit into
base: master
from

Conversation

@ajtowns
Copy link
Contributor

ajtowns commented Mar 2, 2020

Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.

Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.
@fanquake fanquake added the P2P label Mar 2, 2020
@ajtowns

This comment has been minimized.

Copy link
Contributor Author

ajtowns commented Mar 2, 2020

This is an alternative approach to #15505 that's minimally invasive on the data structures. Doing something about this was suggested as a pre-req for #17303 in #17303 (comment)

@sdaftuar expressed some concerns about how much of a CPU hit it could be in worst case scenarios, but it doesn't look too bad in testing:

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 3373f7f544..1b9e66456a 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -735,8 +735,13 @@ std::chrono::microseconds CalculateTxGetDataTime(const uint256& txid, std::chron
     return process_time;
 }
 
+static uint64_t xxx_time_spent GUARDED_BY(cs_main) = 0;
+static uint64_t xxx_invocations GUARDED_BY(cs_main) = 0;
+
 static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
 {
+    uint64_t time_start = GetTimeMicros();
+
     CNodeState::TxDownloadState* best_d = nullptr;
     std::chrono::microseconds best;
 
@@ -764,6 +769,12 @@ static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::ch
             }
         }
     }
+
+    xxx_time_spent += GetTimeMicros() - time_start;
+    ++xxx_invocations;
+    if (xxx_invocations % 1000 == 0) {
+        LogPrintf("Time spent in RetryProcessTx %d.%03ds, %d us per call (%d calls)\n", xxx_time_spent / 1000000, (xxx_time_spent / 1000) % 1000, xxx_time_spent/xxx_invocations, xxx_invocations);
+    }
 }
 
 void RequestTx(CNodeState* state, const uint256& txid, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
diff --git a/test/functional/p2p_notfound_perf.py b/test/functional/p2p_notfound_perf.py
new file mode 100755
index 0000000000..970452696c
--- /dev/null
+++ b/test/functional/p2p_notfound_perf.py
@@ -0,0 +1,63 @@
+#!/usr/bin/env python3
+# Copyright (c) 2017-2018 The Bitcoin Core developers
+# Distributed under the MIT software license, see the accompanying
+# file COPYING or http://www.opensource.org/licenses/mit-license.php.
+"""Test that we don't leak txs to inbound peers that we haven't yet announced to"""
+
+import time
+from test_framework.messages import msg_notfound, msg_inv, CInv
+from test_framework.mininode import P2PDataStore
+from test_framework.test_framework import BitcoinTestFramework
+from test_framework.util import (
+    assert_equal,
+)
+
+
+class P2PNode(P2PDataStore):
+    def on_inv(self, msg):
+        pass
+
+    def on_getdata(self, msg):
+        t = time.time()
+        self.notfound_queue.extend(msg.inv)
+        for inv in msg.inv:
+            self.getdata[inv] = t
+        while len(self.notfound_queue) >= 100:
+            self.send_message(msg_notfound(vec=self.notfound_queue[:100]))
+            self.notfound_queue = self.notfound_queue[100:]
+
+    def summary(self):
+        return len(self.getdata), len(self.notfound_queue)
+
+class P2PNotFoundPerf(BitcoinTestFramework):
+    def set_test_params(self):
+        self.num_nodes = 1
+
+    def run_test(self):
+        PEERS = 11
+        TRANSACTIONS = 99000
+
+        gen_node = self.nodes[0]  # The block and tx generating node
+        gen_node.generate(1)
+
+        inbound_peers = [ self.nodes[0].add_p2p_connection(P2PNode()) for _ in range(PEERS) ]
+        for inbound in inbound_peers:
+            inbound.getdata = {}
+            inbound.notfound_queue = []
+
+        for txbatch in range(TRANSACTIONS//100):
+            self.log.info("Doing batch %d" % (txbatch+1))
+            ann = [CInv(t=1, h=(txbatch*1000+i)) for i in range(100)]
+            for inbound in inbound_peers:
+                inbound.send_message(msg_inv(inv=ann))
+
+        #gen_node.logging(exclude=['net'])
+
+
+        for i in range(60):
+            self.log.info("State: " + " ".join("%d:%d" % inbound.summary() for inbound in inbound_peers))
+            time.sleep(15)
+
+
+if __name__ == '__main__':
+    P2PNotFoundPerf().main()

Gives: 2020-03-02T06:58:54.778347Z [msghand] Time spent in RetryProcessTx 18.754s, 17 us per call (1089000 calls) by the end for me, which doesn't seem too bad. It took about 12 minutes for all 99k transactions to get requested/notfound by all 11 peers, so 19 seconds total doesn't seem too bad. (The check for MAX_PEER_TX_IN_FLIGHT significantly cuts down on whether this actually does anything -- if there's lots of tx's announced by each peer, then most of them will always have most tx's in flight)

@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Mar 2, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #18446 (test: Add test for wtxid transaction relay by fjahr)
  • #18261 (Erlay: bandwidth-efficient transaction relay protocol by naumenkogs)
  • #18044 (Use wtxid for transaction relay by sdaftuar)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot mentioned this pull request Mar 2, 2020
1 of 2 tasks complete
@naumenkogs

This comment has been minimized.

Copy link
Contributor

naumenkogs commented Mar 2, 2020

Concept ACK. Code looks good to me.
Can you provide more context to understand the test? Are you basically trying to show that DoS-vector your PR opens is probably less than those which already exist?

@ajtowns

This comment has been minimized.

Copy link
Contributor Author

ajtowns commented Mar 4, 2020

@naumenkogs Yeah; in a worst case scenario you could get 100 txids in a NOTFOUND message, have 125 peers to cycle through, and each peer could have announced 100k txs, so you end up with something like O( 12,500 * log(100000) ) operations as a result of a single message (albeit with a fair chunk of setup). The test indicates that 11 peers on my hardware gives a time of about 20us per txid, so per NOTFOUND with 110 peers (so a factor of 100*10) that might result in 20ms processing time on a single message. That seemed low enough to be worth a PR to me.

If that's not good enough, could make it a bit lower in the worst case fairly easily by not retrying NOTFOUNDs as aggressively (eg, only do it for the first 10 txids in a NOTFOUND message, or ban peers if they often report a txid as NOTFOUND despite having INVed it recently, or change the retry delay from 0s-2s to 2s-4s, etc); but it'd need more rework of the data structures to get it to work efficiently, and then you have to ensure the new data structures don't introduce different DoS possibilities.

@jnewbery jnewbery mentioned this pull request Mar 4, 2020
@naumenkogs

This comment has been minimized.

Copy link
Contributor

naumenkogs commented Mar 4, 2020

No, I think this sounds reasonable. I'd rather not introduce new data structures. As I said, we probably already have easier ways to DoS a node.

luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Mar 5, 2020
Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.

Github-Pull: bitcoin#18238
Rebased-From: a204d15
@mzumsande

This comment has been minimized.

Copy link
Contributor

mzumsande commented Mar 20, 2020

Concept ACK.

One difference to #15505 is that the retry-request there would be from outbound peers only, whereas this PR also includes inbound peers.
If I understand it correctly, this does not create any additional attack surface (e.g. for InvBlocks), because we keep the exact order of retries from the case where the peer does not answer at all (not even with NOTFOUND), just ask our next peer in line earlier than we would have otherwise.

@naumenkogs

This comment has been minimized.

Copy link
Contributor

naumenkogs commented Mar 22, 2020

@mzumsande there's a little threat that, unlike with prior behavior, we won't have a 1 minute window between first peer announced a tx and followed up with NOTFOUND, and we execute requests for the next peers, during which more honest outbound peers can come with announcements and get prioritized against inbound dishonest peers.

But the only ways I think this can be exploited require dropping a transaction from an honest announcer's mempool, which is probably a more critical vulnerability anyway, so this doesn't reduce the overall security.

}
}

std::chrono::microseconds process_time = current_time + GetRandMicros(MAX_NOTFOUND_RETRY_RANDOM_DELAY);

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 22, 2020

Contributor

Why is 2-seconds random delay useful here? It can accidentally prioritize inbound nodes over outbound :)

This comment has been minimized.

Copy link
@ajtowns

ajtowns Mar 24, 2020

Author Contributor

It already tests that process_time < best so the 2s delay will only bring a process time forward, not push it back past an inbound peer's process time.

I think the scenario is likely to be:

  • bunch of peers announce a new tx, outbound connections get polled asap, inbound connections get polled after 2s
  • you pick one of those, likely an outbound, and send a GETDATA
  • other peers announce, get a 60s-64s delay
  • the early announcing peers get their process time rescheduled with an additional 60-64s delay
  • a NOTFOUND comes back, and you pick whoever's delay was shortest; if it came back in under 2s, you'll next ask an inbound connection who announced early; if it came back later, you'll probably ask an outbound connection who announced after you'd sent the first GETDATA

Which is to say, I don't think we're consistently prioritising outbounds after the first request anyway.

I don't have a good rationale for the 2s random delay; instantly going from a NOTFOUND from one peer to a GETDATA to another peer seems risky to me though.

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 24, 2020

Contributor

Yeah, I was a bit confused, now what's on your mind makes sense to me.
I'm probably fine with leaving your 2s random delay.

Which is to say, I don't think we're consistently prioritising outbounds after the first request anyway.

I think we do? That's probably off-topic here, although i'd be curious why you think so.

std::chrono::microseconds best;

for (auto& el : mapNodeState) {
CNodeState::TxDownloadState* d = &el.second.m_tx_download;

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 24, 2020

Contributor

Could we come up with a better variable name? :(
Maybe "tx_communication" or something. Yeah it sucks, but probably nothing is worse than 1-letter name for something meaningful like this.

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 24, 2020

Contributor

also best->best_time

This comment has been minimized.

Copy link
@ariard

ariard Mar 25, 2020

Contributor

Or I renamed best to lower_process_time while reviewing, but I agree names can be more meaningful.

@@ -4039,6 +4076,7 @@ bool PeerLogicValidation::SendMessages(CNode* pto)
// Erase this entry from tx_process_time (it may be added back for
// processing at a later time, see below)
tx_process_time.erase(tx_process_time.begin());
state.m_tx_download.m_tx_announced[txid] = std::chrono::microseconds::zero();

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 24, 2020

Contributor

I think this line handles a tricky corner case. Can we add comment to reduce the cognitive burned on code reviewer?
What I think happens is (correct me if I'm wrong):

  • request from X1, not found
  • request from X2, ignores us for 1 minute
  • request from X3, notfound
  • this line prevents us from querying X2 again (we would if this line is deleted)

Now that I wrote this, one could simply check the presence of m_tx_in_flight for that peer, and if it is in flight, not consider for re-query?

Copy link
Contributor

ariard left a comment

Maybe commit message can be clearer and points to issue solved :

"Removing mapRelay would be a direct privacy improvement, but it may turn small-mempool peers as Dosers. If announced transactions are dropped from mempools and aren't available anymore in mapRelay, requesters will keep sending GETDATA until download expiration. By retrying requesting NOTFOUND transactions with different peers we avoid this issue. Note a dishonest peer can still withhold NOTFOUND to trigger same behavior from requester."?

std::chrono::microseconds best;

for (auto& el : mapNodeState) {
CNodeState::TxDownloadState* d = &el.second.m_tx_download;

This comment has been minimized.

Copy link
@ariard

ariard Mar 25, 2020

Contributor

Or I renamed best to lower_process_time while reviewing, but I agree names can be more meaningful.

CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;

for (auto& el : mapNodeState) {

This comment has been minimized.

Copy link
@ariard

ariard Mar 25, 2020

Contributor

Did you consider to filter by nPreferredDownload to favor outbound peers and lower DoS risk? Only querying outbound peers should be good enough to achieve goal of finding NOTFOUND transactions but even if we don't success due to bad-connectivity of our outbounds, worst-case scenario we hit the 1-min window (and a transaction not being announced by our outbounds is less likely to be a honest announcement?)

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

I think it'd be even better to change the 'best' definition to prefer outbound peers where we can, but fall back to inbound peers if there are no outbound peers that have announced the tx.

Copy link
Member

jnewbery left a comment

Concept ACK, but lots of questions.

My main concern is that TxDownloadState is becoming more complex, with data duplicated between member variables and only certain combinations being valid states. To reduce cognitive overload and risk of bugs, perhaps it should turned into a class with a well-defined interface for callers.

static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
{
CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

I get a compiler warning that best might be used before initialization:

net_processing.cpp: In function ‘bool ProcessMessage(CNode*, const string&, CDataStream&, int64_t, const CChainParams&, CConnman*, BanMan*, const std::atomic<bool>&)’:
net_processing.cpp:756:27: warning: ‘best’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     if (best_d != nullptr && process_time < best) {
         ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
net_processing.cpp:741:31: note: ‘best’ was declared here
     std::chrono::microseconds best;
                               ^~~~

The logic below means that this can't be read before set, but I don't see any harm in initializing to 0 or std::chrono::microseconds::max().

/* Store all the transactions a peer has recently announced,
* along with their process time
*/
std::map<uint256, std::chrono::microseconds> m_tx_announced;

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

This can be an unordered map for average constant-time lookup:

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 3373f7f54..2e097d44e 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -30,6 +30,7 @@
 
 #include <memory>
 #include <typeinfo>
+#include <unordered_map>
 
 #if defined(NDEBUG)
 # error "Bitcoin cannot be compiled without assertions."
@@ -204,6 +205,11 @@ namespace {
     static std::vector<std::pair<uint256, CTransactionRef>> vExtraTxnForCompact GUARDED_BY(g_cs_orphans);
 } // namespace
 
+struct TxHasher
+{
+    size_t operator()(const uint256& hash) const { return ReadLE64(hash.begin()); }
+};
+
 namespace {
 /**
  * Maintain validation-specific state about nodes, protected by cs_main, instead
@@ -349,7 +355,7 @@ struct CNodeState {
         /* Store all the transactions a peer has recently announced,
          * along with their process time
          */
-        std::map<uint256, std::chrono::microseconds> m_tx_announced;
+        std::unordered_map<uint256, std::chrono::microseconds, TxHasher> m_tx_announced;
 
         //! Store transactions which were requested by us, with timestamp

This comment has been minimized.

Copy link
@ajtowns

ajtowns Mar 26, 2020

Author Contributor

I think you'd need that to remain constant/log time in worst-case/attack scenarios because the announced hash is completely under the control of an attacker.

//! Store all the transactions a peer has recently announced
std::set<uint256> m_tx_announced;
/* Store all the transactions a peer has recently announced,
* along with their process time

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

I think this comment could be slightly improved. The time is set to:

  • std::chrono::microseconds::zero() if the transaction has been requested from this peer (ie exists in m_tx_in_flight)
  • the process time if the transaction has not yet been requested from this peer (ie exists in m_tx_process_time)
@@ -3222,6 +3257,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
std::vector<CInv> vInv;
vRecv >> vInv;
if (vInv.size() <= MAX_PEER_TX_IN_FLIGHT + MAX_BLOCKS_IN_TRANSIT_PER_PEER) {

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

Did you consider moving all of the NOTFOUND processing to its own function ProcessNotFound()? Splitting the logic between here and RetryProcessTx() makes it more difficult to follow than if it was all in one place.

auto end = best_d->m_tx_process_time.end();
for (auto it = best_d->m_tx_process_time.lower_bound(best); it != end && it->first == best; ++it) {
if (it->second == txid) {
best_d->m_tx_process_time.erase(it);

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

Maybe I'm misunderstanding the logic here, but this will bring the NodeState's m_tx_process_time forward (to now + ~1second), but does nothing to the global g_already_asked_for time, which means that next time we go round SendMessages(), we won't actually rerequest the transaction, since last_request_time > current_time - GETDATA_TX_INTERVAL. Am I missing something?

If I'm right, I think we need to do something like reset g_already_asked_for to one minute ago.

@@ -3234,6 +3270,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
}
state->m_tx_download.m_tx_in_flight.erase(in_flight_it);
state->m_tx_download.m_tx_announced.erase(inv.hash);

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

Now that we're potentially rerequesting txs from different peers after receiving a NOTFOUND, I think we shouldn't erase them from a peer's m_tx_in_flight and m_tx_announced structures in receipt of a NOTFOUND, and keep them there until the TX_EXPIRY_INTERVAL in SendMessages(), otherwise there might be an attack where an adversary has multiple connections to you, and then juggles you between them by sending an INV for a transaction, holding on to the GETDATA, and then sending a NOTFOUND immediately followed by an INV reannouncing the same tx.

@fjahr

This comment has been minimized.

Copy link
Contributor

fjahr commented Mar 25, 2020

Concept ACK but echoing jnewbery's feedback I also still have questions and agree with most of his code comments. I also think that code is currently not tested, so adding an explicit test would be very valuable. Additionally, comments in the RetryProcessTx code would be helpful for understanding.

nit on the commit message: s/find one who doesn't already have its inflight/find all who don't already have their inflight/

Copy link
Contributor

jkczyz left a comment

Concept ACK

if (d->m_tx_in_flight.size() >= MAX_PEER_TX_IN_FLIGHT) continue;
auto it = d->m_tx_announced.find(txid);
if (it != d->m_tx_announced.end()) {
if (best_d == nullptr || (it->second != std::chrono::microseconds::zero() && it->second < best)) {

This comment has been minimized.

Copy link
@jkczyz

jkczyz Mar 25, 2020

Contributor

This condition could probably be simplified considerably if (1) best is initialized appropriately and (2) the entry for the txid in m_tx_announced is removed on line 4079 rather than zeroed (like what is done on line 3272). Was zeroing chosen on line 4079 for a specific reason?

This comment has been minimized.

Copy link
@naumenkogs

naumenkogs Mar 25, 2020

Contributor

My guess for the reason is here :)

CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;

for (auto& el : mapNodeState) {

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

I think it'd be even better to change the 'best' definition to prefer outbound peers where we can, but fall back to inbound peers if there are no outbound peers that have announced the tx.

@@ -344,8 +346,10 @@ struct CNodeState {
*/
std::multimap<std::chrono::microseconds, uint256> m_tx_process_time;

//! Store all the transactions a peer has recently announced
std::set<uint256> m_tx_announced;
/* Store all the transactions a peer has recently announced,

This comment has been minimized.

Copy link
@jnewbery

jnewbery Mar 25, 2020

Member

We discussed this in PR Review Club (https://bitcoincore.reviews/18238.html#l-115 and https://bitcoincore.reviews/18238.html#l-180), and there was some agreement that even for a small struct like this, the dependencies between the fields make it difficult to reason about and potentially bug-prone.

I think from a high level, in the future we might want a TxDownloadState class which contains:

  • {txid, timestamp, state} objects where
    • state can be 'announced' or 'requested'
    • timestamp refers to 'when to request' if state is announced, and 'when requested' if state is requested
  • a way to lookup the object by txid
  • a way to iterate through the 'announced' objects sorted by timestamp
  • a way to iterate through the 'requested' objects (doesn't necessarily need to be sorted by timestamp since there can only be MAX_PEER_TX_IN_FLIGHT = 100 of them)
  • public functions to add, remove, refresh timestamps, etc

This comment has been minimized.

Copy link
@ajtowns

ajtowns Mar 26, 2020

Author Contributor

I'm +1 on this, but maybe it'd be better to get wtxid relay #18044 in first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

9 participants
You can’t perform that action at this time.