Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: attempt to fill full outbound connection slots with peers that support tx relay #28538

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mzumsande
Copy link
Contributor

@mzumsande mzumsande commented Sep 26, 2023

As described in the issues #16418 and #28371, it's a possibility that we could end up having an insufficient number of outbound peers relaying our transactions. Having fewer outbound peers support tx-relay could also have implications on privacy and fee estimation.

While #28488 is suggesting meaures based on comparing fee filters / mempool sizes, there is also the simpler issue of peers that tell us they don't want transactions, which is determined by the fRelay field in the version msg. If such a peer runs bitcoin core, that usually means that it's running in -blocksonly mode and accepting inbound connections. The status quo is that we will not avoid these peers and just not send them any transactions.

This PR proposes not to waste any of our 8 full-relay outbound slots on these peers (unless we are in -blocksonly mode ourselves). Using them as outbound peers for block-relay-only connections is fine though and not impacted.

As was suggested by ajtowns below, we don't disconnect during version processing, but later during the regularly scheduled EvictExtraOutboundPeers() task, and only if we are at the maximum of full-outbound peers.

If reviewers think that this proposal is too aggressive, an alternative solution (with a little more complexity) would be to restrict the maximum number of outbound connections to -blocksonly peers to 1 or 2. Although I currently think that it's ok to be selective with outbound connections, so I didn't implement that right away.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 26, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK stratospher
Concept ACK glozow, sr-gi
Stale ACK naumenkogs, amitiuttarwar

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@DrahtBot DrahtBot added the P2P label Sep 26, 2023
@mzumsande mzumsande marked this pull request as draft September 26, 2023 19:15
brunoerg added a commit to brunoerg/bitcoin that referenced this pull request Sep 26, 2023
@mzumsande mzumsande marked this pull request as ready for review September 26, 2023 21:28
Copy link
Member

@naumenkogs naumenkogs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

I think this measure makes perfect sense, it's easy and immediately approaches the mentioned issues.

I would be fine with restricting this (as suggested by the OP message) too.

src/net_processing.cpp Outdated Show resolved Hide resolved
src/net_processing.cpp Outdated Show resolved Hide resolved
// Don't waste any of our full outbound slots on peers that don't want tx relay (because they might be in -blocksonly mode)
if (!fRelay && !m_opts.ignore_incoming_txs && pfrom.IsFullOutboundConn())
{
LogPrint(BCLog::NET, "peer=%d (full-relay outbound) does not participate in transaction relay; disconnecting\n", pfrom.GetId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing this immediaely upon connection (ie, after receiving their VERSION message), would it be better to do it only when we have all our outbound slots filled up (ie, as part of EvictExtraOutboundPeers) ?

Also, doing the check repeatedy would mean that we could keep blocks-only peers while we're doing IBD, only dropping them when we exit IBD and start caring about tx relay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing this immediaely upon connection (ie, after receiving their VERSION message), would it be better to do it only when we have all our outbound slots filled up (ie, as part of EvictExtraOutboundPeers) ?

I like that idea. If for some strange reason we wouldn't be able to find any other peers we wouldn't end up connecting the only ones we have. But I think we would want to start evicting if we are at the limit (so in CheckForStaleTipAndEvictPeers), not just if we have exceeded it (EvictExtraOutboundPeers).

Also, doing the check repeatedy would mean that we could keep blocks-only peers while we're doing IBD, only dropping them when we exit IBD and start caring about tx relay.

Do we really want that? The main reason to run in -blocksonly mode is to save traffic. I would assume that most nodes wouldn't enable listening anyway, but those that do (maybe not even on purpose), maybe it would be nicer not to use them for IBD?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start evicting if we are at the limit (so in CheckForStaleTipAndEvictPeers), not just if we have exceeded it (EvictExtraOutboundPeers).

I agree.

Do we really want that? The main reason to run in -blocksonly mode is to save traffic. I would assume that most nodes wouldn't enable listening anyway, but those that do (maybe not even on purpose), maybe it would be nicer not to use them for IBD?

Not sure about this. You seem to care specifically about those users which do -blocksonly and forget to disable inbounds. Perhaps this is better communicated through documentation and logging. I think we should leave them a chance to strengthen the block relaying network.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to having the logic in eviction instead of connection

RE IBD:
it seems odd / logically inconsistent to run a node in -blocksonly and enable inbounds, so I don't think it's worth overly optimizing for the use case. If we want to improve the user experience, we could have a soft param interaction between the two (independent of these changes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems odd / logically inconsistent to run a node in -blocksonly and enable inbounds, so I don't think it's worth overly optimizing for the use case

Maybe there is a use case for wanting to help the network by providing inbound capacity, but still minimize traffic at the same time? Although then it often doesn't make a lot of sense to run a non-pruned node - unless you also need the entire chain for selfish purposes (e.g. txindex).
In any case, I have included the suggestion to not evict while in ibd in the latest push!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want that? The main reason to run in -blocksonly mode is to save traffic.

In that case you'd be better off doing -maxuploadtarget either instead of or in addition to -blocksonly?

it seems odd / logically inconsistent to run a node in -blocksonly and enable inbounds,

I think it would make sense to expect that if 100% (or 99.9%) of nodes were -blocksonly that they would still manage to propagate blocks sensibly, bring up new nodes efficiently, etc. In theory with this PR, we would also be able to propagate txs successfully in the 99.9% -blocksonly case.

Copy link
Contributor

@amitiuttarwar amitiuttarwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept ACK

// Don't waste any of our full outbound slots on peers that don't want tx relay (because they might be in -blocksonly mode)
if (!fRelay && !m_opts.ignore_incoming_txs && pfrom.IsFullOutboundConn())
{
LogPrint(BCLog::NET, "peer=%d (full-relay outbound) does not participate in transaction relay; disconnecting\n", pfrom.GetId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to having the logic in eviction instead of connection

RE IBD:
it seems odd / logically inconsistent to run a node in -blocksonly and enable inbounds, so I don't think it's worth overly optimizing for the use case. If we want to improve the user experience, we could have a soft param interaction between the two (independent of these changes)

@mzumsande
Copy link
Contributor Author

I've now changed the approach to @ajtowns suggestion to disconnecting during eviction.

@mzumsande mzumsande changed the title p2p: don't make full outbound connections to peers that don't support tx relay p2p: attempt to fill full outbound connection slots with peers that support tx relay Oct 4, 2023
Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept/approach ACK

src/net_processing.cpp Show resolved Hide resolved
@@ -673,6 +673,7 @@ def add_p2p_connection(self, p2p_conn, *, wait_for_verack=True, **kwargs):
def add_outbound_p2p_connection(self, p2p_conn, *, wait_for_verack=True, p2p_idx, connection_type="outbound-full-relay", **kwargs):
"""Add an outbound p2p connection from node. Must be an
"outbound-full-relay", "block-relay-only", "addr-fetch" or "feeler" connection.
The txrelay arg determines whether our peer will signal support for txrelay in their version message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

831ed49: seems like this should be in the peer_connect_send_version function docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the doc, since it's hidden in the kwargs anyway (It wasn't when I added it in an earlier revision). Within peer_connect_send_version it's kind of self-explanatory, I had added it here to avoid confusion to which side of the connection txrelay refers to.

Copy link
Member

@sr-gi sr-gi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, just some nits here and there

test/functional/test_framework/p2p.py Outdated Show resolved Hide resolved
test/functional/test_framework/p2p.py Outdated Show resolved Hide resolved
test/functional/p2p_blocksonly.py Outdated Show resolved Hide resolved
src/net.cpp Show resolved Hide resolved
Comment on lines 5258 to 5256
m_connman.ForEachNode([&node_to_evict](CNode* node) {
if (!node->IsFullOutboundConn() || node->m_relays_txs) return;
if (!node_to_evict.has_value() || *node_to_evict < node->GetId()) {
node_to_evict = node->GetId();
}
});
Copy link
Member

@sr-gi sr-gi Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is, at least, a node already flagged to be disconnected? Shouldn't we skip this altogether? Otherwise, it could be the case that we end up evicting more nodes that we really need to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, ForEachNode already checks NodeFullyConnected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why I didn't include the check here (the similar checks a few lines above aren't necessary either)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume fDisconnect could be useful here too. Just in case ForEachNode gets updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating ForEachNode to no longer check NodeFullyConnected seems pretty intrusive & like all callers would need to be carefully checked, so imo it doesn't seem super valuable to repeat the check here. but cost of redundant check is also pretty low so doesn't seem like a big deal either way.

Copy link
Contributor Author

@mzumsande mzumsande Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't see that as a real risk, but I do see the problem that whoever reads this code (or decides to use ForEachNode somewhere else) could easily get confused, especially since some callers check again for fDisconnect.
I added a refactor commit renaming ForEachNode() to ForEachFullyConnectedNode() and removing the duplicate checks / comments. What do you think about that approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, i suggested literally having Assume(fDisconnect), not an extra if check (that would actually make code worse).

Even if you refactor it into ForEachFullyConnectedNode, I think this Assume(fDisconnect) is kinda useful — because who knows what exactly ForEachFullyConnectedNode means :)

But I'm fine either way, either you rename or add Assume (both are ideal i think :))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a refactor commit renaming ForEachNode() to ForEachFullyConnectedNode() and removing the duplicate checks / comments.

I like it, I find it clear

@mzumsande
Copy link
Contributor Author

Pushed to address feedback.

I also removed a sub-test with a block-relay-only peer, there isn't really a reason to think that that peer would be disconnected anyway, and it didn't make much sense in that form because we weren't at any connection limit.

src/net_processing.cpp Outdated Show resolved Hide resolved
blocksonly_peer1.peer_disconnect()

self.log.info("Check that we don't evict full-relay outbound connections to blocksonly peers if we are blocksonly ourselves")
self.restart_node(0, ["-blocksonly", "-maxconnections=1"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why anyone would do maxconnections=1 in practice, but then if their firewall allows to find only one blocksonly node (and they are not in blocksonly), the behavior might be confusing: they will be frequently disconencted from this peer.

Perhaps, this could be logged if maxconnections=1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it makes no sense, if you want just one connection you should pick a node you trust and do -connect. Using automatic connections for that means eclipsing yourself - maybe we should warn about that in general?
That being said, this PR does log disconnection to -blocksonly peers, although only in BCLog::NET - do you suggest to make it unconditional based on -maxconnections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea was to log if maxconnect is set to 1, so at the startup (setting this argument) time.
I don't insist though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo users should get an error if they set -maxconnections to less than 8, or at least a "this is unsafe. are you SURE?!". but that's probably getting off topic here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with adding this warning, but yeah, this PR is not the best place for it.

@naumenkogs
Copy link
Member

ACK 840a022

Not sure if you noticed this idea, might be a worthy precaution.

Copy link
Contributor

@amitiuttarwar amitiuttarwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks good to me. left a comment about the test that seems worth fixing, then I'm ready to ACK the patch.

trying to think about any potential negative network effects that could occur, but so far it seems fine to disconnect OB full relay peers that signal they don't want to receive transactions.

test/functional/p2p_blocksonly.py Outdated Show resolved Hide resolved
blocksonly_peer1.peer_disconnect()

self.log.info("Check that we don't evict full-relay outbound connections to blocksonly peers if we are blocksonly ourselves")
self.restart_node(0, ["-blocksonly", "-maxconnections=1"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo users should get an error if they set -maxconnections to less than 8, or at least a "this is unsafe. are you SURE?!". but that's probably getting off topic here :)

@mzumsande
Copy link
Contributor Author

mzumsande commented Dec 12, 2023

840a022 to d636e38:
Addressed feedback, and also added a new commit to rename ForEachNode and remove duplicate checks, see #28538 (comment).

ToDo: address #28538 (comment)

Copy link
Contributor

@amitiuttarwar amitiuttarwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d636e38

test/functional/p2p_blocksonly.py Outdated Show resolved Hide resolved
Comment on lines 5258 to 5256
m_connman.ForEachNode([&node_to_evict](CNode* node) {
if (!node->IsFullOutboundConn() || node->m_relays_txs) return;
if (!node_to_evict.has_value() || *node_to_evict < node->GetId()) {
node_to_evict = node->GetId();
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a refactor commit renaming ForEachNode() to ForEachFullyConnectedNode() and removing the duplicate checks / comments.

I like it, I find it clear

@mzumsande
Copy link
Contributor Author

mzumsande commented Dec 14, 2023

d636e38 to 2a17ac3:
Renamed GetExtraFullOutboundCount to GetFullOutboundDelta since it can now return negative values, plust small doc change in the test.
2a17ac3 to ad4866:
Rebased due to conflict.

…bound maximum

...unless we are in -blocksonly mode ourselve or in ibd.
If we had too many outbound peers not participating in tx relay, we would run
into problems with getting own transaction relayed. This would also be
bad for privacy and fee estimation.

We evict non-txrelay peers only when all full outbound slots
are filled to have peers in hypothetical situations where
-blocksonly peers are the only ones available.

Idea for this approach by ajtowns.
@amitiuttarwar
Copy link
Contributor

ACK ad48667

// evict the newest of them (highest Node Id) so we can fill the slot with a tx-relaying outbound peer
// This is not done if we are in -blocksonly mode ourselves or still in IBD,
// because then we don't care if our peer supports tx relay.
else if (full_outbound_delta == 0 && !m_opts.ignore_incoming_txs && !m_chainman.IsInitialBlockDownload()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1393bf6

not super clear why we won't do this if full_outbound_delta > 0?
Maybe a comment explaining why this could be possible in the first place could help (probably above the first if block).
Even now it's hard to tell if it's possible to be over the limit most-of-the-time, let alone if we add some new complexity for being over the limit more frequently (for some security reason or whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think the main reason was to avoid interaction with the existing extra-full-outbound eviction.
Which raises the question whether there should be any interaction?
Should nodes that are protected from that (either because m_protect is true or because they are the only ones for their network) also be made exempt from this new disconnection due to not supporting tx relay?
I'm not really sure about that: It would be unfortunate if the non-tx-relaying outbound peer would be the only one with the best chain, but how realistic would such a scenario be?

Copy link
Member

@sr-gi sr-gi Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not super clear why we won't do this if full_outbound_delta > 0?

Which raises the question whether there should be any interaction?

I think this would create a weird interaction. The current approach of this PR is trying to make the 8 full-outbound connections relay transactions, given we already have blocks-only connections, so we should not waste full-outbound slots for this purpose. However, our full-outbound connection protection criteria is based on whether they have provided valid blocks with enough accumulated work and it has nothing to do with how good they are at providing transactions (there is also the recently added network diversity criteria, more about that later).

This raises the question of what we value the most, having 8 full-outbound peers relaying transactions or our protected peers (and under what criteria). If the former is valued the most, we could merge this under full_outbound_delta > 0 but pick a node not taking the protection criteria into account (which raises the question of how useful the protection is to start with). If the latter is prioritized, then we exclude the protected peers first and then run the eviction over the remaining. This slightly changes the goal of the PR, given we may not achieve 8 transaction-relaying outbound peers. Finally, there are the additional protected peers for network diversity. Depending on how this is treated we may end up with an even smaller number of full-outbounds relaying transactions (8 - 4 - n where n represents the number of networks we support).

Copy link
Contributor Author

@mzumsande mzumsande Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree - I think that there are colliding goals, my thinking is as follows:

  1. The m_protect protection attempts to prevent pathological situations where we wouldn't have enough peers with the honest best chain (that shouldn't happen normally even without the protection, but would be very serious if they did)
  2. The network-specific protection aims to be connected to all networks we support, to help the interconnection, and also to reduce the risk of eclipse attacks. Ideally, that means relaying both blocks and transactions from one network to another.
  3. The added code attempts to make sure we have as many full-outbound peers actually participating in tx relay as possible, mostly for selfish reasons (e.g. privacy, fee estimation)

These goals conflict. I'm pretty confident that 3) should take precedence over 2), because the network-specific protection isn't really about this particular peer, we just want any connection to that network, and would be happy to exchange a current peer that doesn't support tx-relay with a tx-relaying peer from that same network.

I'm kind of torn between 1) and 2). It would be easy to exempt non-tx-relaying m_protect peers from disconnection, but that would mean that we could end up with up to 4 of these peers permanently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easy to exempt non-tx-relaying m_protect peers from disconnection, but that would mean that we could end up with up to 4 of these peers permanently.

commit message where m_protect was introduced says "we pick 4 of our outbound peers and do not subject them to this logic, to be more conservative. We don't wish to permit temporary network issues (or an attacker) to excessively disrupt network topology."

wondering how someone can disrupt network topology with this and if it's still relevant.

blocksonly_peer.sync_with_ping()
blocksonly_peer.peer_disconnect()

self.log.info("Check that we evict full-relay outbound connections to blocksonly peers at the full outbound limit")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

68769ff

You could expand this test by adding some irrelevant peers (e.g. ADDRFETCH), but no big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, but it turned out to be annoying because in the test we simulate being at the full outbound limit by artificially running with a lower -maxconnections. If we now also add an unrelated peer like ADDRFETCH , it would make the test fail because we don't have enough slots in semOutbound.

src/net_processing.cpp Outdated Show resolved Hide resolved
// marked for disconnection
if (!pnode->IsFullOutboundConn() || pnode->fDisconnect) return;
// Only consider outbound-full-relay peers
if (!pnode->IsFullOutboundConn()) return;
CNodeState *state = State(pnode->GetId());
if (state == nullptr) return; // shouldn't be possible, but just in case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ad48667

might update this too if (!Assume(state)) return and drop the comment if you feel like it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to not do it here, for my taste it's not related enough to the core of this PR.

Rename ForEachNode to ForEachFullyConnectedNode because it
has an inbuilt check that only executes the passed function
for fully connected nodes (i.e. if fSuccessfullyConnected is set
and fDisconnect isn't).

This means callers don't need to need to worry about that, so remove
those duplicate checks and related comments.
@mzumsande
Copy link
Contributor Author

Updated to address feedback, I would love to hear more opinions on the protection issue discussed in the threat following #28538 (comment)

Copy link
Contributor

@stratospher stratospher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 8f07458. useful to fill outbound slots with peers which we can relay transactions with and i liked the approach in this PR to swap away blocks-only peers in full relay outbound slots only after the threshold number of max full relay outbound connections is reached.

# Have one blocksonly peer that is not being disconnected
blocksonly_peer1 = self.nodes[0].add_outbound_p2p_connection(P2PInterface(), p2p_idx=1, txrelay=False)
blocksonly_peer2 = self.nodes[0].add_outbound_p2p_connection(P2PInterface(), p2p_idx=2, txrelay=False)
self.nodes[0].mockscheduler(46)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

68769ff: nit if you retouch.

Suggested change
self.nodes[0].mockscheduler(46)
with self.nodes[0].assert_debug_log(expected_msgs=["disconnecting full outbound peer not participating in tx relay: peer=1"]):
self.nodes[0].mockscheduler(46)
blocksonly_peer2.wait_for_disconnect()

// evict the newest of them (highest Node Id) so we can fill the slot with a tx-relaying outbound peer
// This is not done if we are in -blocksonly mode ourselves or still in IBD,
// because then we don't care if our peer supports tx relay.
else if (full_outbound_delta == 0 && !m_opts.ignore_incoming_txs && !m_chainman.IsInitialBlockDownload()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easy to exempt non-tx-relaying m_protect peers from disconnection, but that would mean that we could end up with up to 4 of these peers permanently.

commit message where m_protect was introduced says "we pick 4 of our outbound peers and do not subject them to this logic, to be more conservative. We don't wish to permit temporary network issues (or an attacker) to excessively disrupt network topology."

wondering how someone can disrupt network topology with this and if it's still relevant.

@mzumsande
Copy link
Contributor Author

Thanks for the review!

We had an offline conversation about this today (FYI @sdaftuar @sipa @sr-gi). Some takeaways:

  • It would probaly be better to first have a replacement peer ready before disconnecting a blocksonly peer. That would mean a similar approach as for the extra network-specific peers, it would also mean that we would need to add a way to make extra connections when we have a -blocksonly peer we want to get rid of.
  • We should avoid situations where there could be perpetual churn. For example, imagine the handful of peers on the cjdns network suddenly all went blocksonly. Then our network-specific logic would try to make new connections to them, while this logic would disconnect them, resulting in a lot of churn.
  • To avoid interactions with protected peers, it might be best to not protect fRelay=false peers in the first place via m_protect.

I think I'll look into reworking the PR and will put it in draft until that has happened.

@mzumsande mzumsande marked this pull request as draft January 9, 2024 17:25
@DrahtBot
Copy link
Contributor

🤔 There hasn't been much activity lately and the CI seems to be failing.

If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants