Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not add random inbound peers to addrman. #8594

Merged
merged 1 commit into from Sep 7, 2016
Merged

Do not add random inbound peers to addrman. #8594

merged 1 commit into from Sep 7, 2016

Conversation

gmaxwell
Copy link
Contributor

We should learn about new peers via address messages.

An inbound peer connecting to us tells us nothing about
its ability to accept incoming connections from us, so
we shouldn't assume that we can connect to it based on
this.

The vast majority of nodes on the network do not accept
incoming connections, adding them will only slow down
the process of making a successful connection in the
future.

Nodes which have configured themselves to not announce would prefer we
not violate their privacy by announcing them in GETADDR responses.

We should learn about new peers via address messages.

An inbound peer connecting to us tells us nothing about
 its ability to accept incoming connections from us, so
 we shouldn't assume that we can connect to it based on
 this.

The vast majority of nodes on the network do not accept
 incoming connections, adding them will only slow down
 the process of making a successful connection in the
 future.

Nodes which have configured themselves to not announce would prefer we
 not violate their privacy by announcing them in GETADDR responses.
@dcousens
Copy link
Contributor

dcousens commented Aug 25, 2016

@gmaxwell how does this impact on the connectivity of those [non-incoming] nodes? (that is, does it hinder them at all?)

@pstratem
Copy link
Contributor

utACK 53f8f22

travis is borked

@rebroad
Copy link
Contributor

rebroad commented Aug 26, 2016

utACK

@sipa
Copy link
Member

sipa commented Aug 26, 2016 via email

@laanwj
Copy link
Member

laanwj commented Aug 26, 2016

utACK, this has bothered me before

@sipa
Copy link
Member

sipa commented Aug 26, 2016

utACK

@sipa
Copy link
Member

sipa commented Aug 26, 2016

Can we backport this to 0.13?

@EthanHeilman
Copy link
Contributor

EthanHeilman commented Aug 26, 2016

@sipa @gmaxwell Reading main.cpp:5033, my understanding is that incoming connections are only added to the tried table if the source IP address of the TCP connection is the same as the IP address supplied in the version message. The assumption here is that if the peer's IP in the version message is the same as the IP in TCP connect, the peer can accept incoming connections. This assumption can be violated under two conditions:

  1. Nothing prevents a malicious peer behind a NAT from altering the IP address in the version message to make it look like it is not behind a NAT. While this is less than ideal an attacker can only do this once per IP address they control. This is certainly better than the protections provided by address messages where a malicious peer controlling a single IP address can inject up to ~2048 IP addresses into the new table via address messages (this is an approximate back of the envelope calculation).
  2. A peer might not accept incoming connections but still may not be behind a NAT. It is unclear how common this is, but my expectation from reading bitcoind log files over the years would be that it would be that it is uncommon. We can answer this question with data.

From a security standpoint IP addresses from address messages are more vulnerable to manipulation than IP addresses from incoming connections.

The vast majority of nodes on the network do not accept incoming connections, adding them will only slow down the process of making a successful connection in the future.

I am operating under the assumption that most nodes that do not accept incoming connections do so because they are behind NATs and therefore would not be added to the tried table.

For this and other reasons I would be against this change. If we want to improve the quality of the tried table perhaps confirming that an incoming connection can be connected to via a feeler connection might be a solution. Test-before-evict which I'm currently working on (proof of concept can be found at #6355) does provide some functionality like this and my recently accepted pull request #8282 did, in tests, boost the reachable nodes in the tried table from 55 IPs to 590 IPs. I suspect the real problem with reachable IPs in the tried table is nodes switching IP addresses.

@laanwj
Copy link
Member

laanwj commented Aug 26, 2016

Reading main.cpp:5033, my understanding is that incoming connections are only added to the tried table if the source IP address of the TCP connection is the same as the IP address supplied in the version message.

Do we leak internal LAN IPs here, or is that prevented some way? Nodes should probably not advertise their address at all on outgoing connections, certainly not if they don't want / can't be connected to.

@EthanHeilman
Copy link
Contributor

EthanHeilman commented Aug 26, 2016

Do we leak internal LAN IPs here.

When sending a version message you include your local IP address (see net.cpp:503) but the logic which determines this is fairly complicated (see net.cpp:13).

I've checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true. Can someone who is behind a NAT confirm this behavior in more recent code?

Nodes should probably not advertise their address at all on outgoing connections, certainly not if they don't want / can't be connected to.

If a node sets its IP address in the version message to 0.0.0.0 or 127.0.0.1 when making an outgoing connection they will not be connected to.

@sipa
Copy link
Member

sipa commented Aug 26, 2016

@EthanHeilman It seems dangerous to me to let an attacker directly enter their IP addresses into our tried table. By murmuring addresses they're certainly able to get into the known table, but the tried table should be reserved for IPs which we selected ourselves. Feel free to contradict my intuition here if you have numbers that show otherwise, though.

@EthanHeilman
Copy link
Contributor

@sipa Its an interesting question and I'm not sure what the right answer is.

It seems dangerous to me to let an attacker basically enter their IP addresses into our tried table [..] the tried table should be reserved for IPs which we selected ourselves.

If tried addresses are only added when we select them I agree it would make it harder for an attacker to control tried but it would also have two downsides:

  1. An attacker who eclipsed a node would not need to worry about incoming connections telling the eclipsed node about non-attacker IP addresses except via address messages. An attacker could flush the new table regularly. I'm not sure how this composes with the new connection exhaustion countermeasures.
  2. The tried table will be smaller. It is hard to judge exactly how small since feeler connections will increase the number of short lived incoming connections boosting the tried table, but feeler connections will also increase the number of selected outgoing connections so adding incoming connections to the tried table might not be as valuable.

Determining how the pros balance against the cons would require more research and experiments which I plan on doing but not in the next few months. That said, doing this for security reasons seems more justifiable than doing it to decrease the time it takes to make an outgoing connection.

@gmaxwell What do you think about putting incoming nodes into the new table?

            if (((CNetAddr)pfrom->addr) == (CNetAddr)addrFrom)      
             {      
                 addrman.Add(addrFrom, addrFrom);       
             }

@sipa
Copy link
Member

sipa commented Aug 26, 2016

I would be fine with adding them to the new table... that is identical to what would happen (and likely happens anyway) when they send us an addr message after connecting.

@gmaxwell
Copy link
Contributor Author

@EthanHeilman hosts already send an addr message on connect (see the above stanza), which will add it to new. So I think also doing it there doesn't add anything new except making it more likely that they'll advertise themselves accidentally.

The fact that it was ever added to tried at all was a mistake that violated the design of the system, tried table was supposed to be protected against self-selection by actual trying :) -- I think there could be some legitimate concern that actual useful hosts will be promoted to tried somewhat slower, but I believe that is addressed by the feeler connections.

I would not be surprised if, after this change is widespread, if tried has more reachable hosts in it-- because nodes will pollute their tried table with hosts that aren't reachable less often.

I've checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true.

No, it sets it to its current discovered address. If you are behind NAT and don't have discovery (e.g. via UPNP or configuration) then you won't have a discovered address, and so you'll get 0.0.0.0. Because of security concerns with the upnp library we were using we've currently got it off by default, but I hope to have discovery come back in the future.

I am operating under the assumption that most nodes that do not accept incoming connections do so because they are behind NATs and therefore would not be added to the tried table

Eliminating inbound connections is a good way to mitigate a number of DOS vulnerabilities. A standard recommended commercial configuration is to have production nodes which you care about connect only to other somewhat trusted nodes.

From a security standpoint IP addresses from address messages are more vulnerable to manipulation than IP addresses from incoming connections.

If we want to privilege address announcements that also happen to come from the host telling us about them (e.g. using a lower last seen time for them), I think that would be completely reasonable. But there is no need to infer an addr announcement from the version message.

In the future we'll likely want to carry additional metadata with announcements, which might not be possible to just infer from version handshakes in any case. (Already we can't get port numbers that way).

@luke-jr
Copy link
Member

luke-jr commented Aug 27, 2016

I've checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true. Can someone who is behind a NAT confirm this behavior in more recent code?

Wireshark tells me I am sending my local-scope IPv6 address, when connecting over IPv4 (behind NAT)... :/

@sipa
Copy link
Member

sipa commented Aug 27, 2016

@luke-jr Can you open a separate issue for that? We should never send local addresses out.

@gmaxwell
Copy link
Contributor Author

gmaxwell commented Sep 1, 2016

Where are we on this?

@sipa
Copy link
Member

sipa commented Sep 1, 2016

@EthanHeilman Since we are announcing our own address as soon as we're past IBD anyway, and there is nothing to prevent other nodes from doing the same, what value does this suggested call to addrMan.Add(addr, addr) have?

@laanwj
Copy link
Member

laanwj commented Sep 1, 2016

Where are we on this?

I still think we should continue with this change. I think it would be good eventually to deprecate "addrFrom" in version messages completely: to just send a dummy value and ignore what is sent there. It's a nest of potential privacy and spoofing issues (akin to the X-Forwarded-For header the for HTTP proxies). Also from a protocol point of view advertisement should be separate from version.

That said, @EthanHeilman does have a point that the if (((CNetAddr)pfrom->addr) == (CNetAddr)addrFrom) check at least makes sure that the advertiser can use the IP address advertised, so is a little bit more trustworthy than just a random addr.

On the other hand this same heuristic (if desired) could be used for an address advertised in addr messages. There's no need to couple this to version messages.

@gmaxwell
Copy link
Contributor Author

gmaxwell commented Sep 3, 2016

@laanwj Yes, I agree, I think we should do something when we receive address advertisements to up-weight self ones-- probably by giving the source a time penalty of zero for it's own address in CAddrMan::Add. But I think that makes sense as a separate PR. (I'll go do that now).

@gmaxwell
Copy link
Contributor Author

gmaxwell commented Sep 6, 2016

@laanwj PR #8661 now applies that heuristic for self announcements in addr messages.

@EthanHeilman
Copy link
Contributor

@gmaxwell After thinking about this PR more and performing some back of the envelope calculations I am no longer opposed to this change. To understand the trade-offs I performed a short analysis.

Eclipse attackers have two resources:

  1. time the attack requires,
  2. and attacker controlled IPs in different IP groups (\16 prefixes).

This PR helps attackers by reducing the number of IP addresses an attacker needs to eclipse a node by shrinking the tried table (by how much I don't know) but, as shown below, increases by months the time an attacker needs to spend murmur the attacker addresses into tried.

Disclaimer: I didn't model a more effective hybrid strategy where an attacker strategically places attacker controlled IP addresses in both new and tried as I don't know what the optimal hybrid new and tried strategy should be. The success probability shown below measures only attacker IPs in tried.

A Murmur Model:

I built a very simple model of the tried and new table to determine the length of time it would take feelers to move via a feeler connection running at a 2 minute interval the attacker controlled IP addresses from the new table into the tried table.

  • The honest IPs are the number of honest IPs in the tried table at the start of an attack.
  • The attacker IPs are the attacker control IP addresses in different "groups" (\16 prefixes). Some of these will never make it into the new or tried table due to collisions.
  • The success probability is the probability that the node makes all 8 outgoing connections to the attacker IP addresses.
  • For simplicity sake we only measure addresses selected from tried. As noted in the disclaimer this is a not a fair assumption in an actual attack but allows us to look at murmured addresses without having to worry about the new table.

murmerattack

@sipa
Copy link
Member

sipa commented Sep 7, 2016

@EthanHeilman Thanks a lot for the analysis.

@sipa sipa merged commit eb3596f into bitcoin:master Sep 7, 2016
sipa added a commit that referenced this pull request Sep 7, 2016
eb3596f Do not add random inbound peers to addrman. (Gregory Maxwell)
@maflcko maflcko added this to the 0.13.1 milestone Sep 7, 2016
@sipa sipa mentioned this pull request Sep 7, 2016
laanwj added a commit to laanwj/bitcoin that referenced this pull request Sep 15, 2016
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Sep 21, 2016
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.

Github-Pull: bitcoin#8740
Rebased-From: d9c99c3
@sipa sipa mentioned this pull request Jan 10, 2017
18 tasks
OlegGirko pushed a commit to OlegGirko/dash that referenced this pull request Aug 31, 2017
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
UdjinM6 pushed a commit to dashpay/dash that referenced this pull request Sep 1, 2017
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
pyritepirate referenced this pull request in pyritepirate/pyrite Jan 14, 2019
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Jul 1, 2020
6f41b3e [QA] Missing mempool sync in pos_coldStaking and zc_publicspends tests (random-zebra)
efaf727 net: correctly initialize nMinPingUsecTime (Wladimir J. van der Laan)
61c8ffe Do not add random inbound peers to addrman. (Gregory Maxwell)
e6a1726 Added feeler connections increasing good addrs in the tried table. (Ethan Heilman)
070b6fb Actually only use filterInventoryKnown with MSG_TX inventory messages. (Gregory Maxwell)
9ac6b28 Only use filterInventoryKnown with MSG_TX inventory messages. (Patick Strateman)
01273db Rename setInventoryKnown filterInventoryKnown (Patick Strateman)
4f11eb2 Remove mruset as it is no longer used. (Gregory Maxwell)
2e3b05c Replace setInventoryKnown with a rolling bloom filter. (Gregory Maxwell)
409aa83 Replace trickle nodes with per-node/message Poisson delays (Pieter Wuille)
93e8c46 Move recentRejects initialization to top of InitBlockIndex (Wladimir J. van der Laan)
9a59420 Keep track of recently rejected transactions (Peter Todd)
34ee777 Only use randomly created nonces in CRollingBloomFilter. (Pieter Wuille)
338d346 Make CRollingBloomFilter set nTweak for you (Peter Todd)
dcd15bc Reuse vector hashing code for uint256 (Pieter Wuille)
3230143 Add uint256 support to CRollingBloomFilter (Peter Todd)
128d644 Better mruset unit test (Pieter Wuille)
89740ed Use ring buffer of set iterators instead of deque of copies in mruset (Pieter Wuille)
14c88ee Replace mruset setAddrKnown with CRollingBloomFilter addrKnown (Gavin Andresen)
e0bebbd Rolling bloom filter class (Gavin Andresen)
7c03bd5 Add correct bool combiner for net signals (Pieter Wuille)
5cb5fd6 Stop exporting ConnectNode (Fuzzbawls)
819295d Stop using ConnectNode in layer 2 code (Fuzzbawls)
851b6b4 net: No need to export DumpBanlist (Cory Fields)
4486d4e net: make Ban/Unban/ClearBan functionality consistent (Cory Fields)
a5e278d net: Drop CNodeRef for AttemptToEvictConnection (Cory Fields)
9fd357d net: use the exposed GetNodeSignals() rather than g_signals directly (Cory Fields)
7962bcc net: remove unused set (Cory Fields)
fabf358 Use network group instead of CNetAddr in final pass to select node to disconnect (Patrick Strateman)
7f030fe Fix comment (Patrick Strateman)
18af800 Acquire cs_vNodes before changing refrence counts (Patrick Strateman)
7aa827f CNodeRef copy constructor and assignment operator (Patrick Strateman)
b3f95e7 Return false early if vEvictionCandidates is empty (Patrick Strateman)
85886c9 Better support for nodes with non-standard nMaxConnections (Patrick Strateman)
9c9e55b RAII wrapper for CNode* (Patrick Strateman)
e92780d Add comments to AttemptToEvictConnection (Patrick Strateman)
0ca7ce3 Prefer to disconnect peers in favor of whitelisted peers (Patrick Strateman)
a1c4aaf AttemptToEvictConnection (Patrick Strateman)
aa7ce9b Record nMinPingUsecTime (Patrick Strateman)
fd7bab0 Refactor: Move failure conditions to the top of AcceptConnection (Patrick Strateman)
fcb732b Refactor: Bail early in AcceptConnection (Patrick Strateman)
411766d Refactor: AcceptConnection (Patrick Strateman)

Pull request description:

  This is a culmination of several upstream PRs touching the P2P/Networking code, and resulting in a state just prior to the P2P/Network encapsulation, which will be it's own PR.

  Backported upstream PRs included here:
  - bitcoin#5859
  - bitcoin#6064
  - bitcoin#6374
  - bitcoin#6498
  - bitcoin#6636
  - bitcoin#7133
  - bitcoin#7125
  - bitcoin#7906
  - bitcoin#8282
  - bitcoin#8594

ACKs for top commit:
  random-zebra:
    Great job. ACK 6f41b3e and merging...
  furszy:
    Have been running it the past days and all is looking good, ACK 6f41b3e .

Tree-SHA512: 1cc4b1271516f000a06141b5b069f3ee00f3eb77056e40d2c021c484f749d9d8db2b76ce490f63572372705b646fad342666f6f90ca5fc69abcacf7b207d058f
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Aug 19, 2020
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Aug 25, 2020
After bitcoin#8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.

Also mitigates the privacy issue in (bitcoin#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants