Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASN-based bucketing of the network nodes #16599

Open
naumenkogs opened this issue Aug 13, 2019 · 20 comments
Open

ASN-based bucketing of the network nodes #16599

naumenkogs opened this issue Aug 13, 2019 · 20 comments
Labels
P2P

Comments

@naumenkogs
Copy link
Contributor

@naumenkogs naumenkogs commented Aug 13, 2019

Currently we bucket peers (or potential peers) based on /16 network groups which directly correlate to the IP-addresses. This is done to diversify connections every node maintains, for example to avoid connecting to the nodes all belonging to the same region/provider.

Currently peers.dat (serialized version of addrman) does not store ip->bucket mappings explicitly, and all the known ips from peers.dat are re-hashed and re-bucketed at every restart (although it's very cheap).

Idea

It was recently suggested by @TheBlueMatt to use ASN-based bucketing instead. This is strictly better because if the goal is to diversify connections: the distribution of IPs among the ASNs is not uniform, and because of that netgroup-based bucketing may result in having 8 peers from just 2 large ASNs.
If we allow connecting to each ASN at most once, this would increase the security of the network.

We have @sipa's script to create a compressed representation of mapping (ip->ASN), which is less than 2 megabytes.

However, there are integration-related design questions.

Distribution of the .map file

During the meeting there was a rough consensus (not unanimous though, @jnewbery ) that mapping file should be distributed along with the release, instead of becoming part of the binary.

If you want to question these, feel free to comment below.

Legacy /16 bucketing

There was a suggestion of having an old method as well. I think we should do it.

Loading the map

Maybe there will be concerns here, I have an understanding for now.

@fanquake fanquake added the P2P label Aug 13, 2019
@laanwj laanwj added this to Chasing Concept ACK in High-priority for review Aug 14, 2019
@Sjors

This comment has been minimized.

Copy link
Member

@Sjors Sjors commented Aug 15, 2019

Concept ACK on ASN-based bucketing, no preference at the moment for how updates should work.

@laanwj laanwj removed this from Chasing Concept ACK in High-priority for review Aug 15, 2019
@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Aug 15, 2019

Concept ACK on ASN-based bucketing in addition to legacy /16 bucketing.

Making sure peers are diverse both across a.) the AS-number axis (ASN-based bucketing) and b.) the prefix axis (legacy /16 bucketing) should maximise overall network robustness.

In addition to this: given that we'll have a prefix-to-ASN map -- has the wild idea of opening a connection to one peer from within the same AS-number as oneself been discussed?

@Sjors

This comment has been minimized.

Copy link
Member

@Sjors Sjors commented Aug 15, 2019

@practicalswift I believe so, see IRC discussion from a few days ago: http://www.erisian.com.au/bitcoin-core-dev/log-2019-08-09.html#l-266

Such a nearby node can be useful for fetching blocks quickly, but at the same time e.g. creates a privacy risk for transaction broadcast. I believe that's why @TheBlueMatt suggested to connect to them in blocksonly mode.

@kristapsk

This comment has been minimized.

Copy link
Contributor

@kristapsk kristapsk commented Aug 15, 2019

Definitely Concept ACK

mapping file should be distributed along with the release, instead of becoming part of the binary.

There's GeoLite2 ASN database, which could be updated by the user independently (e.g. from the cron script) from Bitcoin Core updates. Actually, I'm not even sure it should be bundled with Bitcoin Core. If database is unavailable - fallback to old legacy /16 bucketing.

Anybody planning to actually work on this?

@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Aug 16, 2019

@Sjors The "connect-to-own-ASN" idea has another drawback if implemented naïvely without considering the node's total connectivity in terms of ASN distribution:

Consider an organisation disallowing Bitcoin traffic at the edge router level. A newly launched bitcoind would only be able to connect to ASN-local nodes. The node would risk becoming part of an ASN-local "Bitcoin network" which may not be connected to the global network.

Perhaps we should require N connections to prefixes announced by external ASN:s before considering opening connections within our own ASN :-)

@naumenkogs

This comment has been minimized.

Copy link
Contributor Author

@naumenkogs naumenkogs commented Aug 16, 2019

Anybody planning to actually work on this?

@kristapsk I am sketching an implementation.

@Sjors

This comment has been minimized.

Copy link
Member

@Sjors Sjors commented Aug 16, 2019

@practicalswift alternatively "blocks only" in this case would mean only downloading blocks for which you already have the headers (unlike -blocksonly).

@sipa

This comment has been minimized.

Copy link
Member

@sipa sipa commented Aug 17, 2019

@kristapsk @naumenkogs and I are working on a way to load a compressed IP-to-ASN map into bitcoind and use it for grouping IPs. The compressed scheme currently needs slightly less than 1 MB for a full map of the Internet.

It's an open question I think how this map will come to be. Initially I guess it'll just be optional and available for people to experiment with (using a command line flag and a file). Eventually we may want to bundle a map with Bitcoin Core or even make it part of the binary if this approach turns out to be useful.

I wasn't aware of that publicly available GeoLite2 database; that looks useful to experiment with. So far I've been using a BGP router dump I got from somewhere.

@Sjors

This comment has been minimized.

Copy link
Member

@Sjors Sjors commented Aug 17, 2019

Relevant background reading: https://erebus-attack.comp.nus.edu.sg

One thing this attack leverages is that it can fake ASes "behind" it, from the victim node perspective. In light of that, for nodes that know their own IP address, would it make sense to divide AS buckets from an indivudal node perspective rather than from a global perspective?

For example if our node is in AS 1 and AS 1 has routes to AS 2 and 3, then for each new peer we check if it will route through AS 2 or AS 3 and spread equally over buckets. When AS 2 connects to AS 5 and 6, we again split the AS2 "bucket" between those two. No idea how much number crunching that involves (could wait until after IBD), or if there's even a reasonable algorithm.

@Sjors

This comment has been minimized.

Copy link
Member

@Sjors Sjors commented Aug 28, 2019

@TheBlueMatt wrote in #16702 (comment):

One thing we can play with after we build an initial table is to look at the paths, instead of looking only at the last ASN in the path. eg if, from many vantage points on the internet, a given IP block always passes from AS 1 to AS 2, we could consider it as a part of AS 1 (given it appears to only have one provider - AS 1). In order to avoid Western bias we'd need to do it across geographic regions and from many vantage points (eg maybe contact a Tier 1 and get their full routing table view, not just the selected routes), but once we get the infrastructure in place, further filtering can be played with.

Would it make sense to traceroute some of the nodes we connect to and re-bucket based on the ASNs of the first couple of hops? Or does such active probing draw too much attention?

@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Aug 28, 2019

@Sjors To do the equivalent of what traceroute does would require setting time-to-live on outgoing packets (bypassing the socket interface).

That would require the end-user to run bitcoind as root (bad), or having bitcoind invoke a third-party SUID root binary such as traceroute which is also bad: the various traceroute:s were clearly not written with security in mind -- see history of heap overflows, etc.

@sdaftuar

This comment has been minimized.

Copy link
Member

@sdaftuar sdaftuar commented Sep 20, 2019

So there are two parts to this proposal:

  • Use ASN information for addrman bucketing.
  • Use ASN information for determining which peers to connect to.

It seems to me that using ASN information for addrman bucketing is likely to be very beneficial, as it prevents being blinded to peers outside of (eg) ASNs that an attacker might want to divert traffic towards. I figure we should be able to make addrman large enough (if it's not already) that the increased collisions for AS's that have more nodes should not be a big problem.

But it's less clear to me if enforcing ASN-diversity on our outbound peers is beneficial or not, as it might drive connections to a relatively small part of the overall network graph. For instance if there is an ASN with only 1 node (A) in it, and let's say there are a couple hundred ASNs in total with any nodes at all (fair assumption?), then rather than having a ~1 in 10000 chance of A being selected by a node making an outbound connection, A's chances will be more like 1 in a couple hundred. This could have unfortunate side effects for other network-graph attacks (and EREBUS might even be easier in some situations as a result, since that attack revolves around using any AS for which the victim would route through the adversary's network to reach). So this seems like a potentially large effect that I think would be worthy of more careful study before deploying.

An alternative approach might be to just aim for node diversity at the addrman level (using ASN information if available, as suggested here), and then use peer rotation or frequent chain-tip-sync with random peers (like #16859) in order to reduce the likelihood of being eclipsed.

EDIT: I realized I missed the IRC conversation on this, which I just read. I did misunderstand the sampling effect of randomly sampling from all of addrman, as we do now with our existing /16 group limit, and presumably we would also do if we were to enforce an ASN connection limit as well. Still, this seems to me like something we should model and study before deployment, as it's not clear at all to me what effect this would have on the network topology. AS-diversity of our immediate peers is not clearly the thing we should be trying to maximize; if for instance dishonest peers can gain a connectivity advantage by locating themselves in small AS groups, that seems potentially problematic.

@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Sep 20, 2019

@sdaftuar It should also be noted that it is possible for a BGP-speaking attacker to export routes on behalf of an arbitrary number of fake downstream ASN:s (making it look like the attacker is providing transit service to the fake downstream ASN:s). Thereby gaining access to an arbitrary number of tickets in the ASN lottery :)

As described more in depth here: #16702 (comment)

@naumenkogs

This comment has been minimized.

Copy link
Contributor Author

@naumenkogs naumenkogs commented Nov 21, 2019

To understand how the proposed asmap solves the problem and affects the topology of the network, I think it’s important to distinguish the physical and logical level.

Physical level

Legacy /16 bucketing always attempted to not create more than one connection per node to the same /16 subnet, even if a lot of nodes are located there.
It turns out that the correlation between location/owner and /16 is much weaker these days.
Asmap diversifies by ASN, which is a better representation of a piece of infrastructure than /16 group.
Asmap adjusts bucketing to be robust in more realistic scenarios (large AS gets corrupted, trouble with the particular AS-level infrastructure). I think we can agree that in terms of the physical level, this is an improvement.

Asmap might make it worse if an attacker manages to spin up fake AS, but that can be handled upon asmap file distribution.

Logical level

Any diversification makes certain nodes (placed in rare /16 groups or rare ASes) more likely to be chosen for connection.
This might make topology look less like a random graph. This is generally not a good sign, as it creates weakly connected components, which are easier to attack.
The effect on topology be represented by the variation of (AS/netgroup)->nodes] distribution.
In practice, however, salted hashing + bucketing makes the effect of the uneven distribution less noticeable. (see next message)

@naumenkogs

This comment has been minimized.

Copy link
Contributor Author

@naumenkogs naumenkogs commented Nov 21, 2019

I made this simulating script (operating over the real current list of reachable nodes) to understand two things:

  • how much less of a random graph we are getting because of AS/netgroup bucketing
  • how much benefit can an attacker get from this non-uniformity

Both of these questions can be answered with the same answer: the probability of choosing a node from the rare [AS/netgroup]

This means that, if we pick 10% of nodes to be placed in the rarest groups, one per group, the probability of choosing them by other nodes (with different salts) in the network should be 10%.

As I mentioned before, I believe the result depends on the variation (AS/netgroup)->nodes.
Since for netgroups the variation is low, it does not affect the graph, and the probability, in this case, is around 10%, no matter how many peers every node chooses.

For Asmap it does slightly affect the graph, and the probability is 11.5% if every node chooses 8 peers.
If every node chooses 32 peers, the probability is 15%.

To reduce this, we can split N top AS into artificial sub-AS, to reduce the variation, and make it evener.
For instance, if we randomly split top-25 AS each into 20 sub-AS, the probability becomes 10% no matter how many peers every node chooses.
Alternatively, we can bucket small ASes together, which would also reduce the variance.

Overall, I believe this is a positive result, and asmap should be integrated. But let me know if you would like any other measurements or if you have an opinion on AS-splitting.

@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Nov 21, 2019

@naumenkogs Thanks for the simulation and analysis.

If going the asmap route (which I think is a nice idea) I guess there is the option of using the AS/netgroup bucketing method with probability p and using the legacy /16 bucketing with probability 1-p. Could that make us more robust against adversaries able to game only one of the methods (but not both), and increase the cost of attack for adversaries who have the capability to game both methods? Is it overkill?

Sorry if that has been discussed previously or if I've misunderstood how things are meant to work.

@sipa

This comment has been minimized.

Copy link
Member

@sipa sipa commented Nov 25, 2019

@practicalswift I believe that would be overkill. If we think using an AS map is the wrong approach, we should just improve the map. For example we could choose to merge the smallest AS'es until they're all (say) as large as a (random example) /22.

@jnewbery

This comment has been minimized.

Copy link
Member

@jnewbery jnewbery commented Dec 11, 2019

During the meeting there was a rough consensus (not unanimous though, @jnewbery ) that mapping file should be distributed along with the release, instead of becoming part of the binary.

Just noticed this. I said in the meeting that it should be distributed with the release:

601 2019-06-20T19:22:58  <jnewbery> yes, i think the distribution should include it

http://www.erisian.com.au/bitcoin-core-dev/log-2019-06-20.html#l-601

@narula

This comment has been minimized.

Copy link
Contributor

@narula narula commented Dec 11, 2019

I have questions about the generation and maintenance of the asmap file (independent of distribution method):

  • How often should this be regenerated?
  • What constitutes a "bad" asmap file?
  • How might an attacker take advantage of a very stale asmap file (let's say people forget to update it for a few months... or years)?
  • What is the mechanism by which one detects a "bad" generated asmap? What should people look for?
@practicalswift

This comment has been minimized.

Copy link
Member

@practicalswift practicalswift commented Dec 11, 2019

Good questions!

  • How might an attacker take advantage of a very stale asmap file (let's say people forget to update it for a few months... or years)?

From a BGP speaking attacker's perspective wouldn't a very stale asmap be harder to take advantage of compared to the other extreme of say a daily updated asmap? My thinking is that the attacker can easily influence the result of say next day's asmap generation process by adjusting what prefixes and AS paths he/she chooses to communicate.

I'm not suggesting we should use a very stale asmap:s - just making the point that newer is not necessarily less attacker friendly :)

One mitigation could perhaps be to only consider routes and/or AS paths that have been stable for N months when generating the asmap.

Assumptions I'm making:

  • Active attackers are typically participating during shorter time horizons compared to active non-attackers.
  • Routing table anomalies may go unnoticed over short time horizons, but are less likely to go unnoticed over long time horizons. (Perhaps that can be said generally for all types of observable anomalies? :))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.