Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: supplying and using asmap to improve IP bucketing in addrman #16702

Merged
merged 4 commits into from Jan 29, 2020

Conversation

@naumenkogs
Copy link
Contributor

naumenkogs commented Aug 23, 2019

This PR attempts to solve the problem explained in #16599.
A particular attack which encouraged us to work on this issue is explained here [Erebus Attack against Bitcoin Peer-to-Peer Network] (by @muoitranduc)

Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.

A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).

Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.

TODO:

  • more unit tests
  • find a way to test the code without including >1 MB mapping file in the repo.
  • find a way to check that mapping file is not corrupted (checksum?)
  • comments and separate tests for asmap.cpp
  • make python code for .map generation public
  • figure out asmap distribution (?)

Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?

@fanquake fanquake added the P2P label Aug 23, 2019
src/addrman.h Outdated Show resolved Hide resolved
src/test/addrman_tests.cpp Outdated Show resolved Hide resolved
@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Aug 23, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17812 (config, test: asmap functional tests and feature refinements by jonatack)
  • #17809 (rpc: Auto-format RPCResult by MarcoFalke)
  • #17804 (doc: Misc RPC help fixes by MarcoFalke)
  • #17428 (p2p: Try to preserve outbound block-relay-only connections during restart by hebasto)
  • #17399 (validation: Templatize ValidationState instead of subclassing by jkczyz)
  • #17383 (Refactor: Move consts to their correct translation units by jnewbery)
  • #16748 ([WIP] Add support for addrv2 (BIP155) by dongcarl)
  • #16673 (Relog configuration args on debug.log rotation by LarryRuane)
  • #16115 (On bitcoind startup, write config args to debug.log by LarryRuane)
  • #10102 ([experimental] Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.h Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 24, 2019

FWIW, the asmap generated from https://dev.maxmind.com/geoip/geoip2/geolite2/ is 988387 bytes in size.

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 24, 2019

Encoding the map as bool[] array in the source code will add to the executable 4 bytes per bit in the map, that's a bit excessive. You should probably encode it as a encode it as a uint8_t[] instead (with 8 bits per array element).

@wiz

This comment has been minimized.

Copy link

wiz commented Aug 24, 2019

@sipa Would you mind providing the python script used to generate the asmap? I think Bitcoiners like @TheBlueMatt and myself who maintain their own BGP full table view of the Internet would like to generate their own asmap instead of trusting maxmind, or at least verify it.

@practicalswift

This comment has been minimized.

Copy link
Member

practicalswift commented Aug 24, 2019

Concept ACK

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 24, 2019

@wiz The script is here: https://gist.github.com/sipa/b90070570597b950f29a6297772a7636 though we need more tooling to convert from common dump formats, sanity check, ... I'll publish some when they're a bit cleaned up and usable.

@wiz

This comment has been minimized.

Copy link

wiz commented Aug 24, 2019

Thanks for the script, it helps a lot with my own implementation and I also wanted to point out a few potential attack vectors to circumvent the asmap protection:

  1. Private ASN ranges

There are several ASN ranges which can be used as originating ASNs without registration or verification and could be used to circumvent the asmap protection. For example I could announce one /24 from 65000, one /24 from 65001, and so on, all behind my actual ASN, in order to launch an Erebus attack. I propose that we map any IP blocks originating from the following invalid ASNs to the next-valid ASN upstream of it:

0                       # Reserved RFC7607                                     
23456                   # AS_TRANS RFC6793                                     
64496-64511             # Reserved for use in docs and code RFC5398            
64512-65534             # Reserved for Private Use RFC6996                     
65535                   # Reserved RFC7300                                     
65536-65551             # Reserved for use in docs and code RFC5398            
65552-131071            # Reserved                                             
4200000000-4294967294   # Reserved for Private Use RFC6996                     
4294967295              # Reserved RFC7300                                     
  1. Multiple originating ASNs

There are actually around 50 routing prefixes on the Internet that legitimately have multiple originating ASNs, so it's not always a strict 1:1 mapping. For example, a valid use case could be an ISP that aggregates multiple customer /29 or /30 prefixes into a single /24 announcement (since /24 is the smallest globally routable prefix size that would be generally accepted) and my router indicates all of the multiple originating ASNs in curly braces like this:

141.145.0.0/19 {AS7160,43898} 
144.21.0.0 {AS7160,AS43894}
158.13.154.0/24 {AS367,AS1479,AS1504,AS1526,AS1541}
160.34.0.0/20 {AS4192,AS7160}
160.34.16.0/20 {AS7160,AS43898} 

Of course this could be used to circumvent the asmap protection as well, so for this case I also propose we instead identify the upstream aggregating ASN as the originating ASN. For my implementation I'm simply going to use a regex to strip out the curly braces aggregation to ignore it.

@practicalswift

This comment has been minimized.

Copy link
Member

practicalswift commented Aug 24, 2019

@wiz Very good points!

  1. Private ASN ranges

Besides using private ASN ranges another obvious attack vector would be to make use of non-reserved but unused or unallocated AS-numbers as a faux downstream for a specific attacker controlled prefix:

Assume that AS54321 is a non-private but unused or unallocated ASN.

An attacker AS666 with upstream transit provider AS123 could fake having AS54321 as a downstream transit customer with the network 200.201.202.0/24.

The global view would hence be:

  • 200.201.202.0/24: AS123 AS666 AS54321

Traffic to 200.201.202.0/24 would end up in the hands of AS666 who get a free extra slot our ASN lottery :-) This could obviously be repeated with N additional prefixes to get N extra slots.

Assuming that AS54321 is not in active use by any organisation no one will complain about this behaviour. (In contrast to "prefix hijacking" where the hijacked network will notice that their traffic is routed in a strange way and complain.)

How can we guard against this attack?

To guard against an attacker making use of non-IANA allocated ASNs as faux specific attacker controlled prefix:

Instead of blacklisting known reserved AS numbers, we could be stricter and apply a whitelisting approach: allow only AS number ranges that have been been allocated to the five Regional Internet Registries (AFRINIC, APNIC, ARIN, LACNIC and RIPE) by IANA. That data is available in machine readable form.

To guard against an attacker making use of IANA allocated but RIR unallocated ASNs as faux downstream for a specific attacker controlled prefix:

I believe all five RIR:s provide machine readable lists of the individual ASNs they have assigned to organisations.

(Note that this data comes with the country code for the organisation that has been assigned the AS-number. That could be handy if we some time in the future would like to implement logic for avoiding country or region based netsplits. If incorporated in the asmap data we could say not only what ASN a peer belongs to but also the region (based on which RIR that handed out the ASN) and also which country the organisation announcing the peers prefix belongs to. This would be a good approximation for the actual country/region of the peer.)

To guard against an attacker making use of IANA allocated and RIR allocated ASNs as faux downstreams for a specific attacker controlled prefix:

This is harder to guard against.

Perhaps we could require that a prefix-to-ASN mapping has been stable over say N months before being included in the asmap. That would make the described attack harder to carry out without being noticed.

Another approach would be to analyse a full routing table when creating the asmap and reason about the ASNs that are in the path to the announcing origin AS. Could be tricky to distinguish between tier-1:s and attackers :-\

@wiz

This comment has been minimized.

Copy link

wiz commented Aug 24, 2019

Yeah, the AS map mitigation may be flawed. How about requiring everyone to also have a few Tor peers, presumably bypassing the network partition attack?

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Aug 24, 2019

Indeed. ASN mappings are not a foolproof solution, but they're better than just using /16s (after all, there are lots of unused /16s you could announce if you wanted to). Ultimately some monitoring and building up filtering lists over time as we observe malicious behavior may improve things, but, indeed, ensuring redundant connectivity is the only ultimate solution. Once #15759 lands, I'd really like to propose a default of 2 additional blocksonly Tor connections if Tor support is enabled (see-also https://twitter.com/TheBlueMatt/status/1160620919775211520, in which someone suggested their ISP was censoring Bitcoin P2P traffic, and only after setting bitcoind to Tor-only did it manage to connect).

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Aug 24, 2019

One thing we can play with after we build an initial table is to look at the paths, instead of looking only at the last ASN in the path. eg if, from many vantage points on the internet, a given IP block always passes from AS 1 to AS 2, we could consider it as a part of AS 1 (given it appears to only have one provider - AS 1). In order to avoid Western bias we'd need to do it across geographic regions and from many vantage points (eg maybe contact a Tier 1 and get their full routing table view, not just the selected routes), but once we get the infrastructure in place, further filtering can be played with.

src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
src/util/asmap.cpp Outdated Show resolved Hide resolved
fanquake added a commit that referenced this pull request Aug 26, 2019
92528c2 Support serialization of std::vector<bool> (Pieter Wuille)

Pull request description:

  This adds support for serialization of `std::vector<bool>`, as a prerequisite of #16702.

ACKs for top commit:
  MarcoFalke:
    ACK 92528c2 (only looked at the diff on GitHub)
  practicalswift:
    ACK 92528c2 -- diff looks correct
  jamesob:
    ACK 92528c2

Tree-SHA512: 068246a55a889137098f817bf72d99fc3218a560d62a97c59cccc0392b110f6778843cee4e89d56f271ac64e03a0f64dc5e2cc22daa833fbbbe9234c5f42d7b9
@naumenkogs naumenkogs force-pushed the naumenkogs:asn_buckets branch from a944e38 to 0921912 Aug 27, 2019
Copy link
Member

Sjors left a comment

Concept ACK. It makes sense that this is opt-in for now, and I like that it can rebucket if the map changes, so it doesn't have to be perfect.

Can you get rid of the merge (and oops) commit? A rebase is much easier to review commit-by-commit.

I'd like to try this out, but I need something to feed @sipa's script with (and then pass that into -asmap=<file>). Is there an easy incantation to convert the geolite IPv4/IPv6 ASN tables? Or to produce it myself from a VPS?

Do we need to reconsider the number of buckets and their size?

src/init.cpp Outdated Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
@naumenkogs naumenkogs force-pushed the naumenkogs:asn_buckets branch from d688a65 to 712d3d7 Aug 28, 2019
src/netaddress.cpp Outdated Show resolved Hide resolved
@jamesob

This comment has been minimized.

Copy link
Member

jamesob commented Jan 15, 2020

For what it's worth I think reviewing @sipa's asmap format is prerequisite to signing off on this PR (since the data this PR is dependent on is encoded using the former). I've reviewed that code (and added some commentary: sipa/asmap#2), and will now be starting on this code.

Copy link
Member

jamesob left a comment

This is looking pretty close. It seems like a great change to me conceptually. FWIW, at some point the addrman doc preamble merits updating to mention bucketing-by-ASN when available. https://github.com/bitcoin/bitcoin/blob/master/src/addrman.h#L92-L116

The only point that I think is worth addressing before I ACK is the asmap vector copy in CNode::copyStats.

We could use more test coverage IMO but that can be done in followups.

if (asmap_file.empty()) {
asmap_file = DEFAULT_ASMAP_FILENAME;
}
const fs::path asmap_path = GetDataDir() / asmap_file;

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

Agree this should accept an absolute path. We should either pull @jonatack's change in to this PR or follow up shortly after this merges.

while (1) {
assert(pos != asmap.end());
opcode = DecodeType(pos);
if (opcode == 0) {

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

Agree with @EthanHeilman. I know we're trying to avoid nits but this seems like basic hygiene.

} else if (opcode == 3) {
default_asn = DecodeASN(pos);
} else {
assert(0);

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

I'm sort of inclined to say we should log really loudly and then fall back to IP-based bucketing since crashing based on asmap failure seems pretty drastic - especially since this is called not just at startup but throughout runtime. If we crash, a problem with this code could then potentially cause widespread outage.

return val;
}
}
return -1;

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

This code is a great candidate for unittesting - I'm inclined to say that you should use @laanwj's code here but it's somewhat difficult to verify that it will behave in exactly the same way sipa's original code does. Happy to add some tests in a follow-up PR.


#include <hash.h>
#include <netbase.h>
#include <random.h>

class CAddrManTest : public CAddrMan
{
private:
bool deterministic;

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

nit: convention would say m_deterministic = ..., then you don't have to do awkward naming to avoid shadowing below

BOOST_CHECK(buckets.size() == 8);
}

BOOST_AUTO_TEST_CASE(caddrinfo_get_new_bucket)

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

A lot of commonalities between this test case and the last. I know this is nitty stuff but you could save a lot of code (and thus review effort) by consolidating it into a function that takes a function parameter that returns info.Get{Tried,New}Bucket(...). Again not a blocker, just would be easier review without having to read through something that's kind of the same but not.

And these tests are basically duplicates of the existing tests with a null asmap (caddrinfo.*_legacy), so it seems like there's a lot of code duplication that could be removed here. Might make a good follow-up PR I guess.

src/test/addrman_tests.cpp Show resolved Hide resolved
BOOST_CHECK(ResolveIP("2001:0:9999:9999:9999:9999:FEFD:FCFB").GetGroup(asmap) == std::vector<unsigned char>({(unsigned char)NET_IPV4, 1, 2})); // RFC4380
BOOST_CHECK(ResolveIP("FD87:D87E:EB43:edb1:8e4:3588:e546:35ca").GetGroup(asmap) == std::vector<unsigned char>({(unsigned char)NET_ONION, 239})); // Tor
BOOST_CHECK(ResolveIP("2001:470:abcd:9999:9999:9999:9999:9999").GetGroup(asmap) == std::vector<unsigned char>({(unsigned char)NET_IPV6, 32, 1, 4, 112, 175})); //he.net
BOOST_CHECK(ResolveIP("2001:2001:9999:9999:9999:9999:9999:9999").GetGroup(asmap) == std::vector<unsigned char>({(unsigned char)NET_IPV6, 32, 1, 32, 1})); //IPv6

This comment has been minimized.

Copy link
@jamesob

jamesob Jan 16, 2020

Member

If you wanted better (albeit incidental) coverage of the ASN parsing stuff, you could use asmap.raw to verify that IPs are mapped properly. Can be done in a follow-up.

src/net.cpp Outdated Show resolved Hide resolved
naumenkogs added 2 commits Dec 24, 2019
If ASN bucketing is used, return a corresponding AS
used in bucketing for a given peer.
@naumenkogs naumenkogs force-pushed the naumenkogs:asn_buckets branch from 50f655e to 3c1bc40 Jan 23, 2020
@naumenkogs

This comment has been minimized.

Copy link
Contributor Author

naumenkogs commented Jan 23, 2020

I think no more pending critical comments, everything unaddressed will be easier to get through via a follow-up, so let's get this merged? Looking for acks :)

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Jan 23, 2020

re-ACK 3c1bc40
only change is std::vector<bool> &m_asmap

Copy link
Member

jamesob left a comment

ACK 3c1bc40 (jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using)

Only change since last ACK is changing the m_asmap param from value to reference. Built locally.

Show platform data

Tested on Linux-4.15.0-47-generic-x86_64-with-Ubuntu-18.04-bionic

Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-std=c++11 -Wthread-safety-analysis -Werror -Wthread-safety-analysis -Werror --enable-wallet --enable-debug --with-daemon CXX=/usr/bin/clang++-4.0 CC=/usr/bin/clang-4.0 PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --no-create --no-recursion

Compiled with /usr/bin/ccache /usr/bin/clang++-4.0 -std=c++11 -mavx -mavx2 -std=c++11 -Wthread-safety-analysis -Werror -Wthread-safety-analysis -Werror -O0 -g3 -ftrapv  -Wstack-protector -fstack-protector-all -msse4 -msha -msse4.1 -msse4.2  i

Compiler version: clang version 4.0.1-10 (tags/RELEASE_401/final)

@jonatack

This comment has been minimized.

Copy link
Member

jonatack commented Jan 27, 2020

I think reading the asmap happens too late in the initialization process, unless there's a strong reason to do it like this that I'm missing.
When reading it before loading the block index and doing initial verification, there would be much faster feedback on invalid -asmap command line arguments. It's a bit silly to spin up the entire node just to it shut down again.
Could be changed in a later PR (maybe #17812) anyhow.

I can confirm that moving the asmap init.cpp code from the end of Step 12: start node to the end of the Step 6: network initialization section speeds up the feature_asmap.py tests in #17812 from 60 seconds down to 5, while they continue to function as expected... by dramatically speeding up the 2 tests that use assert_start_raises_init_error. Updating #17812.

Copy link
Member

jonatack left a comment

ACK 3c1bc40

Built, ran tests, bitcoind, and getpeerinfo and observed the mapped_as fields. Idem rebased onto current master 3253b5d:

$ bitcoind -asmap=../projects/bitcoin/asmap/demo.map
2020-01-27T13:20:06Z Feeding 36895 bytes of environment data into RNG
2020-01-27T13:20:06Z Bitcoin Core version v0.19.99.0-3c1bc40205 (debug build)
.../...
2020-01-27T13:20:43Z init message: Starting network threads...
2020-01-27T13:20:43Z Opened asmap file "/home/jon/.bitcoin/../projects/bitcoin/asmap/demo.map" (932999 bytes) from disk.
2020-01-27T13:20:43Z msghand thread start
2020-01-27T13:20:43Z addcon thread start
2020-01-27T13:20:43Z net thread start
2020-01-27T13:20:43Z opencon thread start
2020-01-27T13:20:43Z dnsseed thread start
2020-01-27T13:20:44Z Using asmap version 7ead927689f884bc4021618017cc1438d9328578779edd2adfbed044fb52928e for IP bucketing.
2020-01-27T13:20:44Z init message: Done loading
$ bitcoind
2020-01-27T13:34:19Z Feeding 36908 bytes of environment data into RNG
2020-01-27T13:34:19Z Bitcoin Core version v0.19.99.0-3c1bc40205 (debug build)
.../...
2020-01-27T13:36:13Z init message: Starting network threads...
2020-01-27T13:36:13Z Using /16 prefix for IP bucketing.
2020-01-27T13:36:13Z init message: Done loading
$ bitcoin-cli getpeerinfo
[
  {
    "id": 0,
    "addr": "88.99.150.13:8333",
    "addrlocal": "213.152.162.154:37866",
    "addrbind": "10.24.132.40:37866",
    "mapped_as": 24940,
    "services": "000000000000040d",

Code re-review. Agree with improving test coverage and tests, docs, running the code earlier in init.cpp, passing absolute filenames, adding the default filename to the help, and various other little things that can be done in follow-ups.

@@ -83,6 +83,7 @@ static UniValue getpeerinfo(const JSONRPCRequest& request)
" \"addr\":\"host:port\", (string) The IP address and port of the peer\n"
" \"addrbind\":\"ip:port\", (string) Bind address of the connection to the peer\n"
" \"addrlocal\":\"ip:port\", (string) Local address as reported by the peer\n"
" \"mapped_as\":\"mapped_as\", (string) The AS in the BGP route to the peer used for diversifying peer selection\n"

This comment has been minimized.

Copy link
@jonatack

jonatack Jan 27, 2020

Member

The mapped_as value is currently returned as an integer.

This comment has been minimized.

Copy link
@jonatack

jonatack Jan 27, 2020

Member

Will address this in #17812.

This comment has been minimized.

Copy link
@laanwj

laanwj Jan 29, 2020

Member

Doesn't returning it as an integer make sense, as it is an integer? So I think we should change the documentation.

This comment has been minimized.

Copy link
@jonatack

jonatack Jan 29, 2020

Member

Doesn't returning it as an integer make sense, as it is an integer? So I think we should change the documentation.

Thanks! Agreed.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Jan 29, 2020

Is the plan to actually generate the asmap files from GeoIP's GeoLite database?

Has anyone reviewed their EULA which presumably controls asmaps derived from their database?

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Jan 29, 2020

I think how the particular asmap database is generated is an important discussion if and when it's going to be shipped with bitcoin core itself. This PR just adds a feature to load and interpret one, independent of where it comes from.

laanwj added a commit that referenced this pull request Jan 29, 2020
…in addrman

3c1bc40 Add extra logging of asmap use and bucketing (Gleb Naumenko)
e4658aa Return mapped AS in RPC call getpeerinfo (Gleb Naumenko)
ec45646 Integrate ASN bucketing in Addrman and add tests (Gleb Naumenko)
8feb4e4  Add asmap utility which queries a mapping (Gleb Naumenko)

Pull request description:

  This PR attempts to solve the problem explained in #16599.
  A particular attack which encouraged us to work on this issue is explained here  [[Erebus Attack against Bitcoin Peer-to-Peer Network](https://erebus-attack.comp.nus.edu.sg/)] (by @muoitranduc)

  Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.

  A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).

  Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
  In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
  I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.

  TODO:
  - ~~more unit tests~~
  - ~~find a way to test the code without including >1 MB mapping file in the repo.~~
  - find a way to check that mapping file is not corrupted (checksum?)
  - comments and separate tests for asmap.cpp
  - make python code for .map generation public
  - figure out asmap distribution (?)

  ~Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess  if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?~

ACKs for top commit:
  laanwj:
    re-ACK 3c1bc40
  jamesob:
    ACK 3c1bc40 ([`jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using`](https://github.com/jamesob/bitcoin/tree/ackr/16702.3.naumenkogs.p2p_supplying_and_using))
  jonatack:
    ACK 3c1bc40

Tree-SHA512: e2dc6171188d5cdc2ab2c022fa49ed73a14a0acb8ae4c5ffa970172a0365942a249ad3d57e5fb134bc156a3492662c983f74bd21e78d316629dcadf71576800c
@laanwj laanwj merged commit 3c1bc40 into bitcoin:master Jan 29, 2020
1 of 2 checks passed
1 of 2 checks passed
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@laanwj laanwj removed this from Blockers in High-priority for review Jan 29, 2020
@sipa

This comment has been minimized.

Copy link
Member

sipa commented Jan 29, 2020

@luke-jr sipa/asmap#1 adds tooling to build asmap files from RIPE's dumps.

@Empact

This comment has been minimized.

Copy link
Member

Empact commented Jan 29, 2020

For the sake of documentation, this apparently introduced an appveyor test failure #18020, fixed in #18022.

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Jan 29, 2020

Sorry for the late review. I've opened #18023 with things that I think can be improved here.

jonatack added a commit to jonatack/bitcoin that referenced this pull request Jan 30, 2020
and update feature_asmap.py test and test_runner.py.

This commit moves the asmap init.cpp code from the end of "Step 12: start node"
to "Step 6: network initialization" to provide feedback on passing an -asmap
config arg much more quickly.

This change speeds up the feature_asmap.py functional tests from 60 to 5 seconds
by dramatically speeding up the 2 tests that use assert_start_raises_init_error.
See this comment: bitcoin#16702 (comment)
jonatack added a commit to jonatack/bitcoin that referenced this pull request Jan 30, 2020
and update feature_asmap.py test and test_runner.py.

This commit moves the asmap init.cpp code from the end of "Step 12: start node"
to "Step 6: network initialization" to provide feedback on passing an -asmap
config arg much more quickly.

This change speeds up the feature_asmap.py functional tests from 60 to 5 seconds
by dramatically speeding up the 2 tests that use assert_start_raises_init_error.
See this comment: bitcoin#16702 (comment)

Credit to laanj for the suggestion.
jonatack added a commit to jonatack/bitcoin that referenced this pull request Jan 31, 2020
and update feature_asmap.py test and test_runner.py.

This commit moves the asmap init.cpp code from the end of "Step 12: start node"
to "Step 6: network initialization" to provide feedback on passing an -asmap
config arg much more quickly.

This change speeds up the feature_asmap.py functional tests from 60 to 5 seconds
by dramatically speeding up the 2 tests that use assert_start_raises_init_error.
See this comment: bitcoin#16702 (comment)

Credit to laanj for the suggestion.
jonatack added a commit to jonatack/bitcoin that referenced this pull request Jan 31, 2020
and update feature_asmap.py test and test_runner.py.

This commit moves the asmap init.cpp code from the end of "Step 12: start node"
to "Step 6: network initialization" to provide feedback on passing an -asmap
config arg much more quickly.

This change speeds up the feature_asmap.py functional tests from 60 to 5 seconds
by dramatically speeding up the 2 tests that use assert_start_raises_init_error.
See this comment: bitcoin#16702 (comment)

Credit to laanwj for the suggestion.
sidhujag added a commit to syscoin/syscoin that referenced this pull request Feb 1, 2020
…keting in addrman

3c1bc40 Add extra logging of asmap use and bucketing (Gleb Naumenko)
e4658aa Return mapped AS in RPC call getpeerinfo (Gleb Naumenko)
ec45646 Integrate ASN bucketing in Addrman and add tests (Gleb Naumenko)
8feb4e4  Add asmap utility which queries a mapping (Gleb Naumenko)

Pull request description:

  This PR attempts to solve the problem explained in bitcoin#16599.
  A particular attack which encouraged us to work on this issue is explained here  [[Erebus Attack against Bitcoin Peer-to-Peer Network](https://erebus-attack.comp.nus.edu.sg/)] (by @muoitranduc)

  Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.

  A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).

  Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
  In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
  I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.

  TODO:
  - ~~more unit tests~~
  - ~~find a way to test the code without including >1 MB mapping file in the repo.~~
  - find a way to check that mapping file is not corrupted (checksum?)
  - comments and separate tests for asmap.cpp
  - make python code for .map generation public
  - figure out asmap distribution (?)

  ~Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess  if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?~

ACKs for top commit:
  laanwj:
    re-ACK 3c1bc40
  jamesob:
    ACK 3c1bc40 ([`jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using`](https://github.com/jamesob/bitcoin/tree/ackr/16702.3.naumenkogs.p2p_supplying_and_using))
  jonatack:
    ACK 3c1bc40

Tree-SHA512: e2dc6171188d5cdc2ab2c022fa49ed73a14a0acb8ae4c5ffa970172a0365942a249ad3d57e5fb134bc156a3492662c983f74bd21e78d316629dcadf71576800c
luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Feb 16, 2020
The scripts for creating a compact IP->ASN mapping are here:
https://github.com/sipa/asmap

Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>

Github-Pull: bitcoin#16702
Rebased-From: 8feb4e4
luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Feb 16, 2020
Instead of using /16 netgroups to bucket nodes in Addrman for connection
diversification, ASN, which better represents an actor in terms
of network-layer infrastructure, is used.
For testing, asmap.raw is used. It represents a minimal
asmap needed for testing purposes.

Github-Pull: bitcoin#16702
Rebased-From: ec45646
luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Feb 16, 2020
If ASN bucketing is used, return a corresponding AS
used in bucketing for a given peer.

Github-Pull: bitcoin#16702
Rebased-From: e4658aa
luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Feb 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.