Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Feeler connections to increase online addrs in the tried table. #8282

Merged
merged 1 commit into from Aug 25, 2016

Conversation

Projects
None yet
7 participants
@EthanHeilman
Copy link
Contributor

EthanHeilman commented Jun 28, 2016

These changes implement countermeasures 3 (feeler connections) suggested in our paper: "Eclipse Attacks on Bitcoin’s Peer-to-Peer Network".

Design:

We observe that a node's resistance to eclipse attacks grows as the number of online addresses in the tried table grows. To increase the number of online addresses in the tried table the following logic is implemented in net.cpp's ThreadOpenConnections:

  1. Every 2 minutes a short lived feeler connection is made to a randomly selected address in new.
  2. If the tested address is online and running bitcoind, this address is moved to the tried table.
  3. The feeler connection to the tested address is closed.

Only one feeler connection is attempted at any one time and feeler connections are only attempted after all outgoing connections slots of filled.

Advantages:

  • In our paper we sample several peer lists. We found that a large percentage of addresses in tried tables are stale IP addresses (the lowest was 72 percent stale, the highest was 95 percent stale). This large number of stale IP addresses increases the risk of eclipse attacks. This change remedies this by ensuring that the tried table grows quickly and contains many recently online addresses.
  • Not only do feeler connections provide a direct benefit to the node doing the testing but the tested node, if online, learns about our node and adds it to its tried table (conferring herd immunity even under partial deployment).
  • Countermeasure 3 (test-before-evict) builds on the feeler connections introduced in this change. This is the first step to deploying countermeasure 3.

Risk mitigation:

  • To limit the network impact of the feeler connections we only make one new connection every 2 minutes. Compared with other networking tasks that bitcoind performs the bandwidth increase is very slight.
  • To avoid issues of synchronization we introduce a random sleep of between 0 and 1000 milliseconds prior to making a feeler connection.
  • To avoid threading issues the feeler connections are made in the same thread as non-feeler connections.

Test plan:

  1. I'm currently running two EC2 nodes with public IP addresses. One is running this pull request and the other is running standard bitcoind. Both are running with arg -debug=1. At the end of the week I will post results from the logs looking at differences in the size of the tried table and examining the logs for any errors. I will also use the peer.dat files generated from this test to evaluate each nodes resistance to eclipse attacks.
  2. I will also use packet captures to ensure behavior and bandwidth behaves as expected.
  3. I invite anyone would like to help to run standard and feeler connection nodes and email me (Ethan.R.Heilman@gmail.com) the pcaps, logs and peers.dat files. I will post the results of this analysis here.

This change was suggested as Countermeasure 4 in
Eclipse Attacks on Bitcoin’s Peer-to-Peer Network,
Ethan Heilman, Alison Kendler, Aviv Zohar, Sharon Goldberg.
ePrint Archive Report 2015/263. March 2015.

@MarcoFalke MarcoFalke added the P2P label Jun 28, 2016

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Jun 28, 2016

This pull request implements the feeler functionality in #6355.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch Jun 28, 2016

@paveljanik

This comment has been minimized.

Copy link
Contributor

paveljanik commented Jun 30, 2016

tests need some love...

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Jun 30, 2016

@paveljanik I am currently working on fixes these days, however I am running into issues with the rpc-tests. When I run rpc-tests locally both with this pull request and against bitcoin/master I get intermittent failures.

I run against bitcoin/master

python3 qa/pull-tester/rpc-tests.py

tests fail

TEST                           | PASSED | DURATION
bip68-112-113-p2p.py           | True   | 52 s
listtransactions.py            | True   | 177 s
wallet.py                      | True   | 287 s
mempool_resurrect_test.py      | True   | 20 s
receivedby.py                  | True   | 93 s
txn_doublespend.py --mineblock | True   | 50 s
txn_clone.py                   | True   | 40 s
getchaintips.py                | True   | 43 s
p2p-fullblocktest.py           | False  | 447 s
rawtransactions.py             | True   | 83 s
mempool_spendcoinbase.py       | True   | 10 s
mempool_reorg.py               | True   | 22 s
rest.py                        | True   | 71 s
multi_rpc.py                   | True   | 5 s
httpbasics.py                  | True   | 11 s
mempool_limit.py               | True   | 24 s
proxy_test.py                  | True   | 15 s
merkle_blocks.py               | True   | 41 s
signrawtransactions.py         | True   | 5 s
zapwallettxes.py               | True   | 49 s
nodehandling.py                | True   | 24 s
decodescript.py                | True   | 5 s
reindex.py                     | True   | 31 s
blockchain.py                  | True   | 6 s
disablewallet.py               | True   | 5 s
keypool.py                     | True   | 18 s
prioritise_transaction.py      | True   | 25 s
invalidblockrequest.py         | True   | 11 s
sendheaders.py                 | True   | 60 s
invalidtxrequest.py            | True   | 12 s
walletbackup.py                | True   | 647 s
p2p-versionbits-warning.py     | True   | 24 s
abandonconflict.py             | True   | 39 s
fundrawtransaction.py          | False  | 201 s
importprunedfunds.py           | True   | 37 s
signmessages.py                | True   | 6 s
segwit.py                      | True   | 51 s
zmq_test.py                    | True   | 17 s
p2p-segwit.py                  | True   | 88 s
ALL                            | False  | 2852 s (accumulated)

I rerun it and sometimes different tests fail

TEST                           | PASSED | DURATION

p2p-fullblocktest.py           | False  | 33 s
bip68-112-113-p2p.py           | True   | 52 s
listtransactions.py            | True   | 64 s
receivedby.py                  | True   | 46 s
walletbackup.py                | False  | 107 s
mempool_resurrect_test.py      | True   | 21 s
txn_clone.py                   | True   | 58 s
txn_doublespend.py --mineblock | True   | 67 s
getchaintips.py                | True   | 71 s
wallet.py                      | True   | 193 s
mempool_spendcoinbase.py       | True   | 8 s
mempool_reorg.py               | True   | 18 s
mempool_limit.py               | True   | 20 s
httpbasics.py                  | True   | 12 s
rest.py                        | True   | 57 s
multi_rpc.py                   | True   | 6 s
rawtransactions.py             | True   | 67 s
proxy_test.py                  | True   | 16 s
signrawtransactions.py         | True   | 7 s
merkle_blocks.py               | True   | 32 s
zapwallettxes.py               | True   | 46 s
nodehandling.py                | True   | 23 s
decodescript.py                | True   | 7 s
blockchain.py                  | True   | 7 s
disablewallet.py               | True   | 4 s
reindex.py                     | True   | 25 s
keypool.py                     | True   | 18 s
prioritise_transaction.py      | True   | 28 s
invalidblockrequest.py         | True   | 13 s
invalidtxrequest.py            | True   | 10 s
sendheaders.py                 | True   | 62 s
p2p-versionbits-warning.py     | True   | 26 s
abandonconflict.py             | True   | 38 s
importprunedfunds.py           | True   | 27 s
signmessages.py                | True   | 5 s
segwit.py                      | True   | 43 s
zmq_test.py                    | True   | 24 s
fundrawtransaction.py          | True   | 184 s
p2p-segwit.py                  | True   | 90 s

ALL                            | False  | 1635 s (accumulated)

Also sometimes it just crashes.

...........................................Traceback (most recent call last):
  File "qa/pull-tester/rpc-tests.py", line 340, in <module>
    runtests()
  File "qa/pull-tester/rpc-tests.py", line 209, in runtests
    (name, stdout, stderr, passed, duration) = job_queue.get_next()
  File "qa/pull-tester/rpc-tests.py", line 264, in get_next
    (stdout, stderr) = proc.communicate(timeout=3)
  File "/usr/lib/python3.4/subprocess.py", line 960, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.4/subprocess.py", line 1618, in _communicate
    self._check_timeout(endtime, orig_timeout)
  File "/usr/lib/python3.4/subprocess.py", line 986, in _check_timeout
    raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['/home/e0/work/bitcoin-standard/qa/rpc-tests/walletbackup.py', '--srcdir=/home/e0/work/bitcoin-standard/src', '--portseed=37']' timed out after 3 seconds
@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Jun 30, 2016

If you get a timeout, you may try to run the test directly qa/rpc-tests/test.py and see if it still fails. Also, you may want to check if there are any zombie bitcoin processes which you want to kill. And finally, the test_framework will no longer clean up after a failure, so you need to clean your temp dir manually in case it has limited space available.

Usually none of this should be a problem on travis, as a fresh vm is used every time, so you may want to look into the failure on travis.

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Jun 30, 2016

You should assert that nodes in the existing tests are not marked a feel connection and thus get disconnected.

@sdaftuar

This comment has been minimized.

Copy link
Member

sdaftuar commented Jun 30, 2016

From my first glance at a failing travis test, it seems that perhaps pnode->fFeeler can be set to true on an incoming connection, causing an immediate disconnect? That would result in intermittent test failures for sure.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch 2 times, most recently Jun 30, 2016

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Jul 1, 2016

@sdaftuar @MarcoFalke I'm working on a simple unittest to make sure that fFeelers is false by default and never happens when fIncoming = true. Should be pushed by EOD.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch Jul 1, 2016

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Jul 1, 2016

I added a simple test and an assert to ensure that that fFeelers are set false by default and don't get assigned to incoming connection.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch Jul 1, 2016

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Aug 3, 2016

Summary: To test feeler connections I ran two nodes on EC2 for ~one month. One node ran this pull request, the second node ran default Bitcoin. According to this test feeler connections (this pull request) increase the size of the tried table by about an order of magnitude and consequently increase the eclipse resistance of a node by an order of magnitude. I also measured the impact of a future pull request test-before-evict where IPs are not evicted from the tried table if they are online.

Tried Table

Comparing the size of the tried table over time

One node ran this pull request for 32 days (July 1st to August 1st), the second node ran default Bitcoin for 35 Days (June 28 to August 1st).

On August 1st I disconnected the nodes and graphed the results. I tested which nodes in the tried table were online on August 1st.

  • The Feeler connections node had 1309 IPs in the tried table of which 38.8% or 590 IPs were online.
  • The default Bitcoin node had 195 IPs in the tried table of which 28.2% or 55 IPs were online.

Security Benefit

To determine the security benefit of these larger numbers of IPs in the tried table I modeled the attack presented in Eclipse Attacks on Bitcoin’s Peer-to-Peer Network. Full details on how this was carried out are given at bottom of this comment.

attackergraph40000-10-1000short-line

Default node: 595 attacker IPs for ~50% attack success.
Default node + test-before-evict: 620 attacker IPs for ~50% attack success.
Feeler node: 5540 attacker IPs for ~50% attack success.
Feeler node + test-before-evict: 8600 attacker IPs for ~50% attack success.

The node running feeler connections has 10 times as many online IP addresses in its tried table making an attack 10 times harder (i.e. requiring the an attacker require 10 times as many IP addresses in different /16s). Adding test-before-evict increases resistance of the node by an additional 3000 attacker IP addresses.

Below I graph the attack over even greater attacker resources (i.e. more attacker controled IP addresses). Note that test-before-evict maintains some security far longer even against an attacker with 50,000 IPs. If this node had a larger tried table test-before-evict could greatly boost a nodes resistance to eclipse attacks.

attacker graph long view

Modeling an Eclipse Attack

Using the peers.dat file from the default node and the feeler connections node I reconstructed their tried tables (I ignore the new table in our attacks as it is trivial to fill it with trash IP addresses). The attack test was run as follows:

  1. Attempt to insert X attacker IP addresses each from distinct \16s into a model of a node's tried table.
  2. Select nodes from each nodes tried table based on the logic in net.cpp (each outgoing connection must be from a different \16, only connect to online nodes).
  3. If the attacker's IPs are selected for all 8 outgoing connections the attacker wins as the attacker has partitioned the node from the rest of the network otherwise the attacker loses. For realism I only allowed the node to connect to other nodes which were online.

For each value X of number of attacker IPs I ran the attack 1000 times. For each of these 1000 runs I calculated the attack success probability by performing the selection of 8 outgoing connections 10 times. I used the peers.dat file from the feelers connection node and the default node to build the model of the tried table for the attack. I simulated the impact of enabling test-before-evict would have if turned on just prior to the attack (this should not be different than if the node has been running test-before-evict all along) by preventing the attacker from evicting an IP in the tried table if that node was online.

@sipa

View changes

src/net.cpp Outdated
@@ -1609,6 +1614,7 @@ void ThreadOpenConnections()

// Initiate network connections
int64_t nStart = GetTime();
int64_t nLastFeeler = GetTime();

This comment has been minimized.

Copy link
@sipa

sipa Aug 3, 2016

Member

No need to call GetTime() again.

@sipa

View changes

src/net.cpp Outdated

if (fFeeler) {
// Add small amount of random noise before connection to avoid syncronization.
int randsleep = GetRandInt(FEELER_SLEEP_WINDOW * 1000);

This comment has been minimized.

Copy link
@sipa

sipa Aug 3, 2016

Member

Would it make sense to use PoissonNextSend here, resulting in maximally unpredictable times?

EDIT: This comment was meant to appy to FEELER_INTERVAL instead.

This comment has been minimized.

Copy link
@EthanHeilman

EthanHeilman Aug 4, 2016

Author Contributor

Oh, I didn't know about PoissonNextSend, very cool. I moved FEELER_INTERVAL to use PoissonNextSend.

@sipa

View changes

src/net.cpp Outdated
bool fFeeler = false;
if (nOutbound >= MAX_OUTBOUND_CONNECTIONS){
if (GetTime() - nLastFeeler > FEELER_INTERVAL) {
nLastFeeler = GetTime();

This comment has been minimized.

Copy link
@sipa

sipa Aug 3, 2016

Member

GetTime() is relatively expensive; I think you can introduce an nTime = GetTime() above and then reuse that value 4 times. It probably doesn't matter, but gettimeofday calls appear so many times in traces already :)

This comment has been minimized.

Copy link
@EthanHeilman

EthanHeilman Aug 4, 2016

Author Contributor

I managed to get rid of one call to GetTime(). PoissonNextSend uses GetTimeMicros() so replacing additional GetTime() calls would require nNow_(1000_1000) which looks a little ugly.

In a local branch I have some non-ugly changes which can reduce the number of GetTime() and GetTimeMicros() calls in ThreadOpenConnections but it touches code which is out of scope for this pull request. Would it be worthwhile to create a separate pull request for this?

@paveljanik

View changes

src/net.cpp Outdated
addr = addrman.Select(true);
} else {
addr = addrman.Select();
}

This comment has been minimized.

Copy link
@paveljanik

paveljanik Aug 3, 2016

Contributor

Can't this be simply changed to

CAddrInfo addr = addrman.Select(fFeeler);
@paveljanik

View changes

src/net.cpp Outdated
if (addrConnect.IsValid()) {

if (fFeeler) {
// Add small amount of random noise before connection to avoid syncronization.

This comment has been minimized.

Copy link
@paveljanik

paveljanik Aug 3, 2016

Contributor

syncronization -> synchronization

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 3, 2016

Thanks a lot for the thorough analysis! Concept ACK with a few nits. I'm surprised by how little code was needed.

@paveljanik

This comment has been minimized.

Copy link
Contributor

paveljanik commented Aug 3, 2016

How will feeler endpoints view us when we disconnect just after their version message?

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 3, 2016

@paveljanik Similar to oneshot connections that are used for DNS seeding over tor.

@paveljanik

View changes

src/net.cpp Outdated
// Add small amount of random noise before connection to avoid syncronization.
int randsleep = GetRandInt(FEELER_SLEEP_WINDOW * 1000);
MilliSleep(randsleep);
LogPrint("net", "Making feeler connection to %s\n",addrConnect.ToString());

This comment has been minimized.

Copy link
@paveljanik

paveljanik Aug 3, 2016

Contributor

space after , please.

@paveljanik

This comment has been minimized.

Copy link
Contributor

paveljanik commented Aug 4, 2016

Concept ACK. Nice work and interesting reading!

bool fInboundIn = false;

// Test that fFeeler is false by default.
CNode* pnode1 = new CNode(hSocket, addr, pszDest, fInboundIn);

This comment has been minimized.

Copy link
@paveljanik

paveljanik Aug 4, 2016

Contributor

New compile warning here:

+test/net_tests.cpp:154:31: warning: variable 'hSocket' is uninitialized when used here [-Wuninitialized]
+    CNode* pnode1 = new CNode(hSocket, addr, pszDest, fInboundIn);
+                              ^~~~~~~
+test/net_tests.cpp:148:19: note: initialize the variable 'hSocket' to silence this warning
+    SOCKET hSocket;
+                  ^
+                   = 0

This comment has been minimized.

Copy link
@EthanHeilman

EthanHeilman Aug 8, 2016

Author Contributor

This compile warning has been fixed.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch Aug 4, 2016

if (addrConnect.IsValid()) {

if (fFeeler) {
// Add small amount of random noise before connection to avoid synchronization.

This comment has been minimized.

Copy link
@sipa

sipa Aug 4, 2016

Member

Is this still needed with interval randomization?

This comment has been minimized.

Copy link
@EthanHeilman

EthanHeilman Aug 4, 2016

Author Contributor

Yes. If we remove the random sleep synchronization becomes possible.

Consider a network event which freezes ThreadOpenConnections for 120+ seconds. When that event unfreezes the thread all nodes will connect to an address in the new table all at once.

This anti-synchronization mechanism becomes more important when test-before-evict is added. Under test-before-evict a connection to a node can trigger the connected-to node to make another connection. My thinking is to get this protection in earlier rather than later since it does provide a small benefit to feeler connections.

@@ -60,6 +63,7 @@

namespace {
const int MAX_OUTBOUND_CONNECTIONS = 8;
const int MAX_FEELER_CONNECTIONS = 1;

This comment has been minimized.

Copy link
@rebroad

rebroad Aug 6, 2016

Contributor

This constant seems not to be used other than to effectively increase MAX_OUTBOUND_CONNECTIONS.

This comment has been minimized.

Copy link
@rebroad

rebroad Aug 7, 2016

Contributor

I'm perhaps reading the code incorrectly, but it seems that it will only ever make one feeler connection at a time, regardless of the value of this constant. I.e., increasing this value does not cause it to make concurrent feeler connections.

This comment has been minimized.

Copy link
@EthanHeilman

EthanHeilman Aug 8, 2016

Author Contributor

I reread the pull request and my understanding is still that MAX_FEELER_CONNECTIONS > 1 allows concurrent feeler connections. I have run tests in response to your comment which appear to show multiple concurrent feeler connections.

Would you be willing to write out your thought process which led you to the conclusion above in case there is something I am missing?

@EthanHeilman

This comment has been minimized.

Copy link
Contributor Author

EthanHeilman commented Aug 19, 2016

I'm going to be away from my computer for a few days. I will fix the newly broken tests when I get back.

Added feeler connections increasing good addrs in the tried table.
Tests if addresses are online or offline by briefly connecting to them. These short lived connections are referred to as feeler connections. Feeler connections are designed to increase the number of fresh online addresses in tried by selecting and connecting to addresses in new. One feeler connection is attempted on average once every two minutes.

This change was suggested as Countermeasure 4 in
Eclipse Attacks on Bitcoin’s Peer-to-Peer Network, Ethan Heilman,
Alison Kendler, Aviv Zohar, Sharon Goldberg. ePrint Archive Report
2015/263. March 2015.

@EthanHeilman EthanHeilman force-pushed the EthanHeilman:feelers3 branch to dbb1f64 Aug 23, 2016

@@ -4903,6 +4903,12 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,

if (strCommand == NetMsgType::VERSION)
{
// Feeler connections exist only to verify if address is online.

This comment has been minimized.

Copy link
@rebroad

rebroad Aug 24, 2016

Contributor

This would be better after the LogPrint to display version message I think, as otherwise the debug.log shows the disconnect and then after, the version message received, which is potentially confusing.

@@ -1017,7 +1021,8 @@ static void AcceptConnection(const ListenSocket& hListenSocket) {
SOCKET hSocket = accept(hListenSocket.socket, (struct sockaddr*)&sockaddr, &len);
CAddress addr;
int nInbound = 0;
int nMaxInbound = nMaxConnections - MAX_OUTBOUND_CONNECTIONS;
int nMaxInbound = nMaxConnections - (MAX_OUTBOUND_CONNECTIONS + MAX_FEELER_CONNECTIONS);
assert(nMaxInbound > 0);

This comment has been minimized.

Copy link
@rebroad

rebroad Aug 24, 2016

Contributor

Why add this assert here?

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Aug 25, 2016

code review ACK dbb1f64

@laanwj laanwj added this to the 0.13.1 milestone Aug 25, 2016

@laanwj laanwj merged commit dbb1f64 into bitcoin:master Aug 25, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Aug 25, 2016

Merge #8282: net: Feeler connections to increase online addrs in the …
…tried table.

dbb1f64 Added feeler connections increasing good addrs in the tried table. (Ethan Heilman)

@EthanHeilman EthanHeilman deleted the EthanHeilman:feelers3 branch Aug 25, 2016

jonasschnelli added a commit to jonasschnelli/bitcoin that referenced this pull request Sep 1, 2016

Added feeler connections increasing good addrs in the tried table.
Tests if addresses are online or offline by briefly connecting to them. These short lived connections are referred to as feeler connections. Feeler connections are designed to increase the number of fresh online addresses in tried by selecting and connecting to addresses in new. One feeler connection is attempted on average once every two minutes.

This change was suggested as Countermeasure 4 in
Eclipse Attacks on Bitcoin’s Peer-to-Peer Network, Ethan Heilman,
Alison Kendler, Aviv Zohar, Sharon Goldberg. ePrint Archive Report
2015/263. March 2015.

Backport for 0.12 from bitcoin#8282

@sipa sipa referenced this pull request Sep 7, 2016

Merged

[0.13] Various backports #8679

sickpig referenced this pull request in sickpig/BitcoinUnlimited Apr 14, 2017

Merge #8282: net: Feeler connections to increase online addrs in the …
…tried table.

dbb1f64 Added feeler connections increasing good addrs in the tried table. (Ethan Heilman)

@dgenr8 dgenr8 referenced this pull request May 18, 2017

Merged

Networking improvements #203

laanwj added a commit that referenced this pull request Mar 6, 2018

Merge #9037: net: Add test-before-evict discipline to addrman
e68172e Add test-before-evict discipline to addrman (Ethan Heilman)

Pull request description:

  This change implement countermeasures 3 (test-before-evict) suggested in our paper: ["Eclipse Attacks on Bitcoin’s Peer-to-Peer Network"](http://cs-people.bu.edu/heilman/eclipse/).
  # Design:

  A collision occurs when an address, addr1, is being moved to the tried table from the new table, but maps to a position in the tried table which already contains an address (addr2). The current behavior is that addr1 would evict addr2 from the tried table.

  This change ensures that during a collision, addr1 is not inserted into tried but instead inserted into a buffer (setTriedCollisions). The to-be-evicted address, addr2, is then tested by [a feeler connection](#8282). If addr2 is found to be online, we remove addr1 from the buffer and addr2 is not evicted, on the other hand if addr2 is found be offline it is replaced by addr1.

  An additional small advantage of this change is that, as no more than ten addresses can be in the test buffer at once, and addresses are only cleared one at a time from the test buffer (at 2 minute intervals), thus an attacker is forced to wait at least two minutes to insert a new address into tried after filling up the test buffer. This rate limits an attacker attempting to launch an eclipse attack.
  # Risk mitigation:
  - To prevent this functionality from being used as a DoS vector, we limit the number of addresses which are to be tested to ten. If we have more than ten addresses to test, we drop new addresses being added to tried if they would evict an address. Since the feeler thread only creates one new connection every 2 minutes the additional network overhead is limited.
  - An address in tried gains immunity from tests for 4 hours after it has been tested or successfully connected to.
  # Tests:

  This change includes additional addrman unittests which test this behavior.

  I ran an instance of this change with a much smaller tried table (2 buckets of 64 addresses) so that collisions were much more likely and observed evictions.

  ```
  2016-10-27 07:20:26 Swapping 208.12.64.252:8333 for 68.62.95.247:8333 in tried table
  2016-10-27 07:20:26 Moving 208.12.64.252:8333 to tried
  ```

  I documented tests we ran against similar earlier versions of this change in #6355.
  # Security Benefit

  This is was originally posted in PR #8282 see [this comment for full details](#8282 (comment)).

  To determine the security benefit of these larger numbers of IPs in the tried table I modeled the attack presented in [Eclipse Attacks on Bitcoin’s Peer-to-Peer Network](https://eprint.iacr.org/2015/263).

  ![attackergraph40000-10-1000short-line](https://cloud.githubusercontent.com/assets/274814/17366828/372af458-595b-11e6-81e5-2c9f97282305.png)

  **Default node:** 595 attacker IPs for ~50% attack success.
  **Default node + test-before-evict:** 620 attacker IPs for ~50% attack success.
  **Feeler node:** 5540 attacker IPs for ~50% attack success.
  **Feeler node + test-before-evict:** 8600 attacker IPs for ~50% attack success.

  The node running feeler connections has 10 times as many online IP addresses in its tried table making an attack 10 times harder (i.e. requiring the an attacker require 10 times as many IP addresses in different /16s). Adding test-before-evict increases resistance of the node by an additional 3000 attacker IP addresses.

  Below I graph the attack over even greater attacker resources (i.e. more attacker controled IP addresses). Note that test-before-evict maintains some security far longer even against an attacker with 50,000 IPs. If this node had a larger tried table test-before-evict could greatly boost a nodes resistance to eclipse attacks.

  ![attacker graph long view](https://cloud.githubusercontent.com/assets/274814/17367108/96f46d64-595c-11e6-91cd-edba160598e7.png)

Tree-SHA512: fdad4d26aadeaad9bcdc71929b3eb4e1f855b3ee3541fbfbe25dca8d7d0a1667815402db0cb4319db6bd3fcd32d67b5bbc0e12045c4252d62d6239b7d77c4395
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.