fix: protect useful peers in low buckets #634

Stebalien · 2020-05-13T20:57:55Z

That way, we never disconnect from the core of the network.

We don't protect our closest peers as:

We'll reconnect to them frequently when we query for ourself.
We use the core nodes most frequently.

aarshkshah1992 · 2020-05-14T08:48:36Z

query.go

+	// Restrict to buckets 0, 1 (75% of requests, max 40 peers), so we don't
+	// protect _too_ many peers.
+	commonPrefixLen := kb.CommonPrefixLen(q.dht.selfKey, kb.ConvertPeerID(p))
+	if commonPrefixLen < 2 {


nit: magic value.

aarshkshah1992 · 2020-05-14T09:08:36Z

dht.go

-		commonPrefixLen := kb.CommonPrefixLen(self, kb.ConvertPeerID(p))
-		cmgr.TagPeer(p, "kbucket", BaseConnMgrScore+commonPrefixLen)
+		commonPrefixLen := kb.CommonPrefixLen(dht.selfKey, kb.ConvertPeerID(p))
+		cmgr.TagPeer(p, kbucketTag, BaseConnMgrScore+commonPrefixLen)


@Stebalien

Do we give a higher score to peers with higher CPL so that the connection manager dosen't select them as candidates if it has a choice ?

Basically, we will protect the "furthest peers" but try to give a good score to the closer peers so that we de-prioritise them for pruning.

If yes, please can we document this ?

Yeah, basically. It's fuzzy and not the best solution, but it ensures that we:

Never cut connections to the most useful peers.

Try to keep connections to our nearest neighbors.

I'll add some documentation.

Stebalien · 2020-05-15T00:12:07Z

Q for @aarshkshah1992 and @aschmahmann:

I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?
Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

For context, I haven't seen this make any difference on a bootstrapper. We're still churning through nodes in the first two buckets every few hours. Honestly, I think this is just because we effectively pick peers at random. We need a better peer picking algorithm to replace old peers with better peers, ideally ones that seem to be sticking around.

Stebalien · 2020-05-15T06:37:10Z

We're still churning through nodes in the first two buckets every few hours.

I take this back. After a day to stabilize, it looks like the churn is dying down a bit.

aarshkshah1992 · 2020-05-15T13:57:38Z

query.go

+		cmgr.Protect(p, dhtUsefulTag)
+	} else {
+		cmgr.TagPeer(p, dhtUsefulTag, usefulConnMgrScore)
+	}


@Stebalien The TagPeer call is idempotent and overwrites the existing score for that tag.
How about adding to the current score here so we "reward" peers who are useful to us often.

While we could probably do that if we utilize decaying tags like pubsub is planning on doing (libp2p/go-libp2p-pubsub#306). However, I don't think it's currently worth the complexity especially if we take into account the various ways the connection manager can be gamed to promote one peer over another.

I don't want to give them too much weight, I just want to make sure that they're marked useful.

aarshkshah1992 · 2020-05-15T14:05:23Z

I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?

How about protecting useful peers/maybe ALL peers above a certain Cpl threshold ? I am not sure of the benefit of protecting useful peers in the low buckets as it's easy to discover those.

On the other hand, if I get a peer with a CPL of 16, I'd protect it with my life.

Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

I am not sure this is a great idea because it's "easy" to discover peers for the low buckets(because there are many of them) and this can create more problems than the benefit it gives us when the connection manager is constantly hitting it's limits.

aschmahmann · 2020-05-18T15:13:19Z

@aarshkshah1992 this PR is a stab at improving #566. The major point here is that since it's valid to have peers in our routing tables that we're not connected to we only really need to keep connections alive to peers we are using frequently.

On the other hand, if I get a peer with a CPL of 16, I'd protect it with my life.

While it initially seems like protecting peers with high CPLs is in your best interest it's not really all that helpful since you query those peers a very small number of times per refresh interval (i.e. once or less).

I am not sure of the benefit of protecting useful peers in the low buckets as it's easy to discover those.

Yes, they are easy to discover and replace. However, we use them frequently (i.e. 50% of the time a user does a query it will use the peers in bucket 0) and therefore being able to reuse an existing connection instead of forming a new libp2p connection (TCP, multiplexer, encryption, ...) could save us a bunch of round trips and essentially shave a hop off of our lookups.

Before we had dissociated connection state from routing table membership we needed to ensure we were connected to our closest peers to avoid harming the network (i.e. peers far from us are useful to us, but peers close to us are useful for the network). However, we are no longer under that obligation. Note: If we decide to sync state between nearby DHT peers then maintaining active connections to our closest peers will become important again.

Stebalien · 2020-05-19T06:08:52Z

How about protecting useful peers/maybe ALL peers above a certain Cpl threshold ?

I want to be careful about protecting too many peers. That will completely exclude these peers from the connection manager.

That way, we never disconnect from the core of the network. We don't protect our closest peers as: * We'll reconnect to them frequently when we query for ourself. * We use the core nodes most frequently.

And tag useful peers in higher buckets

Stebalien · 2020-05-21T05:29:18Z

Bump @aarshkshah1992.

Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

I am not sure this is a great idea because it's "easy" to discover peers for the low buckets(because there are many of them) and this can create more problems than the benefit it gives us when the connection manager is constantly hitting it's limits.

Note: Protected peers do not count towards the limit. They're simply exempt.

aarshkshah1992 · 2020-05-21T07:07:36Z

Q for @aarshkshah1992 and @aschmahmann:

I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?

Let me try to reason through this as it's all fuzzy in my head:

If we protect useful peers, our connection manager will NOT disconnect them when it kicks in. This will prevent an attacker from making too many connections to us causing us to disconnect those useful peers.
This means the peers that we have protected can continue to be useful in the future as we don't have to pay the connection costs for them. So, we will keep them around instead of preferring new peers.
Even if we give protection to a peer, we will remove it if a query to it fails, if it's not alive during our periodic ping, if it's performance starts to degrade etc... So, there's no "leak" here.
By protecting the useful peers in the first two buckets, we are protecting useful peers for 75% of our queries.

This is good. However,

We do not give any protection to our closest peers(even though we do give them a high score). If an attacker wanted to eclipse a peer from the network, it could still flush out that peer from the Routing Tables of it's closest peers.

I think we should offer protection to useful peers above a certain CPL threshold (or come up with another algo to protect those closest to us). I agree with @aschmahmann that this isn't of much help to us query-wise but if everyone does this, it is mutually beneficial in terms of NOT getting eclipsed.

aschmahmann · 2020-05-21T17:27:08Z

@aarshkshah1992

If an attacker wanted to eclipse a peer from the network, it could still flush out that peer from the Routing Tables of it's closest peers.

If peer X is in bucket 15 how is the attacker going to flush it out, and if it was how would having the connection be protected prevent it from being flushed out?

In my understanding the only way for the peer to get flushed out of the routing table is if it's deemed "no longer useful" which requires a malicious peer to make it's way into our high cpl bucket and for it to respond much faster than other peers (e.g. lower latency, more "valuable" connections so that the good connections are pruned which makes them slower, etc.).

However, since this attack is predicated on the attacker being able to generate high cpl keys they could just generate even higher cpl keys and take over our high cpl buckets that way (although it may be more expensive since they have to respond to queries occasionally). Additionally, even with protecting the connections they will die/churn over time allowing the attacker to slowly work their way into the table.

It's only K extra peers to protect so in some sense why not do it, but I don't think this buys us much from an eclipse-protection standpoint.

aarshkshah1992 · 2020-05-22T06:05:57Z

@aarshkshah1992

If peer X is in bucket 15 how is the attacker going to flush it out, and if it was how would having the connection be protected prevent it from being flushed out?

Create a lot of connections to the peer -> make it's connection manager disconnect non-protected peers which means peers with cpl > 1 -> this will hurt the usefulness of those peers as they wont be able to reply fast enough because of having to pay the connection cost.

At the same time, generate peersIds with those higher CPLs and connect to the peer so that you become useful for those CPLs. But, you are right, this is expensive compared to attacking the lower CPLs because of the compute power involved. And, to eclipse a peer, you'd have to do this attack on all it's closest peers. We can think about this/do this later.

aarshkshah1992

We can still eclipse a peer by flushing it out from the Routing Tables of it's closest peers. But, that's harder to do because:

i. You'd have to generate peerIDs that hash to higher Cpls.
ii You'd have to do this for all of the peer's closest peers.

We can worry about this later.

Stebalien · 2020-05-25T05:38:20Z

Note: you could eclipse a peer that way, but it would be harder to split the peer off from the network.

Stebalien force-pushed the fix/protect-useful-peers branch from 8f6ac08 to 7c89d12 Compare May 13, 2020 21:01

Stebalien requested a review from aarshkshah1992 May 13, 2020 23:45

aarshkshah1992 reviewed May 14, 2020

View reviewed changes

aarshkshah1992 suggested changes May 14, 2020

View reviewed changes

aarshkshah1992 reviewed May 15, 2020

View reviewed changes

Stebalien added 3 commits May 20, 2020 17:58

fix: protect useful peers in low buckets

371d4dc

That way, we never disconnect from the core of the network. We don't protect our closest peers as: * We'll reconnect to them frequently when we query for ourself. * We use the core nodes most frequently.

address cr feedback

6915276

And tag useful peers in higher buckets

fix: use usefulConnMgrProtectedBuckets constant

6c1d38b

Stebalien force-pushed the fix/protect-useful-peers branch from a79d0dd to 6c1d38b Compare May 21, 2020 00:58

Stebalien requested review from aarshkshah1992 and aschmahmann May 21, 2020 05:27

aarshkshah1992 approved these changes May 22, 2020

View reviewed changes

libp2p deleted a comment from blackgultt May 25, 2020

Stebalien merged commit 3d294c7 into master May 25, 2020

Stebalien mentioned this pull request May 26, 2020

Release 0.6.0 ipfs/kubo#7347

Closed

77 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: protect useful peers in low buckets #634

fix: protect useful peers in low buckets #634

Stebalien commented May 13, 2020

aarshkshah1992 May 14, 2020

aarshkshah1992 May 14, 2020

Stebalien May 14, 2020

Stebalien commented May 15, 2020

Stebalien commented May 15, 2020

aarshkshah1992 May 15, 2020 •

edited

aschmahmann May 18, 2020

Stebalien May 19, 2020

aarshkshah1992 commented May 15, 2020 •

edited

aschmahmann commented May 18, 2020

Stebalien commented May 19, 2020

Stebalien commented May 21, 2020

aarshkshah1992 commented May 21, 2020 •

edited

aschmahmann commented May 21, 2020 •

edited

aarshkshah1992 commented May 22, 2020

aarshkshah1992 left a comment

Stebalien commented May 25, 2020

fix: protect useful peers in low buckets #634

fix: protect useful peers in low buckets #634

Conversation

Stebalien commented May 13, 2020

aarshkshah1992 May 14, 2020

Choose a reason for hiding this comment

aarshkshah1992 May 14, 2020

Choose a reason for hiding this comment

Stebalien May 14, 2020

Choose a reason for hiding this comment

Stebalien commented May 15, 2020

Stebalien commented May 15, 2020

aarshkshah1992 May 15, 2020 • edited

Choose a reason for hiding this comment

aschmahmann May 18, 2020

Choose a reason for hiding this comment

Stebalien May 19, 2020

Choose a reason for hiding this comment

aarshkshah1992 commented May 15, 2020 • edited

aschmahmann commented May 18, 2020

Stebalien commented May 19, 2020

Stebalien commented May 21, 2020

aarshkshah1992 commented May 21, 2020 • edited

aschmahmann commented May 21, 2020 • edited

aarshkshah1992 commented May 22, 2020

aarshkshah1992 left a comment

Choose a reason for hiding this comment

Stebalien commented May 25, 2020

aarshkshah1992 May 15, 2020 •

edited

aarshkshah1992 commented May 15, 2020 •

edited

aarshkshah1992 commented May 21, 2020 •

edited

aschmahmann commented May 21, 2020 •

edited