Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: protect useful peers in low buckets #634

Merged
merged 3 commits into from May 25, 2020
Merged

Conversation

Stebalien
Copy link
Member

That way, we never disconnect from the core of the network.

We don't protect our closest peers as:

  • We'll reconnect to them frequently when we query for ourself.
  • We use the core nodes most frequently.

query.go Outdated
// Restrict to buckets 0, 1 (75% of requests, max 40 peers), so we don't
// protect _too_ many peers.
commonPrefixLen := kb.CommonPrefixLen(q.dht.selfKey, kb.ConvertPeerID(p))
if commonPrefixLen < 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: magic value.

dht.go Outdated
commonPrefixLen := kb.CommonPrefixLen(self, kb.ConvertPeerID(p))
cmgr.TagPeer(p, "kbucket", BaseConnMgrScore+commonPrefixLen)
commonPrefixLen := kb.CommonPrefixLen(dht.selfKey, kb.ConvertPeerID(p))
cmgr.TagPeer(p, kbucketTag, BaseConnMgrScore+commonPrefixLen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien

Do we give a higher score to peers with higher CPL so that the connection manager dosen't select them as candidates if it has a choice ?

Basically, we will protect the "furthest peers" but try to give a good score to the closer peers so that we de-prioritise them for pruning.

If yes, please can we document this ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, basically. It's fuzzy and not the best solution, but it ensures that we:

  1. Never cut connections to the most useful peers.
  2. Try to keep connections to our nearest neighbors.

I'll add some documentation.

@Stebalien
Copy link
Member Author

Q for @aarshkshah1992 and @aschmahmann:

  • I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?
  • Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

For context, I haven't seen this make any difference on a bootstrapper. We're still churning through nodes in the first two buckets every few hours. Honestly, I think this is just because we effectively pick peers at random. We need a better peer picking algorithm to replace old peers with better peers, ideally ones that seem to be sticking around.

@Stebalien
Copy link
Member Author

We're still churning through nodes in the first two buckets every few hours.

I take this back. After a day to stabilize, it looks like the churn is dying down a bit.

cmgr.Protect(p, dhtUsefulTag)
} else {
cmgr.TagPeer(p, dhtUsefulTag, usefulConnMgrScore)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien The TagPeer call is idempotent and overwrites the existing score for that tag.
How about adding to the current score here so we "reward" peers who are useful to us often.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we could probably do that if we utilize decaying tags like pubsub is planning on doing (libp2p/go-libp2p-pubsub#306). However, I don't think it's currently worth the complexity especially if we take into account the various ways the connection manager can be gamed to promote one peer over another.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to give them too much weight, I just want to make sure that they're marked useful.

@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented May 15, 2020

  • I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?

How about protecting useful peers/maybe ALL peers above a certain Cpl threshold ? I am not sure of the benefit of protecting useful peers in the low buckets as it's easy to discover those.

On the other hand, if I get a peer with a CPL of 16, I'd protect it with my life.

  • Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

I am not sure this is a great idea because it's "easy" to discover peers for the low buckets(because there are many of them) and this can create more problems than the benefit it gives us when the connection manager is constantly hitting it's limits.

@aschmahmann
Copy link
Contributor

@aarshkshah1992 this PR is a stab at improving #566. The major point here is that since it's valid to have peers in our routing tables that we're not connected to we only really need to keep connections alive to peers we are using frequently.

On the other hand, if I get a peer with a CPL of 16, I'd protect it with my life.

While it initially seems like protecting peers with high CPLs is in your best interest it's not really all that helpful since you query those peers a very small number of times per refresh interval (i.e. once or less).

I am not sure of the benefit of protecting useful peers in the low buckets as it's easy to discover those.

Yes, they are easy to discover and replace. However, we use them frequently (i.e. 50% of the time a user does a query it will use the peers in bucket 0) and therefore being able to reuse an existing connection instead of forming a new libp2p connection (TCP, multiplexer, encryption, ...) could save us a bunch of round trips and essentially shave a hop off of our lookups.


Before we had dissociated connection state from routing table membership we needed to ensure we were connected to our closest peers to avoid harming the network (i.e. peers far from us are useful to us, but peers close to us are useful for the network). However, we are no longer under that obligation. Note: If we decide to sync state between nearby DHT peers then maintaining active connections to our closest peers will become important again.

@Stebalien
Copy link
Member Author

How about protecting useful peers/maybe ALL peers above a certain Cpl threshold ?

I want to be careful about protecting too many peers. That will completely exclude these peers from the connection manager.

That way, we never disconnect from the core of the network.

We don't protect our closest peers as:

* We'll reconnect to them frequently when we query for ourself.
* We use the core nodes most frequently.
And tag useful peers in higher buckets
@Stebalien
Copy link
Member Author

Bump @aarshkshah1992.

Should we protect all peers in low buckets, not just the useful ones? That would give us a chance to use them first.

I am not sure this is a great idea because it's "easy" to discover peers for the low buckets(because there are many of them) and this can create more problems than the benefit it gives us when the connection manager is constantly hitting it's limits.

Note: Protected peers do not count towards the limit. They're simply exempt.

@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented May 21, 2020

Q for @aarshkshah1992 and @aschmahmann:

  • I'm now tagging all useful peers, but only protecting ones in low buckets. Should we protect all useful peers?

Let me try to reason through this as it's all fuzzy in my head:

  • If we protect useful peers, our connection manager will NOT disconnect them when it kicks in. This will prevent an attacker from making too many connections to us causing us to disconnect those useful peers.
  • This means the peers that we have protected can continue to be useful in the future as we don't have to pay the connection costs for them. So, we will keep them around instead of preferring new peers.
  • Even if we give protection to a peer, we will remove it if a query to it fails, if it's not alive during our periodic ping, if it's performance starts to degrade etc... So, there's no "leak" here.
  • By protecting the useful peers in the first two buckets, we are protecting useful peers for 75% of our queries.

This is good. However,

We do not give any protection to our closest peers(even though we do give them a high score). If an attacker wanted to eclipse a peer from the network, it could still flush out that peer from the Routing Tables of it's closest peers.

I think we should offer protection to useful peers above a certain CPL threshold (or come up with another algo to protect those closest to us). I agree with @aschmahmann that this isn't of much help to us query-wise but if everyone does this, it is mutually beneficial in terms of NOT getting eclipsed.

@aschmahmann
Copy link
Contributor

aschmahmann commented May 21, 2020

@aarshkshah1992

If an attacker wanted to eclipse a peer from the network, it could still flush out that peer from the Routing Tables of it's closest peers.

If peer X is in bucket 15 how is the attacker going to flush it out, and if it was how would having the connection be protected prevent it from being flushed out?

In my understanding the only way for the peer to get flushed out of the routing table is if it's deemed "no longer useful" which requires a malicious peer to make it's way into our high cpl bucket and for it to respond much faster than other peers (e.g. lower latency, more "valuable" connections so that the good connections are pruned which makes them slower, etc.).

However, since this attack is predicated on the attacker being able to generate high cpl keys they could just generate even higher cpl keys and take over our high cpl buckets that way (although it may be more expensive since they have to respond to queries occasionally). Additionally, even with protecting the connections they will die/churn over time allowing the attacker to slowly work their way into the table.

It's only K extra peers to protect so in some sense why not do it, but I don't think this buys us much from an eclipse-protection standpoint.

@aarshkshah1992
Copy link
Contributor

@aarshkshah1992

If peer X is in bucket 15 how is the attacker going to flush it out, and if it was how would having the connection be protected prevent it from being flushed out?

Create a lot of connections to the peer -> make it's connection manager disconnect non-protected peers which means peers with cpl > 1 -> this will hurt the usefulness of those peers as they wont be able to reply fast enough because of having to pay the connection cost.

At the same time, generate peersIds with those higher CPLs and connect to the peer so that you become useful for those CPLs. But, you are right, this is expensive compared to attacking the lower CPLs because of the compute power involved. And, to eclipse a peer, you'd have to do this attack on all it's closest peers. We can think about this/do this later.

Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still eclipse a peer by flushing it out from the Routing Tables of it's closest peers. But, that's harder to do because:

i. You'd have to generate peerIDs that hash to higher Cpls.
ii You'd have to do this for all of the peer's closest peers.

We can worry about this later.

@libp2p libp2p deleted a comment from blackgultt May 25, 2020
@libp2p libp2p deleted a comment from blackgultt May 25, 2020
@libp2p libp2p deleted a comment from blackgultt May 25, 2020
@libp2p libp2p deleted a comment from blackgultt May 25, 2020
@Stebalien
Copy link
Member Author

Note: you could eclipse a peer that way, but it would be harder to split the peer off from the network.

@Stebalien Stebalien merged commit 3d294c7 into master May 25, 2020
@Stebalien Stebalien mentioned this pull request May 26, 2020
77 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants