Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discv5: decide on proof-of-work for node identities #122

Open
fjl opened this issue Oct 22, 2019 · 8 comments
Open

discv5: decide on proof-of-work for node identities #122

fjl opened this issue Oct 22, 2019 · 8 comments

Comments

@fjl
Copy link
Collaborator

fjl commented Oct 22, 2019

This issue attempts to collect our options regarding proof-of-work on node identities.
This isn't a complete solution just yet, more a place where we can dump thoughts and
collect research.

First, let's describe what the problem is:

discv5 is a Kademlia-inspired system. As such, it relies on the XOR distance measure to
establish routing tables, forming an overlay network. All known structured overlay
networks are known to be susceptible to attacks (such as the eclipse attack) based on
attacker-chosen node IDs. Proof-of-work is one solution to this problem. If generating a
new node identity required a certain amount of computational work, attackers would have a
harder time creating chosen node IDs to influence routing tables.

Another solution is to design an overlay network that doesn't use node IDs for routing, or
using an unstructured network topology. If such a design exists, and fulfills the
requirements we have for node discovery, I'd strongly prefer to use it instead of adding
proof-of-work. Unfortunately I don't know of any overlay design with those properties.

How much work?

Creating a new node identity is a one time process for most users, but a repeated task for
the attacker. The kind of attacker we want to rule out is one using many cheap cloud
servers. These are typically memory-constrained. If creating a new node required 4GB of
memory, it'd be harder to parallelize this operation at the cost of raising the barrier of
entry into the network. We are less concerned with CPU time consumption of the
proof-of-work function. Computation times up to ten seconds are no issue for most people,
especially for a one-time task.

Suitable proof-of-work functions

Since all network participants should verify the proof-of-work value if provided, the
function chosen must be asymmetric, i.e. verification should take very little effort. We
also need to consider that the chosen function must be simple to implement in multiple
programming languages. It'd be best to choose something that's already available.

There aren't a lot of functions to choose from here, so we might as well look at all the
current options:

  • Ethash. This is widely implemented across languages already. A huge downside with ethash
    is the verification time. We're already struggling with ethash verification performance
    for Ethereum block validation on smaller devices.
  • Equihash. This function is used for zcash and several other currencies. One issue here
    was that I couldn't find many implementations.
  • Cuckoo Cycle. This one is used by grin, and also in bitcoin's peer-to-peer networking
    layer. It's really simple to implement the algorithm and verification is essentially
    free. One downside with this function is the proof size. Cuckoo cycle proofs are L * 4
    bytes, where L is typically > 20. This makes it unsuitable for inclusion in ENR (see
    below).

Implementation Ideas

In discv5, node IDs are derived from node records using an 'identity scheme'. This is good
because any piece of metadata can be attached to the node and relayed in discovery, or
even across multiple discovery systems. This additional metadata can also be used to
derive the overlay node ID.

There are multiple ways to integrate proof-of-work into ENR, depending on how much
flexibility we want. If the proof-of-work value would influence the node ID directly, we'd
need to define a new identity scheme for it. We could also leave the identity scheme as-is
and add an attribute containing a value related to the ID created by any scheme. I am
strongly leaning toward the latter approach, since it allows us to keep improving the PoW
while keeping the ID stable.

If we were to choose Cuckoo Cycle, an additional complication would be transmitting the
proof value in the discovery protocol. I did research this a bit: to make this work,
we'd add an ENR attribute containing an 8-byte PoW nonce. This attribute would simply
signal support for proof-of-work. To actually create the proof, we'd basically try random
nonces until a cuckoo cycle is found for the sha256(node-id || nonce) input value. The
proof would have to be transmitted in a new protocol message or during the handshake.

@fjl fjl added this to the Discovery v5 milestone Oct 22, 2019
@djrtwo
Copy link
Contributor

djrtwo commented Oct 23, 2019

I'm worried that although you can now model the cost of eclipsing a node as non-zero, this cost will still likely be relatively insubstantial to an attacker if we keep the cost low enough for normal nodes (and even resource constrained devices).

It also seems that we would need to keep the PoW fixed rather than dynamically adjusting (like in a PoW consensus) to ensure that an attacker can't make it costly for honest nodes to participate. [It is also unclear at first glance how you would actually do a dynamic PoW in this context..]

How do you plan on choosing a suitable value for the difficulty of the PoW?

This problem is fundamental to DHTs, right? Are there any other novel solutions that don't use PoW that you have come upon?

@fjl
Copy link
Collaborator Author

fjl commented Oct 23, 2019

It also seems that we would need to keep the PoW fixed rather than dynamically adjusting (like in a PoW consensus) to ensure that an attacker can't make it costly for honest nodes to participate. [It is also unclear at first glance how you would actually do a dynamic PoW in this context..]

How do you plan on choosing a suitable value for the difficulty of the PoW?

Yes, the idea here is to keep a static PoW target. There is no way to attack that in the same way the blockchain 'difficulty' could be attacked. We'd basically decide on the amount of resources creating a node ID should take and put these parameters into the software. Changing it would be a software update.

This problem is fundamental to DHTs, right? Are there any other novel solutions that don't use PoW that you have come upon?

Not really. There was a cool idea in Low-Resource Eclipse Attacks on Ethereum’s Peer-to-Peer Network to make the distance measure private to each node (this is Countermeasure 3 in the paper). It breaks routing in a fundamental way, but might be worth going off on that tangent a little longer to see if it leads anywhere. The advantage we have over traditional DHTs is that we're not concerned with content-adressed data storage. It might actually be OK to break routing or make it less efficient because we really only care about random walks of the DHT.

@fjl
Copy link
Collaborator Author

fjl commented Oct 23, 2019

Actually, let's go on that tangent right now. Most ideas in discv5 (basic wire protocol, ENR, liveness check logic, topic queues) are universal and not specifically tied to the Kademlia structure. Topic ad distribution is, but only in that it builds on the idea of XOR distance, and we're actually still unsure about how to do the distribution right.

If we'd break routing, essentially removing Kademlia from the protocol, we'd need something to replace it. That something needs to have:

  • size-bounded local storage of a subset of the network
  • an algorithm to enumerate the entire network, and defined subsets of the network
  • critically: a structure that doesn't depend on node IDs or IP addresses

@dryajov
Copy link

dryajov commented Dec 4, 2019

cc @dryajov

@tfalencar
Copy link

tfalencar commented Feb 4, 2020

Has there been a decision on this topic?
I think the PoW solution is not very future proof, since as soon as at least one ASIC exists for the algorithm chosen, keys for the network can be sold in bulks cheaply enough that undermines the purpose of this mechanism. So in the end all that is left is a false sense of security, yet with all the downsides of the approach: I.e. low end devices unable to join etc. Or do I miss anything?

@fjl
Copy link
Collaborator Author

fjl commented Feb 4, 2020

No decision has been made so far. I don't like PoW either.

@fjl
Copy link
Collaborator Author

fjl commented Feb 7, 2020

Simple idea to try from @zsfelfoldi:

If we change the node distance calculation to be based on the hash of the node's IP address, it would be much harder to generate chosen IDs. We can actually pull this off with discv5 because
the protocol verifies IP addresses (i.e. they can't be spoofed) and we can detect our own global IP address using the PING/PONG source address field.

There are some downsides to this idea: if routing is completely dependent on IP address distribution, we have to deal with odd cases like many nodes being behind the same NAT address. There would also be logic specific to IPv6 to avoid mining on an address suffix.

@tfalencar
Copy link

tfalencar commented Feb 9, 2020

That's an interesting idea. One concern I have is that it would mean if a client is changing IP's frequently (which is common since most home ISP's do not provide static IP's by default), would that mean severe down-time for home users every time their IP changes (since the client needs to restart the whole peer discovery and reputation building from scratch) ?

One solution would be, after a first connection with a (new) peer, make it so that all his peers would accept subsequent incoming connections from new IP's, as long as the hash of that IP is signed with the same formerly provided public-key (say, during handshake). Makes sense?

So:
First connection - > new peer registered if hash(IP) matches -> store peer's PK locally and do normal accounting of peer quality.
Subsequent connections from new IP but same peer PK -> verify that hash(IP) matches and PK signature matches -> update IP/node ID locally to new IP (or better, only remove an IP association after X days of dormant time, and include the new one as also valid for that peer PK).
This would make it so that a lost connection due IP changes can be quickly recovered with its existing peers.

Another thing to think about is if there is any privacy implications of such scheme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@fjl @dryajov @djrtwo @tfalencar and others