Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer ID Calculation History And Resolution #138

Open
Stebalien opened this issue Jan 29, 2019 · 20 comments

Comments

@Stebalien
Copy link
Contributor

commented Jan 29, 2019

Forks off #111 to focus on the backwards compatibility issue instead of the CID/PID design space.

Summary

Peer ID calculation has changed a couple of times over the past year.

  1. Initially, peer IDs were always sha2-256 multihashes of the public key.
  2. From go-ipfs 0.4.16-0.4.18, we used the identity multihash to "inline" small keys (specifically, ed25519 keys) into the peer ID under the assumption that nobody was using them.
  3. We reverted (2) in 0.4.19 dev as OpenBazaar was using ed25519 keys before 0.4.16.
  4. We are now planning on re-introducing (2) as:
    a. Textile is also using ed25519 keys and is relying on 0.4.18 behavior.
    b. OpenBazaar is using a forked go-ipfs network. We are adding a package-level flag so they can restore pre-0.4.16 behavior in their forked build.
  5. Finally, we need to agree on a migration path such that all IPFS nodes use the same method to calculate peer IDs.

PR: libp2p/go-libp2p-peer#42

Goals

This all grew out of several requirements and wants:

Requirements

  • We need to provide a way to for users to select a hash function other than sha256 when computing their peer ID.
  • Given a public key, peer ID calculation must be deterministic.
  • There must be a case-insensitive way to express peer IDs appearing in IPNS paths (for browsers). Currently, peer IDs are always base58 encoded which is not case-insensitive.

Wants

  • We'd like a way to "inline" public keys into peer IDs (if the public key is small enough to fit comfortably). This will allow encrypting a message to a peer without needing to look up their key first.
  • We'd like to be able to fetch keys with bitswap.

Definitions

  1. Inlining peer ID (working title...): A peer ID from which the associated
    public key can be extracted.
  2. sha256 peer ID: A peer ID created using the sha256 hash function.

Note: "inlining" peer IDs look like 1..., sha256 peer IDs look like Qm....

Events

Below is an exhaustive history of this issue:

  1. In the beginning, peer IDs were sha256 multihashes of peer IDs.
  2. At some point, we added support for ed25519 keys in go-libp2p. The plan was
    to embed them in peer IDs
    (libp2p/go-libp2p-crypto#5). However, we punted on
    that. We never added support to go-ipfs
    (ipfs/go-ipfs#3625).
  3. OpenBazaar started using ed25519 keys anyways.
  4. We (go) added a separate function for calculating inlining peer IDs (in
    libp2p/go-libp2p-peer#15) that embedded ed25519 keys. Unfortunately, that
    wasn't usable because everyone needs to compute the same peer ID.
  5. In libp2p/go-libp2p-peer#30, we removed this separate function and
    switched to automatically embedding keys shorter than 42 bytes into peer IDs.
    This way, everyone would deterministically calculate the same peer ID. We did
    this by using the "identity" hash function instead of sha256.
  6. Textile started using go-libp2p and go-ipfs (using ed25519 keys).
  7. OpenBazaar tried to rebase onto go-ipfs 0.4.18 and discovered that peer ID
    calculation had changed.
  8. After discussing the issue in libp2p/specs#111 and on a
    call, we decided to revert 5. We were under the
    impression that nobody else was using ed25519 keys given that go-ipfs doesn't
    provide a way to generate ed25519 keys.
  9. Textile reached out to @whyrusleeping about a weird bug they were seeing when
    trying to connect two nodes. @whyrusleeping tried reproducing it got the
    error dial attempt failed: <peer.ID Qm*yNGz7a> --> <peer.ID 12*FrJvar> dial
    attempt failed: connected to wrong peer". This is the inverse of the issue
    OpenBazaar was having.
  10. At the moment, it doesn't actually look like this is the issue Textile was
    having, they're seeing timeouts: dial attempt failed: <peer.ID 12*xWYT4W> --> <peer.ID 12*YsNMRE> dial attempt failed: context deadline exceeded and
    dial attempt failed: <peer.ID 12*xWYT4W> --> <peer.ID 12*YsNMRE> dial attempt failed: dial tcp4 13.57.23.210:4001: i/o timeout. The latter looks
    like it comes from libp2p/go-tcp-transport#24.

This issue covers the peer ID issue, not the timeout issues.

Current State

  1. Most of the (go) network is using the inlining from libp2p/go-libp2p-peer#30.
  2. The latest go-ipfs and go-libp2p masters are not inlining keys into peer IDs. They are now (again) inlining keys.
  3. OpenBazaar is using a forked (separate) network so, for now at least, they
    don't actually need to interoperate with the rest of the network.
  4. Textile is not using a forked network so they do need to interoperate. Textile is using peer IDs for identity but these could (potentially) be migrated. They are not using IPNS.
  5. IPLD-DID (decentralized identity) is using the new inlined keys for IPNS. However, it's unclear if they have many/any users who would be affected.

Affected Subsystems

This covers the affected subsystems and some hacky fixes that I don't recommend.
They're only there to illustrate the issue.

Outbound Connection

When establishing an outbound connection, go-libp2p will:

  1. Perform a secio handshake.
  2. Derive the remote peer's ID from their public key.
  3. Check if this derived peer ID is the target peer ID.

Textile is seeing this fail after updating go-libp2p because they're passing a
peer ID created using the identity hash function into go-libp2p while
go-libp2p-peer is calculating it using the sha256 multihash.

Example Fix: This could be fixed by a hack to convert a peer ID created using
the identity hash function to a sha256 one.

Inbound Connection

When receiving an inbound connection, we compute the peer's ID from their public key. That means:

  1. If we're running a 0.4.18 go-ipfs node, we'll compute inlining peer IDs for ed25519 keys.
  2. If we're running a 0.4.19-dev go-ipfs node, we'll compute sha256 peer IDs for ed25519 keys.

Example Fix: In practice, this "just works". That is, we use our version of the
peer's ID internally and don't really care what the other side thinks it's ID
is.

DHT

The DHT is the first place where we really care about calculating the same peer
IDs. If we don't, FindPeer breaks.

Let's say there's some peer A that wants to connect to a peer B. Let's assume
that peer A and B have this peer ID inlining enabled but the rest of the network
doesn't.

When peer A tries to connect to peer B, it'll walk the DHT looking asking nodes if they either:

  1. Know of nodes closer to peer B.
  2. Know how to connect to peer B. Currently (in go-ipfs), only nodes that are
    actually connected to peer B will respond.

In this case,

  1. Peer A will be able to walk the DHT all the way to a node directly connected
    to peer B. That is, (1) will work.
  2. That node will know peer B by a different name (ID) so it won't be able to do part (2).

Example Fix: So, if we have an inlining peer ID, we can extract the key and
compute the sha256 peer ID and try to look that up in the DHT. Unfortunately,
we can't go the other way. This fix would also be really hacky.

IPNS

I don't believe IPNS is affected but I haven't thought through it thoroughly. At
worst, we'd have to apply a fix similar to the outbound connection fix.

PubSub

Same as IPNS.

Multibase

Unfortunately, this whole issue also relates to the ask from IPFS In Web
Browsers to make IPNS (and peer IDs) use multibase. If we just have Qm...
IDs, we can avoid allocating Q as a multibase prefix and we'll be fine.
However, inlining peer IDs start with 12 and 1 is already a valid (albeit
useless) multibase prefix for unary.

The real worry is that if we allow arbitrary mulithash functions, we need to
tackle the multibase issue before we start running into collisions. That is,
hash codes such that Base58Encode([hash code]) maps to some useful multibase
prefix.

@vyzo

This comment has been minimized.

Copy link
Contributor

commented Jan 29, 2019

  1. If we're running a 0.4.18 go-ipfs node, we'll compute inlining peer IDs for ed25519 keys.
  2. If we're running a 0.4.19-dev go-ipfs node, we'll compute sha256 peer IDs for ed25519 keys.

Shouldn't these two be reversed? .18 would use sha256 and .19-dev would use inline.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Jan 30, 2019

No. At the moment, 0.4.18 will use inlining, 0.4.16 and 0.4.19-dev won't. That's the issue.

@Zolmeister

This comment has been minimized.

Copy link

commented Jan 30, 2019

These are incompatible (without embedding hash function in the protobuf):

  • We need to provide a way to for users to select a hash function other than sha256 when computing their peer ID.
  • Given a public key, peer ID calculation must be deterministic.

Edit: moved to #139

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Jan 31, 2019

These are incompatible (without embedding hash function in the protobuf):

Hence #111.

@t-bast

This comment has been minimized.

Copy link

commented Feb 1, 2019

We'd like a way to "inline" public keys into peer IDs (if the public key is small enough to fit comfortably). This will allow encrypting a message to a peer without needing to look up their key first.

While I agree with the goal (I struggled with encryption key management in one of my experiments), I'm not sure it can be done securely.
I think the public key that should be used to derive your peer ID should be a signing key.
Using a signing key for encryption is usually not secure.

For Ed25519 to Curve25519 the community currently thinks it's likely to be secure but I haven't found a formal proof of that yet, which is why the golang library doesn't even offer to convert an Ed25519 key to a Curve25519 point.

And even if that turns out to be secure for the specific case of Ed25519 keys, it's likely to be insecure for other types of keys so we shouldn't encourage users to use the peer ID key for encryption. I think some kind of handshake is preferable, as it also provides nice features like perfect forward secrecy (at the expense of performance unfortunately).

@Zolmeister

This comment has been minimized.

Copy link

commented Feb 1, 2019

@t-bast What is an Ed25519 signing key?

@t-bast

This comment has been minimized.

Copy link

commented Feb 1, 2019

Ed25519 is only a signing scheme (see here), not an encryption scheme.
So the keys you get when generating an Ed25519 key-pair simply can't be used for encryption, only for signatures.
It's possible to convert those keys to encryption keys (Curve25519 points) but I'm afraid it's not considered secure and not recommended.

@Zolmeister

This comment has been minimized.

Copy link

commented Feb 1, 2019

I see, so to rephrase what you said before: the public key (Ed25519) in the PeerID should only be used as a signing key
I agree

@t-bast

This comment has been minimized.

Copy link

commented Feb 1, 2019

That's exactly it. We'll need another mechanism to fetch the user's encryption key (in my small experiment users publish their encryption keys on the DHT and sign it with their peer key).

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Feb 1, 2019

@t-bast

I agree. We'll have to enable this on a per-key (type) basis. We don't currently support encryption with these keys for precisely this reason.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Feb 1, 2019

@t-bast

If you ever do find a satisfactory solution, I'd love to hear about it.

@Zolmeister

This comment has been minimized.

Copy link

commented Feb 1, 2019

@t-bast @Stebalien
In-lining public-keys would allow for anonymous DHT lookups, via Tor's rend-spec-v3
That is, querying the DHT for PeerID -> metadata without disclosing PeerID or metadata (to parties without the originating PeerID).

This is not possible without in-lining.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Feb 1, 2019

@Zolmeister

Thanks for pointing out that document! We've been discussing doing something like it in ipfs/notes#291 (comment). Note, that doesn't actually require (as far as I can tell) embedding the public key in the peer ID. Instead of deriving the encryption key from the public key, we'd derive it from the peer ID.

@Zolmeister

This comment has been minimized.

Copy link

commented Feb 1, 2019

The reason I believe in-lining is required is because Put(Hash(PeerID), (Encrypt(DeriveKey(PeerID), metadata)) cannot be authenticated without revealing PeerID.
This would allow for DOS (ipfs/notes#291 (comment)) on the DHT (without having to maintain a connection to the network for each PeerID).

With in-line keys this becomes Put(BlindKey, Encrypt(DeriveKey(PeerID, metadata))), where BlindKey ownership is proven with PeerID private key.

@t-bast

This comment has been minimized.

Copy link

commented Feb 2, 2019

Regarding encryption, I think there's no free lunch. If you want to have long-term encryption keys, you lose forward secrecy. That's the reason most protocols don't have long term encryption keys or rotate them regularly (tor's hidden services' master key used to derive the address is just a signing key, not an encryption key).

I think forward secrecy is important, which means there will always be a handshake/roundtrips to retrieve an encryption key for a given peer. We'll just have to live with it and make it efficient enough. TLS 1.3's 0-RTT is a good way to mitigate the performance issue (for actions that are not vulnerable to replay attacks).

Having a look at Signal's double ratchet could be an inspiration. I also remembered that Signal seems to have a solution that can provide signature and encryption with a single key, which they call XEdDSA, but I need to re-read the protocol because I remember that there are some caveats.

Otherwise the hidden service v3 spec is a very solid protocol (but not necessarily very performant) so it can also be a good starting point.

@cpacia

This comment has been minimized.

Copy link

commented Feb 5, 2019

@Stebalien so here's our plan for ob... As part of the rebase to 0.18 I've put in patches to our vendor directory in the various locations get it working with hashed keys. Everything seems OK now.

However, I would like to switch to inline keys in the future as not having to look up the public key simplifies our code quit a bit in various places. If you guys did absolutely nothing I would probably be OK with that as once enough of our peers upgrade to our latest release I can migrate everyone to inline keys and stop patching go-ipfs in vendor.

However, I do think it makes sense to from an architecture standpoint to support both formats but it would seem like there might be easier solutions. For example you could patch Matches to test both styles.

func (id ID) MatchesPublicKey(pk ic.PubKey) bool {
	oid, err := IDFromPublicKey(pk)
	if err != nil {
		return false
	}
	iid, err := InlineIDFromPublicKey(pk)
	if err != nil {
		return false
	}
	return oid == id || iid == id
}

Also the way I patched 0.18 connections was to change the runhandshake() function so that if remotePeer != actualRemotePeer that creates an identity from public key using the hashed key and checks the equality again.

But overall from our perspective we'd like to make sure go-ipfs remains compatible with inline keys in the future as it would greatly improve our stuff to make the switch.

@cpacia cpacia referenced this issue Feb 5, 2019

Merged

IPFS rebase to v0.4.18 #1425

2 of 2 tasks complete
@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Feb 6, 2019

However, I would like to switch to inline keys in the future as not having to look up the public key simplifies our code quit a bit in various places.

Note: IPNS records now include the key if it isn't inlined into the ID.

But overall from our perspective we'd like to make sure go-ipfs remains compatible with inline keys in the future as it would greatly improve our stuff to make the switch.

This is still the goal and we'll make sure to support it no matter what.

For example you could patch Matches to test both styles.

Take a look at the "Affected Subsystems" section above. The real issue happens when accepting an inbound connection. In that case, the "acceptor" doesn't know how the dialer calculated their peer ID so they may end up calculating it differently. This can affect things like DHT lookups.

@cpacia

This comment has been minimized.

Copy link

commented Feb 7, 2019

The real issue happens when accepting an inbound connection.

@Stebalien can go into more detail on this or maybe link to the part of the affected code.

I'm asking because in my code I've only patched runhandshake() and it appears that everything is working properly. Upgraded nodes aren't having any issue making IPFS/IPNS queries on the network (which consists almost entirely of non-upgraded nodes).

Stebalien added a commit to ipfs/go-ipfs that referenced this issue Feb 8, 2019

gx: update go-libp2p-peer
Switch _back_ to the 0.4.18 style of peer IDs while we figure things out. See
libp2p/specs#138.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>

Stebalien added a commit to ipfs/go-ipfs that referenced this issue Feb 8, 2019

gx: update go-libp2p-peer
Switch _back_ to the 0.4.18 style of peer IDs while we figure things out. See
libp2p/specs#138.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>

magik6k pushed a commit to ipfs/interface-go-ipfs-core that referenced this issue Feb 9, 2019

gx: update go-libp2p-peer
Switch _back_ to the 0.4.18 style of peer IDs while we figure things out. See
libp2p/specs#138.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@parkan

This comment has been minimized.

Copy link

commented Feb 22, 2019

@Stebalien could you link out to the affected "accepter" code? OB are looking to converge their code base with upstream and this is a blocker

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

@parkan this isn't about any specific code, just how DHTs work.

For example:

  1. New node (new peer ID calculation method) joins the DHT. The DHT nodes it connects to will calculate the peer ID using the old system.
  2. New node adds content to ipfs, and tries to announce the content using a provide record. The DHT will ignore these provide records because they won't appear to be coming from the right peer. We could fix that by upgrading DHT nodes to handle both peer IDs but, well, that requires an upgrade.

magik6k pushed a commit to ipfs/go-filestore that referenced this issue Jul 15, 2019

gx: update go-libp2p-peer
Switch _back_ to the 0.4.18 style of peer IDs while we figure things out. See
libp2p/specs#138.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.