docs: add peer id spec #100

ghost · 2018-10-10T16:12:51Z

This updates the peer ID spec to explain what keypairs are supported and how peer IDs are encoded for each key type. Thanks to @Stebalien for figuring this out with me.

ghost · 2018-10-10T16:13:34Z

In the spirit of learning and spreading knowledge, I have chosen @jhiesey and @raulk as reviewers of this PR.

peer-ids/peer-ids.md

Stebalien · 2018-10-11T08:45:02Z

We should probably say: Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA but nodes using those keys may not be able to connect to all nodes.

marten-seemann · 2018-10-11T08:48:10Z

+1 for using normative language

whyrusleeping · 2019-01-11T18:20:13Z

Any update here? I'd love to have some specs around here to link to ;)

marten-seemann

In the interest of getting our specs PRs merged, I did a review of this one. Overall, this LGTM, I just think we can delete a few sentences that specify implementation details.

marten-seemann · 2019-03-11T23:57:09Z

peer-ids/peer-ids.md

+## Keys
+
+
+Our key pairs are stored on disk using a simple protobuf defined in [libp2p/go-libp2p-crypto/pb/crypto.proto#L5](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto#L5):


The specs shouldn't link to code, it should be the other way round.

How keys are stored on discs doesn't need to be specified, it's an implementation decision. We only need to specify things that effect the interoperability of implements.

marten-seemann · 2019-03-12T00:06:11Z

peer-ids/peer-ids.md

+  3.  If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes.  In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes).  The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
+  4. If the length is >42, then we hash it using it using the SHA256 multihash.
+
+For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go):


Same here. I think the text already describes the logic pretty well, so we don't need to cite this comment.

marten-seemann · 2019-03-12T00:07:00Z

peer-ids/peer-ids.md

+
+Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA, but nodes using those keys may not be able to connect to all other nodes.
+
+Keys are passed around in code as byte arrays.  Keys are encoded within these arrays differently depending on the type of key.  


That seems like an implementation decision. Remove this sentence?

marten-seemann · 2019-03-12T00:07:20Z

peer-ids/peer-ids.md

+
+To sign a message, we first hash it with SHA-256 and then sign it using the RSASSA-PKCS1-V1.5-SIGN from RSA PKCS#1 v1.5.
+
+See [libp2p/go-libp2p-crypto/rsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/rsa.go) for details


marten-seemann · 2019-03-12T00:07:27Z

peer-ids/peer-ids.md

+
+Ed25519 signatures follow the normal Ed25519 standard.
+
+See [libp2p/go-libp2p-crypto/ed25519.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ed25519.go) for details


marten-seemann · 2019-03-12T00:07:34Z

peer-ids/peer-ids.md

+
+To sign a message, we hash the message with SHA 256, and then sign it with the ECDSA standard algorithm, then we encode it using DER-encoded ASN.1.
+
+See [libp2p/go-libp2p-crypto/ecdsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ecdsa.go) for details.


marten-seemann · 2019-03-12T00:07:41Z

peer-ids/peer-ids.md

+
+To sign a message, we hash the message with SHA 256, then sign it using the standard Bitcoin EC signature algorithm (BIP0062), and then use standard Bitcoin encoding.
+
+See [libp2p/go-libp2p-crypto/secp256k1.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/secp256k1.go) for details.


peer-ids/peer-ids.md

marten-seemann · 2019-03-12T00:12:20Z

peer-ids/peer-ids.md

+
+We do not do any special additional encoding for Ed25519 public keys.
+
+The encoding for Ed25519 private keys is a little unusual. There are two formats that we encourage implementors to support:


This seems like an implementation decision, so we probably don't need to specify it.

Not entirely. We do want users to be able to port keys from one implementation to another.

marten-seemann · 2019-03-12T00:13:22Z

peer-ids/peer-ids.md

+}
+```
+
+As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair.


Is there any situation where we want to transmit the PrivateKey? That seems... dangerous. If not, we don't need to specify the PrivateKey here at all.

Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.

Unfortunately, users do need to be able to take their private keys with them (especially because we use these for things like IPNS).

It's true that removing the private key format from this doc leaves a gap. We still need to specify somewhere how we handle them.

We could bring back the private key references and add a call-out at the top of the doc that they're not related to peer-id calculation and are shown for reference.

Really, we should probably rename this doc to the "libp2p key spec" and make peer ID calculation a part of that.

👍 for that

peer-ids/peer-ids.md

raulk · 2019-03-13T15:56:13Z

peer-ids/peer-ids.md

+  1. Encode the public key into the protobuf.
+  2. Serialize the protobuf containing the public key into bytes using the [canonical protobuf encoding](https://developers.google.com/protocol-buffers/docs/encoding).
+  3.  If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes.  In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes).  The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
+  4. If the length is >42, then we hash it using it using the SHA256 multihash.


We should say something about how these are commonly represented as strings: base58btc encoding raw, without using multibase.

I added a bit about base58btc, but didn't mention multibase, since we hadn't defined it yet in the doc. Should I bring it up? I think if people are likely to expect Peer Ids to use multibase we should clarify.

raulk · 2019-03-13T15:56:40Z

peer-ids/peer-ids.md

+  3.  If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes.  In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes).  The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
+  4. If the length is >42, then we hash it using it using the SHA256 multihash.
+
+For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go):


peer-ids/peer-ids.md

raulk · 2019-03-13T17:26:16Z

peer-ids/peer-ids.md

+}
+```
+
+As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair.


Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.

raulk · 2019-03-13T17:27:20Z

@yusefnapora if you wanna do some spec herding, this is a quick win I think. Pretty good consensus.

yusefnapora · 2019-03-14T18:30:18Z

I did a quick edit to remove references to private keys and serialization on disk. I also removed links to go code and added some links to e.g. the RSA signing spec, etc.

@mgoelzer do you mind if I push changes to this branch? I put up a new one here with the edits: https://github.com/libp2p/specs/blob/edit/peer-ids/peer-ids/peer-ids.md

but it might be easier to discuss if I drop the commits here.

raulk · 2019-03-14T18:41:35Z

@yusefnapora gonna jump in and say yes. In the interest of moving forward, push to the branch in this PR. Thanks!

Co-Authored-By: Stebalien <steven@stebalien.com>

Adds "encode to byte array according to rules below" as first step, and makes explicit that we only use the public part of the keypair.

yusefnapora · 2019-05-08T19:21:46Z

Sorry for sleeping on this for a while everyone :)

I added a few commits to address some feedback. I think the most significant is the note about deterministic protobuf encoding. It basically says determinism is "desirable" and you should try to make it happen, but doesn't call it out as a MUST. Without requiring a bunch of changes to the protobuf spec, that might be the best we can do, but if anyone has a better way to put this, I'm all ears.

yusefnapora · 2019-05-20T19:28:51Z

@Stebalien @arnetheduck @raulk - could you guys help me figure out the resolution to the deterministic encoding problem?

If we definitely want to require a consistent / canonical encoding for peer ids, then I think I should write up a precise spec that requires a certain field ordering, etc. And we can have some tests that ensure your encoder handles edge cases well.

But @Stebalien mentions that, because we're also not guaranteeing a canonical encoding for the key Data field, it's not really relevant at this layer. And that it's fine to have multiple valid peer ids for the same key, which seems contentious.

I can make up some arguments in favor of this view; for one, we can extend the PublicKey message in the future without having to decide now how to encode unknown fields in our "not-quite-protobuf" format. It could also be up to an application to decide whether to reject multiple peer ids that all derive from the same public key. It's also much simpler, of course, since the easiest problem to solve is the one you don't have.

@Stebalien could you elaborate a bit on your view? I think we should figure out if this is a blocker or not to merging the spec.

raulk · 2019-05-20T22:18:41Z

On mobile.

Re: deterministic key serde. I suggest we specify the format as proto3 + the extra requirements to reach a deterministic result (ordering, no unrecognised fields, no duplicates last wins, etc.) We should add an implementers note in the form of a SHOULD recommendation to use an OOTB protobuf encoder, where possible, and provide test vectors.

For cases where that’s unfeasible, we should provide a boiled down serde spec in BNF or similar form. It’ll be super simple, the schema is so short and constrained we can express the serialised form manually without alluding to proto3 at all.

Re: using unrecognised fields for peer ID calculation, I’d like to hear what use cases you had in mind @Stebalien. In my view, the peer ID should be derived from the pubkey modulo metadata, if any. I don’t think user-defined metadata should yield a different identity. Seems like opening a trivial attack vector for sybils.

Stebalien · 2019-05-20T23:10:55Z

In my view, the peer ID should be derived from the pubkey modulo metadata, if any. I don’t think user-defined metadata should yield a different identity.

Note: I'm really not expecting much if any user defined metadata. However, we may want to add new fields in the future and it's hard to do this if we don't include them in the hash.

We (IPFS/IPLD) would like to to be able to convert peer IDs to CIDs (and, ideally, fetch keys over bitswap).
Currently, peer IDs always map to exactly one public key. If we both allow other metadata and don't hash it, we'll need to handle merging metadata.
The un-hashed metadata won't be authenticated. When we connect to a peer, they send us their key. We then compute the peer ID of that key and check if it matches the expected one. If we ignore metadata, we could end up with inconsistent metadata views.

I don’t think user-defined metadata should yield a different identity. Seems like opening a trivial attack vector for sybils.

There are two cases:

The key's owner generates multiple identities. This is a non-issue, they can already generate more keys.
An attacker generates multiple identities for a victim: We'll detect this on connect because we'll compute the victim's peer ID from the key they give us. Ideally, keys would be self signed but our use-case for these keys is simple enough that this probably isn't that big of an issue (we should keep using this system for identifying libp2p peers but should come up with something better for identity).

Stebalien · 2019-05-20T23:26:46Z

But @Stebalien mentions that, because we're also not guaranteeing a canonical encoding for the key Data field, it's not really relevant at this layer. And that it's fine to have multiple valid peer ids for the same key, which seems contentious.

Same Key

IMO, "same key" should mean "same bytes". That is, if I change anything about the bytes of the serialized key, I get a new key.

I'm concerned that not all key formats will have a "canonical" encoding, some libraries may strip certain metadata while others preserve it, etc. This will lead to hard-to-track-down bugs. This has already been a real pain for us in IPLD and our solution there is to avoid re-serializing unless we change something.

My unconcern with having multiple valid peer IDs for the same underlying cryptographic key is that there's likely nothing we can do to stop this. I haven't audited our key formats/algorithms but it's likely possible to make some small changes to a public key and have it continue to work with the private key.

yusefnapora · 2019-05-23T16:10:46Z

@raulk, what do you think of the header in 6c4a587 - I think the markdown table is a decent way to present the main status information.

I figure we can hammer out details here and I'll write up a little spec for the header format once we like it.

yusefnapora · 2019-05-23T16:12:16Z

Also, @raulk, @vyzo & @Stebalien I nominated you guys as the Interest Group for this one 😄

Others are welcome to join in if they like

Zolmeister · 2019-05-25T22:19:15Z

I would like to reiterate the value in migrating to embedded-key base32 encoded canonical representation of PeerIDs
Ref. #139

yusefnapora · 2019-06-11T15:29:33Z

How do we feel about merging this?

870b71a kicks the deterministic encoding issue down the road by saying a future version of the spec may require more strict encoding than the protobuf spec, but until then, don't extend the PublicKey message and use a protobuf encoder that writes fields in order.

@mgoelzer @Stebalien @raulk @vyzo

folex · 2023-05-04T22:37:58Z

Sorry for commenting on an old PR, but I have not found an answer anywhere else.

What's the motivation for number 42 here?

keys that serialize to at most 42 bytes must be hashed using the "identity" multihash codec.

When one would expect keys to be longer than 42 bytes?

Thank you!

Winterhuman · 2023-05-05T13:24:05Z

I believe it's the size of ed25519 public keys inside of protobuf encoding, since ed25519 keys are supposed to be inlined into PeerIDs, so any keypair algorithm with public keys larger than 32 bytes would be encoded as more than 42 bytes and therefore not be inlined

folex · 2023-05-05T15:38:29Z

uh, so it's 32 + 10 bytes.

30% for serializaton is such a big number, I didn't even think about that.

thank you, that makes it clearer🙏

mxinden · 2023-05-08T04:58:24Z

And I vaguely remember that 42 would be small enough to then fit into a single DNS segment. Though I might be misremembering that.

docs: add peer id spec

714a6c7

ghost self-assigned this Oct 10, 2018

ghost added the in progress label Oct 10, 2018

ghost requested review from jhiesey and raulk October 10, 2018 16:13

docs: clean up writing

565767a

vyzo reviewed Oct 10, 2018

View reviewed changes

peer-ids/peer-ids.md Outdated Show resolved Hide resolved

peer-ids/peer-ids.md Outdated Show resolved Hide resolved

This was referenced Oct 10, 2018

Add ECDSA public key format libp2p/rust-libp2p#556

Closed

If peer id length <= 42, use identity hashing libp2p/rust-libp2p#555

Closed

Mike Goelzer added 2 commits October 10, 2018 17:40

docs: fix @vyzo comment

902fbfe

docs: syntax highlighting

95c2354

Mike Goelzer added 2 commits October 11, 2018 23:18

Key types should/may

d8459bc

clarify 42 byte rule

e2dfbe2

Stebalien mentioned this pull request Oct 24, 2018

SECIO spec #106

Merged

ghost mentioned this pull request Nov 19, 2018

Specs 2.0 & libp2p book #110

Open

16 tasks

tomaka mentioned this pull request Feb 28, 2019

libp2p TLS 1.3 Handshake #151

Merged

marten-seemann reviewed Mar 12, 2019

View reviewed changes

raulk reviewed Mar 13, 2019

View reviewed changes

raulk mentioned this pull request Mar 13, 2019

Phase 0 Networking Specifications ethereum/consensus-specs#763

Merged

yusefnapora added 2 commits March 14, 2019 14:24

remove references to private keys & storage formats

6c318c9

remove links to go impl, add links to specs

878f2fa

ghost assigned yusefnapora Mar 14, 2019

peer ids: language nit

eda2295

Co-Authored-By: Stebalien <steven@stebalien.com>

yusefnapora added 2 commits May 8, 2019 15:10

tweak the description of peer id generation

a7de2f6

Adds "encode to byte array according to rules below" as first step, and makes explicit that we only use the public part of the keypair.

add note about deterministic encoding of PublicKey protobuf

1237100

raulk mentioned this pull request May 9, 2019

[Contest] libp2p+noise: Win a Data Terra Nemo’19 conference ticket! libp2p/go-libp2p#631

Closed

bigs approved these changes May 21, 2019

View reviewed changes

yusefnapora mentioned this pull request May 22, 2019

libp2p specification framework – lifecycle: maturity level and status #169

Merged

yusefnapora added 4 commits May 22, 2019 14:03

revise note about deterministic encoding

870b71a

update status & generate TOC

d14a44d

fix TOC

10043ec

update status header

6c4a587

use shortcut reference links for authors in header

2ec0867

yusefnapora mentioned this pull request May 27, 2019

libp2p specs framework: document header format #171

Merged

yusefnapora added 2 commits June 19, 2019 09:51

Merge master into feat/peer-ids

5173834

add peer id spec to index

ed01eb1

yusefnapora merged commit 52ef330 into master Jun 20, 2019

yusefnapora deleted the feat/peer-ids branch June 20, 2019 14:14

jacobcoro mentioned this pull request Dec 15, 2020

Support more Identity key pairs and signing algorithms textileio/go-threads#474

Open

Karrenbelt mentioned this pull request Mar 15, 2022

ACN keys valory-xyz/open-aea#85

Merged

		## Keys


		Our key pairs are stored on disk using a simple protobuf defined in [libp2p/go-libp2p-crypto/pb/crypto.proto#L5](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto#L5):


		Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA, but nodes using those keys may not be able to connect to all other nodes.

		Keys are passed around in code as byte arrays. Keys are encoded within these arrays differently depending on the type of key.


		To sign a message, we first hash it with SHA-256 and then sign it using the RSASSA-PKCS1-V1.5-SIGN from RSA PKCS#1 v1.5.

		See [libp2p/go-libp2p-crypto/rsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/rsa.go) for details


		Ed25519 signatures follow the normal Ed25519 standard.

		See [libp2p/go-libp2p-crypto/ed25519.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ed25519.go) for details


		To sign a message, we hash the message with SHA 256, and then sign it with the ECDSA standard algorithm, then we encode it using DER-encoded ASN.1.

		See [libp2p/go-libp2p-crypto/ecdsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ecdsa.go) for details.


		To sign a message, we hash the message with SHA 256, then sign it using the standard Bitcoin EC signature algorithm (BIP0062), and then use standard Bitcoin encoding.

		See [libp2p/go-libp2p-crypto/secp256k1.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/secp256k1.go) for details.


		We do not do any special additional encoding for Ed25519 public keys.

		The encoding for Ed25519 private keys is a little unusual. There are two formats that we encourage implementors to support:

docs: add peer id spec #100

docs: add peer id spec #100

Conversation

ghost commented Oct 10, 2018

ghost commented Oct 10, 2018

Stebalien commented Oct 11, 2018

marten-seemann commented Oct 11, 2018

whyrusleeping commented Jan 11, 2019

marten-seemann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk commented Mar 13, 2019

yusefnapora commented Mar 14, 2019

raulk commented Mar 14, 2019

yusefnapora commented May 8, 2019

yusefnapora commented May 20, 2019

raulk commented May 20, 2019 • edited Loading

Stebalien commented May 20, 2019

Stebalien commented May 20, 2019

yusefnapora commented May 23, 2019

yusefnapora commented May 23, 2019

Zolmeister commented May 25, 2019

yusefnapora commented Jun 11, 2019

folex commented May 4, 2023

Winterhuman commented May 5, 2023 • edited Loading

folex commented May 5, 2023

mxinden commented May 8, 2023

raulk commented May 20, 2019 •

edited

Loading

Winterhuman commented May 5, 2023 •

edited

Loading