Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add peer id spec #100

Merged
merged 24 commits into from Jun 20, 2019

Conversation

@mgoelzer
Copy link
Contributor

commented Oct 10, 2018

This updates the peer ID spec to explain what keypairs are supported and how peer IDs are encoded for each key type. Thanks to @Stebalien for figuring this out with me.

@ghost ghost assigned mgoelzer Oct 10, 2018

@ghost ghost added the in progress label Oct 10, 2018

@mgoelzer mgoelzer requested review from jhiesey and raulk Oct 10, 2018

@mgoelzer

This comment has been minimized.

Copy link
Contributor Author

commented Oct 10, 2018

In the spirit of learning and spreading knowledge, I have chosen @jhiesey and @raulk as reviewers of this PR.

peer-ids/peer-ids.md Outdated Show resolved Hide resolved
peer-ids/peer-ids.md Outdated Show resolved Hide resolved

mgoelzer added some commits Oct 10, 2018

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

We should probably say: Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA but nodes using those keys may not be able to connect to all nodes.

@marten-seemann

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

+1 for using normative language

mgoelzer added some commits Oct 11, 2018

@Stebalien Stebalien referenced this pull request Oct 24, 2018
@mgoelzer mgoelzer referenced this pull request Nov 19, 2018
2 of 16 tasks complete
@whyrusleeping

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2019

Any update here? I'd love to have some specs around here to link to ;)

@tomaka tomaka referenced this pull request Feb 28, 2019
@marten-seemann
Copy link
Contributor

left a comment

In the interest of getting our specs PRs merged, I did a review of this one. Overall, this LGTM, I just think we can delete a few sentences that specify implementation details.

## Keys


Our key pairs are stored on disk using a simple protobuf defined in [libp2p/go-libp2p-crypto/pb/crypto.proto#L5](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto#L5):

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor
  1. The specs shouldn't link to code, it should be the other way round.
  2. How keys are stored on discs doesn't need to be specified, it's an implementation decision. We only need to specify things that effect the interoperability of implements.
3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
4. If the length is >42, then we hash it using it using the SHA256 multihash.

For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go):

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

Same here. I think the text already describes the logic pretty well, so we don't need to cite this comment.

This comment has been minimized.

Copy link
@raulk

raulk Mar 13, 2019

Member

Agree.


Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA, but nodes using those keys may not be able to connect to all other nodes.

Keys are passed around in code as byte arrays. Keys are encoded within these arrays differently depending on the type of key.

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

That seems like an implementation decision. Remove this sentence?


To sign a message, we first hash it with SHA-256 and then sign it using the RSASSA-PKCS1-V1.5-SIGN from RSA PKCS#1 v1.5.

See [libp2p/go-libp2p-crypto/rsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/rsa.go) for details

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

Same here.


Ed25519 signatures follow the normal Ed25519 standard.

See [libp2p/go-libp2p-crypto/ed25519.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ed25519.go) for details

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

And here.


To sign a message, we hash the message with SHA 256, and then sign it with the ECDSA standard algorithm, then we encode it using DER-encoded ASN.1.

See [libp2p/go-libp2p-crypto/ecdsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ecdsa.go) for details.

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

And here.


To sign a message, we hash the message with SHA 256, then sign it using the standard Bitcoin EC signature algorithm (BIP0062), and then use standard Bitcoin encoding.

See [libp2p/go-libp2p-crypto/secp256k1.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/secp256k1.go) for details.

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

And here.

peer-ids/peer-ids.md Show resolved Hide resolved

We do not do any special additional encoding for Ed25519 public keys.

The encoding for Ed25519 private keys is a little unusual. There are two formats that we encourage implementors to support:

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

This seems like an implementation decision, so we probably don't need to specify it.

This comment has been minimized.

Copy link
@Stebalien

Stebalien Mar 15, 2019

Contributor

Not entirely. We do want users to be able to port keys from one implementation to another.

}
```

As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair.

This comment has been minimized.

Copy link
@marten-seemann

marten-seemann Mar 12, 2019

Contributor

Is there any situation where we want to transmit the PrivateKey? That seems... dangerous. If not, we don't need to specify the PrivateKey here at all.

This comment has been minimized.

Copy link
@raulk

raulk Mar 13, 2019

Member

Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.

This comment has been minimized.

Copy link
@Stebalien

Stebalien Mar 15, 2019

Contributor

Unfortunately, users do need to be able to take their private keys with them (especially because we use these for things like IPNS).

This comment has been minimized.

Copy link
@yusefnapora

yusefnapora Mar 15, 2019

Contributor

It's true that removing the private key format from this doc leaves a gap. We still need to specify somewhere how we handle them.

We could bring back the private key references and add a call-out at the top of the doc that they're not related to peer-id calculation and are shown for reference.

This comment has been minimized.

Copy link
@Stebalien

Stebalien Mar 15, 2019

Contributor

Really, we should probably rename this doc to the "libp2p key spec" and make peer ID calculation a part of that.

This comment has been minimized.

Copy link
@yusefnapora

yusefnapora Mar 15, 2019

Contributor

👍 for that

peer-ids/peer-ids.md Outdated Show resolved Hide resolved
1. Encode the public key into the protobuf.
2. Serialize the protobuf containing the public key into bytes using the [canonical protobuf encoding](https://developers.google.com/protocol-buffers/docs/encoding).
3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
4. If the length is >42, then we hash it using it using the SHA256 multihash.

This comment has been minimized.

Copy link
@raulk

raulk Mar 13, 2019

Member

We should say something about how these are commonly represented as strings: base58btc encoding raw, without using multibase.

This comment has been minimized.

Copy link
@yusefnapora

yusefnapora Mar 15, 2019

Contributor

I added a bit about base58btc, but didn't mention multibase, since we hadn't defined it yet in the doc. Should I bring it up? I think if people are likely to expect Peer Ids to use multibase we should clarify.

3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function.
4. If the length is >42, then we hash it using it using the SHA256 multihash.

For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go):

This comment has been minimized.

Copy link
@raulk

raulk Mar 13, 2019

Member

Agree.

peer-ids/peer-ids.md Outdated Show resolved Hide resolved
}
```

As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair.

This comment has been minimized.

Copy link
@raulk

raulk Mar 13, 2019

Member

Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.

@raulk

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

@yusefnapora if you wanna do some spec herding, this is a quick win I think. Pretty good consensus.

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

I did a quick edit to remove references to private keys and serialization on disk. I also removed links to go code and added some links to e.g. the RSA signing spec, etc.

@mgoelzer do you mind if I push changes to this branch? I put up a new one here with the edits: https://github.com/libp2p/specs/blob/edit/peer-ids/peer-ids/peer-ids.md

but it might be easier to discuss if I drop the commits here.

@raulk

This comment has been minimized.

Copy link
Member

commented Mar 14, 2019

@yusefnapora gonna jump in and say yes. In the interest of moving forward, push to the branch in this PR. Thanks!

@ghost ghost assigned yusefnapora Mar 14, 2019

peer ids: language nit
Co-Authored-By: Stebalien <steven@stebalien.com>

@ghost ghost assigned raulk Mar 27, 2019

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

Sorry for sleeping on this for a while everyone :)

I added a few commits to address some feedback. I think the most significant is the note about deterministic protobuf encoding. It basically says determinism is "desirable" and you should try to make it happen, but doesn't call it out as a MUST. Without requiring a bunch of changes to the protobuf spec, that might be the best we can do, but if anyone has a better way to put this, I'm all ears.

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

@Stebalien @arnetheduck @raulk - could you guys help me figure out the resolution to the deterministic encoding problem?

If we definitely want to require a consistent / canonical encoding for peer ids, then I think I should write up a precise spec that requires a certain field ordering, etc. And we can have some tests that ensure your encoder handles edge cases well.

But @Stebalien mentions that, because we're also not guaranteeing a canonical encoding for the key Data field, it's not really relevant at this layer. And that it's fine to have multiple valid peer ids for the same key, which seems contentious.

I can make up some arguments in favor of this view; for one, we can extend the PublicKey message in the future without having to decide now how to encode unknown fields in our "not-quite-protobuf" format. It could also be up to an application to decide whether to reject multiple peer ids that all derive from the same public key. It's also much simpler, of course, since the easiest problem to solve is the one you don't have.

@Stebalien could you elaborate a bit on your view? I think we should figure out if this is a blocker or not to merging the spec.

@raulk

This comment has been minimized.

Copy link
Member

commented May 20, 2019

On mobile.

Re: deterministic key serde. I suggest we specify the format as proto3 + the extra requirements to reach a deterministic result (ordering, no unrecognised fields, no duplicates last wins, etc.) We should add an implementers note in the form of a SHOULD recommendation to use an OOTB protobuf encoder, where possible, and provide test vectors.

For cases where that’s unfeasible, we should provide a boiled down serde spec in BNF or similar form. It’ll be super simple, the schema is so short and constrained we can express the serialised form manually without alluding to proto3 at all.

Re: using unrecognised fields for peer ID calculation, I’d like to hear what use cases you had in mind @Stebalien. In my view, the peer ID should be derived from the pubkey modulo metadata, if any. I don’t think user-defined metadata should yield a different identity. Seems like opening a trivial attack vector for sybils.

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

In my view, the peer ID should be derived from the pubkey modulo metadata, if any. I don’t think user-defined metadata should yield a different identity.

Note: I'm really not expecting much if any user defined metadata. However, we may want to add new fields in the future and it's hard to do this if we don't include them in the hash.

  1. We (IPFS/IPLD) would like to to be able to convert peer IDs to CIDs (and, ideally, fetch keys over bitswap).
  2. Currently, peer IDs always map to exactly one public key. If we both allow other metadata and don't hash it, we'll need to handle merging metadata.
  3. The un-hashed metadata won't be authenticated. When we connect to a peer, they send us their key. We then compute the peer ID of that key and check if it matches the expected one. If we ignore metadata, we could end up with inconsistent metadata views.

I don’t think user-defined metadata should yield a different identity. Seems like opening a trivial attack vector for sybils.

There are two cases:

  • The key's owner generates multiple identities. This is a non-issue, they can already generate more keys.
  • An attacker generates multiple identities for a victim: We'll detect this on connect because we'll compute the victim's peer ID from the key they give us. Ideally, keys would be self signed but our use-case for these keys is simple enough that this probably isn't that big of an issue (we should keep using this system for identifying libp2p peers but should come up with something better for identity).
@Stebalien

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

But @Stebalien mentions that, because we're also not guaranteeing a canonical encoding for the key Data field, it's not really relevant at this layer. And that it's fine to have multiple valid peer ids for the same key, which seems contentious.

Same Key

IMO, "same key" should mean "same bytes". That is, if I change anything about the bytes of the serialized key, I get a new key.

I'm concerned that not all key formats will have a "canonical" encoding, some libraries may strip certain metadata while others preserve it, etc. This will lead to hard-to-track-down bugs. This has already been a real pain for us in IPLD and our solution there is to avoid re-serializing unless we change something.

My unconcern with having multiple valid peer IDs for the same underlying cryptographic key is that there's likely nothing we can do to stop this. I haven't audited our key formats/algorithms but it's likely possible to make some small changes to a public key and have it continue to work with the private key.

@bigs

bigs approved these changes May 21, 2019

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

@raulk, what do you think of the header in 6c4a587 - I think the markdown table is a decent way to present the main status information.

I figure we can hammer out details here and I'll write up a little spec for the header format once we like it.

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Also, @raulk, @vyzo & @Stebalien I nominated you guys as the Interest Group for this one 😄

Others are welcome to join in if they like

@Zolmeister

This comment has been minimized.

Copy link

commented May 25, 2019

I would like to reiterate the value in migrating to embedded-key base32 encoded canonical representation of PeerIDs
Ref. #139

@yusefnapora

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

How do we feel about merging this?

870b71a kicks the deterministic encoding issue down the road by saying a future version of the spec may require more strict encoding than the protobuf spec, but until then, don't extend the PublicKey message and use a protobuf encoder that writes fields in order.

@mgoelzer @Stebalien @raulk @vyzo

@yusefnapora yusefnapora merged commit 52ef330 into master Jun 20, 2019

@yusefnapora yusefnapora deleted the feat/peer-ids branch Jun 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.