Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a WebTransport spec #404

Merged
merged 25 commits into from
Oct 12, 2022
Merged

add a WebTransport spec #404

merged 25 commits into from
Oct 12, 2022

Conversation

marten-seemann
Copy link
Contributor

@marten-seemann marten-seemann commented Apr 4, 2022

This PR adds a basic spec for WebTransport, based on https://pl-strflt.notion.site/Enabling-WebTransport-across-the-libp2p-network-c6849c7252a1469a828a6c3e4b6abcad.

Most important things to watch out for when reviewing:

  1. The multiaddr format: The multiaddr contains the hashes of the certificates used, e.g. /ip4/1.2.3.4/udp/443/quic/webtransport/<hash1><hash2>. Is this a sane construction?
  2. We need to run a libp2p handshake on the WebTransport connection to verify peer IDs.

UPDATE 2022-07-26: Updated the link to the Notion page. I was mistaken in believing that the previous link was already public.

webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved

## Securing Streams

All streams other than the stream used for the security handshake are protected using Salsa20. Two (symmetric) keys are derived from the master secrect established during the handshake, one for sending and for receiving on the stream.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that there's no concept of the client identity at the transport level, so that we don't need an additional wrapping of encryption.

We should see if that's an API that can be proposed at the spec level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need that and we don't need double encryption. We can sufficiently identify the connection with the server certificate. Basically:

  1. The client knows it's talking to a server with a key that matches the certificate hash.
  2. The client then sends a challenge of (cert_hash, client_pid, server_pid, client_salt)
  3. The server responds with server_salt, SIG(HASH(cert_hash || client_pid || server_pid || client_salt || server_salt)) to authenticate their end.
  4. The client verifies this, then signs the same thing (maybe it signs the signature too?).
  • In step 3, the server ties their peer ID to the connection.
  • In step 4, the client ties their peer ID to the connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The client knows it's talking to a server with a key that matches the certificate hash.

That is correct iff it received the certificate hash in a signed peer record. Otherwise, the MITM could have generated a multiaddr containing its own cert hash.
Is that something we can require? Does js-libp2p support signed peer records?

  1. The client then sends a challenge of (cert_hash, client_pid, server_pid, client_salt)

Note: Unfortunately, the client doesn't know which cert_hash was used by WebTransport. It only knows that one of the hashes was considered valid, so it would have to send all of the cert hashes to the server.

  1. The server responds with server_salt, SIG(HASH(cert_hash || client_pid || server_pid || client_salt || server_salt)) to authenticate their end.

It is crucial that the server validates that it owns all cert hashes, otherwise a MITM attack is trivial. Furthermore (unless the certificate hash was signed in step 1), the client has to trust that the server actually performs this verification. Note that this kind of trust is not necessary when using TLS or Noise.

Here's a diagram of that handshake:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kubuxu Would you mind taking a look at this proposal (the handshake that @Stebalien suggested in particular)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct iff it received the certificate hash in a signed peer record. Otherwise, the MITM could have generated a multiaddr containing its own cert hash.

It doesn't need to be in a signed peer record, the server signs (and, as you noted, validates) this in step 2.

Note: Unfortunately, the client doesn't know which cert_hash was used by WebTransport. It only knows that one of the hashes was considered valid, so it would have to send all of the cert hashes to the server.

Makes sense.

It is crucial that the server validates that it owns all cert hashes, otherwise a MITM attack is trivial. Furthermore (unless the certificate hash was signed in step 1), the client has to trust that the server actually performs this verification. Note that this kind of trust is not necessary when using TLS or Noise.

Yes, the server needs to check it. But it's definitely not the clients problem if the server doesn't check. The server could just as easily leak their key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be in a signed peer record, the server signs (and, as you noted, validates) this in step 2.

Fair enough. As long as the client trusts the server to perform this check, we're good. What makes me a little bit nervous is that the client on its own can't verify that the connection isn't MITM'ed, which is different from the security guarantees you get from a regular TLS handshake. We probably can live with it (and as you say, the server could leak the key anyway), but it is a somewhat weaker security guarantee.

Copy link
Member

@Kubuxu Kubuxu Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set goals for this protocol, it needs to bind: the current channel and both PeerIDs.
TLS provides facilities for this by means of ExportKeyingMaterial. Result from it is unique per established channel, so if someone were to try to MitM the session, ExportKeyingMaterial result will be different on either side.

I think, but I need someone else to check it, we can just do:
Both sides compiute: ekm = ExportKeyingMaterial("libp2p-auth"), ekm is now the same shared secret on both sides, if they are using the same channel.
Then the Client sends the signature Signature(ekm || ServerPeerID, ClientPrivKey) and its pubkey. Server does the same, sends Signature(ekm || ClientPeerID, ServerPrivKey) and its pubkey.

Then to verify both sides verify that pubkeys match PeerIDs and they verify if the signature received is of ekm || MyPeerID. The ekm is never transmitted over wire, the MyPeerID is not transferred but a local value.

If either side uncovers an invalid signature, they terminate the connection.
If signature checks pass, the connection is assumed secure.

This binds the WebTransport TLS channel to both PeerIDs.
Binding to ekm prevents MitM and binding to other side PeerID prevents impersonation.
The local side PeerID binding is not necessary as that is achieved by the signature.


Questions is whether something like ExportKeyingMaterial is available on the browser end. I think the answer is yes.

EDIT: add other side PeerID binding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PubKeys can also be exchanged out of band, IIRC libp2p has some other facilities for it.

Copy link
Member

@Kubuxu Kubuxu Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExportKeyingMaterial is unavailable on the browser side making my proposal moot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @Stebalien 's proposal can work. It is just there are a lot of details to work out that I don't feel comfortable deciding on (in essence we are rolling our own secure channel protocol, just without encryption).

An alternative to that proposal could be to reuse existing implementation of Noise secure channel to establish a secure session within WebTransport, for the purpose of co-authentication and authentication of the cert_hash.
The proposal would be to start a libp2p-noise channel over WebCrypot and add cert_hash into the Data field of HandshakePayload. An input of a predicate to assert onto the Data field would need to be added into libp2p-noise.

I think this is the lowest friction and risk solution.

webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved

## Securing Streams

All streams other than the stream used for the security handshake are protected using Salsa20. Two (symmetric) keys are derived from the master secrect established during the handshake, one for sending and for receiving on the stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need that and we don't need double encryption. We can sufficiently identify the connection with the server certificate. Basically:

  1. The client knows it's talking to a server with a key that matches the certificate hash.
  2. The client then sends a challenge of (cert_hash, client_pid, server_pid, client_salt)
  3. The server responds with server_salt, SIG(HASH(cert_hash || client_pid || server_pid || client_salt || server_salt)) to authenticate their end.
  4. The client verifies this, then signs the same thing (maybe it signs the signature too?).
  • In step 3, the server ties their peer ID to the connection.
  • In step 4, the client ties their peer ID to the connection.

webtransport/README.md Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor Author

I updated the spec, incorporating the ideas of @Stebalien and @Kubuxu for securing the connection and avoiding double-encryption:

  1. I introduced a /certhash multiaddr component, which can be repeated to transmit multiple certificate hashes.
  2. We now use Noise to perform the libp2p handshake, and transmit the list of certificate hashes the client was willing to accept as (unencrypted, but integrity-protected) payload of the first handshake message.
  3. The server verifies that it possesses certificates matching ALL of the certificate hashes.
  4. This also means that we won't be able to accept CA-signed certificates any more, as the Noise handshake described above is needed to tie the outer (WebTransport, TLS) connection to the Noise handshake. This simplifies the spec.

@Kubuxu
Copy link
Member

Kubuxu commented Apr 22, 2022

If we really wanted we could accept CA-signed certs but as self-signed certs are the primary use case, we might as well use just self-signed.

}
```

On receipt of the `e` message, the server MUST verify the list of certificate hashes. If the list is empty, it MUST fail the handshake. For every certificate in the list, it checks if it possesses a certificate with the corresponding hash. If so, it continues with the handshake. However, if there is even a single certificate hash in the list that it cannot associate with a certificate, it MUST abort the handshake.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it more, there is a small issue with it.
It would be trivial for me to make any node impossible to connect to via WebTransport by, as 3rd party, announcing additional multiaddr with some random certhash.

I don't have good solution for this, especially if WebTransport browser APIs don't give us any feedback on which certhash was used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand the attack you're suggesting. You're saying that an attacker can add a /certhash/<attacker controlled hash> to a WebTransport multiaddr, right? How is it different from an attacker modifying a TCP or QUIC address, changing the port number? That would prevent the victim from establishing a TCP / QUIC connection to that node as well.

I guess the answer to that is: the only multiaddr you can actually trust is an address transferred in a signed peer record. For all other addresses, the best you can do is try to connect and see what happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is no different than specifying a bad port on the same IP address for a given peer id.

In face of this, we have to ensure that multiaddrs won't be merged on the client side.

@SgtPooki
Copy link
Member

I introduced a /certhash multiaddr component, which can be repeated to transmit multiple certificate hashes.

Can we convert the '/certhash' pairs into an IPLD structure and do something like '/certhashs/<ipld_cid>' or something similar?

@MarcoPolo
Copy link
Contributor

I introduced a /certhash multiaddr component, which can be repeated to transmit multiple certificate hashes.

Can we convert the '/certhash' pairs into an IPLD structure and do something like '/certhashs/<ipld_cid>' or something similar?

I don't think we'd want to do that since then we'd have to fetch the certhash someway (since we only have the cid). Also I'm not sure what we would gain from the extra overhead of having a struct here. We also don't want to introduce an IPLD dependency for every implementation that wants to add this.

Is there something I'm missing?

@SgtPooki
Copy link
Member

SgtPooki commented May 9, 2022

Is there something I'm missing?

The CID shouldn't require lookup if it's an identity CID, other than that, probably not? ¯\_(ツ)_/¯

webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor Author

Thank you everyone for the great discussions on this PR!

We've shipped (experimental) WebTransport support based on this specification in go-libp2p v0.23.

I've resolved all remaining nits in the specification. Would be grateful for another round of review and many approvals :)


## Certificates

Since most libp2p nodes don't possess a TLS certificate signed by a Certificate Authority, servers use a self-signed certificates. According to the [w3c WebTransport certification](https://www.w3.org/TR/webtransport/), the validity of the certificate MUST be at most 14 days, and must not use an RSA key. Nodes then include the hash of one (or more) certificates in their multiaddr (see [Addressing](#addressing)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MUST, it's part of the webtransport spec: https://www.w3.org/TR/webtransport/#web-transport-configuration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's "MUST NOT", not "MUST not" ;)

This entire section needs a rewrite though. It is possible to use a CA-signed certificate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this section. Please take another look.

webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
Comment on lines 16 to 18
WebTransport is a way for browsers to establish a stream-multiplexed and bidirectional connection to servers using QUIC.

The WebTransport protocol is currently under development at the IETF. Chrome has implemented and shipped support for [draft-02](https://datatracker.ietf.org/doc/draft-ietf-webtrans-http3/), and Firefox [is working](https://bugzilla.mozilla.org/show_bug.cgi?id=1709355) on WebTransport support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, when we talk about the WebTransport protocol in this doc, we actually mean the WebTransport over HTTP-3 protocol as described in draft-ietf-webtrans-http3.
WebTransport as specified in draft-ietf-webtrans-overview is a protocol framework and not a single protocol. I think we should add a note/ sentence on that. At least for me it was a bit confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Not the best naming from the IETF's side here...
Everybody thinks of WebTransport over HTTP/3 when WebTransport, but then there's the HTTP/2 fallback that not a lot of people care about to begin with...

I've added some text (and links) to clarify things. Let me know if that's less confusing now.

webtransport/README.md Outdated Show resolved Hide resolved
Comment on lines 33 to 34
* `/ip4/1.2.3.4/udp/443/quic/webtransport/certhash/<hash1>`
* `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/quic/webtransport/certhash/<hash1>/certhash/<hash2>/certhash/<hash3>`
Copy link
Contributor

@elenaf9 elenaf9 Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we call it /certhash, instead of just adding a more general /multihash? The latter could then also be used in the future by other protocols that need to include a multihash in the adress for whatever reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some discussion here: multiformats/multiaddr#130. But I think the important part is that we can be a bit more specific here to highlight that this is a certificate hash rather than a hash for some other (?) thing. We also don't have a use case for a generic hash, but do have this use case for the certificate hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vote for more specific field name too. it consumes less encoding space and allows easier extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that being specific pays off here, by making us more future-proof.
Imagine we want to add another hash-like component to a multiaddr in the future. We'd now use /hash for the certificate hash and for that new thing. How would we know which of these is a certificate hash and which of them the new thing?


## Certificates

Since most libp2p nodes don't possess a TLS certificate signed by a Certificate Authority, servers use a self-signed certificates. According to the [w3c WebTransport certification](https://www.w3.org/TR/webtransport/), the validity of the certificate MUST be at most 14 days, and must not use an RSA key. Nodes then include the hash of one (or more) certificates in their multiaddr (see [Addressing](#addressing)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MUST, it's part of the webtransport spec: https://www.w3.org/TR/webtransport/#web-transport-configuration

Copy link
Contributor

@julian88110 julian88110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good write-up. Some comments and questions in line.


In order to verify end-to-end encryption of the connection, the peers need to establish that no MITM intercepted the connection. To do so, the server MUST include the certificate hash of the currently used certificate as well as the certificate hashes of all future certificates it has already advertised to the network in the `webtransport_certhashes` Noise extension (see Noise Extension section of the [Noise spec](/noise/README.md)). The hash of recently used, but expired certificates SHOULD also be included.

On receipt of the `webtransport_certhashes` extension, the client MUST verify that the certificate hash of the certificate that was used on the connection is contained in the server's list. If the client was willing to accept multiple certificate hashes, but cannot determine which certificate was actually used to establish the connection (this will commonly be the case for browser clients), it MUST verify that all certificate hashes are contained in the server's list. If verification fails, it MUST abort the handshake.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to specify how the peer should detect and retire expired certificate? or is it true that the webtransport connections are shortlived so that the expiration is usually not an issue?

Comment on lines 33 to 34
* `/ip4/1.2.3.4/udp/443/quic/webtransport/certhash/<hash1>`
* `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/quic/webtransport/certhash/<hash1>/certhash/<hash2>/certhash/<hash3>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vote for more specific field name too. it consumes less encoding space and allows easier extension.

## Security Handshake

Unfortunately, the self-signed certificate doesn't allow the nodes to authenticate each others' peer IDs. It is therefore necessary to run an additional libp2p handshake on a newly established WebTransport connection.
The first stream that the client opens on a new WebTransport session is used to perform a libp2p handshake using Noise (https://github.com/libp2p/specs/tree/master/noise). The client SHOULD start the handshake right after sending the CONNECT request, without waiting for the server's response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a curious question regarding handshake starting right after Connect. Is connect referring to establishing a socket level connection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is WebTransport CONNECT request as described in https://datatracker.ietf.org/doc/html/draft-ietf-webtrans-http3-03#section-3.2.

@libp2p libp2p deleted a comment from elenaf9 Sep 30, 2022
marten-seemann and others added 4 commits September 30, 2022 01:49
Co-authored-by: Elena Frank <elena.frank@protonmail.com>
Co-authored-by: Elena Frank <elena.frank@protonmail.com>
@marten-seemann
Copy link
Contributor Author

Thank you for this in-depth review, @elenaf9!

I rewrote the "Certificates" section to allow the use of CA-signed certificates.

webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
webtransport/README.md Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor Author

Thank you to everyone involved here! This was a big effort, and we now have a decent spec, implemented and released by go-libp2p, with another implementation in js-libp2p demonstrating interoperability.

Given that this PR has received 2 approvals from members of the libp2p team at PL, I'm going to merge it now.
This does NOT mean that this spec is perfect - both editorial as well as protocol changes are still possible via PRs to this repo.

@marten-seemann marten-seemann merged commit 12f9a31 into master Oct 12, 2022
@marten-seemann marten-seemann deleted the webtransport branch October 12, 2022 19:21
@mxinden
Copy link
Member

mxinden commented Oct 12, 2022

🚀 big step for the project!

Small nit, can you add an entry to https://github.com/libp2p/specs#protocols?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.