Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine API: proposal for authentication #162

Closed
holiman opened this issue Jan 14, 2022 · 33 comments
Closed

Engine API: proposal for authentication #162

holiman opened this issue Jan 14, 2022 · 33 comments

Comments

@holiman
Copy link
Contributor

holiman commented Jan 14, 2022

This is an RFC for authenticated JSON rpc API

Authentication

The engine JSON-RPC interface, exposed by EL and consumed by CL, needs to be authenticated. The authentication scheme chosen for thus purpose is JWT.

The type of attacks that this authentication scheme attempts to protect against are the following:

  • RPC port exposed towards the internet, allowing attackers to exchange messages with EL engine api.
  • RPC port exposed towards the browser, allowing malicious webpages to submit messages to the EL engine api.

The authentication scheme is not designed to

  • Prevent attackers with capability to read ('sniff') network traffic from reading the traffic,
  • Prevent attackers with capability to read ('sniff') network traffic from performing replay-attacks of earlier messages.

Authentication is performed as follows:

  • For HTTP dialogue, each jsonrpc request is individually authenticated by supplying JWT token in the HTTP header.
  • For a WebSocket dialogue, only the initial handshake is authenticated, after which the message dialogue proceeds without further use of JWT.
  • For inproc, a.k.a raw ipc communication, no authentication is required, under the assumption that a process able to access ipc channels for the process, which usually means local file access, is already sufficiently permissioned that further authentication requirements do not add security.

JWT specifications

  • The EL MUST support the at least the following alg HMAC + SHA256 (HS256)
  • The EL MUST reject the alg none.

The HMAC algorithm means that symmetric encryption is used, thus several CL's will be able to use the same key, and, from an authentication perspective be able to impersonate each other. From a deployment perspective, it means that an EL does not need to be provisioned with individual keys for each caller.

Key distribution

The EL and CL clients MUST accept a cli/config parameter: jwt-secret, a 256 bit key, to be used for verifying/generating jwt tokens.
If such a parameter is not given, the client SHOULD generate such a token, valid for the duration of the execution, and show the token in the output, which the user can then use to provision the counterpart client with.

@mkalinin
Copy link
Collaborator

Would you mind to provide more details and examples of exact payloads during the handshake for the WebSocket case? It seems like we need to add a new error code that would signal the caller that it's not authenticated.

@holiman
Copy link
Contributor Author

holiman commented Jan 14, 2022

The websocket handshake starts with the client performing a websocket upgrade request. This must be a regular http GET request, and the actual
parameters for the WS-handshake is carried in the http headers.

Example:

GET / HTTP/1.1
Host: localhost:1337
User-Agent: Mozilla/5.0..
Origin: ..
Connection: keep-alive, Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Protocol: echo-protocol
Sec-WebSocket-Extensions: permessage-deflate
Sec-WebSocket-Key: VOPn9g/2WnfiJbxBw4aZkQ==
Sec-Fetch-Dest: websocket
Sec-Fetch-Mode: websocket
Sec-Fetch-Site: cross-site

The response, if accepted, is

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
...

After that, the connection is kept open, and is now full duplex.
Since the handshake is a regular HTTP request, it is easy to integrate with any previously existing HTTP authentication scheme, in our case: add jwt authentication to the request. One thing I'm not sure about, though, is exactly what to sign. I guess nothing (empty payload), since the initial GET request does not have a json payload.

@holiman
Copy link
Contributor Author

holiman commented Jan 17, 2022

After having investigated/tested this a bit, I've come across a couple of problems.

From https://testdriven.io/courses/taxi-react/websockets-part-one/#H-3-authenticating-socket-connections :

Anything that can be sent with an HTTP request can be sent with the handshake -- i.e., headers and cookies, query string parameters, and request bodies. Unfortunately, the JavaScript WebSocket API does not support custom headers. That means we need to find a different way to authenticate our WebSocket connection than an authorization header.

So. the Authorization: Bearer <token> technically does work, but only with custom clients and/or curl examples, not with browser communication. I'm not sure if this is a blocker -- do we want to expose the engine api for browser interaction?

The suggested solution is to pass the token in url query parameters. This, however, means that the token will/can be exposed in proxy caches and/or server logs -- which further means that we can definitely not use the empty payload for initial connect, but rather something based on a nonce or a timestap.

I'm not sure what the best solution is here.

cc @fjl

@mkalinin
Copy link
Collaborator

I'm not sure if this is a blocker -- do we want to expose the engine api for browser interaction?

Engine API is planned to be used by CL client to interact with EL one. I don't see the case when browser plays a role of CL client

@djrtwo
Copy link
Contributor

djrtwo commented Jan 17, 2022

Looks good!

Regarding "key distribution" -- is the common flow to start one side (either EL or CL), grab the key, then start the other side with that key?

And advanced usage is to generate in some independent flow and provide the key on both sides?

@djrtwo
Copy link
Contributor

djrtwo commented Jan 17, 2022

Engine API is planned to be used by CL client to interact with EL one. I don't see the case when browser plays a role of CL client

There is a desire to see CL light clients in the browser but (1) I'm not sure if/when this will happen and (2) it's unclear if it would drive an external EL client in such a case

@holiman
Copy link
Contributor Author

holiman commented Jan 17, 2022

Regarding "key distribution" -- is the common flow to start one side (either EL or CL), grab the key, then start the other side with that key?
And advanced usage is to generate in some independent flow and provide the key on both sides?

That was my idea, yes.

The main point I'm not happy about right now is the empty payload at ws connect.

@holiman
Copy link
Contributor Author

holiman commented Jan 18, 2022

Perhaps it would be good to also include the iat claim in the JWT: https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.6 . This means that the websocket connect token will not be identical each time, and the EL can verify the freshness of each token, and only allow access if the issuance time is within a certain time window, maybe 5 seconds in either direction.

@djrtwo
Copy link
Contributor

djrtwo commented Jan 20, 2022

Perhaps it would be good to also include the iat claim in the JWT

This seems like a reasonable approach to me

@MicahZoltu
Copy link
Contributor

Is there any limits to what is a valid jwt-secret, or is any 256-bit value acceptable?

@MicahZoltu
Copy link
Contributor

Is there any expectation to include exp (expiration) on the tokens? Feels like we should, and it would be good to have some "general advice" for what a reasonable expiration time is.

@holiman
Copy link
Contributor Author

holiman commented Jan 21, 2022

Is there any limits to what is a valid jwt-secret, or is any 256-bit value acceptable?

Well, we can prohibit 0x00, but not sure what else we can do.

Is there any expectation to include exp (expiration) on the tokens

The idea with iat would be to make it so that there's a 5 (?) - second window where it is valid. So it would implicitly carry the exp. Or rather, the expiry would be enforced on the EL, not settable by the CL.

The canonical usecase for exp is when a federated server issues "this is John, he's an admin, token is valid for one hour". It's not quite our usecase

@djrtwo
Copy link
Contributor

djrtwo commented Jan 22, 2022

Is there anything else here to discuss?

We're looking to get a release out early this week to keep engineering moving. Shall we write up a PR?

@holiman
Copy link
Contributor Author

holiman commented Jan 22, 2022 via email

@jflo
Copy link

jflo commented Jan 25, 2022

is there a conversation happening about different key sharing formats between CL and EL for the JWT signing key? For instance, Besu could trivially support loading it from a java keystore on the filesystem, but thats really only friendly to Teku. Is there a more agnostic format we want to support across combinations?

Besu does this today https://besu.hyperledger.org/en/stable/HowTo/Interact/APIs/Authentication/#jwt-public-key-authentication

@mratsim
Copy link

mratsim commented Feb 27, 2022

I have concerns about JWT, it seems like a complex protocol with many options and so pitfalls.

For instance:

Excerpts:

So, as someone who does some work in crypto engineering, arguments about JWT being problematic only if implementations are "bungled" or developers are "incompetent" are sort of an obvious "tell" that the people behind those arguments aren't really crypto people. In crypto, this debate is over.

I know a lot of crypto people who do not like JWT. I don't know one who does. Here are some general JWT concerns:

  • It's kitchen-sink complicated and designed without a single clear use case. The track record of cryptosystems with this property is very poor. Resilient cryptosystems tend to be simple and optimized for a specific use case.

  • It's designed by a committee and, as far as anyone I know can tell, that committee doesn't include any serious cryptographers. I joked about this on Twitter after the last JWT disaster, saying that JWT's support for static-ephemeral P-curve ECDH was the cryptographic engineering equivalent of a "kick me" sign on the standard. You could look at JWT, see that it supported both RSA and P-curve ECDH, and immediately conclude that crypto experts hadn't had a guiding hand in the standard.

  • Flaws in crypto protocols aren't exclusive to, but tend to occur mostly in, the joinery of the protocol. So crypto protocol designers are moving away from algorithm and "cipher suite" negotiation towards other mechanisms. Trevor Perrin's Noise framework is a great example: rather than negotiating, it defines a family of protocols and applications can adopt one or the other without committing themselves to supporting different ones dynamically. Not only does JWT do a form of negotiation, but it actually allows implementations to negotiate NO cryptography. That's a disqualifying own-goal.

  • JWT's defaults are incoherent. For instance: non-replayability, one of the most basic questions to answer about a cryptographic token, is optional. Someone downthread made a weird comparison between JWT and Nacl (weird because Nacl is a library of primitives, not a protocol) based on forward-security. But for a token, replayability is a much more urgent concern.

  • The protocol mixes metadata and application data in two different bag-of-attributes structures and generally does its best to maximize all the concerns you'd have doing cryptography with a format as malleable as JSON. Seemingly the only reason it does that is because it's "layered" on JOSE, leaving the impression that making a pretty lego diagram is more important to its designers than coming up with a simple, secure standard.

  • It's 2017 and the standard still includes X.509, via JWK, which also includes indirected key lookups.

  • The standard supports, and some implementations even default to, compressed plaintext. It feels like 2012 never happened for this project.

For almost every use I've seen in the real world, JWT is drastic overkill; often it's just an gussied-up means of expressing a trivial bearer token, the kind that could be expressed securely with virtually no risk of implementation flaws simply by hexifying 20 bytes of urandom. For the rare instances that actually benefit from public key cryptography, JWT makes a hard task even harder. I don't believe anyone is ever better off using JWT. Avoid it.

And another one

The problems, for people who don't want to read articles from comment links, are:

  • JSON Web Signing
    • alg headers
    • "This Header Parameter MUST be present and MUST be understood and processed by implementations."
  • JSON Web Encryption
    • RSA with PKCS1v1.5 padding (power word: Bleichenbacher 1998)
    • ECDH over NIST curves (and, in practice, invalid-curve attacks)
    • AES-GCM included in a list of asymmetric algorithm choices, for added confusion
    • AES-GCM for shared-key encryption, without guidance over nonces or key rotation

A better solution from JOSE would only give developers two options:

  • Version (v1, v2, v3, etc. which hard-coded the algorithm choices)
  • Operation
    • enc -> crypto_secretbox()
    • auth -> crypto_auth()
    • pub-enc -> crypto_box_seal()
    • pub-sign -> crypto_sign_detached()

image

@MicahZoltu
Copy link
Contributor

@mratsim Do you have a recommendation for a good alternative solution? One of the things I like about JWT is there is a library for every language (often multiple libraries). While I agree with some of the arguments made in the links you provided about JWTs being overkill (especially since we aren't leveraging capabilities), I would like to see a concrete proposal for an alternative that is either so trivially simple that a library isn't necessary, or has a large selection of libraries that could be used.

@tersec
Copy link
Contributor

tersec commented Feb 28, 2022

TOTP?

@mkalinin
Copy link
Collaborator

AFAICS, this articles are explaining why JWT is bad for user authentication and in-browser use cases. I haven't found any issues in application to our particular use case, please, correct me if it's been overlooked. Our use case seems very simple and I don't think that if we were designing auth by ourselves without relying on already existing schemas we wouldn't end up with a pretty similar scheme that we have with JWT utilization.

@MicahZoltu
Copy link
Contributor

The argument is basically that JWT is extremely unnecessary for our needs. We literally just need a preshared key and a signature over the payload (or similar). JWT provides things like capabilities, permissions, etc., All features we aren't going to use.

@MicahZoltu
Copy link
Contributor

I think a very reasonable question to ask would be, "what features exactly do we need from the authentication?". I haven't seen that listed anywhere, is it?

@mratsim
Copy link

mratsim commented Feb 28, 2022

@mkalinin

AFAICS, this articles are explaining why JWT is bad for user authentication and in-browser use cases. I haven't found any issues in application to our particular use case, please, correct me if it's been overlooked. Our use case seems very simple and I don't think that if we were designing auth by ourselves without relying on already existing schemas we wouldn't end up with a pretty similar scheme that we have with JWT utilization.

That's exactly why I think JWT is overkill and exposing us to many implementation bugs in third-party libraries and/or supply chain attacks while our use-case and threat model is very simple.

@MicahZoltu

Do you have a recommendation for a good alternative solution? One of the things I like about JWT is there is a library for every language (often multiple libraries). While I agree with some of the arguments made in the links you provided about JWTs being overkill (especially since we aren't leveraging capabilities), I would like to see a concrete proposal for an alternative that is either so trivially simple that a library isn't necessary, or has a large selection of libraries that could be used.

I've been looking into very simple schemes that would be misuse-resistant and came across Branca: https://github.com/tuupola/branca-spec

It only needs XChaCha20-Poly1305, has test vectors, has 3 libraries in Go, 2 in Java, 1 Javascript, 1 Python and 1 Rust among other languages.

The Python impl by the spec for reference is just 50 lines of code, assuming XChaCha20-Poly1305 provided by a crypto lib.

Tagging @asanso since he is mentioned by name in JWT best practices at https://tools.ietf.org/id/draft-ietf-oauth-jwt-bcp-02.html#insecure-use-of-elliptic-curve-encryption

https://github.com/tuupola/pybranca/blob/79366150353727c0410e111eef7dbaf2ed97ec8e/branca.py

"""
Branca
Authenticated and encrypted API tokens using modern crypto.
"""

import base62
import calendar
import ctypes
import struct
from binascii import unhexlify
from datetime import datetime
from xchacha20poly1305 import generate_nonce
from xchacha20poly1305 import crypto_aead_xchacha20poly1305_ietf_encrypt
from xchacha20poly1305 import crypto_aead_xchacha20poly1305_ietf_decrypt
from xchacha20poly1305 import CRYPTO_AEAD_XHCACHA20POLY1305_IETF_NPUBBYTES
from xchacha20poly1305 import CRYPTO_AEAD_XHCACHA20POLY1305_IETF_KEYBYTES

class Branca:
    VERSION = 0xBA

    def __init__(self, key):
        if isinstance(key, bytes):
            self._key = key
        else:
            self._key = unhexlify(key)

        if len(self._key) is not CRYPTO_AEAD_XHCACHA20POLY1305_IETF_KEYBYTES:
            raise ValueError(
                "Secrect key should be {} bytes long".format(
                    CRYPTO_AEAD_XHCACHA20POLY1305_IETF_KEYBYTES
                )
            )

        self._nonce = None # Used only for unit testing!

    def encode(self, payload, timestamp=None):

        if not isinstance(payload, bytes):
            payload = payload.encode()

        if timestamp is None:
            timestamp = calendar.timegm(datetime.utcnow().timetuple())

        version = struct.pack("B", self.VERSION)
        time = struct.pack(">L", timestamp)

        if self._nonce is None:
            nonce = generate_nonce()
        else:
            nonce = self._nonce

        header = version + time + nonce
        ciphertext = crypto_aead_xchacha20poly1305_ietf_encrypt(payload, header, nonce, self._key)

        return base62.encodebytes(header + ciphertext)

    def decode(self, token, ttl=None):
        token = base62.decodebytes(token)
        header = token[0:CRYPTO_AEAD_XHCACHA20POLY1305_IETF_NPUBBYTES + 5]
        nonce = header[5:CRYPTO_AEAD_XHCACHA20POLY1305_IETF_NPUBBYTES + 5]
        ciphertext = token[CRYPTO_AEAD_XHCACHA20POLY1305_IETF_NPUBBYTES + 5:]

        version, time = struct.unpack(">BL", bytes(header[0:5]))

        # Implementation should accept only current version.
        if version is not self.VERSION:
            raise RuntimeError("Invalid token version")

        payload = crypto_aead_xchacha20poly1305_ietf_decrypt(ciphertext, header, nonce, self._key)

        if ttl is not None:
            future = time + ttl
            timestamp = calendar.timegm(datetime.utcnow().timetuple())
            if future < timestamp:
                raise RuntimeError("Token is expired")

        return payload

    def timestamp(self, token):
        token = base62.decodebytes(token)
        version, time = struct.unpack(">BL", bytes(token[0:5]))

        return time

@mratsim
Copy link

mratsim commented Feb 28, 2022

Note compared to the desired properties, Branca is also overkill in the following way:

  • Payload is encrypted. Depending on the volume of data between CL and EL this might be too much latency? But it might opens up new use-cases when CL and EL are on dedicated VM/machines and XChaCha20-Poly1305 is likely to become heavily optimized in the future as it's easier to make it fast compared to AES (which requires hardware acceleration to reach GB throughput https://www.bearssl.org/speed.html).

@mkalinin
Copy link
Collaborator

That's exactly why I think JWT is overkill and exposing us to many implementation bugs in third-party libraries and/or supply chain attacks while our use-case and threat model is very simple.

Vulnerabilities in a 3rd party libraries is a good point. The spec is currently based on HMAC + SHA256, vulnerabilities in JWT libraries are related to EC based algorithms. I am not against using custom auth scheme, using any other 3rd party solution is prone to vulnerabilities, though, may be less prone if the solution is tiny. Just want to be sure that JWT is not an option.

what features exactly do we need from the authentication?

Currently it is 1) be able to prove that CL knows a secret 2) basic replay protection for auth tokens (via "iat") 3) optional: versioning. I believe we don't need anything on top of that. cc @holiman

@mkalinin
Copy link
Collaborator

Btw, could we implement current auth scheme avoiding usage of 3rd party JWT libs? We have crypto libs and HMAC + SHA256 is widely supported, we have JSON libs and Base64 codecs -- all these dependencies should already be used by client implementations.

@asanso
Copy link

asanso commented Feb 28, 2022

Tagging @asanso since he is mentioned by name in JWT best practices at https://tools.ietf.org/id/draft-ietf-oauth-jwt-bcp-02.html#insecure-use-of-elliptic-curve-encryption

Since I have been mentioned by @mratsim here my 2 cents. I actually already gave a look to this proposal (thanks @djrtwo for pointing this out) and while I share some of the concerns about JWT as a whole (as you can also see from some of my previous research) the subset used in this proposal seem to be immune to those vulnerabilities.

So it might really just a quick win to just use JWT (specifically HMAC + SHA256) for this use case rather than invent yet another solution.

@mratsim
Copy link

mratsim commented Feb 28, 2022

Btw, could we implement current auth scheme avoiding usage of 3rd party JWT libs? We have crypto libs and HMAC + SHA256 is widely supported, we have JSON libs and Base64 codecs -- all these dependencies should already be used by client implementations.

I think this is reasonable, though for censorship resistance and also to enable more distributed use-cases in the future I do think adding encryption would be interesting (and leaving key negotiation out of scope).

@mkalinin
Copy link
Collaborator

I do think adding encryption would be interesting (and leaving key negotiation out of scope)

We have been discussing this and ended up with TLS as a solution for encrypting communication channel between CL and EL. In this case extra auth scheme becomes redundant.

@mratsim
Copy link

mratsim commented Feb 28, 2022

I do think adding encryption would be interesting (and leaving key negotiation out of scope)

We have been discussing this and ended up with TLS as a solution for encrypting communication channel between CL and EL. In this case extra auth scheme becomes redundant.

What about the certificates and the certificate authorities though? Or do we use self-signing certificates? In that case a pre-shared key is likely better because it allows end-users negotiating them with whatever they want, cli-config/copy-pasting as suggested in this RFC or Diffie-Hellman in more complex infra.

@mkalinin
Copy link
Collaborator

What about the certificates and the certificate authorities though? Or do we use self-signing certificates? In that case a pre-shared key is likely better because it allows end-users negotiating them with whatever they want, cli-config/copy-pasting as suggested in this RFC or Diffie-Hellman in more complex infra.

I think there are options here and TLS with self-signed certificates is one of them. IMHO, more sophisticated encryption and auth schemas shouldn't be a part of client implementations and rather be set up as additional infra alongside with clients. For instance, it could be an HTTPS/WSS server in front of EL and a corresponding client on the other side that CL has access through

@holiman
Copy link
Contributor Author

holiman commented Feb 28, 2022

We discussed a few different schemes. One was to just go with TLS, but that would put tls endpoints into the client - making it difficult to integrate with a production TLS terminator, in case someone wants to put the api behind an nginx or cloudflare terminator. Not to mention certificate handling, which is a bit cumbersome.

Btw, could we implement current auth scheme avoiding usage of 3rd party JWT libs?

Definitely. This EIP mandates HMAC+SHA256, and iat property. It would be a pretty small thing to write a custom JWT handler which does only that. Whether client implementors choose to do that or go with a third party library is an implementation choice. The current go-ethereum PR uses a third party library, but explicitly only allows the HMAC+SHA256 scheme.

I've said from the start that I'm not hell-bent on JWT, but I still think it's suitable for this purpose -- given the constraints put forth in this EIP. There is no ambiguity about cryptography schema used, and a 10-second validity window. It can easily be integrated with a regular off-the-shelf TLS-terminator/load balancer.

I'm still open to being convinced otherwise, if there are any concrete arguments about why our use of it is insufficient/insecure.

@rauljordan
Copy link

Although the concerns of JWT being a "kitchen-sink" approach that has valid criticisms, I do think the way we use it in the scope of the engine API authentication scheme is decent. It is extremely commonplace, has tons of library support, and it is also fairly simple to write a custom JWT handler that meets the needs of the spec here. The auth constraints are simple, and we are not building this for web. Upon first learning the spec was suggesting JWT, we were able to immediately add support for it in Prysm as almost every developer that has worked on http auth has heard of JWT. Branca, although simpler in scope, feels more niche.

TL;DR: based on the constraints of the spec, JWT works fine as there is no serious downside that necessitates using a more niche schema.

@holiman
Copy link
Contributor Author

holiman commented Mar 7, 2022

Btw -- I guess this ticket can be closed now, since the corresponding PR has been merged. Any further discussions on this should rather go to either a new issue or a PR replacing it with something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants