Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support exposure of hash algorithm digest to handle OIDC at_hash, potentially other spec extensions #314

Closed
sirosen opened this issue Nov 27, 2017 · 3 comments · Fixed by #775
Labels
stale Issues without activity for more than 60 days

Comments

@sirosen
Copy link
Contributor

sirosen commented Nov 27, 2017

#296 was closed with the suggestion that rather than special-case support for OIDC, PyJWT could support pluggable verifiers which handle additional, specialized claim validation behavior.

If there's a general market for verifiers -- or if they're going to be prepackaged and provided in this library as optional extensions -- that's a good avenue, but at present #295 is the only such issue.
Additionally, #258 runs directly counter to any extensible family of verifiers being hooked into PyJWT.

"Pluggable verifiers" seem therefore, at first blush, to be overwrought as a solution. I want PyJWT to support external verification of a claim based on the following information:

  • content of an arbitrary claim in the payload (trivially easy)
  • a digest from the hash algorithm specified in the JWT header (right now, hard, should be easy)

Client code should be able to extract this info without having to worry about how PyJWT behaves internally when cryptography is present.

The heart of the #296 implementation is really quite short:

        alg_obj = self.get_algo_by_name(algorithm)  # self is a PyJWT object
        hash_alg = alg_obj.hash_alg

        def get_digest(bytestr):
            if has_crypto and (
                    isinstance(hash_alg, type) and
                    issubclass(hash_alg, hashes.HashAlgorithm)):
                digest = hashes.Hash(hash_alg(), backend=default_backend())
                digest.update(bytestr)
                return digest.finalize()
            else:
                return hash_alg(bytestr).digest()

where get_algo_by_name is this segment abstracted into a function.

Having this hooked into the PyJWT.decode() call is not in any way necessary -- an external method like PyJWT.compute_hash_digest(<string>) would satisfy this just fine.

OIDC Client code would then look like this:

def verify_at_hash(decoded_id_token, access_token):
    digest = decoded_id_token.compute_hash_digest(access_token.encode('utf-8'))
    computed_at_hash = base64.urlsafe_b64encode(digest[:(len(digest) // 2)]).rstrip('=').decode('utf-8')
    assert decoded_id_token['at_hash'] == computed_at_hash

decoded = jwt.decode(...)
verify_at_hash(decoded, access_token)

However, there's a significant, nasty wrinkle: PyJWT.decode() only returns the payload and discards the header.

One of my initial inclinations is to make the payload a subclass of dict with additional attributes for signing_input, header, signature -- the other things returned from PyJWT._load() -- and any useful helper methods (in this case, compute_hash_digest).
This solution fixes the decision by PyJWT.decode() to discard potentially useful information. In fact, this is the critical information which makes it presently impossible to verify the OIDC at_hash with PyJWT.

If that's unattractive, the big alternative is to add jwt.compute_hash_digest(header, <input_string>), and add an argument to PyJWT.decode like verification_callbacks=... which is an iterable of callables which consume payload, header, signing_input, signature as inputs.

@mark-adams, which of these directions for PyJWT is more appealing to you? Do you see issues with exposing compute_hash_digest in one form or another?

Subclassing dict may sound messier at first, but it results in less code for PyJWT to maintain and doesn't expand the public API surface significantly vs verification_callbacks.
It also has a nice benefit over verification_callbacks in that adding attributes to an object if necessary in the future won't break most sane usage, but if people write their verification_callbacks without accepting **kwargs (which would be documented as incorrect, but people would still do it) then adding more information into those callbacks would break those clients' usage.

sirosen added a commit to sirosen/pyjwt that referenced this issue Jan 10, 2018
The JWTPayload class allows PyJWT.decode() to expose header, signature,
signing_input, and compute_hash_digest() (based on header) without
changing the pyjwt API in a breaking way.
Merely making this info accessible to the client (without specify an
additional verification callback scheme) is simpler for everyone.

Include doc on why JWTPayload is a good idea (in a docstring), since
it's a little unusual to subclass `dict`. The intent is to make the JWT
payload change as little as possible while still making it easy to add
more verification after the fact.

Add a simple test for `JWTPayload.compute_hash_digest()`

Closes jpadilla#314, jpadilla#295
sirosen added a commit to sirosen/pyjwt that referenced this issue Jan 10, 2018
The JWTPayload class allows PyJWT.decode() to expose header, signature,
signing_input, and compute_hash_digest() (based on header) without
changing the pyjwt API in a breaking way.
Merely making this info accessible to the client (without specify an
additional verification callback scheme) is simpler for everyone.

Include doc on why JWTPayload is a good idea (in a docstring), since
it's a little unusual to subclass `dict`. The intent is to make the JWT
payload change as little as possible while still making it easy to add
more verification after the fact.

Add a simple test for `JWTPayload.compute_hash_digest()`

Closes jpadilla#314, jpadilla#295
sirosen added a commit to sirosen/pyjwt that referenced this issue Jan 10, 2018
The JWTPayload class allows PyJWT.decode() to expose header, signature,
signing_input, and compute_hash_digest() (based on header) without
changing the pyjwt API in a breaking way.
Merely making this info accessible to the client (without specify an
additional verification callback scheme) is simpler for everyone.

Include doc on why JWTPayload is a good idea (in a docstring), since
it's a little unusual to subclass `dict`. The intent is to make the JWT
payload change as little as possible while still making it easy to add
more verification after the fact.

Add a simple test for `JWTPayload.compute_hash_digest()`

Closes jpadilla#314, jpadilla#295
sirosen added a commit to sirosen/pyjwt that referenced this issue May 14, 2020
The JWTPayload class allows PyJWT.decode() to expose header, signature,
signing_input, and compute_hash_digest() (based on header) without
changing the pyjwt API in a breaking way.
Merely making this info accessible to the client without specifying an
additional verification callback scheme is simpler for everyone.

Include doc on why JWTPayload is a good idea in a module docstring,
since it's a little unusual to subclass `dict`. The intent is to make
the JWT payload change as little as possible while still making it easy
to add more verification after the fact.

Add a simple test for `JWTPayload.compute_hash_digest()` and a test
for compute_hash_digest with cryptography (which is compared against a
manual hashlib usage).

Closes jpadilla#314, jpadilla#295
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale Issues without activity for more than 60 days label Jun 20, 2022
@sirosen
Copy link
Contributor Author

sirosen commented Jun 20, 2022

I may have approached this the wrong way years back when I filed this issue. Perhaps it's too verbose.
But I still think that OIDC is a valid use-case to consider.

I've been happy to contribute ever since I opened this and the two related (closed) PRs, but I'm not willing to open another one without some buy-in from maintainers.

I'd appreciate this not being closed just because it's old. It's still valid.

@github-actions github-actions bot removed the stale Issues without activity for more than 60 days label Jun 22, 2022
sirosen added a commit to sirosen/pyjwt that referenced this issue Jun 29, 2022
Looking up an algorithm by name is used internally for signature
generation. This encapsulates that functionality in a dedicated method
and adds it to the public API. No new tests are needed to exercise the
functionality.

Rationale:

1. Inside of PyJWS, this improves the code. The KeyError handler is
   better scoped and the signing code reads more directly.

2. This is part of the path to supporting OIDC at_hash validation as a
   use-case (see: jpadilla#295, jpadilla#296, jpadilla#314).

This is arguably sufficient to consider that use-case supported and
close it. However, it is an improvement and step in the right
direction in either case.
sirosen added a commit to sirosen/pyjwt that referenced this issue Jun 29, 2022
Looking up an algorithm by name is used internally for signature
generation. This encapsulates that functionality in a dedicated method
and adds it to the public API. No new tests are needed to exercise the
functionality.

Rationale:

1. Inside of PyJWS, this improves the code. The KeyError handler is
   better scoped and the signing code reads more directly.

2. This is part of the path to supporting OIDC at_hash validation as a
   use-case (see: jpadilla#295, jpadilla#296, jpadilla#314).

This is arguably sufficient to consider that use-case supported and
close it. However, it is an improvement and step in the right
direction in either case.
sirosen added a commit to sirosen/pyjwt that referenced this issue Jun 29, 2022
Looking up an algorithm by name is used internally for signature
generation. This encapsulates that functionality in a dedicated method
and adds it to the public API. No new tests are needed to exercise the
functionality.

Rationale:

1. Inside of PyJWS, this improves the code. The KeyError handler is
   better scoped and the signing code reads more directly.

2. This is part of the path to supporting OIDC at_hash validation as a
   use-case (see: jpadilla#295, jpadilla#296, jpadilla#314).

This is arguably sufficient to consider that use-case supported and
close it. However, it is an improvement and step in the right
direction in either case.

A minor change was needed to satisfy mypy, as a union-typed variable
does not narrow its type based on assignments. The easiest resolution
is to use a new name, in this case, simply `algorithm -> algorithm_`.
jpadilla pushed a commit that referenced this issue Jul 3, 2022
* Expose get_algorithm_by_name as new method

Looking up an algorithm by name is used internally for signature
generation. This encapsulates that functionality in a dedicated method
and adds it to the public API. No new tests are needed to exercise the
functionality.

Rationale:

1. Inside of PyJWS, this improves the code. The KeyError handler is
   better scoped and the signing code reads more directly.

2. This is part of the path to supporting OIDC at_hash validation as a
   use-case (see: #295, #296, #314).

This is arguably sufficient to consider that use-case supported and
close it. However, it is an improvement and step in the right
direction in either case.

A minor change was needed to satisfy mypy, as a union-typed variable
does not narrow its type based on assignments. The easiest resolution
is to use a new name, in this case, simply `algorithm -> algorithm_`.

* Use get_algorithm_by_name in _verify_signature

Rather than catching the KeyError from a dict lookup, catch the
NotImplementedError raised by get_algorithm_by_name. This changes the
exception seen in the cause under exception chaining but otherwise has
no public-facing impact.
sirosen added a commit to sirosen/pyjwt that referenced this issue Jul 3, 2022
The goal of this doc example is to demonstrate usage of
`get_algorithm_by_name` and `compute_hash_digest` for the purpose of
`at_hash` validation. It is not meant to be a "guaranteed correct" and
spec-compliant example.

closes jpadilla#314
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale Issues without activity for more than 60 days label Aug 21, 2022
sirosen added a commit to sirosen/pyjwt that referenced this issue Nov 1, 2022
The goal of this doc example is to demonstrate usage of
`get_algorithm_by_name` and `compute_hash_digest` for the purpose of
`at_hash` validation. It is not meant to be a "guaranteed correct" and
spec-compliant example.

closes jpadilla#314
auvipy pushed a commit that referenced this issue Nov 2, 2022
…alidation example (#775)

* Add compute_hash_digest to Algorithm objects

`Algorithm.compute_hash_digest` is defined as a method which inspects
the object to see that it has the requisite attributes, `hash_alg`.

If `hash_alg` is not set, then the method raises a
NotImplementedError. This applies to classes like NoneAlgorithm.

If `hash_alg` is set, then it is checked for
```
has_crypto  # is cryptography available?
and isinstance(hash_alg, type)
and issubclass(hash_alg, hashes.HashAlgorithm)
```
to see which API for computing a digest is appropriate --
`hashlib` vs `cryptography.hazmat.primitives.hashes`.

These checks could be avoided at runtime if it were necessary to
optimize further (e.g. attach compute_hash_digest methods to classes
with a class decorator) but this is not clearly a worthwhile
optimization. Such perf tuning is intentionally omitted for now.

* Add doc example of OIDC login flow

The goal of this doc example is to demonstrate usage of
`get_algorithm_by_name` and `compute_hash_digest` for the purpose of
`at_hash` validation. It is not meant to be a "guaranteed correct" and
spec-compliant example.

closes #314
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues without activity for more than 60 days
Projects
None yet
1 participant