Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the verification API #299

Merged
merged 35 commits into from
Nov 21, 2022
Merged

Refactor the verification API #299

merged 35 commits into from
Nov 21, 2022

Conversation

woodruffw
Copy link
Member

@woodruffw woodruffw commented Nov 14, 2022

TODO:

  • Refactor offline Rekor entry "sanity checking"
  • Unit tests

Closes #250.

@woodruffw woodruffw added component:verification Core verification functionality refactoring Refactoring tasks. labels Nov 14, 2022
@woodruffw woodruffw self-assigned this Nov 14, 2022
@woodruffw
Copy link
Member Author

I still need to think a little more about the "policy" side of this API. We probably want to support verification schemes like "all of these identities" or "one of these identities", so we probably want something like:

  • An abstract VerificationPolicy with a single API like: verify(self, cert: Certificate)
  • Concrete Email, GitHubWorkflow, etc. policies
  • "Composed" policies like AnyOf, AllOf

AllOf in particular will be a little tricky, since it changes the VerificationMaterials relationship from the current:

@dataclass(init=False)
class VerificationMaterials:
    """
    Represents the materials needed to perform a Sigstore verification.
    """

    input_: bytes
    """
    The input that was signed for.
    """

    artifact_hash: str
    """
    The hex-encoded SHA256 hash of `input_`.
    """

    certificate: Certificate
    """
    The certificate that attests to and contains the public signing key.
    """

    signature: bytes
    """
    The raw signature.
    """

    offline_rekor_entry: Optional[RekorEntry]
    """
    An optional offline Rekor entry.

    If supplied an offline Rekor entry is supplied, verification will be done
    against this entry rather than the against the online transparency log.

    Offline Rekor entries do not carry their Merkle inclusion
    proofs, and as such are verified only against their Signed Entry Timestamps.
    This is a slightly weaker verification verification mode, as it does not
    demonstrate inclusion in the log.
    """

to something more like a list[tuple[hash, cert, sig, rekor?]], so that we can test against multiple signatures at once. I'm not sure if that's worth directly accommodating, though; maybe it's something we can leave to API consumers.

Make _verify into a subdirectory.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
sigstore/_verify/policy.py Show resolved Hide resolved
sigstore/_verify/policy.py Outdated Show resolved Hide resolved
@tetsuo-cpp
Copy link
Collaborator

  • Concrete Email, GitHubWorkflow, etc. policies

Doesn't the Identity policy already cover this? Or were you thinking to split that check out into multiple policies and then wrap them in an AnyOf for the CLI workflow.

@woodruffw
Copy link
Member Author

Doesn't the Identity policy already cover this? Or were you thinking to split that check out into multiple policies and then wrap them in an AnyOf for the CLI workflow.

Sorry, this was a little jumbled of me: Identity does indeed cover Email, since Identity is a generic catch-all for "any X.509 SAN type and the OIDC issuer".

What I meant there was other, related policies, like verifying GitHub workflow-specific claims (per #157). For those, we'll probably need separate policy verification logic, since they aren't just a tuple of (identity, oidc_provider).

@woodruffw
Copy link
Member Author

(I was originally thinking that it might make sense to have different policy classes for Email, URI, etc., but I realized that doing so would introduce impedance problems with the CLI and is outside of the threat model we discussed with the Fulcio folks.)

@jku
Copy link
Member

jku commented Nov 15, 2022

Will you be documenting the potential public API somewhere externally before this is ready or should we assume _verify/__init__.py will be roughly the public module?

@woodruffw
Copy link
Member Author

Will you be documenting the potential public API somewhere externally before this is ready or should we assume _verify/__init__.py will be roughly the public module?

Yeah, I'm going to post a summary comment on this PR once it's fleshed out a bit more.

It still won't be officially public, but it'll probably be the same shape once we make it public. So any feedback on it will go directly towards the public API!

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
This currently isn't useful with just identities, but it will
be useful once we support the other extensions used by Fulcio,
e.g. for GitHub workflows.

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

NB: I've also added a policy.AllOf policy, which isn't currently useful but will be once we add more policies for the X.509v3 extensions that Fulcio uses for other claims (e.g. GitHub workflow-specific claims).

Signed-off-by: William Woodruff <william@trailofbits.com>
…licies

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

I'm also trying something a little magical with 034253a -- any policy that boils down to a check on a single X.509v3 extension can now be auto-generated with a decorator, e.g.:

@_single_x509v3_extension(oid=_OIDC_ISSUER_OID)
class Issuer:
    """
    Verifies the certificate's OIDC issuer, identified by
    an X.509v3 extension tagged with `1.3.6.1.4.1.57264.1.1`.

    See: <https://github.com/sigstore/fulcio/blob/main/docs/oid-info.md#1361415726411--issuer>
    """

This keeps our policy definitions DRY and eliminates logic errors during duplication, but it also confuses the heck out of mypy (since it involves dynamic class modification in the decorator).

cc @di and @tetsuo-cpp for thoughts on this design specifically -- it's an internal implementation detail, but I can remove it if you think it's too magical.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

Decided that 034253a was too clever, so I replaced it in 58a2492 with a simpler ABC + subclass approach.

(h/t @jacobian)

Just to limit potential confusion with X.509's issuer.

Signed-off-by: William Woodruff <william@trailofbits.com>
Copy link
Collaborator

@tetsuo-cpp tetsuo-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is looking really good. 👍

sigstore/_verify/verifier.py Show resolved Hide resolved
# limitations under the License.

"""
APIs for describing identity verification "policies", which describe how the identities
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to expand on this a bit. I understand what the API is getting at since I'm reviewing the code. But if I was just looking at docs and saw a function that takes a "verification policy", I'd be pretty confused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree. I've been thinking more generally about a mental model for Sigstore verifications, with three moving parts:

  • "Verification context": the immutable and invariant things needed to perform a verification (the root cert(s), cert chain(s), various instance URLs, etc.). These are composed in the Verifier instance itself.
  • "Verification materials": the principals being verified: input, signature, cert, hash, and (maybe) an offline Rekor entry. These are composed in the VerificationMaterials model.
  • "Verification policy": how the verification materials are verified, beyond their basic cryptographic authenticity/integrity: the identity/context they relate to, etc. These are composed via the "policies" API, and can be mixed together via AnyOf and AllOf.

Does that general categorization make sense to you? If so I can put that into the source, in each of the relevant modules 🙂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes a lot of sense to me.

@jku
Copy link
Member

jku commented Nov 17, 2022

Could you provide a description for the Pull Request? This feels a lot broader than what the issue describes. Or is this comment the PR description : #299 (comment) ?

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to get back to this for another look tomorrow, but I'll leave these initial comments:

  • Generally I like it, nice work
  • I'm not yet sure if the "higher level" policies (AnyOf, AllOf) belong in the API (or at least in this PR)... but I'll not judge yet, maybe I've not seen the big picture yet
  • if it's easy to do the big verifier refactor (the move to _verify/) as a separate PR before this, I think that would help review-ability: Github just shows verifier.py as a completely new file now
  • Is policy.py API or not? It's not "made public" in init.py but I suppose it has to be, otherwise people can't use the policies... but then are all the policies defined there public, like the Github specific ones?

sigstore/_verify/policy.py Show resolved Hide resolved
Comment on lines +154 to +155
[--rekor-bundle FILE] --cert-identity IDENTITY
--cert-oidc-issuer URL [--require-rekor-offline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree identity and issuer should be required in CLI: anything else is unsafe by default.

docstrings seem to still claim "default: None" which maybe does not make sense.

raise NotImplementedError # pragma: no cover


class AnyOf:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the subject of Protocol vs ABC I think there is a third option: keep using Protocol but still explicitly declare the subclass.

Currently it's not immediately obvious to reader that there are ~10 VerificationPolicy implementations in this file... I suppose more explicit naming might be another solution to that ( some policy classes are called *Policy, some derive from a *Policy, some do neither)

@woodruffw
Copy link
Member Author

  • I'm not yet sure if the "higher level" policies (AnyOf, AllOf) belong in the API (or at least in this PR)... but I'll not judge yet, maybe I've not seen the big picture yet

Yeah, I threw these in as a "proof of concept" for some of the more powerful things someone could do as a consumer of this new policy API. IMO including them makes sense, since they're relatively simple and reflect use cases we'd expect users to have (identity claims + workflow-specific claims, for example).

  • if it's easy to do the big verifier refactor (the move to _verify/) as a separate PR before this, I think that would help review-ability: Github just shows verifier.py as a completely new file now

I can look into this, but the change history might make it a little tricky at this point. I agree that would have been ideal...

Is policy.py API or not? It's not "made public" in init.py but I suppose it has to be, otherwise people can't use the policies... but then are all the policies defined there public, like the Github specific ones?

It's part of the private API for the time being -- the stabilization plan it to make all of sigstore._verify public (by making it sigstore.verify), which will in turn include sigstore.verify.policy once that happens.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
@jku
Copy link
Member

jku commented Nov 18, 2022

  • I'm not yet sure if the "higher level" policies (AnyOf, AllOf) belong in the API (or at least in this PR)... but I'll not judge yet, maybe I've not seen the big picture yet

Yeah, I threw these in as a "proof of concept" for some of the more powerful things someone could do as a consumer of this new policy API. IMO including them makes sense, since they're relatively simple and reflect use cases we'd expect users to have (identity claims + workflow-specific claims, for example).

Got it. Just to confirm, is this roughly how you might e.g. verify a sigstore-python release:

class GitHubReleasePolicy(AllOf):
    def __init__(self, sha: str, tag: str, repo: str, workflow: str):
        super().__init__([
            Identity(
                identity=f"{workflow}@refs/tags/{tag}
                issuer="https://token.actions.githubusercontent.com",
            ),
            GitHubWorkflowSHA(sha),
            GitHubWorkflowRepository(repo),
        ])

# release policy example for sigstore-python 0.7.0
rel_policy = GitHubReleasePolicy(
    "5410427e89653fe4344f72506d057f2872683f4c"
    "v0.7.0"
    "sigstore/sigstore-python"
    "https://github.com/sigstore/sigstore-python/.github/workflows/release.yml"
)

@woodruffw
Copy link
Member Author

Got it. Just to confirm, is this roughly how you might e.g. verify a sigstore-python release:

Yeah, something like that! I was imagining people would use AllOf(...) explicitly, so something like a DSL:

policy = AllOf([
    Identity(...),
    GitHubWorkflowSHA(...),
    GitHubWorkflowRepository(...),
])

verify(..., policy)

or (and maybe this is too magical) having default __and__ and __or__ implementations on VerificationPolicy that compose policies into AllOf and AnyOf implicitly. Then, someone could do something like this:

verify(
    ...,
    Identity(...) & GitHubWorkflowSHA(...) & GitHubWorkflowRepository(...),
)

Signed-off-by: William Woodruff <william@trailofbits.com>
@jku
Copy link
Member

jku commented Nov 20, 2022

Yeah, something like that! I was imagining people would use AllOf(...) explicitly, so something like a DSL

That's how I wrote the example code originally, and it ended up looking complicated and easy to mess up (because forming the identity string with code is non-trivial from the actual inputs ). "identity" is a nicely defined concept in sigstore/OIDC, but not in the world of "I'm just trying to verify a release X of project Y".

PR looks good me though, thanks.

Copy link
Collaborator

@tetsuo-cpp tetsuo-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@woodruffw
Copy link
Member Author

That's how I wrote the example code originally, and it ended up looking complicated and easy to mess up (because forming the identity string with code is non-trivial from the actual inputs ). "identity" is a nicely defined concept in sigstore/OIDC, but not in the world of "I'm just trying to verify a release X of project Y".

Yeah, this is a great point! IMO this is a good reason to pursue a "maximalist" approach to our built-in policy support -- I'd be very okay with us including a GitHubReleasePolicy similar to the one you specified.

@woodruffw woodruffw merged commit 7c95e8f into main Nov 21, 2022
@woodruffw woodruffw deleted the ww/verify-api-refactor branch November 21, 2022 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:verification Core verification functionality refactoring Refactoring tasks.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chore: Refactor how we handle verification materials
3 participants