Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add GitLab CI provenance #6375

Merged
merged 12 commits into from May 12, 2023
Merged

Conversation

wlynch
Copy link
Contributor

@wlynch wlynch commented Apr 19, 2023

This is a first pass at provenance generation for GitLab CI.

Note: I'm a noob when it comes to javascript, so feel free to call out any beginner mistakes / common practices 😅

This is based loosely off of existing GitLab provenance documents, with some differences:

https://about.gitlab.com/blog/2022/11/30/achieve-slsa-level-2-compliance-with-gitlab/ https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5

Currently this pulls values from environment variables. I'm aware we want to pull most data from authenticated JWTs for GitHub provenance, but I don't know what is in-flight so I am starting here for now, and marking as draft to get early feedback. @bdehamer happy to sync up and coordinate on changes you're making on the GitHub side.

We're doing things slightly out of order while we're waiting for things to land on the Sigstore side to recognize GitLab issued JWTs, so this has not been tested e2e in a real environment. Marking this as v1alpha1 until we have more confidence in the provenance spec + generation. If there's a better way to denote potential instability / experimental behavior let me know.
Worst case this should just error out when trying to fetch a Sigstore cert.

References

Fixes #6373

@wraithgar wraithgar self-assigned this Apr 19, 2023
Copy link
Contributor Author

@wlynch wlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! Updated.

Will move out of draft once I hear more about plans for using JWTs in provenance.

workspaces/libnpmpublish/lib/provenance.js Outdated Show resolved Hide resolved
workspaces/libnpmpublish/lib/provenance.js Outdated Show resolved Hide resolved
workspaces/libnpmpublish/lib/publish.js Outdated Show resolved Hide resolved
@wlynch wlynch requested review from ljharb and wraithgar April 20, 2023 22:00
predicateType: SLSA_PREDICATE_TYPE,
predicate: {
buildType: `${GITLAB_BUILD_TYPE_PREFIX}/${GITLAB_BUILD_TYPE_VERSION}`,
builder: { id: `${env.CI_PROJECT_URL}/-/runners/${env.CI_RUNNER_ID}` },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the claims mapped here, do you have access to the CI_RUNNER_ENVIRONMENT here so we could use that instead of the id? Or is id the env? This would make it easier to write policies trust only the gitlab-hosted runner for example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should this be per project or just set to the server url runner to make it easier to have some kind of stable identifier for all gitlab.com hosted runners for example?

? env.CI_SERVER_URL/-/runners/${env. CI_RUNNER_ENVIRONMENT}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://github.com/in-toto/attestation/blob/v0.1.0/spec/predicates/provenance.md#fields, I don't think that the runner id is an appropriate value to use here.

The identity MUST reflect the trust base that consumers care about. How detailed to be is a judgement call. For example, GitHub Actions supports both GitHub-hosted runners and self-hosted runners. The GitHub-hosted runner might be a single identity because, it's all GitHub from the consumer's perspective. Meanwhile, each self-hosted runner might have its own identity because not all runners are trusted by all consumers.

The runner id identifies the specific machine, and we cycle through machines quite often. For gitlab-hosted runners, we probably want this to be a generic value that denotes the runner is hosted by GitLab. For self-hosted runners, we probably want some way to map this back to the instance, group, or project that is hosting the runner.

Copy link
Contributor Author

@wlynch wlynch Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally in favor, but I think there's some desire to have this match the existing GitLab provenance document, which uses CI_RUNNER_ID (see https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5 for an example provenance doc I pulled that I used as the basis for this).

This is a common theme for why fields are the way they are for this provenance document.

I think my preference would to start with parity with the existing GitLab SLSA v0.2 spec, and I can sync with @Brcrwilliams @marshall007 and other GitLab folks to how we want this to change this moving forward.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say builder.id defines the "trust base that consumers care about". I can't say too much about GitLab without more knowledge of that platform, but generally I think this should identify something like "GitLab hosted runner" or "GitLab custom runner". I don't think the runner ID will identify a significant difference in the "trust base".

/cc @MarkLodato @kpk47

However, wrt the format overall maybe this is best taken up with the GitLab Supply Chain Security working group of which I think @kpk47 is a member.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping parity with the existing GitLab SLSA v0.2 spec makes sense to me. Sounds like it would also be useful to start an effort to figure out how the 1.0 spec should look for GitLab and get this included in the slsa org.

Before you can actually test this against the npm registry, we'll need to update the server side checks to accept gitlab provenance statements. There's currently a simple allow list of issuers to allow us to gradually roll out support, as well as checks to make sure the extensions in the signing cert matches what's in the provenance statement.

@wlynch to aid with registry integration, could you include a complete .sigstore bundle files for GitLab? It's a bit annoying getting the sigstore bundle at the moment, but you could modify this line and write the file to disc during a pipeline run:

const serializedBundle = JSON.stringify(provenanceBundle)

Ideally we would also have the fulcio certificate updated to include all the new gitlab claims but can start without this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, wrt the format overall maybe this is best taken up with the GitLab Supply Chain Security working group

I already brought it up in the WG. 🙂 There's general interest in working towards improving the existing SLSA v0.2 spec + moving to v1.0. We can follow up there.

@wlynch to aid with registry integration, could you include a complete .sigstore bundle files for GitLab?

I'll do this as soon as sigstore/fulcio#983 goes live!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional problem of using builder: { id: ${env.CI_PROJECT_URL}/-/runners/${env.CI_RUNNER_ID}} as builder.id is that every project has a different builder. This kinda defeats the purpose of SLSA which scales by having a small set of builders, which you can then use for policy enforcement during verification. The builder.id, as written in this PR, would need to be generated dynamically for verification, even though the builders are technically the same (for GitLab hosted, I mean). See a longer discussion in slsa-framework/slsa#655.

@feelepxyz
Copy link
Member

@wlynch this is very exciting! Finally had some time to review this. I think we should also give the SLSA team some time to chime in here.

Part of me also wonders if we should just got to SLSA 1.0 for GitLab from the get go and write up a spec for this in the official repo like was done for GHA. This would simplify our lives a bit on the npm registry as we would only need support 3 different predicate types for GitLab when rendering the UI (1.0) + GitHub (SLSA 0.2 + 1.0).

Going to figure out if we can add support for SLSA 1.0 in the next few weeks, maybe up to a month.

@wraithgar wraithgar changed the base branch from latest to provenance April 24, 2023 15:45
@wraithgar
Copy link
Member

wraithgar commented Apr 24, 2023

I updated this PR to be a merge request against the provenance branch. This is what was used during beta testing of the GitHub provenance generation and I am hoping it can prove useful to GitLab's testing.

Once this PR lands folks can npm install -g npm/cli#provenance and get the code as y'all are helping to refine it. This is a way folks can test e2e in a real environment.

This will also lower the barrier to what is "acceptable" to land. We don't have to land the 100% finished product all at once, and we can have a little more wiggle room in case we need to make "breaking" changes in the branch before then.

@wraithgar wraithgar changed the title libnpmpublish: Add GitLab CI provenance. feat: Add GitLab CI provenance. Apr 24, 2023
@wlynch
Copy link
Contributor Author

wlynch commented Apr 24, 2023

SGTM! Happy to move this out of draft then. 🙂

@wlynch wlynch marked this pull request as ready for review April 24, 2023 16:26
@wlynch wlynch requested a review from a team as a code owner April 24, 2023 16:26
@wraithgar wraithgar changed the title feat: Add GitLab CI provenance. feat: Add GitLab CI provenance Apr 24, 2023
@wraithgar wraithgar changed the base branch from provenance to latest April 24, 2023 19:57
@wraithgar wraithgar changed the base branch from latest to provenance April 24, 2023 19:57
CI_PIPELINE_ID: env.CI_PIPELINE_ID,
CI_PIPELINE_IID: env.CI_PIPELINE_IID,
CI_PIPELINE_SOURCE: env.CI_PIPELINE_SOURCE,
CI_PIPELINE_URL: env.CI_PIPELINE_URL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wlynch could we add pipeline_ref and pipeline_sha that are now in the claims? Or are these in here but with different names?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see these are still in progress: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/117923

Copy link
Contributor Author

@wlynch wlynch May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline_ref is populated in environent.pipeline.ref via CI_CONFIG_PATH.

pipeline_sha is WIP on the GitLab side and may not always be populated (i.e. if being fetched from remote sources) - I've raised with GitLab folks. The safer thing to key off of for provenance checking is pipeline id / job id. I don't think this is blocking for this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wlynch pipeline_ref is actually a URI that links to the pipeline configuration. The name is easy to confuse with a git ref.

@marshall007 @aladh Maybe we should call it something else like pipeline_config_uri and pipeline_config_sha.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wlynch pipeline_ref is actually a URI that links to the pipeline configuration. The name is easy to confuse with a git ref.

@marshall007 @aladh Maybe we should call it something else like pipeline_config_uri and pipeline_config_sha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what we want. 🙂

What we're ultimately looking for is a hook to the Fulcio cert Build Config URI to match for verification.

No objections to renaming though.

Copy link
Member

@feelepxyz feelepxyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good 👍 would be good to get the final changes from the claims to add values for pipeline_ref and pipeline_sha before we merge this.

@wlynch wlynch requested a review from feelepxyz May 8, 2023 17:05
CI_BUILD_REF doesn't actually exist, CI_REPOSITORY_URL contains an
access token.
These values are a bit too much PII, so remove for now and only include
GITLAB_USER_ID and GITLAB_USER_LOGIN.
@wlynch
Copy link
Contributor Author

wlynch commented May 11, 2023

Sample provenance from Sigstore staging: https://gist.github.com/wlynch/42e89527d51bc72a61279f0c7f3be1cd

GitLab support landed in Fulcio prod today, so you should be able to test this out in prod as well!

@wlynch wlynch requested a review from feelepxyz May 12, 2023 14:59
@wlynch
Copy link
Contributor Author

wlynch commented May 12, 2023

@wlynch wlynch requested a review from feelepxyz May 12, 2023 15:16
Copy link
Member

@feelepxyz feelepxyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'm 👍 on merging this into the provenance branch.

@wraithgar would you like to do the honours?

@wraithgar wraithgar merged commit c85352e into npm:provenance May 12, 2023
20 checks passed
@wraithgar
Copy link
Member

Yay! Thanks @wlynch for seeing this through. Thanks @feelepxyz for reviewing it.

Installing this version of npm should be as simple as: npm install -g npm/cli#provenance

@wraithgar
Copy link
Member

@wlynch Can we get a PR started to the docs to update https://docs.npmjs.com/generating-provenance-statements to include what the gitlab instructions should be? We can wait to land it till we also land the provenance branch but let's not forget the docs.

https://github.com/npm/documentation/blob/e52916f65fc640287f269d0268b8f77c5e41b33d/content/packages-and-modules/securing-your-code/generating-provenance-statements.mdx

@SiaraMist
Copy link

SiaraMist commented May 12, 2023

👋 If it would be helpful for a tech writer to work on the docs for this, I wrote the existing npm build provenance docs and am happy to add GitLab content too if there is detailed source material. If you would like me to add the updates, just let me know here or in Slack and I can get that process started. (No worries if you just want to make the updates yourself though.)

wraithgar pushed a commit that referenced this pull request May 18, 2023
This is a first pass at provenance generation for GitLab CI.

This is based loosely off of existing GitLab provenance documents:
https://about.gitlab.com/blog/2022/11/30/achieve-slsa-level-2-compliance-with-gitlab/
https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5
wraithgar pushed a commit that referenced this pull request Jun 6, 2023
This is a first pass at provenance generation for GitLab CI.

This is based loosely off of existing GitLab provenance documents:
https://about.gitlab.com/blog/2022/11/30/achieve-slsa-level-2-compliance-with-gitlab/
https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5
wraithgar pushed a commit that referenced this pull request Jun 6, 2023
This is a first pass at provenance generation for GitLab CI.

This is based loosely off of existing GitLab provenance documents:
https://about.gitlab.com/blog/2022/11/30/achieve-slsa-level-2-compliance-with-gitlab/
https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5
wraithgar added a commit that referenced this pull request Jun 13, 2023
This is a first pass at provenance generation for GitLab CI.

This is based loosely off of existing GitLab provenance documents:
https://about.gitlab.com/blog/2022/11/30/achieve-slsa-level-2-compliance-with-gitlab/
https://gist.github.com/wlynch/c7fd8f53adc77d3c0ec82356e4d43cb5

Co-authored-by: Billy Lynch <1844673+wlynch@users.noreply.github.com>
@github-actions github-actions bot mentioned this pull request Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(libnpmpublish) GitLab CI provenance
8 participants