Skip to content

fix(ci): keep generated SBOMs out of the crate/PyPI publish path (+ CI pin)#50

Merged
project-navi-bot merged 11 commits into
mainfrom
fix/sbom-excluded-from-pypi-upload
May 25, 2026
Merged

fix(ci): keep generated SBOMs out of the crate/PyPI publish path (+ CI pin)#50
project-navi-bot merged 11 commits into
mainfrom
fix/sbom-excluded-from-pypi-upload

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Owner

Follow-up to the supply-chain hardening (#42/#43): a Codex stop-gate caught that the SBOM step pollutes the publish path — and verification showed it breaks both release pipelines. Because the release workflows are workflow_dispatch-only, this never ran in normal CI and would only have surfaced at a manual release ("silent failure past the CI gates").

The bug (both publish paths)

  • PyPI (release-python.yml): the publish jobs download-artifact (merge-multiple) pulls every artifact into dist/ — including sbom-python (.cdx.json). pypa/gh-action-pypi-publish then feeds dist/ to twine, which rejects a stray .cdx.json.
  • crate (release-crate.yml): cargo cyclonedx writes ordvec.cdx.json at the repo root. Verified locally — that untracked file makes cargo publish refuse the dirty tree (error: ... not yet committed), and would otherwise bundle the SBOM into the .crate.

Fixes

  • PyPI: find dist -name *.cdx.json -delete after attestation, before the pypa upload (the SBOM stays the sbom-python build artifact).
  • crate: gitignore *.cdx.jsoncargo publish sees a clean tree and never bundles it. Re-verified: cargo publish --dry-run now exits 0 with the SBOM present.

Pinned in CI (so it cant recur silently)

New tests/release_publish_invariants.sh, run by a release-guard job on every push/PR (the release workflows themselves are dispatch-only):

  1. both *.cdx.json SBOM paths are gitignored (catches the crate regression);
  2. the PyPI publish job strips *.cdx.json from dist/ before the pypa upload (catches the PyPI regression).
    Fast (git + grep, no toolchain). Verified it passes with the fixes and fails without them.

Workflow-only + a test; publish stays HELD. Merges under strict protection (CI + 1 non-self code-owner approval).

Codex stop-gate: release-python's publish job download-artifact (merge-multiple) pulls sbom-python (.cdx.json) into dist/, which PyPI/twine reject — add 'find dist -name *.cdx.json -delete' after attestation, before the pypa publish (the SBOM stays the sbom-python build artifact).

Verified the analogous crate break locally: the generated, untracked ordvec.cdx.json makes 'cargo publish' refuse a dirty tree (error: 'not yet committed'). gitignore '*.cdx.json' so cargo sees a clean tree and never bundles the SBOM into the .crate (re-verified: dry-run now exits 0, no dirty error). SBOMs remain build artifacts + covered by the attestations.
The release-*.yml workflows are workflow_dispatch-only, so the SBOM-then-publish interaction (which broke both publish paths) never ran in push/PR CI — it would only have surfaced at manual release ('silent failure past the CI gates'). Add tests/release_publish_invariants.sh, run by a new ci.yml 'release-guard' job, asserting: (1) *.cdx.json SBOMs are gitignored (so cargo publish stays clean and never bundles them); (2) the PyPI publish job deletes *.cdx.json from dist/ BEFORE the pypa upload. A regression now fails on every push/PR. Verified: passes with the fixes in.
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Fix SBOM artifacts breaking release pipelines + add CI guard

🐞 Bug fix 🧪 Tests

Grey Divider

Walkthroughs

Description
• Fix SBOM artifacts breaking both PyPI and crate release pipelines
  - PyPI: delete *.cdx.json from dist/ before twine upload
  - Crate: gitignore *.cdx.json to keep tree clean for cargo publish
• Add CI guard job to pin SBOM publish invariants on every push/PR
  - Verify SBOMs are gitignored (prevents cargo dirty tree)
  - Verify PyPI cleanup runs before pypa upload step
• Prevent silent failures that only surfaced at manual release
Diagram
flowchart LR
  A["Generated SBOM artifacts"] --> B["Crate publish path"]
  A --> C["PyPI publish path"]
  B --> D["Gitignore *.cdx.json"]
  C --> E["Delete *.cdx.json from dist/"]
  D --> F["Clean tree for cargo publish"]
  E --> G["Twine accepts wheels + sdist"]
  H["release-guard CI job"] --> I["Verify gitignore invariant"]
  H --> J["Verify cleanup before upload"]
  I --> K["Catch regressions on every push/PR"]
  J --> K

Loading

File Changes

1. .github/workflows/ci.yml 🧪 Tests +18/-0

Add release-guard CI job for SBOM invariants

• Add new release-guard job to run SBOM publish invariant tests
• Job executes tests/release_publish_invariants.sh on every push/PR
• Uses hardened runner with audit egress policy
• Runs after existing CI checks to catch regressions early

.github/workflows/ci.yml


2. .github/workflows/release-python.yml 🐞 Bug fix +6/-0

Delete SBOM from PyPI upload directory

• Add cleanup step to remove *.cdx.json from dist/ directory
• Runs after attestation but before pypa publish upload
• Prevents twine rejection of stray SBOM files
• SBOM remains available as separate sbom-python artifact

.github/workflows/release-python.yml


3. tests/release_publish_invariants.sh 🧪 Tests +37/-0

Test script for SBOM publish invariants

• New test script pinning SBOM publish invariants in CI
• Verifies both ordvec.cdx.json and ordvec-python/ordvec-python.cdx.json are gitignored
• Confirms PyPI publish job deletes *.cdx.json before pypa upload step
• Ensures cleanup step runs before publish step (line number validation)

tests/release_publish_invariants.sh


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 25, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Guard not publish-scoped ✓ Resolved 🐞 Bug ≡ Correctness
Description
The invariant script searches the entire release workflow and uses the first match for each step, so
adding another actions/download-artifact earlier in the file could make the ordering check pass even
if the publish job regresses.
Code

tests/release_publish_invariants.sh[R38-46]

Evidence
step_line() returns only the first match in the entire workflow file via head -1, and the
subsequent dl_line/clean_line/pub_line variables rely on that value without restricting to the
publish job block.

tests/release_publish_invariants.sh[38-46]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`tests/release_publish_invariants.sh` uses `head -1` over a whole-file grep, so it may select a `uses: actions/download-artifact` occurrence from a different job than `publish`, weakening the guard (it can become a false pass after workflow refactors).

## Issue Context
The intent is to pin invariants specifically for the PyPI **publish** job ordering: download -> delete SBOM -> publish.

## Fix Focus Areas
- tests/release_publish_invariants.sh[38-55]

### Suggested approach
1. Compute a line range for the `publish:` job in `.github/workflows/release-python.yml` (start at the `^  publish:` line; end at the next `^  [a-zA-Z0-9_-]+:` job key at the same indentation, or EOF).
2. Run all three `step_line` searches within that bounded range (e.g., using `sed -n "${start},${end}p"` piped to `grep -nE`, or `awk` to emit only that block).
3. Keep the existing comment-skipping behavior, but apply it within the extracted block.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Overbroad workflow permissions ✓ Resolved 🐞 Bug ⛨ Security
Description
The new release-guard job does not set an explicit permissions block, so it may run with broader
GITHUB_TOKEN permissions than necessary for a read-only git check-ignore/grep guard. This
increases blast radius if any step in the job is ever compromised.
Code

.github/workflows/ci.yml[R157-166]

Evidence
release-guard lacks any permissions declaration, while release-python.yml demonstrates
explicit permission scoping for its publish job, showing least-privilege is already practiced
in-repo.

.github/workflows/ci.yml[157-166]
.github/workflows/release-python.yml[174-185]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`release-guard` runs a read-only shell script but does not declare minimal `permissions`, potentially granting more token access than required.

### Issue Context
Other workflows in the repo already scope permissions explicitly for sensitive jobs.

### Fix Focus Areas
- .github/workflows/ci.yml[157-166]

### Suggested fix
Add a job-level permissions block, e.g.:
```yaml
permissions:
 contents: read
```
(Adjust only if the script later needs additional scopes.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Unpinned checkout action ✓ Resolved 🐞 Bug ⛨ Security
Description
The new release-guard job uses actions/checkout@v6 (a floating tag), which can change without
review and weakens the repo’s supply-chain pinning posture. This is inconsistent with other release
workflows that pin actions/checkout to an immutable commit SHA.
Code

.github/workflows/ci.yml[164]

Evidence
The new CI job uses a floating checkout tag, while an existing release workflow pins checkout to a
commit SHA, demonstrating the repo’s intended hardening approach.

.github/workflows/ci.yml[157-166]
.github/workflows/release-crate.yml[101-105]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`.github/workflows/ci.yml` introduces a new job that runs `actions/checkout@v6` via a floating tag. This undermines supply-chain hardening because the executed code can change without a PR.

### Issue Context
Other workflows in this repo pin `actions/checkout` to a specific commit SHA, indicating a preferred standard.

### Fix Focus Areas
- .github/workflows/ci.yml[157-166]

### Suggested fix
Replace `actions/checkout@v6` with a commit-SHA-pinned reference (matching the version you intend to run), e.g. `actions/checkout@<sha> # vX.Y.Z`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

4. Cleanup check is format-brittle ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
The guard only recognizes SBOM deletion when the delete command appears on the same line as the YAML
run: key, so changing the cleanup step to a common run: | block would incorrectly fail CI
despite still deleting the SBOM.
Code

tests/release_publish_invariants.sh[R41-50]

Evidence
The cleanup detector regex anchors on run: and requires the delete command text on the same line,
while the workflow already demonstrates multi-line run: | usage as a standard style.

tests/release_publish_invariants.sh[41-50]
.github/workflows/release-python.yml[69-73]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The regex for detecting the SBOM cleanup step only matches single-line `run: ... -delete` / `run: rm ...` forms. A common YAML refactor to `run: |` (multi-line) would break the guard even if the cleanup still happens.

## Issue Context
`release-python.yml` already uses multi-line `run: |` in other steps, so contributors may naturally apply the same formatting to the cleanup step.

## Fix Focus Areas
- tests/release_publish_invariants.sh[41-50]

### Suggested approach
Extend detection to accept either:
- single-line: `run: ...find...-delete` / `run: ...rm ...cdx.json`
- multi-line:
 - match `run:[[:space:]]*[|>]` and then search the following indented lines (until the next step key like `- name:` / `- uses:`) for `find ... cdx.json ... -delete` or `rm ... cdx.json`.

This can be done with a small `awk` state machine over the publish-step block (especially if you implement the publish-job scoping fix above).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds *.cdx.json to .gitignore and introduces a new test script, tests/release_publish_invariants.sh, to verify that SBOM files are ignored and properly cleaned up before PyPI publication. Feedback highlights that the CI workflow changes required to execute this script are missing. Additionally, improvements were suggested for the script's robustness, including a file existence check and a more specific grep pattern for identifying the PyPI publish step.

Comment thread tests/release_publish_invariants.sh
Comment thread tests/release_publish_invariants.sh Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prevents CycloneDX SBOM generation/artifacts from breaking the manual release publish pipelines (crates.io + PyPI) by keeping *.cdx.json out of the publish inputs, and adds an always-on CI guard so the release-only regression can’t recur silently.

Changes:

  • Add *.cdx.json to .gitignore to keep cargo publish from failing due to a dirty tree / bundling SBOMs into the crate.
  • Delete *.cdx.json from dist/ in the PyPI publish job before pypa/gh-action-pypi-publish runs.
  • Add tests/release_publish_invariants.sh and a new release-guard CI job to enforce the above invariants on every push/PR.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/release_publish_invariants.sh New invariant test to pin SBOM-related release constraints in push/PR CI.
.gitignore Ignore generated CycloneDX SBOM outputs (*.cdx.json) so they don’t dirty the publish tree.
.github/workflows/release-python.yml Remove SBOM JSON files from dist/ prior to the PyPI upload step.
.github/workflows/ci.yml Add a release-guard job to run the invariant test on every push/PR.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/release_publish_invariants.sh Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fad8038158

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/release_publish_invariants.sh Outdated
project-navi-bot and others added 5 commits May 25, 2026 11:23
…wnload->clean->publish order)

Part (2) of release_publish_invariants.sh used broad greps that could give a
false PASS. Tightened to address the PR review:
- key on the pinned pypa/gh-action-pypi-publish action, not the bare string
  pypi-publish, which could match a job name (gemini);
- anchor on uses:/run: step keys and skip YAML comment lines, so a stale comment
  cannot satisfy the check after the real step is removed (copilot/gemini);
- also assert the cleanup runs AFTER actions/download-artifact, not only before
  the upload — otherwise moving it ahead of the download makes it a no-op for the
  downloaded SBOM (codex).

Verified: passes on the current workflow (download 190 < clean 209 < publish 211)
and fails on each regression (cleanup removed / reordered before download /
replaced by a comment).
The hardened part (2) pattern run:.*(find|rm).*cdx.json dropped the original's
-delete requirement, so a no-op 'find ... *.cdx.json' (no -delete, lists only)
would satisfy the cleanup check while leaving the SBOM in dist/ — reintroducing
the PyPI upload failure. Restore the deletion proof: require
'find ... *.cdx.json ... -delete' or 'rm ... *.cdx.json' (the original
semantics), keeping the comment-skip and uses:/run: step-key anchoring.

Verified: passes on the current workflow and now FAILS on a non-deleting find
(Codex stop-review case), alongside removed / reordered-before-download /
comment-only regressions.
Pin actions/checkout in the new release-guard job to
de0fac2e4500dabe0009e67214ff5f5447ce83dd (v6.0.2) — the immutable SHA already
used by the release-crate/release-python/codeql/audit workflows — instead of the
floating @v6 tag, matching the repo's pinned-action posture for this new job.

(The other 8 actions/checkout@v6 in ci.yml and 2 in python.yml are pre-existing
and out of scope for this SBOM PR; a full ci.yml/python.yml pin can follow.)
@Fieldnote-Echo
Copy link
Copy Markdown
Owner Author

Re: Qodo review — release-guard job

1. Unpinned checkout (ci.yml:164) — fixed in 1e44a80. actions/checkout in the release-guard job is now pinned to de0fac2e4500dabe0009e67214ff5f5447ce83dd (v6.0.2), the same immutable SHA already used by the release-crate / release-python / codeql / audit workflows. (The other 8 actions/checkout@v6 in ci.yml and 2 in python.yml are pre-existing and not part of this PR diff; a full pin of those can follow separately.)

2. Overbroad permissions — ci.yml already sets a workflow-level permissions: contents: read (lines 23-24) that applies to every job, including release-guard. No job in ci.yml sets per-job permissions, so release-guard already runs with a read-only GITHUB_TOKEN — the minimal scope its git check-ignore / grep guard requires, and consistent with the rest of the file. Blast radius is already constrained to contents: read; an explicit per-job block would equal the inherited default. Can add one for explicitness if preferred.

@Fieldnote-Echo
Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 25, 2026

Persistent review updated to latest commit 1e44a80

… (qodo)

Two robustness fixes to part (2) of release_publish_invariants.sh:

- Publish-scoped (qodo #1): step_line grepped the whole workflow with head -1,
  so a download-artifact in another job could satisfy the ordering even if the
  publish job regressed. Now extract the 'publish:' job body (its key to the next
  2-space-indented job key or EOF) and search only within it.

- Multi-line aware (qodo #4): the cleanup was anchored on a single-line 'run:',
  so a 'run: |' block would false-fail. Now match the delete command on its own
  line, so both 'run: ... -delete' and a multi-line 'run: |' block work — still
  requiring a real delete ('find ... -delete' or 'rm ... *.cdx.json').

Verified A-G: passes on the real workflow and multi-line run:| (F); fails on
removed / reordered-in-publish / comment-only / non-deleting-find; and is no
longer fooled by a decoy download-artifact in another job (G).
Add an explicit job-level 'permissions: contents: read' to release-guard. The
workflow already defaults to contents: read, but an explicit per-job block is
the repo convention for scoped jobs (release-python.yml's publish job does the
same) and constrains blast radius if the top-level default is ever widened. The
guard only runs git check-ignore + grep, so read-only is sufficient.
@Fieldnote-Echo
Copy link
Copy Markdown
Owner Author

Re: Qodo review (round 2) — release-guard

Guard verified across A-G: correct workflow passes; removed / reordered / comment-only / non-deleting-find fail; multi-line run: | passes; a decoy download-artifact in another job does not fool the ordering.

Round-2 matched the delete command on ANY line in the publish job (to support
multi-line run:), which also accepted NON-executing text: a step name:, an
env:/with: value, or prose mentioning 'find ... -delete' would satisfy the guard
while nothing actually deletes the SBOM. Replace the bare match with a small awk
state machine that counts a delete only as a single-line 'run: ... -delete' or a
line inside that step's 'run: |'/'run: >' block.

Still publish-scoped, deletion-proof, and ordering-checked. Verified A-I: single-
line and multi-line run: pass; removed / reordered / comment-only / non-deleting
find / name-text / non-run-field-text all fail.
…s (Codex)

The run-context awk treated any line indented inside a run: |/> block as a
comment (via !is_comment) and exited the block early, so a shell comment before
the real delete (e.g. '# remove the SBOM' then 'find ... -delete') made the guard
miss a valid cleanup and false-fail. A # line inside a block scalar is literal
shell, not a YAML comment: block membership is now by indentation alone, and the
!is_comment gate applies only to whether a line counts as the delete command.

Verified A-K: a comment before the delete inside a block now passes (J); delete
text appearing only in a block comment still fails (K); all prior cases hold.
@project-navi-bot project-navi-bot merged commit b8e828e into main May 25, 2026
16 checks passed
@project-navi-bot project-navi-bot deleted the fix/sbom-excluded-from-pypi-upload branch May 25, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants