fix(ci): keep generated SBOMs out of the crate/PyPI publish path (+ CI pin)#50
Conversation
Codex stop-gate: release-python's publish job download-artifact (merge-multiple) pulls sbom-python (.cdx.json) into dist/, which PyPI/twine reject — add 'find dist -name *.cdx.json -delete' after attestation, before the pypa publish (the SBOM stays the sbom-python build artifact). Verified the analogous crate break locally: the generated, untracked ordvec.cdx.json makes 'cargo publish' refuse a dirty tree (error: 'not yet committed'). gitignore '*.cdx.json' so cargo sees a clean tree and never bundles the SBOM into the .crate (re-verified: dry-run now exits 0, no dirty error). SBOMs remain build artifacts + covered by the attestations.
The release-*.yml workflows are workflow_dispatch-only, so the SBOM-then-publish interaction (which broke both publish paths) never ran in push/PR CI — it would only have surfaced at manual release ('silent failure past the CI gates'). Add tests/release_publish_invariants.sh, run by a new ci.yml 'release-guard' job, asserting: (1) *.cdx.json SBOMs are gitignored (so cargo publish stays clean and never bundles them); (2) the PyPI publish job deletes *.cdx.json from dist/ BEFORE the pypa upload. A regression now fails on every push/PR. Verified: passes with the fixes in.
Review Summary by QodoFix SBOM artifacts breaking release pipelines + add CI guard
WalkthroughsDescription• Fix SBOM artifacts breaking both PyPI and crate release pipelines - PyPI: delete *.cdx.json from dist/ before twine upload - Crate: gitignore *.cdx.json to keep tree clean for cargo publish • Add CI guard job to pin SBOM publish invariants on every push/PR - Verify SBOMs are gitignored (prevents cargo dirty tree) - Verify PyPI cleanup runs before pypa upload step • Prevent silent failures that only surfaced at manual release Diagramflowchart LR
A["Generated SBOM artifacts"] --> B["Crate publish path"]
A --> C["PyPI publish path"]
B --> D["Gitignore *.cdx.json"]
C --> E["Delete *.cdx.json from dist/"]
D --> F["Clean tree for cargo publish"]
E --> G["Twine accepts wheels + sdist"]
H["release-guard CI job"] --> I["Verify gitignore invariant"]
H --> J["Verify cleanup before upload"]
I --> K["Catch regressions on every push/PR"]
J --> K
File Changes1. .github/workflows/ci.yml
|
Code Review by Qodo
1.
|
There was a problem hiding this comment.
Code Review
This pull request adds *.cdx.json to .gitignore and introduces a new test script, tests/release_publish_invariants.sh, to verify that SBOM files are ignored and properly cleaned up before PyPI publication. Feedback highlights that the CI workflow changes required to execute this script are missing. Additionally, improvements were suggested for the script's robustness, including a file existence check and a more specific grep pattern for identifying the PyPI publish step.
There was a problem hiding this comment.
Pull request overview
Prevents CycloneDX SBOM generation/artifacts from breaking the manual release publish pipelines (crates.io + PyPI) by keeping *.cdx.json out of the publish inputs, and adds an always-on CI guard so the release-only regression can’t recur silently.
Changes:
- Add
*.cdx.jsonto.gitignoreto keepcargo publishfrom failing due to a dirty tree / bundling SBOMs into the crate. - Delete
*.cdx.jsonfromdist/in the PyPI publish job beforepypa/gh-action-pypi-publishruns. - Add
tests/release_publish_invariants.shand a newrelease-guardCI job to enforce the above invariants on every push/PR.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
tests/release_publish_invariants.sh |
New invariant test to pin SBOM-related release constraints in push/PR CI. |
.gitignore |
Ignore generated CycloneDX SBOM outputs (*.cdx.json) so they don’t dirty the publish tree. |
.github/workflows/release-python.yml |
Remove SBOM JSON files from dist/ prior to the PyPI upload step. |
.github/workflows/ci.yml |
Add a release-guard job to run the invariant test on every push/PR. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fad8038158
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…wnload->clean->publish order) Part (2) of release_publish_invariants.sh used broad greps that could give a false PASS. Tightened to address the PR review: - key on the pinned pypa/gh-action-pypi-publish action, not the bare string pypi-publish, which could match a job name (gemini); - anchor on uses:/run: step keys and skip YAML comment lines, so a stale comment cannot satisfy the check after the real step is removed (copilot/gemini); - also assert the cleanup runs AFTER actions/download-artifact, not only before the upload — otherwise moving it ahead of the download makes it a no-op for the downloaded SBOM (codex). Verified: passes on the current workflow (download 190 < clean 209 < publish 211) and fails on each regression (cleanup removed / reordered before download / replaced by a comment).
The hardened part (2) pattern run:.*(find|rm).*cdx.json dropped the original's -delete requirement, so a no-op 'find ... *.cdx.json' (no -delete, lists only) would satisfy the cleanup check while leaving the SBOM in dist/ — reintroducing the PyPI upload failure. Restore the deletion proof: require 'find ... *.cdx.json ... -delete' or 'rm ... *.cdx.json' (the original semantics), keeping the comment-skip and uses:/run: step-key anchoring. Verified: passes on the current workflow and now FAILS on a non-deleting find (Codex stop-review case), alongside removed / reordered-before-download / comment-only regressions.
Pin actions/checkout in the new release-guard job to de0fac2e4500dabe0009e67214ff5f5447ce83dd (v6.0.2) — the immutable SHA already used by the release-crate/release-python/codeql/audit workflows — instead of the floating @v6 tag, matching the repo's pinned-action posture for this new job. (The other 8 actions/checkout@v6 in ci.yml and 2 in python.yml are pre-existing and out of scope for this SBOM PR; a full ci.yml/python.yml pin can follow.)
Re: Qodo review — release-guard job1. Unpinned checkout (ci.yml:164) — fixed in 1e44a80. 2. Overbroad permissions — ci.yml already sets a workflow-level |
|
/agentic_review |
|
Persistent review updated to latest commit 1e44a80 |
… (qodo) Two robustness fixes to part (2) of release_publish_invariants.sh: - Publish-scoped (qodo #1): step_line grepped the whole workflow with head -1, so a download-artifact in another job could satisfy the ordering even if the publish job regressed. Now extract the 'publish:' job body (its key to the next 2-space-indented job key or EOF) and search only within it. - Multi-line aware (qodo #4): the cleanup was anchored on a single-line 'run:', so a 'run: |' block would false-fail. Now match the delete command on its own line, so both 'run: ... -delete' and a multi-line 'run: |' block work — still requiring a real delete ('find ... -delete' or 'rm ... *.cdx.json'). Verified A-G: passes on the real workflow and multi-line run:| (F); fails on removed / reordered-in-publish / comment-only / non-deleting-find; and is no longer fooled by a decoy download-artifact in another job (G).
Add an explicit job-level 'permissions: contents: read' to release-guard. The workflow already defaults to contents: read, but an explicit per-job block is the repo convention for scoped jobs (release-python.yml's publish job does the same) and constrains blast radius if the top-level default is ever widened. The guard only runs git check-ignore + grep, so read-only is sufficient.
Re: Qodo review (round 2) — release-guard
Guard verified across A-G: correct workflow passes; removed / reordered / comment-only / non-deleting-find fail; multi-line |
Round-2 matched the delete command on ANY line in the publish job (to support multi-line run:), which also accepted NON-executing text: a step name:, an env:/with: value, or prose mentioning 'find ... -delete' would satisfy the guard while nothing actually deletes the SBOM. Replace the bare match with a small awk state machine that counts a delete only as a single-line 'run: ... -delete' or a line inside that step's 'run: |'/'run: >' block. Still publish-scoped, deletion-proof, and ordering-checked. Verified A-I: single- line and multi-line run: pass; removed / reordered / comment-only / non-deleting find / name-text / non-run-field-text all fail.
…s (Codex) The run-context awk treated any line indented inside a run: |/> block as a comment (via !is_comment) and exited the block early, so a shell comment before the real delete (e.g. '# remove the SBOM' then 'find ... -delete') made the guard miss a valid cleanup and false-fail. A # line inside a block scalar is literal shell, not a YAML comment: block membership is now by indentation alone, and the !is_comment gate applies only to whether a line counts as the delete command. Verified A-K: a comment before the delete inside a block now passes (J); delete text appearing only in a block comment still fails (K); all prior cases hold.
Follow-up to the supply-chain hardening (#42/#43): a Codex stop-gate caught that the SBOM step pollutes the publish path — and verification showed it breaks both release pipelines. Because the release workflows are
workflow_dispatch-only, this never ran in normal CI and would only have surfaced at a manual release ("silent failure past the CI gates").The bug (both publish paths)
release-python.yml): the publish jobsdownload-artifact(merge-multiple) pulls every artifact intodist/— includingsbom-python(.cdx.json).pypa/gh-action-pypi-publishthen feedsdist/to twine, which rejects a stray.cdx.json.release-crate.yml):cargo cyclonedxwritesordvec.cdx.jsonat the repo root. Verified locally — that untracked file makescargo publishrefuse the dirty tree (error: ... not yet committed), and would otherwise bundle the SBOM into the.crate.Fixes
find dist -name *.cdx.json -deleteafter attestation, before the pypa upload (the SBOM stays thesbom-pythonbuild artifact).*.cdx.json→cargo publishsees a clean tree and never bundles it. Re-verified:cargo publish --dry-runnow exits 0 with the SBOM present.Pinned in CI (so it cant recur silently)
New
tests/release_publish_invariants.sh, run by arelease-guardjob on every push/PR (the release workflows themselves are dispatch-only):*.cdx.jsonSBOM paths are gitignored (catches the crate regression);*.cdx.jsonfromdist/before the pypa upload (catches the PyPI regression).Fast (git + grep, no toolchain). Verified it passes with the fixes and fails without them.
Workflow-only + a test; publish stays HELD. Merges under strict protection (CI + 1 non-self code-owner approval).