Fix(ci): Surface Grype findings on audit failure#259
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the CI SBOM vulnerability auditing so Grype failures surface actionable CVE/package details directly in failing logs and in the job step summary, improving debuggability for both PR CI and release CI workflows.
Changes:
- Split SBOM generation from Grype auditing into separate
sbomandgrypejobs, with artifact handoff between them. - Replace multiple
anchore/scan-actionruns with a single Grype invocation that emits SARIF/JSON/table outputs at once and produces a single annotation. - Render Grype findings in
$GITHUB_STEP_SUMMARYas a Markdown table derived fromgrype-results.jsonviajq.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| .github/workflows/build-test.yaml | Splits SBOM vs Grype into separate jobs and adds JSON-driven Markdown summary + consolidated Grype run. |
| .github/workflows/build-test-release.yaml | Mirrors the PR workflow improvements for releases and adds grype as a publishing gate. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ed174af to
7deca0d
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves CI ergonomics for SBOM vulnerability auditing by making Grype failures actionable (CVE/package details visible in the failing step log) and by rendering Grype findings as a readable Markdown table in the job step summary. It also clarifies responsibility for failures by separating SBOM generation from the Grype audit into distinct jobs.
Changes:
- Split SBOM generation and Grype auditing into separate
sbomandgrypejobs so failures are attributed correctly. - Replace multiple
anchore/scan-actioninvocations with a single Grype run emitting SARIF/JSON/table outputs, with one authoritative annotation. - Render Grype findings into
$GITHUB_STEP_SUMMARYas a Markdown table derived from JSON viajq(with pipe escaping and numeric formatting), and warn on missing expected artifacts.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| .github/workflows/build-test.yaml | Separates SBOM + Grype into dedicated jobs; runs Grype once and surfaces findings in logs + Markdown step summary; uploads Grype outputs (SARIF/JSON/table). |
| .github/workflows/build-test-release.yaml | Mirrors the PR-CI workflow improvements for release CI; ensures publishing is gated on the Grype audit job and outputs are surfaced consistently. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Separate 'Generate SBOM' and 'Grype Audit SBOM' into two jobs so that a failing vulnerability scan is attributed to the audit job rather than to SBOM generation. The new 'grype' job depends on 'sbom' and consumes the uploaded sbom-files artefact. Reorder the Grype steps within the audit job so the Text/Table scan runs last and owns the pass/fail verdict. Its step log contains the human-readable CVE list, so when the step turns red in the PR Checks UI, following the link to the workflow surfaces the offending packages and CVEs directly. The SARIF scan still runs (with fail-build disabled) purely to produce the artefact uploaded alongside the text results. The NO_BLOCK_AUDIT_FAIL repository variable override is preserved by computing fail-build dynamically on the Table step, removing the need for a separate 'Check Grype scan results' gate. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Add a JSON-format Grype scan alongside the existing SARIF and Table scans, include grype-results.json in the uploaded grype-scan-results artefact, and use the JSON output to render a proper Markdown table in the job step summary. Parsing structured JSON with jq avoids the alignment and cell delimiter hazards that a naive split of the pre-formatted Grype table text would introduce; any literal pipe characters in cell content are escaped. EPSS probabilities are formatted as percentages to two decimal places (with a '<0.01%' threshold for very small non-zero values) and risk scores are rounded to two decimals to avoid IEEE-754 float representation artefacts. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Round 1 fixes: - round2: use jq built-in 'round' for true half-up rounding instead of truncation-via-floor, matching the PR description and the function's documented behaviour. (ref: #1 discussion r3111955943) - Upload Grype scan results: switch if-no-files-found from 'ignore' to 'warn' so missing expected outputs (e.g. from an earlier scan-action failure) surface in the logs instead of being silently tolerated. (ref: #1 discussion r3111955992) - Grype Table step comment: reword 'PR Checks UI' to the generic 'workflow run's Actions/Checks UI' since this workflow is triggered by tag pushes rather than pull requests. (ref: #1 discussion r3111956019) Round 2 fixes: - Grype SARIF step comment: drop the misleading 'code scanning ingestion' phrasing. The SARIF artefact is uploaded but not consumed by github/codeql-action/upload-sarif in this workflow, so the comment was implying behaviour that does not exist. (ref: #1 discussion r3112034310) - Download SBOM artefact: standardise on actions/download-artifact v8.0.1 (SHA 3e5f45b...) to match the other invocation later in the same workflow, avoiding mixed major versions. (ref: #1 discussion r3112034212) - test-pypi: add 'grype' to the needs list so the Grype audit gates publishing. Previously publishing could run in parallel with the audit, undermining the 'block releases on vulnerabilities unless NO_BLOCK_AUDIT_FAIL=true' behaviour. Note: the audit job's own fail-build propagation already honours the override, so test-pypi will still run when NO_BLOCK_AUDIT_FAIL is 'true'. (ref: #1 discussion r3112034266) Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Mirror the refactor from build-test-release.yaml in the PR build-test workflow: separate 'Generate SBOM' and 'Grype Audit SBOM' into two jobs so that a failing vulnerability scan is attributed to the audit job rather than to SBOM generation. The new 'grype' job depends on 'sbom' and consumes the uploaded sbom-files artefact. Reorder the Grype steps within the audit job so the Text/Table scan runs last and owns the pass/fail verdict. Its step log contains the human-readable CVE list, so when the step turns red in the PR Checks UI, following the link to the workflow surfaces the offending packages and CVEs directly. The SARIF scan still runs (with fail-build disabled) purely to produce the artefact uploaded alongside the text results. The NO_BLOCK_AUDIT_FAIL repository variable override is preserved by computing fail-build dynamically on the Table step, removing the need for a separate 'Check Grype scan results' gate. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Mirror the summary improvement from build-test-release.yaml in the PR build-test workflow: add a JSON-format Grype scan alongside the existing SARIF and Table scans, include grype-results.json in the uploaded grype-scan-results artefact, and use the JSON output to render a proper Markdown table in the job step summary. Parsing structured JSON with jq avoids the alignment and cell delimiter hazards that a naive split of the pre-formatted Grype table text would introduce; any literal pipe characters in cell content are escaped. EPSS probabilities are formatted as percentages to two decimal places (with a '<0.01%' threshold for very small non-zero values) and risk scores are rounded to two decimals to avoid IEEE-754 float representation artefacts. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Wrap the SARIF and JSON Grype scan steps (both of which run with fail-build: false purely to produce artefacts) in a ::stop-commands:: / ::<token>:: pair so that the runner does not lift their '::warning::Failed minimum severity level...' output into job annotations. The final Text/Table scan remains outside the pair and is therefore the sole source of the job's single annotation — its step log also contains the human-readable CVE table, which is the output we actually want surfaced. The stop-token is generated per run with 16 random bytes of hex to prevent a (malicious) transitive dependency from re-enabling workflow-command parsing early via a crafted log line. The resume step is marked 'if: always()' so a failure in either artefact-producing scan does not leave later steps in this or subsequent jobs running with workflow commands suppressed. Applied to both build-test.yaml and build-test-release.yaml so the PR CI and the release CI behave consistently. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
The previous implementation generated a per-run stop-commands
token and passed it between steps via a \$GITHUB_OUTPUT value.
The runner auto-masks values read from step outputs as '***' in
the log stream, so the actually-emitted directive was
'::stop-commands::***' rather than '::stop-commands::<token>'.
This both registered '***' as the end-token (the generic
redaction placeholder, which appears frequently elsewhere in
workflow logs) and left the downstream ::warning:: lines from
the SARIF and JSON scan steps unmuted, so the duplicate
annotations still surfaced on the job (verified on workflow run
24680188551).
Switch to a fixed literal token ('grype-quiet-annotations')
which avoids the masking entirely. The previously-cited threat
model (a hostile transitive dependency prematurely re-enabling
workflow commands by echoing the token) is negligible for this
workflow, and is explicitly called out in the inline comment so
a future reader understands why randomisation was dropped.
Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
7deca0d to
e95a6a9
Compare
Replace the three scan-action invocations (SARIF / JSON / Table) with a single Grype run that emits all three output formats at once, driven by a shell step we fully own. Why: each GitHub Actions step gets a fresh ActionCommandManager (see actions/runner Handler.cs using CreateService per handler instance), so the ::stop-commands:: / ::<token>:: pair we used to mute annotations from the artefact-only scan steps could not possibly persist across step boundaries. That was confirmed empirically on workflow run 24680548929: the stop-commands directive was emitted in the 'Mute' step's log but had no effect on the subsequent SARIF and JSON scan-action steps, which each continued to surface a '::warning::Failed minimum severity level...' duplicate annotation. The single-step approach: - Uses anchore/scan-action/download-grype to install Grype and (optionally) cache its vulnerability DB. - Runs grype once with '-o sarif=... -o json=... -o table=...', so SARIF, JSON, and human-readable outputs are all produced in one invocation. No scan-action-emitted workflow commands. - Cats grype-results.txt into the step log so the CVE table is the first thing visible when someone clicks through to a failing run. - Evaluates grype's exit code (2 == vulnerabilities at or above --fail-on threshold) and emits exactly one ::error:: or ::warning:: based on the NO_BLOCK_AUDIT_FAIL override. That's the only annotation the job can produce now. Grype's '--fail-on medium' preserves the previous severity cutoff (scan-action's default). Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
e95a6a9 to
1b9f9ad
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves how Grype SBOM audit failures are surfaced in GitHub Actions by ensuring the failing check logs and step summaries include actionable CVE/package details, and by making failure attribution clearer in both PR CI and release CI workflows.
Changes:
- Split SBOM generation and Grype auditing into separate jobs so failures are attributed to the correct job.
- Run Grype once to emit SARIF/JSON/table outputs, printing the table directly into the failing step’s log and emitting a single annotation.
- Render Grype findings into a Markdown table in the job step summary using structured JSON +
jq, and warn (not ignore) when expected artifacts are missing.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| .github/workflows/build-test.yaml | Separates SBOM generation from Grype audit and improves audit logging + Markdown summary output for PR CI. |
| .github/workflows/build-test-release.yaml | Mirrors the CI improvements for release CI and gates publishing on the Grype audit job. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Improves the SBOM/Grype audit workflow so that, when a vulnerability
scan fails, the failing check surfaces the actual CVE/package
information directly in its log, and the job step summary presents the
findings as a proper Markdown table.
Applies to both
build-test.yaml(PR CI) andbuild-test-release.yaml(tag-push release CI).Problem
When the Grype audit of the SBOM failed on a PR, the job that produced
useful console output (
Security scan with Grype (Text/Table)) wasmarked as successful, while the other (
Security scan with Grype (SARIF))owned the failure but its log only contained the terse
Failed minimum severity levelmessage with no CVE context.Additionally:
Generate SBOM, so failures looked like the SBOMcould not be generated, when in reality the generation succeeded and
it was the audit of the SBOM that failed.
column alignment) into
$GITHUB_STEP_SUMMARY, producing ahard-to-read wall of text.
anchore/scan-actioninvocations (SARIF,JSON, Text/Table) produced three duplicate
Failed minimum severity level…job annotations, one perinvocation, with none of them carrying the CVE context.
Changes
Split
sbomintosbom+grypejobsGenerate SBOMandGrype Audit SBOMjobs, with the
grypejob depending onsbomand consuming theuploaded
sbom-filesartefact. Failures are now clearly attributedto whichever job actually failed.
grypeis added to thetest-pypijob'sneedslist sovulnerability findings gate publishing (and transitively, the
production
pypipublish). The existingNO_BLOCK_AUDIT_FAILrepository-variable override still allows releases to proceed when
blocked by newly discovered CVEs in transitive dependencies.
Single-step Grype invocation
anchore/scan-actioninvocations with a singleGrype run driven by
anchore/scan-action/download-grypeplus ashell step we fully control. Grype is executed once with
-o sarif=… -o json=… -o table=…, producing all three outputformats at once.
clicking through to a failing check in the Actions UI immediately
shows the offending packages and CVEs.
::error::(or::warning::whenNO_BLOCK_AUDIT_FAIL='true')with the reason for the failure. No duplicate annotations.
Markdown-table job step summary
grype-results.jsonand processed withjq. Parsing structuredJSON avoids the alignment/cell-delimiter hazards that a naive split
of the pre-formatted Grype text would introduce; any literal pipe
characters in cell content are escaped as
\|.places (with a
<0.01%threshold for very small non-zero values),and risk scores are rounded to two decimals to avoid IEEE-754 float
representation artefacts.
Other workflow-quality fixes
"code scanning ingestion" phrasing, since this workflow doesn't
invoke
github/codeql-action/upload-sarif.if-no-files-foundon the Grype artefact upload changed fromignoretowarnso missing expected outputs surface in logsinstead of being silently tolerated.
Verification
The new behaviour was verified end to end on the fork PR
(
modeseven-lfit/dependamerge#1) by temporarily forcing thetransitive
h11dependency to0.12.0via a[tool.uv]override-dependenciespin, which triggeredGHSA-vqfr-h8mv-ghfj
(Critical) in the Grype audit. The verification commits (the
h11pin and a temporary
pull_requesttrigger relaxation) have beenremoved; this branch contains only the actual fix commits, rebased
onto
upstream/main.During verification:
Generate SBOMjob succeeded.Grype Audit SBOMjob failed with one informative failureannotation naming the CVE, plus the runner's generic
Process completed with exit code 1status annotation (which isalways emitted for a failing step and cannot be suppressed without
also suppressing job failure).
grype-scan-resultsartefact contained SARIF, JSON, andtable-text files.
The
jqpipeline was also verified locally against a Grype JSON scanof an intentionally vulnerable environment (
cryptography==41.0.0+pytest==7.0.0, 11 matches); sample output:The pipe-escape path also produces the Markdown-correct
single-backslash escape (e.g. a hypothetical package name
pytest|weirdrenders aspytest\|weird).