Skip to content

Fix(ci): Surface Grype findings on audit failure#259

Merged
ModeSevenIndustrialSolutions merged 8 commits into
lfit:mainfrom
modeseven-lfit:fix-sbom-fail-output
Apr 22, 2026
Merged

Fix(ci): Surface Grype findings on audit failure#259
ModeSevenIndustrialSolutions merged 8 commits into
lfit:mainfrom
modeseven-lfit:fix-sbom-fail-output

Conversation

@ModeSevenIndustrialSolutions
Copy link
Copy Markdown
Contributor

@ModeSevenIndustrialSolutions ModeSevenIndustrialSolutions commented Apr 21, 2026

Summary

Improves the SBOM/Grype audit workflow so that, when a vulnerability
scan fails, the failing check surfaces the actual CVE/package
information directly in its log, and the job step summary presents the
findings as a proper Markdown table.

Applies to both build-test.yaml (PR CI) and
build-test-release.yaml (tag-push release CI).

Problem

When the Grype audit of the SBOM failed on a PR, the job that produced
useful console output (Security scan with Grype (Text/Table)) was
marked as successful, while the other (Security scan with Grype (SARIF))
owned the failure but its log only contained the terse
Failed minimum severity level message with no CVE context.

Additionally:

  • The job was named Generate SBOM, so failures looked like the SBOM
    could not be generated, when in reality the generation succeeded and
    it was the audit of the SBOM that failed.
  • The step summary dumped raw Grype table text (with whitespace-only
    column alignment) into $GITHUB_STEP_SUMMARY, producing a
    hard-to-read wall of text.
  • Running three separate anchore/scan-action invocations (SARIF,
    JSON, Text/Table) produced three duplicate
    Failed minimum severity level… job annotations, one per
    invocation, with none of them carrying the CVE context.

Changes

Split sbom into sbom + grype jobs

  • The workflow now has separate Generate SBOM and Grype Audit SBOM
    jobs, with the grype job depending on sbom and consuming the
    uploaded sbom-files artefact. Failures are now clearly attributed
    to whichever job actually failed.
  • grype is added to the test-pypi job's needs list so
    vulnerability findings gate publishing (and transitively, the
    production pypi publish). The existing NO_BLOCK_AUDIT_FAIL
    repository-variable override still allows releases to proceed when
    blocked by newly discovered CVEs in transitive dependencies.

Single-step Grype invocation

  • Replaced the three anchore/scan-action invocations with a single
    Grype run driven by anchore/scan-action/download-grype plus a
    shell step we fully control. Grype is executed once with
    -o sarif=… -o json=… -o table=…, producing all three output
    formats at once.
  • The step prints the human-readable table into its own log, so
    clicking through to a failing check in the Actions UI immediately
    shows the offending packages and CVEs.
  • The step then evaluates Grype's exit code and emits exactly one
    ::error:: (or ::warning:: when NO_BLOCK_AUDIT_FAIL='true')
    with the reason for the failure. No duplicate annotations.

Markdown-table job step summary

  • The step summary renders a proper Markdown table sourced from
    grype-results.json and processed with jq. Parsing structured
    JSON avoids the alignment/cell-delimiter hazards that a naive split
    of the pre-formatted Grype text would introduce; any literal pipe
    characters in cell content are escaped as \|.
  • EPSS probabilities are formatted as percentages to two decimal
    places (with a <0.01% threshold for very small non-zero values),
    and risk scores are rounded to two decimals to avoid IEEE-754 float
    representation artefacts.

Other workflow-quality fixes

  • Reworded the SARIF-output comment to drop the misleading
    "code scanning ingestion" phrasing, since this workflow doesn't
    invoke github/codeql-action/upload-sarif.
  • if-no-files-found on the Grype artefact upload changed from
    ignore to warn so missing expected outputs surface in logs
    instead of being silently tolerated.

Verification

The new behaviour was verified end to end on the fork PR
(modeseven-lfit/dependamerge#1) by temporarily forcing the
transitive h11 dependency to 0.12.0 via a [tool.uv]
override-dependencies pin, which triggered
GHSA-vqfr-h8mv-ghfj
(Critical) in the Grype audit. The verification commits (the h11
pin and a temporary pull_request trigger relaxation) have been
removed; this branch contains only the actual fix commits, rebased
onto upstream/main.

During verification:

  • The Generate SBOM job succeeded.
  • The Grype Audit SBOM job failed with one informative failure
    annotation naming the CVE, plus the runner's generic
    Process completed with exit code 1 status annotation (which is
    always emitted for a failing step and cannot be suppressed without
    also suppressing job failure).
  • The job step summary rendered the CVE as a Markdown table.
  • The grype-scan-results artefact contained SARIF, JSON, and
    table-text files.

The jq pipeline was also verified locally against a Grype JSON scan
of an intentionally vulnerable environment (cryptography==41.0.0 +
pytest==7.0.0, 11 matches); sample output:

Package Installed Fixed In Type Vulnerability Severity EPSS Risk
cryptography 41.0.0 41.0.2 python GHSA-cf7p-gm2m-833m High 1.09% 0.85
cryptography 41.0.0 42.0.4 python GHSA-6vqw-3v5j-54x4 High 0.42% 0.31
cryptography 41.0.0 46.0.5 python GHSA-r6ph-v2qm-q3c2 High <0.01% 0
pytest 7.0.0 9.0.3 python GHSA-6w46-j5rx-g56g Medium <0.01% 0

The pipe-escape path also produces the Markdown-correct
single-backslash escape (e.g. a hypothetical package name
pytest|weird renders as pytest\|weird).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the CI SBOM vulnerability auditing so Grype failures surface actionable CVE/package details directly in failing logs and in the job step summary, improving debuggability for both PR CI and release CI workflows.

Changes:

  • Split SBOM generation from Grype auditing into separate sbom and grype jobs, with artifact handoff between them.
  • Replace multiple anchore/scan-action runs with a single Grype invocation that emits SARIF/JSON/table outputs at once and produces a single annotation.
  • Render Grype findings in $GITHUB_STEP_SUMMARY as a Markdown table derived from grype-results.json via jq.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
.github/workflows/build-test.yaml Splits SBOM vs Grype into separate jobs and adds JSON-driven Markdown summary + consolidated Grype run.
.github/workflows/build-test-release.yaml Mirrors the PR workflow improvements for releases and adds grype as a publishing gate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/build-test-release.yaml Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves CI ergonomics for SBOM vulnerability auditing by making Grype failures actionable (CVE/package details visible in the failing step log) and by rendering Grype findings as a readable Markdown table in the job step summary. It also clarifies responsibility for failures by separating SBOM generation from the Grype audit into distinct jobs.

Changes:

  • Split SBOM generation and Grype auditing into separate sbom and grype jobs so failures are attributed correctly.
  • Replace multiple anchore/scan-action invocations with a single Grype run emitting SARIF/JSON/table outputs, with one authoritative annotation.
  • Render Grype findings into $GITHUB_STEP_SUMMARY as a Markdown table derived from JSON via jq (with pipe escaping and numeric formatting), and warn on missing expected artifacts.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
.github/workflows/build-test.yaml Separates SBOM + Grype into dedicated jobs; runs Grype once and surfaces findings in logs + Markdown step summary; uploads Grype outputs (SARIF/JSON/table).
.github/workflows/build-test-release.yaml Mirrors the PR-CI workflow improvements for release CI; ensures publishing is gated on the Grype audit job and outputs are surfaced consistently.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Separate 'Generate SBOM' and 'Grype Audit SBOM' into two jobs so
that a failing vulnerability scan is attributed to the audit job
rather than to SBOM generation. The new 'grype' job depends on
'sbom' and consumes the uploaded sbom-files artefact.

Reorder the Grype steps within the audit job so the Text/Table
scan runs last and owns the pass/fail verdict. Its step log
contains the human-readable CVE list, so when the step turns red
in the PR Checks UI, following the link to the workflow surfaces
the offending packages and CVEs directly. The SARIF scan still
runs (with fail-build disabled) purely to produce the artefact
uploaded alongside the text results.

The NO_BLOCK_AUDIT_FAIL repository variable override is preserved
by computing fail-build dynamically on the Table step, removing
the need for a separate 'Check Grype scan results' gate.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Add a JSON-format Grype scan alongside the existing SARIF and
Table scans, include grype-results.json in the uploaded
grype-scan-results artefact, and use the JSON output to render a
proper Markdown table in the job step summary.

Parsing structured JSON with jq avoids the alignment and cell
delimiter hazards that a naive split of the pre-formatted Grype
table text would introduce; any literal pipe characters in cell
content are escaped. EPSS probabilities are formatted as
percentages to two decimal places (with a '<0.01%' threshold for
very small non-zero values) and risk scores are rounded to two
decimals to avoid IEEE-754 float representation artefacts.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Round 1 fixes:

- round2: use jq built-in 'round' for true half-up rounding
  instead of truncation-via-floor, matching the PR description
  and the function's documented behaviour.
  (ref: #1 discussion r3111955943)

- Upload Grype scan results: switch if-no-files-found from
  'ignore' to 'warn' so missing expected outputs (e.g. from an
  earlier scan-action failure) surface in the logs instead of
  being silently tolerated.
  (ref: #1 discussion r3111955992)

- Grype Table step comment: reword 'PR Checks UI' to the
  generic 'workflow run's Actions/Checks UI' since this
  workflow is triggered by tag pushes rather than pull
  requests.
  (ref: #1 discussion r3111956019)

Round 2 fixes:

- Grype SARIF step comment: drop the misleading 'code scanning
  ingestion' phrasing. The SARIF artefact is uploaded but not
  consumed by github/codeql-action/upload-sarif in this
  workflow, so the comment was implying behaviour that does
  not exist.
  (ref: #1 discussion r3112034310)

- Download SBOM artefact: standardise on
  actions/download-artifact v8.0.1 (SHA 3e5f45b...) to match
  the other invocation later in the same workflow, avoiding
  mixed major versions.
  (ref: #1 discussion r3112034212)

- test-pypi: add 'grype' to the needs list so the Grype audit
  gates publishing. Previously publishing could run in parallel
  with the audit, undermining the 'block releases on
  vulnerabilities unless NO_BLOCK_AUDIT_FAIL=true' behaviour.
  Note: the audit job's own fail-build propagation already
  honours the override, so test-pypi will still run when
  NO_BLOCK_AUDIT_FAIL is 'true'.
  (ref: #1 discussion r3112034266)

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Mirror the refactor from build-test-release.yaml in the PR
build-test workflow: separate 'Generate SBOM' and 'Grype Audit
SBOM' into two jobs so that a failing vulnerability scan is
attributed to the audit job rather than to SBOM generation. The
new 'grype' job depends on 'sbom' and consumes the uploaded
sbom-files artefact.

Reorder the Grype steps within the audit job so the Text/Table
scan runs last and owns the pass/fail verdict. Its step log
contains the human-readable CVE list, so when the step turns red
in the PR Checks UI, following the link to the workflow surfaces
the offending packages and CVEs directly. The SARIF scan still
runs (with fail-build disabled) purely to produce the artefact
uploaded alongside the text results.

The NO_BLOCK_AUDIT_FAIL repository variable override is preserved
by computing fail-build dynamically on the Table step, removing
the need for a separate 'Check Grype scan results' gate.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Mirror the summary improvement from build-test-release.yaml in
the PR build-test workflow: add a JSON-format Grype scan
alongside the existing SARIF and Table scans, include
grype-results.json in the uploaded grype-scan-results artefact,
and use the JSON output to render a proper Markdown table in the
job step summary.

Parsing structured JSON with jq avoids the alignment and cell
delimiter hazards that a naive split of the pre-formatted Grype
table text would introduce; any literal pipe characters in cell
content are escaped. EPSS probabilities are formatted as
percentages to two decimal places (with a '<0.01%' threshold for
very small non-zero values) and risk scores are rounded to two
decimals to avoid IEEE-754 float representation artefacts.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Wrap the SARIF and JSON Grype scan steps (both of which run with
fail-build: false purely to produce artefacts) in a
::stop-commands:: / ::<token>:: pair so that the runner does not
lift their '::warning::Failed minimum severity level...' output
into job annotations. The final Text/Table scan remains outside
the pair and is therefore the sole source of the job's single
annotation — its step log also contains the human-readable CVE
table, which is the output we actually want surfaced.

The stop-token is generated per run with 16 random bytes of hex
to prevent a (malicious) transitive dependency from re-enabling
workflow-command parsing early via a crafted log line. The
resume step is marked 'if: always()' so a failure in either
artefact-producing scan does not leave later steps in this or
subsequent jobs running with workflow commands suppressed.

Applied to both build-test.yaml and build-test-release.yaml so
the PR CI and the release CI behave consistently.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
The previous implementation generated a per-run stop-commands
token and passed it between steps via a \$GITHUB_OUTPUT value.
The runner auto-masks values read from step outputs as '***' in
the log stream, so the actually-emitted directive was
'::stop-commands::***' rather than '::stop-commands::<token>'.
This both registered '***' as the end-token (the generic
redaction placeholder, which appears frequently elsewhere in
workflow logs) and left the downstream ::warning:: lines from
the SARIF and JSON scan steps unmuted, so the duplicate
annotations still surfaced on the job (verified on workflow run
24680188551).

Switch to a fixed literal token ('grype-quiet-annotations')
which avoids the masking entirely. The previously-cited threat
model (a hostile transitive dependency prematurely re-enabling
workflow commands by echoing the token) is negligible for this
workflow, and is explicitly called out in the inline comment so
a future reader understands why randomisation was dropped.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Replace the three scan-action invocations (SARIF / JSON / Table)
with a single Grype run that emits all three output formats at
once, driven by a shell step we fully own.

Why: each GitHub Actions step gets a fresh ActionCommandManager
(see actions/runner Handler.cs using CreateService per handler
instance), so the ::stop-commands:: / ::<token>:: pair we used
to mute annotations from the artefact-only scan steps could not
possibly persist across step boundaries. That was confirmed
empirically on workflow run 24680548929: the stop-commands
directive was emitted in the 'Mute' step's log but had no effect
on the subsequent SARIF and JSON scan-action steps, which each
continued to surface a '::warning::Failed minimum severity
level...' duplicate annotation.

The single-step approach:

- Uses anchore/scan-action/download-grype to install Grype and
  (optionally) cache its vulnerability DB.
- Runs grype once with '-o sarif=... -o json=... -o table=...',
  so SARIF, JSON, and human-readable outputs are all produced in
  one invocation. No scan-action-emitted workflow commands.
- Cats grype-results.txt into the step log so the CVE table is
  the first thing visible when someone clicks through to a failing
  run.
- Evaluates grype's exit code (2 == vulnerabilities at or above
  --fail-on threshold) and emits exactly one ::error:: or
  ::warning:: based on the NO_BLOCK_AUDIT_FAIL override. That's
  the only annotation the job can produce now.

Grype's '--fail-on medium' preserves the previous severity
cutoff (scan-action's default).

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Matthew Watkins <mwatkins@linuxfoundation.org>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves how Grype SBOM audit failures are surfaced in GitHub Actions by ensuring the failing check logs and step summaries include actionable CVE/package details, and by making failure attribution clearer in both PR CI and release CI workflows.

Changes:

  • Split SBOM generation and Grype auditing into separate jobs so failures are attributed to the correct job.
  • Run Grype once to emit SARIF/JSON/table outputs, printing the table directly into the failing step’s log and emitting a single annotation.
  • Render Grype findings into a Markdown table in the job step summary using structured JSON + jq, and warn (not ignore) when expected artifacts are missing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
.github/workflows/build-test.yaml Separates SBOM generation from Grype audit and improves audit logging + Markdown summary output for PR CI.
.github/workflows/build-test-release.yaml Mirrors the CI improvements for release CI and gates publishing on the Grype audit job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ModeSevenIndustrialSolutions ModeSevenIndustrialSolutions merged commit a0f834b into lfit:main Apr 22, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants