Skip to content

relay: CI image scanning — Trivy or Grype, fail on CRITICAL CVEs #40

@ilmoniemi

Description

@ilmoniemi

Why

Once #32 ships a Dockerfile, the built image enters the supply chain. Base-image CVEs (even pinned-by-digest) accumulate over time; first CI signal that there's a known critical vulnerability in a dep should be at PR time, not when an attacker exploits it.

This ticket is the runtime-image scan. The Go-source-side complement is govulncheck — see the separate ticket for that. Both layers are needed because they cover different artifacts (your Go code + reachable call paths vs. base-image system libs).

docs/threat-model.md § Supply chain — Go dependencies covers source-side dep audit posture.

What

Two CI runs, one config:

  1. PR-time scan in .github/workflows/ci.yml after the Docker build:
- name: Scan image for CVEs
  uses: aquasecurity/trivy-action@<pinned-version>
  with:
    image-ref: ghcr.io/pyrycode/pyrycode-relay:${{ github.sha }}
    severity: CRITICAL,HIGH
    exit-code: '1'
    ignore-unfixed: true
  1. Periodic cron in a separate workflow (.github/workflows/security-scan.yml or similar) that re-runs BOTH this image scan AND govulncheck against the latest main SHA. Schedule daily or weekly. Critical because new CVEs surface AFTER your last PR — without periodic re-scan, a vuln disclosed today against a dep you haven't touched in 3 months is invisible until the next PR happens to bump that dep.

ignore-unfixed: true — don't fail on CVEs that have no fix yet; we can't act on those. We do fail on fixable CRITICAL/HIGH.

Implementation notes

  • Pin the action version (third-party — supply-chain consideration in itself). Renovate/Dependabot can keep the pin fresh.
  • Workflow permissions: permissions: contents: read only — no token write needed for scanning.
  • .trivyignore allow-listing is OK but each entry needs a code comment explaining WHY it's ignored + a TODO with revisit date. Otherwise the file becomes a graveyard.
  • Periodic scan failure → file an issue automatically. gh issue create from the cron job, with the CVE details in the body. Otherwise the cron failure scrolls past in Actions logs.
  • Grype is the alternative scanner if Trivy proves noisy. Same workflow shape.

Out of scope

  • SLSA provenance / cosign signing (overkill for v1; revisit when relay is consumed downstream)
  • Source-side dep scanning — covered by govulncheck ticket + Dependabot (already enabled)
  • Runtime self-check (separate ticket — verifies deployment posture at boot)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions