fix: trivy cache isolation + 0.70.0 bump + multi-image report heading#223
Merged
fix: trivy cache isolation + 0.70.0 bump + multi-image report heading#223
Conversation
…e report headings Three small fixes addressing PAY-SPACE feedback from the first prod rollout: 1. trivy cache lock timeout (pay_space_wallet, crypto-tools runs): Trivy uses file locks on its cache directory to synchronize DB access. When the cache is shared across runs (Blacksmith persistent cache) or a prior trivy crashed, stale locks cause: 'unable to initialize fs cache: cache may be in use by another process: timeout'. Switch to a per-scan ephemeral cache directory via os.MkdirTemp — eliminates lock contention entirely at the cost of re-downloading the ~100MB DB per scan, which is acceptable for CI. Added Scan() defer cleanup so dirs don't leak. 2. trivy version bump 0.69.3 → 0.70.0: Latest upstream release, surfaces as a warning in 0.69 logs. 3. Multi-image report disambiguation: Stacks with multiple images (e.g. pay_space_wallet builds web+worker) produce one security-report local.Command per image. They wrote identical '## Security Pipeline Summary' headings to , making two distinct reports look like a duplicate. Suffix the heading with imageName so web/worker/etc render as visibly separate sections. Verified: go build ./... clean, go test ./pkg/security/scan/... and ./pkg/clouds/pulumi/docker/... all green (including new multi-scan thread-safety test for ensureTrivyCacheDir).
smecsia
approved these changes
Apr 19, 2026
Cre-eD
added a commit
that referenced
this pull request
Apr 19, 2026
PR #222 added the sc symlink to this Dockerfile on the staging branch, but push.yaml (triggered on push to main) also builds and pushes simplecontainer/github-actions:staging using MAIN's copy of this file. When any main push triggers push.yaml, it overwrites the :staging image with a build that doesn't include the symlink — reintroducing 'sc: not found' on downstream deploys (e.g. PAY-SPACE/crypto-tools at 2026-04-19T20:02, right after PR #223 merged to main and re-triggered push.yaml). Root cause: two workflows publish the same tag from two branches. Fix keeps the Dockerfiles in sync by applying the same symlink+verify lines to main's copy. Mirror of #222 exactly.
3 tasks
Cre-eD
added a commit
that referenced
this pull request
Apr 19, 2026
…image Brings into staging: - PR #223 (trivy cache isolation, 0.70.0 bump, multi-image report heading) - AWS tags, cloudflare worker fix, Caddy preStop drain, etc. Conflicts (4 files): - trivy.go, trivy_test.go, security_report.go: take main (new features) - simple_container.go: KEEP staging's serviceSpec() helper + serviceTypeStr (main regresses here — inline ServiceSpecArgs lacks ClusterIP guard that codex flagged as a P1 during PR #221 review) Triggers build-staging.yml to publish a :staging image with staging's fixed Dockerfile (sc symlink from PR #222) AND main's improvements. Note: this is a TEMPORARY fix. PR #224 (still in review) fixes the root cause — main's github-actions-staging.Dockerfile. Until that merges, every push to main re-overwrites :staging with a broken image and this merge must be repeated.
Merged
3 tasks
Cre-eD
added a commit
that referenced
this pull request
Apr 19, 2026
…staging with sc symlink (#225) ## Why now `:staging` image is currently broken (`/bin/sh: sc: not found` on PAY-SPACE/crypto-tools). Root cause in #224 (main's Dockerfile also needs the symlink). Until that merges, we can unblock PAY-SPACE by re-triggering `build-staging.yml` — which builds from **staging's** Dockerfile (already fixed by #222). This PR takes the opportunity to also pull main's improvements into staging so the rebuilt image includes: - PR #223 — trivy cache isolation, 0.70.0 bump, multi-image report heading - Misc main commits (AWS tags, Caddy preStop drain, cloudflare worker fix, etc.) ## Conflict resolution (4 files) | File | Taken | Why | |---|---|---| | `pkg/security/scan/trivy.go` | main | New cache-isolation + 0.70.0 version bump | | `pkg/security/scan/trivy_test.go` | main | Updated tests for per-invocation cache dir | | `pkg/clouds/pulumi/docker/security_report.go` | main | Multi-image heading fix | | `pkg/clouds/pulumi/kubernetes/simple_container.go` | **staging** | Keep `serviceSpec()` helper + `serviceTypeStr` — main inline regresses on ClusterIP guard (was P1 in #221 codex review) | ## Verification - `go build ./...` clean - `go test ./pkg/security/scan/... ./pkg/clouds/pulumi/kubernetes/... ./pkg/clouds/pulumi/docker/...` all green ## Test plan - [ ] Merge triggers `build-staging.yml` - [ ] `docker pull simplecontainer/github-actions:staging && docker run --rm --entrypoint sh ... -c 'which sc'` returns `/usr/local/bin/sc` - [ ] Re-run PAY-SPACE/crypto-tools deploy — passes ## Follow-up Still need **PR #224** (main's Dockerfile fix) to merge — otherwise the next push to main will overwrite `:staging` with a broken image again. --------- Co-authored-by: universe-ops <177390656+universe-ops@users.noreply.github.com> Co-authored-by: Universe Ops <universe-ops@github.com> Co-authored-by: simple-container-forge[bot] <257785999+simple-container-forge[bot]@users.noreply.github.com> Co-authored-by: GitHub Action <action@github.com> Co-authored-by: Ilya <smecsia@gmail.com> Co-authored-by: Bao Tran <baotn166@users.noreply.github.com>
Cre-eD
added a commit
that referenced
this pull request
Apr 20, 2026
…le (root cause of repeated :staging regressions) (#224) ## Symptom After #221 merged main→staging and #222 added the sc symlink to staging's Dockerfile, PAY-SPACE deploys briefly worked. Then #223 merged to main and immediately after, PAY-SPACE/crypto-tools started failing again with: ``` /bin/sh: sc: not found error: exit status 127: running "... 'sc' 'sbom' 'generate' ..." ``` ## Root cause **Two separate workflows build and push `simplecontainer/github-actions:staging`:** | Workflow | Trigger | Dockerfile (from which branch) | |---|---|---| | `build-staging.yml` | push to `staging` | `github-actions-staging.Dockerfile` on `staging` branch | | `push.yaml` | push to `main` | `github-actions-staging.Dockerfile` on `main` branch | PR #222 only fixed the Dockerfile on **staging**. The main branch copy never got the `ln -s /root/github-actions /usr/local/bin/sc` line. So every time anything merges to main, `push.yaml` runs, builds without the symlink, and **overwrites** the good `:staging` image that was pushed by `build-staging.yml`. Verified by `docker pull simplecontainer/github-actions:staging && docker run --rm --entrypoint sh .../github-actions:staging -c 'ls -la /usr/local/bin/sc'`: ``` ls: /usr/local/bin/sc: No such file or directory ``` Image created timestamp matches `push.yaml` run at 19:47 (after PR #223 merged at 19:47), not `build-staging.yml` run at 17:53 (after #222). ## Fix Apply the identical symlink + verify lines to main's copy of `github-actions-staging.Dockerfile`. Both workflows now produce an image with `sc` in PATH. This keeps the two Dockerfiles in sync going forward. ## Long-term Two workflows publishing the same tag from two branches is fragile — whichever runs last wins. Consider consolidating: either `build-staging.yml` is the sole publisher of `:staging`, or `push.yaml` drops the staging tag. Out of scope for this PR. ## Test plan - [ ] Merge triggers `push.yaml` → new `:staging` image pushed - [ ] `docker pull simplecontainer/github-actions:staging && docker run --rm --entrypoint sh ... -c 'sc --help | head -1'` succeeds - [ ] Re-run PAY-SPACE/crypto-tools deploy — passes end-to-end
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three small fixes surfaced from the first PAY-SPACE prod rollout:
1. Trivy cache lock timeout
Symptom (pay_space_wallet, crypto-tools runs):
Cause: Trivy uses file locks on its cache directory to synchronize DB access. When the cache persists across runs (Blacksmith persistent cache) or a prior trivy crashed, stale locks cause the next scan to time out waiting for them.
Fix: Use a per-scan ephemeral cache directory via
os.MkdirTemp("<userCache>/trivy", "scan-*"), withdefer cleanupinScan(). Eliminates lock contention entirely. Cost: re-downloads the ~100MB vulnerability DB per scan (a few seconds), acceptable for CI. Codex flagged the earlier "just delete the lock files" approach as unsafe under real concurrency — this version sidesteps the race entirely.Updated
TestEnsureTrivyCacheDirto assert the per-invocationscan-*suffix and that two calls return different directories.2. Trivy version bump 0.69.3 → 0.70.0
Scan logs from the 0.69.3 runs surface the upstream notice:
3. Multi-image report heading disambiguation
Symptom (PAY-SPACE/pay_space_wallet): the step summary showed what looked like the same Security Pipeline Summary report twice.
Cause: The
pay_space_walletstack builds two images (web+worker), so the pipeline creates onesecurity-report-<image>local.Command per image. Each one writes## Security Pipeline Summaryto$GITHUB_STEP_SUMMARY. Both reports have identical scan counts (same codebase, different entrypoints), so two distinct reports look like a duplicate.Fix: Suffix the heading with the image name:
Verification
go build ./...cleango test ./pkg/security/scan/...green (incl. updated cache-dir tests)go test ./pkg/clouds/pulumi/docker/...green