Skip to content

Phase 6: promote/rollback automation, freshness alarm, prune (#144)#209

Merged
cbeaulieu-gt merged 18 commits intomainfrom
phase-6-promotion
May 6, 2026
Merged

Phase 6: promote/rollback automation, freshness alarm, prune (#144)#209
cbeaulieu-gt merged 18 commits intomainfrom
phase-6-promotion

Conversation

@cbeaulieu-gt
Copy link
Copy Markdown
Member

@cbeaulieu-gt cbeaulieu-gt commented May 6, 2026

Summary

Phase 6 of the CI Claude Runtime epic (#130, milestone "Containerized Claude Code CI runtime"). Adds promotion automation, rollback workflow, drift alarm, and orphan-tag pruning so the runtime image set has a complete lifecycle: build → promote → rollback → cleanup, with both proactive (freshness) and reactive (rollback) safety paths.

This PR builds on Phase 5 (#188) which shipped digest-pinned consumer workflows; Phase 6 makes the digest-pin lifecycle automatic + auditable.

Closes #144

What's in the PR

STAGE 5 promote (runtime-build.yml)

  • New stage-5-promote job appended after STAGE 4-overlay green
  • Atomic per-overlay regex substitution across 5 reusable workflows × 7 occurrences (review×2, fix×4, explain×1)
  • Count-asserted at exactly 7 — cross-overlay digest contamination structurally impossible
  • OCI label capture (private_sha, marketplace_sha, cli_version) for each promoted image, surfaced in the auto-PR body
  • App-token authenticated push so the PR shows under claude-action-runner[bot] identity AND merging re-triggers downstream consumer workflows (synchronize events are suppressed for GITHUB_TOKEN pushes)
  • if: github.event_name != 'pull_request' so PR validation never cascades into a promote
  • Concurrency cancel-in-progress: false — in-flight promote completes before next starts (cancellation mid-sed would leave half-applied substitution)

Rollback (runtime-rollback.yml)

  • workflow_dispatch with target_pubsha (40-hex required) + dry_run (default true)
  • Validation chain: shape → :tag-existsorg.opencontainers.image.revision label match → inventory-match.shsmoke-test.shsed substitution count assertion. Each step exits 1 with a specific error message
  • Catches tag retag drift — GHCR has tag immutability disabled (Spec defect: GHCR does not support tag immutability — pivot to digest-pin-only reproducibility #173), so a :<pubsha> tag could be retagged or deleted; the revision-label check refuses to roll back to a corrupted target
  • inputs.dry_run normalized via shell env to literal true/false before peter-evans's draft: input — avoids the GHA false || 'true' expression trap

Freshness alarm (runtime-check-private-freshness.yml)

  • Weekly cron (Mon 08:00 UTC) + manual workflow_dispatch with simulate_stale rehearsal input
  • Path-scoped denominator per spec §13 Q7 — drift on non-imported paths in claude-configs doesn't page
  • Idempotent issue dedupe — same-titled open issue gets a re-fire comment instead of new issue
  • Functional gate: requires claude-action-runner App installed on glitchwerks/claude-configs (App claude-action-runner not installed on glitchwerks/claude-configs — blocks Task 6.5 freshness alarm #206). Step exits cleanly with explicit error until that install lands

Pruning (runtime-prune-pending.yml)

  • Weekly cron (Mon 09:00 UTC) + manual workflow_dispatch
  • Targets :pending-* tags older than 14 days across the 4 runtime packages
  • Protection invariant: any version that ALSO carries a 40-hex pubsha tag is a promoted, immutable version (spec §11.4) and is NEVER pruned, even when also tagged pending-*. The pubsha tag is the protection signal
  • Both triggers default dry_run=true for first deployment — flip to live after several clean weekly reports prove the invariant filter

STAGE 4 label-shape assertion (runtime/scripts/smoke-test.sh §d.4)

  • Asserts org.opencontainers.image.revision is 40-hex at build time
  • Catches future Dockerfile refactors that set the label to something truthy-but-malformed (e.g. ${{ github.run_id }} instead of ${{ github.sha }}), which would silently poison rollback's digest resolution

Auxiliary

  • verify-app-secrets.yml extended with App-identity dump + workflow-write probe (commits d0a72e5, 71bc01a, fe8f832) — permanent diagnostic for App-perm debugging
  • runtime/ci-manifest.yaml — comment block near sources.marketplace.ref codifying the manual-bump policy resolved in docs(spec): resolve §13 Q5 — marketplace SHA bump cadence #208

Verifications

STAGE 5 dry-run Verified via PR #207 (closed without merge). Run 25411378215 — atomic 7-line digest bump across 5 workflow files, OCI labels in body, bot identity confirmed
App workflows: write perm Granted by org admin on 2026-05-05 (#205). Verified via probe-write run 25411361926
actionlint -shellcheck="-S info" Clean on all four new workflows + the modified runtime-build.yml + verify-app-secrets.yml
Inquisitor coverage Plan committed at a7cc548 was reviewed via two adversarial inquisitor passes; both Pass 1 fixes were re-reviewed in Pass 2 (caught two Critical regressions Pass 1 introduced — fabricated dev.glitchwerks.ci.pubsha label and the GHA false || 'true' expression trap)
Spec §13 Q5 Resolved in standalone PR #208 (merged 2026-05-06); manifest comment in this PR cross-references it

Path deviation from master plan

Master plan placed three files under runtime/: runtime/rollback.yml, runtime/check-private-freshness.yml, runtime/prune-pending.yml. GitHub Actions only auto-discovers workflows under .github/workflows/ — those master-plan paths would not have been triggerable. This PR puts all three under .github/workflows/ with the runtime- prefix to make the relationship visually obvious. Symlinks were considered and rejected (Windows checkout inconsistency under actions/checkout@v5). Plan committed at a7cc548 documents the deviation in detail.

Task 6.4 deviation: live dispatch deferred to Task 6.9

gh workflow run runtime-rollback.yml --ref phase-6-promotion returns HTTP 404: workflow not found on default branch. GitHub indexes workflow_dispatch workflows from the default branch only — a workflow that exists on a feature branch but not on main is invisible to the dispatch API. Pre-merge, Task 6.4's value reduces to static validation only (actionlint clean + inquisitor coverage). The live empty-diff dry-run + deliberate-failure rehearsal happen as part of Task 6.9 post-merge — see plan docs/superpowers/plans/phase-6-promotion.md Task 6.9 for the rehearsal sequence.

§13 Q4 — GHCR push from forked PR (deferred)

Builds are triggered exclusively from main (via push after merge) or via workflow_dispatch from authorized maintainers. Forked PRs cannot trigger runtime-build.yml to push to GHCR — there is no v1 use case requiring it. Deferred until the first external fork attempts to contribute runtime content. Tracked as a Phase 7 wrap-up checklist item. No workflow changes required for Phase 6.

§13 Q7 — path-scoped denominator (implemented)

runtime-check-private-freshness.yml reads imports_from_private from runtime/ci-manifest.yaml (skills, agents, claude_md, standards) plus overlay-level imports_from_private.agents, deduplicates the path list, and runs git log <pinned-sha>..<main-head> -- <paths> against the cloned claude-configs worktree. The 14-day calendar-gap threshold is preserved per §11.3 but only fires when the path-scoped commit count is non-zero — drift on non-imported paths in main doesn't page.

Linked issues

Test plan


Task 6.9 Rehearsal — Outcome (2026-05-06)

The post-merge live rehearsal of STAGE 5 + runtime-rollback.yml ran end-to-end. The rehearsal surfaced a real defect (filed and fixed mid-flight) and validated the rollback path on real production digests — both outcomes are exactly what the rehearsal was designed to produce.

Rehearsal sequence

Step Action Run / PR Outcome
6.9.4 (planned) Rollback to Set A pubsha 3bb6a22e... run 25435260430 Rejected by promote-history check — Set A predates STAGE 5; structurally not rollback-valid. Memory captured: feedback_stage5_only_rollback_targets.md. Rehearsal pivoted to produce two STAGE-5-promoted sets so a real rollback could be exercised.
6.9.2′ Re-dispatch runtime-build.yml against main run 25437028239 All 13 stages green; STAGE 5 opened PR #211 (Set C, pubsha a985e4d...).
6.9.3′ Merge Set C promote PR #211 (merge-commit) Main on Set C; commit be6f72f carries promote: runtime images @a985e4d... subject.
6.9.4′ attempt 1 Rollback to Set B pubsha b080630... run 25438451947 FAILED at Step 6.3.5binventory-match.sh received an image URL where it expected a local enumeration JSON. Filed as #213.
Detour Fix #213 PR #214 (squash, 4d33fbf) Step 6.3.5b now calls enumerate-persona.sh first, mirrors overlay-smoke.sh pattern. All four CI checks green including claude-pr-review/quality-gate.
6.9.4′ attempt 2 Re-dispatch rollback (workflow now has #214's fix on main) run 25440709021 All steps green including 6.3.5b/c. STAGE 5-style auto-PR opened: PR #215. First-ever successful end-to-end run of runtime-rollback.yml.
6.9.5′ Merge rollback PR #215 (merge-commit, 9240ab4) Main back on Set B; rollback commit 42c3063 recorded.
6.9.7 Deliberate-failure dispatch (target_pubsha=000…0001 dry_run=true) run 25441662792 Rejected at promote-history check, as expected (different layer than the original session-summary prediction — see subsection below).

Step 6.9.7 — deliberate-failure result

The original Phase 6 plan expected the bad-pubsha rejection to fire at the "Resolve target digests" step (GHCR-side, where the tag wouldn't exist). After the #209 review #6 promote-history grep landed, that check now fires earlier in the workflow — so 0000000000000000000000000000000000000001 was rejected at "Verify target_pubsha appears in promote history" instead, with the workflow's own diagnostic:

target_pubsha 0000000000000000000000000000000000000001 does not appear in promote history on origin/main (no merge commit subject matches 'promote: runtime images @0000000000000000000000000000000000000001')

Same intent (reject invented pubshas), different layer catches it. This is the correct behaviour: defense-in-depth ordering means the cheapest, most-deterministic check runs first. No PR opened, no GHCR mutation attempted, exit code 1 at the expected step.

Findings landed as durable artifacts

Risk reassessment vs original session summary

What this rehearsal validated

🤖 Generated by Claude Code on behalf of @cbeaulieu-gt

Claude Auto-Fix and others added 15 commits May 5, 2026 20:35
…es (refs #144)

Plan covers Phase 6 of the CI Claude Runtime epic (#130, milestone #7):

- Task 6.1: STAGE 5 promote job (atomic per-overlay digest substitution; 7
  occurrences across 5 reusable workflow files; concurrency-guarded)
- Task 6.3: runtime-rollback.yml (workflow_dispatch + target_pubsha;
  pre-PR validation chain — pubsha-revision-label assertion, inventory-match,
  overlay-smoke; Option A authorization model documented)
- Task 6.5: runtime-check-private-freshness.yml (Mondays cron, path-scoped
  denominator per spec §13 Q7; heartbeat infra deferred to #204 with hard SLA)
- Task 6.6: runtime-prune-pending.yml (daily cron, schedule-source branching
  + dry_run input normalization)
- Task 6.7: §13 Q5 marketplace SHA cadence — split into standalone spec PR
  + Phase 6 manifest comment referencing it
- Task 6.8: §13 Q4 forked-PR auth deferral text in Phase 6 PR body
- Task 6.9: post-merge rollback rehearsal (happy path + broken-target path)
- Task 6.10: PR opening with Closes #144

Plan went through baseline refresh (post-#143/#200/#202/v2.2.0) plus two
inquisitor passes:

- Pass 1 (6 charges, 5 folded, 1 deferred to #203 Option C auto-merge)
- Pass 2 (5 charges, 5 folded — including 2 Critical regressions
  introduced by Pass 1 fixes that Pass 2 caught: bogus
  dev.glitchwerks.ci.pubsha label + GHA expression || semantics)

Plan refresh log at end of file documents every shift since 2026-05-02.

Follow-up issues filed:
- #203 — Option C auto-merge with quality-gate self-validation
- #204 — heartbeat infra for runtime-check-private-freshness.yml (blocks Phase 7)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per spec §6.2 STAGE 5 + §9.3. After STAGE 4 succeeds (base + all 3
overlays smoke green), opens a single PR rewriting all 7 @sha256:
occurrences across 5 reusable workflows in one commit.

- Per-overlay-name regex substitution (review/fix/explain digests
  cannot cross-contaminate); count-asserted at exactly 7
- PR body lists OCI labels (private_sha, marketplace_sha, cli_version)
  for each promoted image, plus rollback instructions
- peter-evans/create-pull-request@5f6978fa SHA-pinned (v8.1.1)
- actions/create-github-app-token@1b10c78c SHA-pinned (v3.1.1)
- App-token authenticated so the PR shows under bot identity and
  re-triggers downstream consumer workflows when merged
- Concurrency group serializes promotes (cancel-in-progress: false)
- if: github.event_name != 'pull_request' so PR validation never
  cascades into a promote

Also adds STAGE 4 label-shape assertion (smoke-test.sh §d.4): the
revision label must be 40-hex, catching a future Dockerfile refactor
that sets the label to something truthy-but-malformed (which would
poison runtime-rollback.yml's digest resolution).

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diagnostic addition to surface the App's slug, name, and permission
grants when run. Needed to confirm whether the App has the 'workflows'
permission, which STAGE 5 (and Task 6.3 runtime-rollback.yml) require
to push edits to .github/workflows/**.

Background: STAGE 5 dry-run (run 25410793517) failed at the final push
with "refusing to allow a GitHub App to create or update workflow ...
without 'workflows' permission". This step makes the gap visible so
the operator can grant the missing scope or pivot auth.

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…144)

Per spec §9.3 + Phase 6 plan Task 6.3. workflow_dispatch with target_pubsha
input opens a PR reverting all 7 @sha256: references to the digest set
tagged :<target_pubsha> in GHCR. Mirrors STAGE 5's PR mechanism in reverse;
merging the PR is the rollback.

Path: lives at .github/workflows/runtime-rollback.yml (master plan said
runtime/rollback.yml, but GitHub Actions only auto-discovers workflows
under .github/workflows/). Documented as a path deviation in the Phase 6
plan and PR body.

Authorization model (plan §"Authorization model — Option A"):
  1. workflow_dispatch permission gates invocation
  2. OCI org.opencontainers.image.revision label assertion (Step 6.3.5a)
     — refuses to roll back if a :<pubsha> tag has been retagged
  3. inventory-match + smoke validation against target digests
     (Steps 6.3.5b/c) — same battery STAGE 4 runs at build time
  4. Human merge approval (final gate)

Validations before opening the PR:
  - target_pubsha shape: 40-char hex
  - GHCR :<pubsha> tag exists for all 3 overlays
  - org.opencontainers.image.revision label on resolved digest == target
  - expected.yaml inventory contract satisfied
  - smoke-test.sh passes (CLI binary works, persona files mounted, etc.)
  - sed substitution lands exactly 7 occurrences (assertion in workflow)

dry_run normalization (Step 6.3.6b, inquisitor pass 2 charge 3 fix):
  GHA `false || 'true'` evaluates to the string 'true' in expression
  context — would silently invert operator intent. Raw input is read
  through env and normalized to a literal true/false in a shell step
  before passing to peter-evans/create-pull-request's draft input.

Action SHAs (resolved at PE-1/PE-2):
  - peter-evans/create-pull-request@5f6978fa (v8.1.1)
  - actions/create-github-app-token@1b10c78c (v3.1.1)

Note (#206): smoke-test step requires CLAUDE_CODE_OAUTH_TOKEN secret;
already in scope on this repo. Step 6.3.5b/c require runner UID and yq
(installed inline; same pattern STAGE 4-overlay uses).

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Per spec §11.3 + §13 Q7. Weekly schedule (Mon 08:00 UTC) reads the
manifest's pinned ci-v* tag, clones glitchwerks/claude-configs, and
computes git log <pinned>..<main> scoped to imports_from_private paths.
Opens a deduped issue when (path-scoped commits > 0) AND (calendar gap > 14d).

- Path-scoped denominator (Q7) — drift on non-imported paths does not page
- Idempotent dedupe via 'gh issue list --search "<title> in:title"'
- workflow_dispatch with simulate_stale input for rehearsal
- 14-day threshold matches spec §11.3; revisit after one cycle
- App token scoped to claude-configs repo (repositories: claude-configs)
- App install on glitchwerks/claude-configs tracked in #206

Refs #144

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two real bugs in the subagent-authored workflows:

1. runtime-check-private-freshness.yml issue body: the heredoc had 10 spaces
   of indent on every line, which Markdown renders as indented code blocks.
   The table headers, bullet lists, and remediation steps would all render
   as code instead of formatted content. Switched to printf with explicit
   '\n' — whitespace-precise.

2. runtime-check-private-freshness.yml labels: used 'runtime' and
   'stale-private-ref' which don't exist in this repo's label set
   (verified 2026-05-05; same lookup that informed #205/#206).
   gh issue create would have failed at runtime with 'label not found'.
   Switched to the existing set: ci, infrastructure, epic-130.

3. runtime-prune-pending.yml: the subagent invented a behavior the plan
   didn't ask for — 'Schedule trigger always runs live, overrides dry_run'.
   For first-time deployment with zero production-test data on the
   protection-invariant filter, scheduled live deletes are too risky vs.
   scheduled report-only. Both triggers now honor dry_run with default
   'true'. After several clean weekly dry-run reports prove the filter
   correct, revisit this default in a follow-up — tracked informally.

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment block in runtime/ci-manifest.yaml documents the manual-bump
policy inline, referencing the standalone spec PR #208 that resolved
§13 Q5. Per Phase 6 plan Task 6.7.1b/6.7.2.

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude finished @cbeaulieu-gt's task in 4m 48s —— View job


PR Review: Phase 6 Promotion/Rollback Automation ✅

I've completed a comprehensive review of the Phase 6 changes. This is an ambitious and well-structured PR that adds critical lifecycle automation. The code quality is generally excellent with thoughtful error handling and validation. However, I've identified several security and reliability concerns that should be addressed before merge.


🔴 Critical Issues

1. Unverified Binary Download (Security)

File: .github/workflows/runtime-check-private-freshness.yml:41-43

The yq binary is downloaded from GitHub Releases without SHA256 verification:

sudo curl -fsSL -o /usr/local/bin/yq \
  https://github.com/mikefarah/yq/releases/download/v4.44.3/yq_linux_amd64
sudo chmod 0755 /usr/local/bin/yq

Risk: A compromised GitHub account or MitM attack could serve a malicious binary. This workflow has access to App tokens for the private claude-configs repo.

Recommendation: Add SHA256 verification:

YQ_SHA256="e28e61a33d96d3e06346dd7d37a1f5e6c3bd3dc7c642d8fbb641574da33ec817"  # v4.44.3
sudo curl -fsSL -o /usr/local/bin/yq \
  https://github.com/mikefarah/yq/releases/download/v4.44.3/yq_linux_amd64
echo "${YQ_SHA256}  /usr/local/bin/yq" | sha256sum -c -
sudo chmod 0755 /usr/local/bin/yq

The same pattern appears in runtime-rollback.yml:132-136. Both should be fixed.


🟡 High-Priority Issues

2. Potential Token Leakage in Error Messages

File: .github/workflows/runtime-check-private-freshness.yml:100

: "${APP_TOKEN:?App token not set — check APP_ID/APP_PRIVATE_KEY secrets and #206 install status}"

The :? parameter expansion will output APP_TOKEN: <value> if the variable is set but empty. While unlikely, if steps.token.outputs.token returns an empty string (rather than being unset), the full token could leak to workflow logs.

Recommendation: Use explicit conditional:

if [ -z "${APP_TOKEN:-}" ]; then
  echo "::error::App token not set — check APP_ID/APP_PRIVATE_KEY secrets and #206 install status"
  exit 1
fi

3. Hardcoded Branch Name in Diagnostic Probe

File: .github/workflows/verify-app-secrets.yml:33,45

The probe workflow writes to phase-6-promotion branch:

-f branch="phase-6-promotion"

Issue: This will fail when run on any other branch, and the diagnostic will incorrectly report missing permissions.

Recommendation: Use ${{ github.ref_name }} or make the branch name configurable via workflow_dispatch input.

4. Missing Upper Bound on Prune Deletions

File: .github/workflows/runtime-prune-pending.yml:147-151

The deletion loop has no safety limit:

jq -r '.[].id' /tmp/candidates.json | while read -r vid; do
  gh api -X DELETE \
    "/orgs/glitchwerks/packages/container/${{ matrix.package }}/versions/$vid"
done

Risk: If the jq filter malfunctions (e.g., negation logic inverts), it could delete all versions including promoted ones. While the 40-hex protection is correctly implemented, defense-in-depth suggests adding a sanity check.

Recommendation: Add a count threshold:

count=$(jq 'length' /tmp/candidates.json)
if [ "$count" -gt 50 ]; then
  echo "::error::Prune candidate count ($count) exceeds safety threshold (50) — refusing to delete"
  exit 1
fi

🟢 Medium-Priority / Code Quality

5. Typo in Rollback PR Body

File: .github/workflows/runtime-rollback.yml:226

- [ ] Confirm the rollback target predates a known regression and the 'differen-eyes' invariants are not weakened by reverting.

Should be: different-eyes (missing 't')

6. No Automated Verification of Rollback Target Provenance

File: .github/workflows/runtime-rollback.yml:225-226

The rollback workflow asks reviewers to manually verify the target pubsha appears in promotion history:

- [ ] Confirm `${TARGET}` appears in `git log --grep='auto/runtime-promote-' main`

Observation: This could be automated as a validation step before opening the PR. The workflow already has checkout access and could run:

if ! git log --grep="auto/runtime-promote-${TARGET}" main --format=%H | grep -q "$TARGET"; then
  echo "::error::target_pubsha $TARGET does not appear in promotion history"
  exit 1
fi

Trade-off: The manual review provides a human gate, but automation would catch operator error (e.g., typo'd pubsha) earlier.

7. Negative Time Gap Not Handled

File: .github/workflows/runtime-check-private-freshness.yml:138

DAYS_GAP=$(( ( $(date -d "$MAIN_DATE" +%s) - $(date -d "$PINNED_DATE" +%s) ) / 86400 ))

If MAIN_DATE is somehow older than PINNED_DATE (e.g., force-push that rewinds main in the private repo), DAYS_GAP will be negative. The staleness check [ "$DAYS_GAP" -gt 14 ] would correctly not fire, but the issue title would be confusing: "… -5 days behind main".

Recommendation: Add a sanity check:

if [ "$DAYS_GAP" -lt 0 ]; then
  echo "::warning::Pinned ref ($PINNED_DATE) is NEWER than main ($MAIN_DATE) — possible force-push in private repo"
  exit 0
fi

8. Concurrent Promote PRs

File: .github/workflows/runtime-build.yml:26-28

concurrency:
  group: stage-5-promote
  cancel-in-progress: false

Observation: The concurrency group serializes promote jobs, but if two commits land on main in rapid succession, the second promote PR will open against the first's PR branch (not yet merged), leading to a conflict or double-promotion.

Current behavior: The second promote will likely fail when it tries to create a PR with the same branch name, or succeed with a no-op diff if the first PR already merged.

Assessment: Acceptable for current usage, but worth monitoring. If this becomes a problem, the promote could check for an existing open runtime-promote-* PR and skip or auto-merge it first.


✅ Strengths

  1. Comprehensive validation chains — rollback runs inventory-match + smoke-test before opening PR
  2. Defense-in-depth — revision label check catches tag retag drift (Spec defect: GHCR does not support tag immutability — pivot to digest-pin-only reproducibility #173 GHCR immutability gap)
  3. Excellent error messages — every failure path has a clear, actionable message
  4. Shell best practices — consistent set -euo pipefail, proper quoting, shellcheck compliance
  5. Safe defaults — both prune and rollback default to dry_run=true
  6. Idempotent operations — freshness check dedupe prevents issue spam
  7. Well-documented trade-offs — inline comments explain decisions (e.g., cancel-in-progress: false rationale)
  8. OCI label assertion — smoke-test.sh now validates revision label is 40-hex (catches future Dockerfile errors)

📋 Additional Observations

Documentation

  • The PR body is exceptional — clear summary, verification table, path deviations documented
  • Inline workflow comments are thorough and reference spec sections
  • The runtime/ci-manifest.yaml comment block (lines 11-17) clearly documents the manual marketplace SHA bump policy

Testing

  • PR references actionlint clean and inquisitor pass 2 coverage
  • STAGE 5 dry-run verified in run 25411378215
  • Post-merge Task 6.9 rollback rehearsal deferred (reasonable given workflow_dispatch limitation on non-default branch)

Architecture

  • Atomic 7-occurrence substitution with count assertion is solid
  • Path-scoped freshness denominator (§13 Q7) correctly implemented
  • Protection invariant in prune (40-hex tag = skip) is correctly expressed in jq

🎯 Recommendation

Merge after addressing Critical (#1) and High-Priority (#2, #3, #4) issues. The medium-priority items can be addressed in follow-up PRs if time is constrained, though #5 (typo) is trivial to fix now.

The code quality and architectural decisions are excellent. The security issues are fixable with small, localized changes. Once those are resolved, this PR will be a robust foundation for Phase 6.


Summary of Required Changes

# Priority Issue File(s) Estimated Fix Time
1 🔴 Critical Add SHA256 verification for yq download runtime-check-private-freshness.yml, runtime-rollback.yml 10 min
2 🟡 High Fix token leakage risk in error message runtime-check-private-freshness.yml 5 min
3 🟡 High Remove hardcoded branch in probe verify-app-secrets.yml 5 min
4 🟡 High Add safety threshold to prune deletion runtime-prune-pending.yml 5 min
5 🟢 Medium Fix typo: differen-eyes → different-eyes runtime-rollback.yml 1 min

Total estimated time to address blocking issues: ~25 minutes


Review Complete

I've thoroughly reviewed all 8 files (2342 additions, 0 deletions) across code quality, security, performance, test coverage, and documentation dimensions. The PR demonstrates excellent engineering practices and thoughtful design. Addressing the identified security issues will make this ready to merge.


CI's raven-actions/actionlint@v2 runs shellcheck at -S style (stricter than
my local default) and surfaced 20 errors my local actionlint missed.
Combined with the bot's PR review (1 Critical + 3 High + advisory items),
this fixup addresses everything blocking the merge.

Actionlint fixes:
- runtime-build.yml SC2126: grep | wc -l → grep -cE | awk -F: sum (multi-file
  grep -c outputs file:count, must sum)
- runtime-rollback.yml SC2018/2019 (×5): tr a-z A-Z → tr '[:lower:]' '[:upper:]'
- runtime-rollback.yml SC2126: same grep -c+awk pattern
- runtime-check-private-freshness.yml SC2129: combine 4 echo>>$GITHUB_ENV
  redirects into a single brace block (×2 places)
- runtime-check-private-freshness.yml SC2016: relocated `# shellcheck disable`
  inside the bash script body (was on outer YAML, not seen by shellcheck)

Bot review #209:
- Critical #1: yq SHA256 verification on runtime-rollback.yml + runtime-check-
  private-freshness.yml. Hash a2c097180dd884a8d50c956ee16a9cec070f30a7947cf4ebf
  87d5f36213e9ed7 verified against the official upstream checksums file
  (https://github.com/mikefarah/yq/releases/download/v4.44.3/checksums, SHA-256
  column per checksums_hashes_order)
- High #2: token leakage via :? expansion → explicit if [ -z ] check (the :?
  pattern echoes the variable's value to stderr if set-but-empty)
- High #3: hardcoded `phase-6-promotion` branch in verify-app-secrets probe
  → ${{ github.ref_name }} (probe now works on any branch)
- High #4: prune safety threshold — refuse to delete if candidate count > 50
  for a single package (defense-in-depth for protection-invariant filter bugs)
- Medium #5: typo 'differen-eyes' → 'different-eyes'
- Medium #6: rollback verifies target_pubsha appears in promote history
  (`git log --grep` against main); fetch-depth bumped to 0 to enable the
  walk. Catches operator typo before the rollback PR opens
- Medium #7: negative-gap guard in freshness alarm — exit 0 with warning
  if pinned-ref is somehow newer than main (force-push / main rewind)

Refs #144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude finished @cbeaulieu-gt's task in 2m 36s —— View job


Incremental Review: Excellent Response to Feedback ✅

I've reviewed the changes from commit b92f5e58395871 (addressing #209 review feedback).


Summary

All 7 issues from the previous review have been fixed, and the fixes exceed expectations. This is an exemplary response to code review feedback with thorough security hardening, improved error handling, and comprehensive inline documentation.


✅ Issues Addressed

🔴 Critical Issue #1: Unverified Binary Download → FIXED

Files: runtime-check-private-freshness.yml, runtime-rollback.yml

Added SHA256 verification for yq binary downloads with proper error handling:

YQ_SHA256="a2c097180dd884a8d50c956ee16a9cec070f30a7947cf4ebf87d5f36213e9ed7"
sudo curl -fsSL -o /usr/local/bin/yq \
  https://github.com/mikefarah/yq/releases/download/v4.44.3/yq_linux_amd64
echo "${YQ_SHA256}  /usr/local/bin/yq" | sha256sum -c -

Verified: Hash matches official checksums file. Includes detailed comment explaining verification source.


🟡 High-Priority Issue #2: Token Leakage Risk → FIXED

File: runtime-check-private-freshness.yml:105-112

Replaced risky :? parameter expansion with explicit conditional:

if [ -z "${APP_TOKEN:-}" ]; then
  echo "::error::App token not set — check APP_ID/APP_PRIVATE_KEY secrets and #206 install status" >&2
  exit 1
fi

Verified: No risk of token exposure. Comment explains the security rationale.


🟡 High-Priority Issue #3: Hardcoded Branch Name → FIXED

File: verify-app-secrets.yml:66,78

Changed from hardcoded "phase-6-promotion" to dynamic "${{ github.ref_name }}":

-f branch="${{ github.ref_name }}"

Verified: Probe will work on any branch.


🟡 High-Priority Issue #4: Missing Prune Safety Threshold → FIXED

File: runtime-prune-pending.yml:142-152

Added defense-in-depth threshold with detailed error messaging:

if [ "$count" -gt 50 ]; then
  echo "::error::Prune candidate count ($count) exceeds safety threshold (50) for ${{ matrix.package }} — refusing to delete. Investigate via workflow_dispatch with dry_run=true and reduce candidates manually if intentional." >&2
  jq -r '.[] | "  id=\(.id) created=\(.created_at) tags=\(.tags | join(","))"' /tmp/candidates.json | head -20 >&2
  exit 1
fi

Verified: Threshold is generous (50) and provides actionable diagnostics on failure.


🟢 Medium-Priority Issue #5: Typo → FIXED

File: runtime-rollback.yml:256

Changed differen-eyes to different-eyes in reviewer checklist.

Verified: Typo corrected.


🟢 Medium-Priority Issue #6: Automated Rollback Provenance → FIXED (Exceeded Expectations!)

File: runtime-rollback.yml:76-91

Added entirely new validation step that automates the manual checklist verification:

if ! git -C "$GITHUB_WORKSPACE" log --grep="auto/runtime-promote-${TARGET}" main --format=%H | head -n1 | grep -q .; then
  echo "::error::target_pubsha $TARGET does not appear in promote history (no commit on main matches 'auto/runtime-promote-${TARGET}'). Either the pubsha is invalid or the promote was reverted before merging." >&2
  exit 1
fi

Also updated fetch-depth: 0 (from 1) to support the git log check.

Verified: This goes beyond the suggestion — the check is now automated rather than just better documented. Catches operator typos before the PR opens.


🟢 Medium-Priority Issue #7: Negative Time Gap → FIXED

File: runtime-check-private-freshness.yml:149-155

Added sanity check with clear warning:

if [ "$DAYS_GAP" -lt 0 ]; then
  echo "::warning::Pinned ref ($PINNED_DATE) is NEWER than main ($MAIN_DATE) — possible force-push or main rewind in private repo. Skipping staleness check."
  exit 0
fi

Verified: Handles edge case gracefully with actionable diagnostic.


🎯 Additional Improvements

Beyond the review items, the commit also addresses actionlint and shellcheck issues:

  1. SC2126 (grep|wc -l): Fixed in both runtime-build.yml and runtime-rollback.yml

    • Changed to grep -c with awk summation for multi-file counts
    • Includes comment explaining why grep -c alone doesn't work
  2. SC2129 (multiple >> redirects): Fixed in runtime-check-private-freshness.yml

    • Combined into single brace block
  3. POSIX compliance: Improved tr usage throughout

    • Changed tr a-z A-Z to tr '[:lower:]' '[:upper:]'
  4. Git fetch reliability: Simplified fetch logic in freshness check

    • Removed redundant fetch-and-fallback pattern

📊 Review Across 5 Dimensions

Dimension Rating Notes
Code Quality ✅ Excellent All shellcheck issues fixed, POSIX-compliant, clear inline docs
Security ✅ Excellent SHA256 verification added, token leakage eliminated, defense-in-depth threshold
Performance ✅ No concerns Changes don't introduce performance issues; awk summation is efficient
Test Coverage ✅ N/A Workflow files; defensive checks act as runtime validation
Documentation ✅ Excellent Every fix includes detailed comment with review item reference

✅ Recommendation

Approve and merge. All blocking issues have been resolved. The fixes are well-implemented, thoroughly documented, and in several cases exceed the original recommendations.

This is a model response to code review — every issue addressed, additional improvements made proactively, and comprehensive inline documentation explaining the "why" behind each change.


🎖️ Strengths of This Response

  1. 100% issue coverage — all 7 review items addressed
  2. Defense-in-depth — multiple layers of validation (SHA256, thresholds, provenance)
  3. Excellent error messages — every failure path has clear, actionable diagnostics
  4. Comprehensive inline docs — each fix references the review item number and explains rationale
  5. Proactive improvements — addressed shellcheck/actionlint issues beyond review scope
  6. Exceeded expectations — Issue feat: automate ci-failure → apply-fix pipeline on high-confidence diagnoses #6 is now fully automated, not just improved

Two follow-up fixes after CI revealed:

1. SC2016 (9 errors) wasn't silenced by file-top disable directive. The
   directive applies file-wide ONLY when shellcheck recognizes the script
   as a 'file'; CI's raven-actions/actionlint passes each run-block as
   isolated content. Solution: put the disable directly above the brace
   block (compound command) and the COMMIT_LOG/PATH_LIST commands —
   shellcheck applies the disable to the whole compound command.
   The backticks in printf format strings are intentional Markdown
   syntax, not command substitution.

2. The #6 promote-history check grepped for 'auto/runtime-promote-<TARGET>'
   which only exists as a branch name, not in any commit message — the
   check would have failed every time. Squash-merged STAGE 5 PRs produce
   commits with the PR title as subject: 'promote: runtime images @<sha>'.
   Fixed grep pattern + switched to origin/main since 'main' alone may
   not be available in shallow checkouts.

Refs #144, #209
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude finished @cbeaulieu-gt's task in 1m 33s —— View job


Incremental Review: Excellent Fixup ✅

Reviewed commit 2360bff (fixup: SC2016 brace-block disable + correct promote-history grep).


Summary

This fixup addresses two items:

  1. SC2016 shellcheck suppressions — properly documented disables for intentional backticks in Markdown formatting strings
  2. Promote-history grep fix — corrects a logic bug in rollback validation that would have caused false negatives

Verdict: ✅ Approve. Both changes are correct improvements with excellent documentation.


Changes Analyzed

1. SC2016 Suppressions (runtime-check-private-freshness.yml)

Added three shellcheck disable directives with detailed justification comments:

# SC2016 disabled: the git --pretty format string contains literal
# backticks intended for Markdown rendering, not shell command
# substitution. Single quotes preserve them as-is.
# shellcheck disable=SC2016
COMMIT_LOG=$(git -C /tmp/cc log --no-merges \
  --pretty=format:'- `%h` %s' \
  ...

Assessment: ✅ Correct. SC2016 flags literal $var in single-quoted strings as "potentially unintended non-expansion." In these cases, the backticks and format specifiers (%h, %s) are intentional for git pretty-printing and Markdown rendering. The disable comments clearly explain this.

Best practice followed: Each disable includes a detailed comment explaining why the suppression is safe, referencing the specific pattern being suppressed.


2. Promote-History Grep Fix (runtime-rollback.yml)

Before (broken):

git log --grep="auto/runtime-promote-${TARGET}" main

After (correct):

git log --grep="promote: runtime images @${TARGET}" origin/main

Why this is a bug fix:

  • The original grep searched for auto/runtime-promote-<sha>, which is the branch name created by STAGE 5
  • Branch names don't appear in commit messages — only the PR title gets preserved as the merge commit subject
  • The PR title format is "promote: runtime images @<github.sha>" (per runtime-build.yml STAGE 5 step "Open digest-bump PR")
  • The fixed grep correctly matches the merge commit subject

Additional improvements:

  • Changed mainorigin/main for explicit remote reference
  • Updated error message to reflect the actual search pattern
  • Added comprehensive comment explaining what gets matched and why

Impact: This fix makes the rollback provenance validation actually work. The original version would have always failed to find promoted versions, blocking legitimate rollbacks.


Review Across 5 Dimensions

Dimension Rating Notes
Code Quality ✅ Excellent Clear comments, correct logic, follows shellcheck best practices
Security ✅ No concerns Changes don't introduce security risks; grep fix improves validation reliability
Performance ✅ No impact Suppressions and grep pattern are equivalent in performance
Test Coverage ✅ N/A Workflow files; runtime validation is the test
Documentation ✅ Excellent Each disable has detailed justification; promote-history comment is thorough

✅ Recommendation

Approve and merge. This fixup:

  • Properly suppresses false-positive shellcheck warnings with clear documentation
  • Fixes a real bug in rollback validation logic
  • Improves inline documentation

No issues found. The promote-history grep fix is particularly important — without it, rollback provenance validation would silently fail.


Todo

  • Examine incremental diff (83958712360bff)
  • Assess if foundational code is touched
  • Review changes across 5 dimensions
  • Post findings

…nt (closes #184)

The GitHub REST API's /issues/{n}/comments endpoint does NOT support
sort/direction query params — those are silently ignored and results
come back in created_at ASC order. The action's previous logic:

  gh api ".../comments?sort=updated&direction=desc" --jq '... | first | ...'

picked the OLDEST bot comment, not the latest. On any PR with >1 bot
review (every PR with a fixup), the gate would grep stale review text
and fire false-positive failures (or miss real findings).

Fix: drop the bogus sort params (the API ignores them anyway) and sort
in jq:

  --jq 'map(...) | sort_by(.updated_at) | last | .body // ""'

This was demonstrated empirically on PR #209 round 3:
- API returns oldest first regardless of sort=updated&direction=desc
- 'first' picked the round-1 review with `Critical Issue #1` headers
  in the 'Issues Addressed' section (those are described fixes, not
  findings — but the gate's grep can't tell the difference)
- Latest review (the 'Excellent Fixup' approval) had 0 markers

Note (dogfooding limitation): claude-pr-review.yml references
pr-review@v2, not the local branch, so this fix only takes effect
after #209 merges and v2 is retagged. PR #209's own quality-gate
remains red until then; admin-merge with reference to #184.

Closes #184
Refs #144, #209

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude finished @cbeaulieu-gt's task in 1m 36s —— View job


Incremental Review: Quality-Gate Comment Selection Fix ✅

Reviewed commit ec024cc (fix(pr-review): quality-gate now picks the genuinely-latest bot comment).


Summary

This is a well-executed bug fix for issue #184. The change corrects how the quality-gate status check identifies the latest PR review comment from github-actions[bot].

Verdict: ✅ Approve. No blocking issues found. The fix is correct, well-documented, and addresses a real bug.


What Changed

The code that fetches the latest bot comment for quality-gate severity scanning was relying on GitHub API query parameters (sort=updated&direction=desc) that are silently ignored by the /issues/{n}/comments endpoint.

Before (broken):

BODY=$(gh api "repos/$REPO/issues/$PR_NUMBER/comments?per_page=100&sort=updated&direction=desc" \
  --jq 'map(select(.user.login == "github-actions[bot]")) | first | .body // ""' \

After (correct):

BODY=$(gh api "repos/$REPO/issues/$PR_NUMBER/comments?per_page=100" \
  --jq 'map(select(.user.login == "github-actions[bot]")) | sort_by(.updated_at) | last | .body // ""' \

Why this matters: Without manual sorting, the code was selecting the first bot comment (oldest by created_at) instead of the most recently updated one. On PRs with multiple review runs, this meant the quality gate could be checking severity markers against an outdated review, causing incorrect pass/fail signals.


Review Across 5 Dimensions

Dimension Rating Notes
Code Quality ✅ Excellent Clear fix with comprehensive inline documentation explaining the API limitation and solution
Security ✅ No concerns No security implications; proper auth maintained via GH_TOKEN
Performance ✅ No concerns Minimal impact; jq sorting on ≤100 items is negligible
Test Coverage ⚠️ Limited Workflow file — no unit tests available; fix was presumably validated via observed behavior
Documentation ✅ Excellent Inline comment explains the problem, solution, and pagination trade-off

✅ Strengths

  1. Root cause clearly identified: The comment explicitly states that /issues/{n}/comments doesn't support sort/direction params
  2. Correct fix: Manual sort_by(.updated_at) | last ensures genuinely-latest selection
  3. Honest trade-off documentation: The 100-comment pagination limit is acknowledged inline
  4. Proper fallback: || echo "" and subsequent null-check handle the no-comments edge case
  5. References issue Quality-gate selects wrong comment when multiple github-actions[bot] workflows post on a PR #184: Traceable back to the original bug report

📊 Observations

Pagination Trade-Off (Acceptable)

The comment notes: "100 comments is enough for any realistic PR; if a PR ever exceeds, the gate misses newer pages."

Assessment: This is a reasonable pragmatic trade-off:

  • Most PRs have far fewer than 100 comments
  • The failure mode (using an older bot comment) is not catastrophic — it might miss severity markers from the latest review, but won't crash the workflow
  • Full pagination would add significant complexity (looping through pages, managing rate limits)
  • The 100-comment limit is explicitly documented for future maintainers

If this becomes a problem in practice, it can be addressed with pagination logic later. For now, this is the right balance of simplicity vs. completeness.


🎯 Recommendation

Approve and merge. This fix:

  • Correctly addresses the root cause (API limitation)
  • Improves reliability of the quality-gate status check
  • Is well-documented with clear inline comments
  • Has no security, performance, or correctness issues

No changes required.


Todo

  • Examine incremental diff (2360bffec024cc)
  • Assess if foundational code is touched
  • Review changes across 5 dimensions
  • Post findings

@cbeaulieu-gt cbeaulieu-gt merged commit b080630 into main May 6, 2026
17 of 18 checks passed
@cbeaulieu-gt cbeaulieu-gt deleted the phase-6-promotion branch May 6, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment