fix(ci): fail-fast pre-flight + per-job permissions on release publish by githubrobbi · Pull Request #255 · skyllc-ai/UltraFastFileSearch

githubrobbi · 2026-05-15T18:35:36Z

What

Release pipeline #98 (v0.5.99) built binaries on all three platforms for ~32 minutes and then died in <1 second at the 📦 Create GitHub Release step with:

```
HTTP 403 — Resource not accessible by integration
```

Root cause was a repo-level setting flip: `Settings → Actions → General → Workflow permissions` had been switched from "Read and write" to "Read-only". That toggle clamps every job's `GITHUB_TOKEN` to read scope at runtime regardless of what the workflow file declares — so the workflow's top-level `permissions: contents: write` was silently downgraded to read. v0.5.96 (the previous successful release ~16 h earlier) ran with the same workflow file and succeeded, confirming the file was never the problem.

The org's audit log is not retrievable (Free-tier org — `gh api /orgs/skyllc-ai/audit-log` returns 404), so the actor/timestamp of the flip is unrecoverable.

Why this PR is needed even though the immediate symptom can be fixed by re-flipping the toggle

A repo-level toggle that bypasses every safeguard in YAML is a perfect silent-rot vector: the next time it's flipped (intentionally or accidentally — by you, a co-maintainer, an org admin, or a GitHub-side policy nudge), the next release will silently burn ~30 minutes of build time before dying at the publish step. This PR makes that scenario fail in ~1 second with a precise error message.

The two belt-and-suspenders changes

1. Pre-flight permissions probe in `release-preparation`

Creates a draft release with a throwaway tag (`preflight-permcheck-run<run_id>-attempt<run_attempt>`) and immediately deletes it on success. Exercises the EXACT REST path that `softprops/action-gh-release` uses 30+ minutes later in `create-github-release`, so a permissions clamp surfaces here in ~1 s with an actionable error message pointing at the toggle that needs flipping (repo level, with an org-level fallback note for free-tier orgs where the cascade can come from the org).

```yaml

name: Pre-flight — verify Actions token can create releases
shell: bash
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
TEST_TAG="preflight-permcheck-run${{ github.run_id }}-attempt${{ github.run_attempt }}"
if ! gh release create "$TEST_TAG" --repo "${{ github.repository }}" --draft …; then
# ::error annotation + heredoc with full remediation steps
exit 1
fi
gh release delete "$TEST_TAG" --repo "${{ github.repository }}" --yes || \
echo "::warning::orphan is harmless"
```

Why a synthetic-release probe rather than a direct `/repos/.../actions/permissions/workflow` API call: that endpoint requires the `administration: read` scope, which neither this workflow nor the default `GITHUB_TOKEN` carries. Widening to add it would expand the permission surface. The create-draft probe stays inside `contents: write` which the job already declares (no new scope grants).

Cleanup safety: Draft releases do not create the underlying git tag (only published releases do), so a failed cleanup leaks at worst an invisible draft row — never an orphan tag. The throwaway tag name includes both `run_id` and `run_attempt` so concurrent or retried invocations cannot collide.

2. Explicit per-job `permissions:` block on `create-github-release`

Pins `contents: write`, `id-token: write`, `attestations: write` at the point of use rather than relying on inheritance from the workflow-level block ~520 lines up.

Does NOT change runtime behaviour by itself — the repo-level clamp still wins. But it pairs with the pre-flight to make the failure mode self-documenting from the YAML alone: a reader doesn't have to scroll 500 lines to learn which scopes this job needs.

```yaml
create-github-release:
name: 📦 Create GitHub Release
…
permissions:
contents: write # action-gh-release: create v* tag + release row
id-token: write # SLSA build-provenance: OIDC for Sigstore Fulcio
attestations: write # SLSA build-provenance: post attestation to repo
```

Verification

`actionlint .github/workflows/release.yml .github/workflows/release-cache-warm.yml` — clean.
Local `lint-pre-push` gate (22 stages, including `workflow-drift`) — ✅ all green in 51s.
The probe itself will be exercised on the next release dispatch. On the current repo state (workflow permissions = "read"), it will fail with the precise error message it's designed to surface; once the repo-level toggle is flipped back to "write", subsequent runs will pass the probe in ~1 s.

What it does NOT do

It does not flip the repo-level workflow-permissions toggle (that's a repo-admin action that should be done deliberately, not as a code change).
It does not change behaviour on the existing failing run — that one is permanently failed and needs a fresh dispatch.
It does not widen the workflow's permission surface — no new `administration:read` scope, no new tokens.

Recommended next steps after merge

Flip the toggle: `Settings → Actions → General → Workflow permissions → Read and write permissions → Save` (or via API as documented in the probe's failure message).
Re-dispatch `🚀 UFFS Release Pipeline` for `v0.5.99` (no version bump needed — Cargo.toml is already at 0.5.99, no `v0.5.99` tag exists yet, and the release workflow creates the tag atomically).
The pre-flight probe will pass in <2 s; the rest of the pipeline proceeds.

Regression history: pipeline #98 / v0.5.99. Related prior PRs in the same release-stability sweep: #251 (show-binary-sizes shell bug), #254 (macOS rustup proxy post-cache-restore).

Release pipeline #98 (v0.5.99) burned ~32 min building binaries on all three platforms, then died in <1 s at the 'Create GitHub Release' step with the cryptic: HTTP 403 — Resource not accessible by integration Root cause was a repo-level setting flip: 'Settings → Actions → General → Workflow permissions' had been switched from 'Read and write' to 'Read-only', which clamps every job's GITHUB_TOKEN to read scope at runtime regardless of what the workflow file declares. v0.5.96 (the previous successful release ~16 h earlier) ran with the same workflow file and succeeded, confirming the file was never the problem. The audit log is not retrievable (free-tier org), so the actor/timestamp of the flip is unrecoverable. This commit prevents the next ~30 min of silent build time: 1. Pre-flight permissions probe in 'release-preparation'. Creates a draft release with a throwaway tag (run_id+attempt) and immediately deletes it on success. Exercises the EXACT REST path that softprops/action-gh-release uses 30+ min later, so a permissions clamp surfaces in ~1 s with a precise error message pointing at the toggle that needs flipping (repo level, with an org-level fallback note). Draft releases do not create git tags, so failed cleanup leaks at worst an invisible draft row — never an orphan tag. 2. Explicit per-job 'permissions:' block on 'create-github-release' pinning 'contents: write', 'id-token: write', 'attestations: write'. Documents the scope needs at the point of use rather than relying on inheritance from the top-level block ~520 lines up. Does NOT change runtime behaviour by itself — the repo-level clamp still wins — but pairs with the pre-flight to make the failure mode self-documenting from the YAML alone. Why a synthetic-release probe rather than a direct '/repos/.../actions/permissions/workflow' API call: that endpoint requires the 'administration: read' scope, which neither this workflow nor the default GITHUB_TOKEN carries. Widening to add it would expand the permission surface; the create-draft probe stays inside 'contents: write' which the job already declares. Local validation: actionlint clean on both touched workflow files; lint-fast + lint-pre-push will gate the push.

githubrobbi enabled auto-merge (squash) May 15, 2026 18:36

githubrobbi merged commit 971bf7c into main May 15, 2026
18 checks passed

githubrobbi deleted the fix/ci-release-preflight-token-permissions branch May 15, 2026 18:48

githubrobbi mentioned this pull request May 16, 2026

🔴 Release Pipeline Failed — 9ed7487 #250

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): fail-fast pre-flight + per-job permissions on release publish#255

fix(ci): fail-fast pre-flight + per-job permissions on release publish#255
githubrobbi merged 1 commit into
mainfrom
fix/ci-release-preflight-token-permissions

githubrobbi commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

githubrobbi commented May 15, 2026

What

Why this PR is needed even though the immediate symptom can be fixed by re-flipping the toggle

The two belt-and-suspenders changes

1. Pre-flight permissions probe in `release-preparation`

2. Explicit per-job `permissions:` block on `create-github-release`

Verification

What it does NOT do

Recommended next steps after merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant