fix(ci): repair macOS rustup proxy in release workflows + workspace-wide actionlint cleanup#245
Merged
Conversation
…ide actionlint cleanup
Root cause: `macos-latest` runner images ship with an occasionally-stale `~/.cargo/bin/cargo` symlink pointing at `rustup-init` (the installer) instead of the proxy that forwards to the active toolchain's real cargo. `rustup show` succeeded (it invokes rustup directly), but the next step's `cargo build` hit the broken proxy and fell through to the installer's argument parser with `error: unexpected argument 'build' found`. ~50% of `release-cache-warm.yml`'s macOS jobs were failing this way over the past week.
Three changes to release-cache-warm.yml::warm (the failing job):
* Install-step now appends 'rustup default "$(rustup show active-toolchain | awk '{print $1}')"' after 'rustup show'. Rewrites the proxy binaries in ~/.cargo/bin so symlinks resolve to the active toolchain's real cargo. Fail-fast smoke check (cargo --version / rustc --version) surfaces any residual breakage in ~1s instead of letting Swatinem/rust-cache + cargo build burn 30+ min on the same root cause.
* Single auto-retry on the cargo build step (2 attempts, 10-s sleep). Absorbs transient network/sccache flakes without manual re-runs.
* 'continue-on-error: true' at the warm-job level. The workflow's own header comment documents that cache-warming is best-effort by design; this matches that intent and stops noisy ❌ in PR checks for a single-platform image flake. Real regressions still surface via run history + per-job summary.
Mirror of fix #1 applied to release.yml::build-release-binaries — that workflow's tag-dispatched runs would hit the same broken proxy on a freshly-flaky runner image, and on the production release pipeline a 30+ min late failure delays the release by a full re-run cycle. Fixes #2 and #3 are NOT mirrored: release.yml is production critical-path and should fail loudly so notify-failure fires.
Adjacent actionlint cleanup in the same PR (workspace-wide, zero new warnings):
* release.yml — 5 clusters: SC2086 (unquoted $GITHUB_OUTPUT / $GITHUB_STEP_SUMMARY writes), SC2129 (group individual redirects with { ... } >> "$out"), SC2010 ('ls | grep' replaced with explicit glob loop over shipping-set names), SC2035 ('sha256sum *' / 'shasum -a 256 *' switched to './*' form to guard against future filenames starting with '-').
* auto-rerun-transient.yml — 3x SC2016 false positives where literal backticks inside single-quoted printf format strings looked like command-substitution markers. Rewritten as 'echo' with backslash-escaped backticks (idiomatic for 'print this template string'). No per-line suppression directive needed.
* cargo-vet-refresh.yml — 1x SC2129 (grouped redirect).
* dependabot-review.yml — 3x SC2129 (grouped redirects in delta-summary table, newly-resolved-crates block, dropped-crates block).
Verification: 'actionlint .github/workflows/*.yml' is now clean across every workflow (zero warnings, zero errors). All script behavior is byte-identical to before; bash test runs of the rewritten echo / grouped-redirect blocks produce the same output as the originals.
githubrobbi
added a commit
that referenced
this pull request
May 15, 2026
…lds (#254) * chore: development v0.5.99 - comprehensive testing complete [auto-commit] * fix(ci): repair rustup proxy AFTER cache restore on macOS release builds Release pipeline #97 (v0.5.98) failed on the aarch64-apple-darwin build with: error: unexpected argument 'build' found Usage: rustup-init[EXE] [OPTIONS] Stack backtrace: rustup_init::run_rustup_inner ... PR #245 already added a proxy-repair line ('rustup default ...') to the 'Install pinned nightly' step, and the in-step 'cargo --version' smoke check still passes — but the very next step, 'Swatinem/rust-cache', restores '~/.cargo/bin/' from a prior run's cache. On macos-latest that cache holds a poisoned 'cargo' symlink pointing at 'rustup-init' (the installer) rather than the toolchain proxy. The restore overwrites the freshly-repaired '~/.cargo/bin/cargo', and the subsequent 'cargo build' picks up the poisoned proxy and exits with the rustup-init backtrace before reaching any cargo subcommand. Empirical proof from the failing job (76219153458): 16:36:34Z cargo --version -> cargo 1.97.0-nightly (proxy healthy) 16:36:35Z Swatinem/rust-cache: restoring ~/.cargo/bin/ ... 16:36:54Z cargo build -> rustup-init backtrace (proxy poisoned) Fix: add a second proxy-repair step *after* the cache-restore step and *before* the build step. Re-running 'rustup default "$(rustup show active-toolchain | awk '{print $1}')"' rewrites the proxy binaries in '~/.cargo/bin' so the build resolves to the active toolchain's cargo even when the restored cache was poisoned. On the next successful run, 'Swatinem/rust-cache' saves a clean cache and the poison clears itself. Both 'release.yml::build-release-binaries' and 'release-cache-warm.yml::warm' get the same step (with matching commentary) — without the fix in the warm workflow, every macOS warm run would re-save the poisoned cache and perpetuate the regression. Step is gated on 'runner.os == macOS' so Linux and Windows legs (which do not exhibit this symptom) are unaffected. Pre-cache repair line is intentionally kept so the in-step 'cargo --version' smoke check still fails fast on a broken runner-image proxy.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the ~50% failure rate observed on
release-cache-warm.yml's macOS leg over the past week, plus a workspace-wide actionlint cleanup of every other workflow.Root cause
On
macos-latestrunner images the~/.cargo/bin/cargoproxy is occasionally a stale symlink pointing atrustup-init(the installer binary) instead of the proxy that forwards to the active toolchain's real cargo.rustup showsucceeded because it invoked rustup directly; the very next step'scargo buildhit the broken proxy and fell through to the installer's argument parser:Linux + Windows runners were unaffected.
Changes
release-cache-warm.yml::warm(the failing job) — three fixesrelease.yml::build-release-binaries— Fix A mirroredSame toolchain-install repair applied to the production release pipeline, because tag-dispatched runs would hit the same broken proxy on a freshly-flaky runner image — and on the production critical path a 30+ min late failure delays the release by a full re-run cycle.
Fixes B and C are NOT mirrored — `release.yml` is production critical-path; it should fail loudly so the `notify-failure` issue-opener fires.
Workspace-wide actionlint cleanup (no new warnings remain)
While auditing the two release workflows, every other workflow in `.github/workflows/` was actionlint-scanned and the pre-existing shellcheck warnings (all info / style level — none of them latent bugs) were resolved:
Verification
Discipline notes (per request)