Disable cache-on-failure (save-always: false on julia-actions/cache@v3)#119
Merged
Conversation
`julia-actions/cache` v3.0.0 introduced a breaking change: caches are now saved on job failure by default. v2 only saved on success. The override `save-always: false` restores the v2 behavior. The new default is reasonable for the common case (a flaky test failure happens after an expensive depot install; keeping that install cached makes the retry fast). But it's actively harmful when the failure is in the setup itself — a half-installed depot, an aborted Pkg precompile, etc. In that case the broken state gets cached, the restore-key prefix matches subsequent runs, and every retry restores the broken state and fails identically. Reruns alone can't recover; the cache has to be manually evicted or expire. We hit this on `ITensor/ITensorNetworks.jl#373`: a fresh Windows run failed in `Pkg.test` precompilation (`ChainRulesCore is required but does not seem to be installed`), the broken state was cached, and two reruns reproduced the same failure verbatim by restoring from that cache. Set `save-always: false` on every `julia-actions/cache@v3` invocation in this repo's reusable workflows (Tests, CheckCompatBounds, Documentation, FormatCheck, FormatPullRequest, Registrator). One explanatory comment lives in `Tests.yml`; the others reference back to it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
julia-actions/cache@v3(introduced via #118) saves caches on job failure by default, where v2 only saved on success. The escape hatch documented in the v3 release notes issave-always: false. This PR sets that on everyjulia-actions/cache@v3invocation in the reusable workflows here.The new v3 default is reasonable for the common case (test-failure retries reuse the expensive depot install). But when the failure is in the setup itself — a half-installed depot, an aborted
Pkgprecompile — the broken state is cached, the restore-key prefix matches subsequent runs, and every retry restores the broken state and fails identically. Reruns alone can't recover; the cache has to be manually evicted or expire.This was hit on
ITensor/ITensorNetworks.jl#373: a fresh Windows run failed inPkg.testprecompilation (ChainRulesCore is required but does not seem to be installed), the broken state was cached, and two reruns reproduced the failure verbatim by restoring from that cache.