-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Stuck #658
Comments
I've run into the same issue. Each time it starts with |
I'm getting this too, on a GitHub-hosted runner. Logs are here: https://github.com/rcowsill/NodeGoat/runs/1504051837 (see the In my case it's happening when the cache with the matching key already exists. Normally that would fail with the message "Cache already exists." instead of "Unable to reserve cache with key [...]" I'm using satackey/action-docker-layer-caching@v0.0.8, which imports @actions/cache@1.0.4. In case it's important, that action is running up to four cache uploads in parallel, but all caches have unique keys within a single build. I was only running that one build using those cache keys at the time. EDIT: I tested with parallel uploads switched off and got the same result. I'm wondering if the serverside response has been changed for the "cache exists" case to avoid retries which will never succeed. |
I've seen this happen again. This might be the initial error:
|
I've also noticed this is extremely reproducible for some reason. This failure happens a lot for my docker runners when things are in parallel, especially now that I have 8 runners now. |
This is now blocking the build system. I'm seeing this build error constantly:
I'm going to remove caching to fix it. |
Same issue, macOS runner. Works on GitHub runner, but not local. Our key is
|
I think we encountered this problem by manually cancelling a run that had begun caching with a certain key before it completed the caching step. It seemed to never released the key, and subsequent runs that try to use the same key fail to reserve it:
...
Our only workaround was to change the key. |
This is to work around actions/toolkit#658
This is to work around actions/toolkit#658
I'm also experiencing this issue with my custom github action, it seems that it really insists on the cache key being "unique"? otherwise it keeps failing with |
Same here, or some |
After reruning the workflow error is gone. |
I had a relative path in the list of files to cache. But it seems that relative paths are not supported. Removing it fixed the error. |
I think I am hitting this on two workflows, as rust-cache uses this underneath. Manually overriding cache keys, and doing a commit seems like a bit of whack-a-mole approach. Happy to manually delete the cache from a UI (or API) feels like a bit of a cleaner workaround until the root cause is fixed. |
I also experienced Warning: uploadChunk (start: 67108864, end: 100663295) failed: Cache service responded with 503 |
Faced the same on
Though everything worked on the very next run. |
I ran into this issue and think I figured out the most common cause. It seems like there's two causes for this:
If you're running into this problem and don't have any errors uploading, I'm guessing the root cause is the last one. Basically I compared what this cache library was doing with how the official The official action does not try to save the cache if there was previously an exact key match on a cache hit. If you just naively do a restore cache + save cache (like I tried), you'll run into this error every time there's a cache hit (meaning you're trying to save a cache key which is already cached). Ideally So unfortunately the solution is to replicate all the logic within https://github.com/actions/cache/blob/611465405cc89ab98caca2c42504d9289fb5f22e/src/save.ts. Here's a utility function to wrap a cacheable function (like calling async function withCache(cacheable, paths, baseKey, hashPattern) {
const keyPrefix = `${process.env.RUNNER_OS}-${baseKey}-`;
const hash = await glob.hashFiles(hashPattern);
const primaryKey = `${keyPrefix}${hash}`;
const restoreKeys = [keyPrefix];
const cacheKey = await cache.restoreCache(paths, primaryKey, restoreKeys);
if (!cacheKey) {
core.info(`Cache not found for keys: ${[primaryKey, ...restoreKeys].join(", ")}`);
}
core.info(`Cache restored from key: ${cacheKey}`);
await cacheable();
if (isExactCacheKeyMatch(primaryKey, cacheKey)) {
core.info(`Cache hit occurred on the primary key ${primaryKey}, not saving cache.`);
return;
}
await cache.saveCache(paths, primaryKey);
}
await withCache(async () => {
await exec.exec('npm install')
}, ['node_modules], 'npm', '**/package.json'); You might want to customize the arguments and how the keys are built (maybe accept a list of restore keys too). |
Current SSCACHE is stuck due to: actions/toolkit#658 The simplest solution is to invalidate it by switching the start of the cache key to v1. Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
Current SSCACHE is stuck due to: actions/toolkit#658 The simplest solution is to invalidate it by switching the start of the cache key to v1. Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
A build fails with |
the issue is still reproducible, there's not a workaround at the moment |
Had a similar issue with ruff: ruff failed
Cause: Failed to create cache file '/home/runner/work/aoc_2023/aoc_2023/.ruff_cache/0.1.12/1323824952410372998'
Cause: No such file or directory (os error 2) Emptying the workflow file, committing, then pasting it back in fixed the issue. |
ruff failed |
### Description - Uses derived hook and ISM config and dispatchTx of message to implement metadata fetching ### Drive-by changes - Change yarn cache key to workaround actions/toolkit#658 - Make `hyperlane message send` use `HyperlaneCore.sendMessage` ### Related issues - Fixes #3450 ### Backward compatibility Yes ### Testing E2E testing BaseMetadataBuilder
Describe the bug
All my actions are running, but are refusing to cache with the same error:
There must be some sort of race condition, and the cache is stuck, because they all have the same precache log:
I imagine that if I changed my yarn lock file, which the cache is based on, this issue would resolve itself.
Expected behavior
That the cache always works, and when it gets stuck like this, it would automatically fix itself.
Additional context
The text was updated successfully, but these errors were encountered: