Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: option to update cache #342

Open
fsimonis opened this issue Jun 2, 2020 · 27 comments · May be fixed by #498
Open

Feature request: option to update cache #342

fsimonis opened this issue Jun 2, 2020 · 27 comments · May be fixed by #498
Assignees
Labels
area:granular-control Issues related to granular control in saving or restoring cache feature request

Comments

@fsimonis
Copy link

fsimonis commented Jun 2, 2020

Problem Description
Currently, the cache action either restores an existing cache on cache-hit, or generates a missing cache on cache-miss. It does not update the cache on cache-hit.

This works well for caching static dependencies, but not for caching build artefacts.

Proposed Solution
Add an option allowing the user to enable cache updates.
This should be false by default to retain backwards-compatibility.

uses: actions/cache@v2
with:
  path: ccache
  key: ${{ matrix.CONFIG }}-${{ matrix.CXX }}-${{ matrix.TYPE }}
  update: true   # <~~ explicitly request an update

Motivation
Some programming languages benefit greatly from build caching. C++ in conjunction with ccache is the prime example. Using caching commonly decreases compilation times by at least 70%. Medium-sized projects easily take 20 minutes to compile.
ccache also manages the cache size itself and automatically removes obsolete entries, thus the cache won't explode with continuous updates.

It also saves time and money for both user and provider. The environment will be happy too.

@eine
Copy link

eine commented Jun 4, 2020

I'm having this issue when trying to use @actions/cache for reducing update time of the built-in MSYS2 installation on windows-latest virtual environments. The virtual environments are outdated quite fast, and currently 5503.95 MiB need to be downloaded and installed on each job. It takes 8-10 min.

As commented in msys2/setup-msys2#23, I'm trying to save /var/cache/pacman/pkg/. However, actions/checkout@v2 allows to do it once only. Later executions skip it: https://github.com/eine/setup-msys2/runs/737062486?check_suite_focus=true#step:12:2

Post job cleanup.
Cache hit occurred on the primary key msys, not saving cache.

Then, I tried using the npm package: https://www.npmjs.com/package/@actions/cache. Unfortunately, it fails: https://github.com/eine/setup-msys2/runs/738298072?check_suite_focus=true#step:4:116

##[error]reserveCache failed: Cache already exists. Scope: refs/heads/tool-cache, Key: msys2-nokey, Version: 000d31344dacf74d63d9e122f85409f68c5697c2aa32c5626452e8301c5d0c66

As an alternative to updating an existing key, it would be feasible to remove it explicitly (ref #340).

@davidsbond
Copy link

This would also be super handy for persisting the test cache when using GitHub Actions in go projects. For example, I could download the latest test cache, run my tests then update the existing cache with the results of the tests in the current run.

This way, I can have a global test cache across all my workflow runs.

@zen0wu
Copy link

zen0wu commented Jun 21, 2020

This also applies to things like webpack loader cache, small, have their own key management, needs to be updated each time

@Mordil
Copy link

Mordil commented Aug 7, 2020

This would be super valuable for monorepo's where each subproject has its own dependencies it wants to cache and build so that upstream projects have faster build times.

I ran into the expectation that this was already the default behavior - so I wrote #392 under that assumption

@potaito
Copy link

potaito commented Aug 13, 2020

Is there any workaround for forcing the update of the cache even on a hit? I can't think of any way...
In my case it would reduce the compilation time from 20 minutes to 3 minutes if I could use ccache for QT/C++.

@Vampire
Copy link

Vampire commented Aug 13, 2020

Maybe save as ccache-${{ github.run_id }} and restore with restore key ccache-.
github.run_id is unique id for the workflow run, so every time a new cache is saved.
When restoring you will never have an exact match but then the ccache- restore key will restore the latest one that started with that string and in the end create a new one with the current state.

@potaito
Copy link

potaito commented Aug 13, 2020

@Vampire that's brilliant, thanks mate! You are right, forgot about the pattern matching that the cache finding does. Perhaps this is then a non-issue and your solution is the intended way of doing things?

@Vampire
Copy link

Vampire commented Aug 13, 2020

Nah, that's merely a work-around.
It will fill up your 5 GiB of cache and then evict things that might not have been evicted if the cache would have been updatable.

@PathogenDavid
Copy link

@HebaruSan It won't save anything at all.

https://github.com/actions/toolkit/blob/73d5917a6b5ea646ac3173cfceb727ee914ff6ed/packages/cache/src/cache.ts#L166-L175

@HebaruSan
Copy link

Duplicate of #171

autarch added a commit to houseabsolute/local-covid-tracker that referenced this issue Dec 25, 2020
If a given cache key already exists, the cache action will not update the
cache, even if the cached files have changed. See
actions/cache#342 for discussion.

The workaround is to change the cache key every day but to have a restore key
which is not date-specific. See
actions/cache#342 (comment).
autarch added a commit to houseabsolute/local-covid-tracker that referenced this issue Dec 25, 2020
If a given cache key already exists, the cache action will not update the
cache, even if the cached files have changed. See
actions/cache#342 for discussion.

The workaround is to change the cache key every day but to have a restore key
which is not date-specific. See
actions/cache#342 (comment).
autarch added a commit to houseabsolute/local-covid-tracker that referenced this issue Dec 25, 2020
If a given cache key already exists, the cache action will not update the
cache, even if the cached files have changed. See
actions/cache#342 for discussion.

The workaround is to change the cache key every day but to have a restore key
which is not date-specific. See
actions/cache#342 (comment).
autarch added a commit to houseabsolute/local-covid-tracker that referenced this issue Dec 25, 2020
If a given cache key already exists, the cache action will not update the
cache, even if the cached files have changed. See
actions/cache#342 for discussion.

The workaround is to change the cache key every day but to have a restore key
which is not date-specific. See
actions/cache#342 (comment).
eyal0 added a commit to eyal0/cache that referenced this issue Jan 4, 2021
@eyal0 eyal0 linked a pull request Jan 4, 2021 that will close this issue
eyal0 added a commit to eyal0/cache that referenced this issue Jan 4, 2021
eyal0 added a commit to eyal0/cache that referenced this issue Jan 4, 2021
eyal0 added a commit to eyal0/cache that referenced this issue Jan 4, 2021
yureiita added a commit to yureiita/Actions-OpenWrt that referenced this issue Oct 2, 2023

Verified

This commit was signed with the committer’s verified signature.
yureiita Abdul Halim Daud
yureiita added a commit to yureiita/Actions-OpenWrt that referenced this issue Oct 5, 2023

Verified

This commit was signed with the committer’s verified signature.
yureiita Abdul Halim Daud
@GuyAv46
Copy link

GuyAv46 commented Nov 13, 2023

For anyone performing the gh workaround and doesn't want to checkout the repository in the job running it, you just need to provide a GitHub token and the repository name:

cache-sha:
  # Caches the SHA of the last successful build
  runs-on: ubuntu-latest
  steps:
    - name: Clear cache
      continue-on-error: true # Don't fail if the cache doesn't exist
      env:
        GH_TOKEN: ${{ github.token }} # required by gh
      run: |
        gh extension install actions/gh-actions-cache
        gh actions-cache delete ${{ env.CACHE_NAME }} --confirm -R ${{ github.repository }}

@grosser
Copy link

grosser commented May 19, 2024

FYI when using the cache deletion trick and getting Error: Resource not accessible by integration you have to enable Read and write permissions under repo settings under Actions -> General -> Workflow permissions

@lcswillems
Copy link

lcswillems commented Jun 22, 2024

Hey @bethanyj28, what's the reason such a feature has not been shipped yet? It is built-in in Gitlab, etc. It is the 2nd top most requested. And it seems it is doable (they are workarounds but cumbersome)

captn3m0 added a commit to blr-today/ingest that referenced this issue Jun 28, 2024
Github does not override our cache, so we instead
provide a restore-key prefix, which lets us
pick the latest of the caches that were stored

Ref: actions/cache#342 (comment)
captn3m0 added a commit to blr-today/ingest that referenced this issue Jun 28, 2024
Github does not override our cache, so we instead
provide a restore-key prefix, which lets us
pick the latest of the caches that were stored

Ref: actions/cache#342 (comment)
captn3m0 added a commit to blr-today/ingest that referenced this issue Jun 28, 2024
Github does not override our cache, so we instead
provide a restore-key prefix, which lets us
pick the latest of the caches that were stored

Ref: actions/cache#342 (comment)
slaff pushed a commit to SmingHub/Sming that referenced this issue Jul 17, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Build times are increasing faster than I expected, so it's clear that to be effective the ccache needs to be kept up to date.
This needs to happen automatically so we can forget about it.

Github caching behaviour is to fail the save operation if a hit occurred on primary key.
In other words, a cache entry will never be overwritten (updated).
There is no provision for a cache update as such: actions/cache#342

The workaround is fairly easy though: break into discrete restore/delete/save steps.

Only the cache for the current branch is deleted, using the `--branch` option to `gh action-cache delete`.
This prevents pull requests from contributors with repo write access from deleting the main `develop` cache.

This table shows the intended progression of cache updates, starting from an empty cache:

| Event   | Branch       | Restore      | Delete  | Save         |
| ------- | ------------ | --------     | ------- | ------------ |
| PR      | feature/new  | -            | -       | feature/new  |
| PR      | feature/new  | feature/new  | -       | -            |
| Merge   | develop      | -            | -       | develop      |
| PR      | fix/lwip     | develop      | -       | -            |
| Merge   | develop      | develop      | develop | develop      |

So always save if (event.branch == develop) or no cache hit
CDAGaming added a commit to CDAGaming/UniLib-Mirror that referenced this issue Aug 20, 2024

Verified

This commit was signed with the committer’s verified signature.
- Implement workarounds in actions/cache#342 involving overwriting Github Actions cache data
  - This was, we still have one cache, but it can now be appended with the cache data of all the versions
- Concurrency Group added, serializing the main workflow rather than doing parallel jobs, but will lead to faster times long-term thanks to a working cache
- Fixed a potential ordering issue, by placing the Gradle Daemon Stopping before Artifact uploading
@romani
Copy link

romani commented Sep 30, 2024

mutable cache that exists just during single run of workflow would be awesome option.
it should not be default behavior but possibility will be convenient, it would help to checkout sources ones and keep them from job to job (of single workflow) to let each job mutate/update file system as required and help following job to reuse it.

our workaround was: to create one more cache file.

@assignUser
Copy link

from job to job (of single workflow) to let each job mutate/update file system as required and help following job to reuse it.

@romani you can use artifacts for that too, just overwrite the existing one and cache it in the last job to make it permanent across workflows (if desired).

@romani
Copy link

romani commented Sep 30, 2024

https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/storing-and-sharing-data-from-a-workflow#comparing-artifacts-and-dependency-caching

Use artifacts when you want to save files produced by a job to view after a workflow run has ended, such as built binaries or build logs.

Nuance is that in our case we do not want to share some files after workflow ended.
We can hack everything to do what we need, but we tried to not polluting outside of workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:granular-control Issues related to granular control in saving or restoring cache feature request
Projects
None yet