Enable always writing cache to support hermetic build systems #109

wchargin · 2019-11-21T06:09:43Z

I’d like to use actions/cache to cache my Bazel build state, which
includes dependencies that have been fetched, binaries and generated
code that have been built, and results for tests that have run. Bazel is
a hermetic build system, so the standard Bazel pattern is to always use
a single cache. Bazel will take care of invalidation at a fine-grained
level: if you only change one source file, it will only re-build and
re-test targets that depend on that source file.

Thus, the pattern that makes sense to me for Bazel projects is to always
fetch the cache and always store the cache. We can always fetch the
cache by using a constant cache key, but then the cache will never be
stored. Bazel doesn’t have a single package-lock.json-style file that
can be used as a cache key; it’s the combination of all build and source
files in the whole repository. We could key use the Git tree (or commit)
hash as a cache key, but this would lead to storing a mountain of
caches, too, which seems wasteful.

Ideally, the fetched cache would be taken from origin/master, but
really taking it from any recent commit should be fine, even if that
commit was in a broken or failing state.

On my repository, it takes 33 seconds to save the Bazel cache after a
successful job, but on a clean cache it takes 2 minutes to fetch remote
dependencies and 26 minutes to build all targets. I would be more than
happy to pay those 33 seconds every time if it would save half an hour
in the rest of the build!

For comparison, on Travis we achieve this by simply pointing to the
Bazel cache directory:
https://github.com/tensorflow/tensorboard/blob/1d1bd9a237fe23a3f2c31282ab44e7dfbcac717c/.travis.yml#L30-L32

The text was updated successfully, but these errors were encountered:

chrispat · 2019-11-21T14:53:19Z

@wchargin this is an interesting topic thanks for bringing it up.

In this example if we had a way to skip storing the cache unless the run was on master you could use the git commit as part of your key and get the desired behavior without writing a new cache for each run of a pull request.

Do you think that would work for you?

Summary: GitHub Actions is a new first-party CI service offered by GitHub. It requires no extra permissions. Its concurrency limits are appealing, at 20 workflows per repo (1 workflow ≈ 1 commit) and concurrent jobs ranging from 20 (free tier) to 180 (enterprise tier), with the option to run on your own servers if this isn’t enough. This commit adds a workflow definition for our CI. It’s similar to our existing Travis workflow, except that it only runs on Python 3.6 for now due to a bug in the Python 2.7 runtime that has been fixed on GitHub’s end but not yet deployed (see note inline). I also added a run of our self-diagnosis script for good measure. (The diagnosis script always exits successfully, and runs in about 4 seconds.) The high job concurrency limits let us save some time by running the lint steps in parallel and just once rather than sequentially and in every cell of the build matrix. The GitHub Actions VMs appear to have very little overhead: the entire elapsed time for the `lint-yaml` job is 12 seconds, of which 6 seconds is checking out the repo. Empirically, there is very little latency (order of seconds) between pushing a commit and seeing real work being done on the VMs. GitHub Actions offers caching. From what I can glean, each cache directory (e.g., “the Bazel cache” or “the Node cache”) is tarred and gzipped; each such archive must not exceed 400 MB. This is enough space to cache our `node_modules` and our Bazel state.\* But I haven’t done so, pending (a) clarity on the recommended way to cache Node modules and (b) better support for Bazel-style unicaches (see notes inline). Even without any caching, the total workflow time is still about the same as the best-case Travis build time because of the improved concurrency. \* Sometimes: in my tests, sometimes Bazel could be cached successfully, and other times it was well over the limit (595 MB out of 400 MB). I’m not quite sure what that’s about. [nm]: actions/cache#67 [bzl]: actions/cache#109 Test Plan: Note that this commit triggers a GitHub Actions workflow that succeeds. wchargin-branch: gh-actions

wchargin · 2019-11-22T00:19:17Z

@chrispat: Yeah, that sounds reasonable! At a glance, I don’t see a way
to save the cache only if it’s running on master… but perhaps I could
hack something together that restores the cache to its initial state at
the end of the job for builds that aren’t running on master—just as
a proof of concept to see how this strategy works.

If I understand correctly, we’d still be proliferating caches with each
commit to master, right? I understand that cache eviction kicks in, but
it still seems unfortunate, especially if I have to worry about other
caches (e.g., node_modules) being evicted prematurely.

hvr · 2019-11-22T11:19:40Z

For the record, cabal's Nix-style store/cache also falls into this category; see my comment at #38 (comment)

chrispat · 2019-11-22T19:48:24Z

@wchargin given the version of the sources is part of the bazel caching algorithm what key do you think should be used to prevent a huge number of updates? My assumption is travis is uploading new caches essentially every build if they are just looking at changes to the cache directory.

wchargin · 2019-11-25T01:23:55Z

Yes, Travis uploads new caches every build. And you’re right that this
is a performance problem: Travis re-uploads the entire cache directory
from scratch every build, which can take minutes. (Also, the build
doesn’t report success until this upload has completed, and this
upload can cause an otherwise successful build to time out and fail,
which is super frustrating…)

We do want to update the cache on every build, but it should be cheap to
perform a partial update of only files that changed, rsync-style. The
action cache will be updated on basically every commit, but is small
(~500K). The fetch caches will be very rarely updated, and can be large
(hundreds of MB). And the build cache for any given target will be
updated whenever that target changes, but not only if unrelated targets
change, and can be of varying sizes (typically fairly small, but there
are lots of them).

I see that actions/cache currently tars and gzips everything into a
single bundle, but it would be much more effective for caches in the
style of Bazel/Nix/Cabal to support incremental updates, perhaps by
using use a content-addressable store like that of Git itself. What do
you think?

chrispat · 2019-11-25T12:40:27Z

For something like bazel I wonder if having a truly remote cache is actually a better option https://github.com/buchgr/bazel-remote. This is not something we are going to get around to implementing anytime soon but it is something we can consider for the future.

The model we have for caching enables to user to control the key and also requires that all caches are immutable by key. While that is not ideal for all scenarios it does work generally well for a large number of different technology stacks and scenarios. This immutable nature make incremental update untenable and likely not possible. Even if we could incrementally update the cache the download on next run is going to have to be the entire cache as we have to provision a fresh VM for each job.

mborgerson · 2019-11-26T18:58:41Z

I believe I have a similar use case to the issue described here, and ideally would like to see an update-cache option added to the action, but I've worked around the issue by leveraging the restore-keys option.

A project of mine consists largely of C files, and naturally a significant portion of my CI cycle time is spent in compilation. To speed things up, I've employed ccache, which will opportunistically recycle previously built object files when it detects that the compilation would be the same for the current build. This has a dramatic performance improvement on CI times. In order to do this though, I need some persistence of storage between workflow runs in order to save and restore ccache's cache directory. Of course, as the code base evolves, the cache of object files will change too.

I was pleased to discover actions/cache, as it fits my use case very nicely; but, I was surprised to find that when a cache hit occurs, actions/cache will not attempt to update the cache at all, and there's not an option to request such update.

To work around this, I do the following:

    - name: Initialize Compiler Cache
      id: cache
      uses: actions/cache@v1
      with:
        path: /tmp/xqemu-ccache
        key: cache-${{ runner.os }}-${{ matrix.configuration }}-${{ github.sha }}
        restore-keys: cache-${{ runner.os }}-${{ matrix.configuration }}-

It works like this: when the cache is loaded for a workflow, there will be an initial cache miss because the cache key contains the current commit sha. actions/cache will fall back to the most recently added cache via restore-keys prefix matching policy, then after the build has completed, create a new cache entry to satisfy the initial cache miss.

This solution seems to work very well for me, and hopefully this will be useful to others with a similar use case. Ideally though, I think actions/cache should just support updating the cache, to a new immutable revision perhaps--as I have done above.

wchargin · 2019-11-26T21:49:11Z

Having the caches be immutable makes a lot of sense. Immutable caches
seem perfectly compatible with incremental updates—in fact, this is a
strong point of Git. If your repository has 100 top-level directories
each with 100 files, then you have 101 trees and 10000 blobs; if you
change just one of those files, then you have 103 trees and 10001 blobs,
not 202 trees and 20000 blobs. Does this make sense, or am I missing
something?

A truly remote cache is an appealing option, but comes with a lot more
operational overhead for the user. Storing files is much easier than
running a server.

Downloading the full latest cache on each run may not be perfect, but
it’s still an improvement over rebuilding all the artifacts, faster by
about 20 minutes in my case.

mborgerson · 2019-11-26T22:31:30Z

@wchargin I agree that immutability is acceptable on the condition that we can restore and create a new cache as I have described above (though it can be quite wasteful as you mentioned). My guess is that this particular use case will be desirable by many projects. Perhaps the documentation could simply be updated to demonstrate this type of use case? To me, it wasn't immediately obvious. My suggestion would be to mention using ${{ github.sha }} in the key.

wchargin · 2019-11-27T04:03:53Z

Right; immutability is space-wasteful if the caches are stored
independently (which will happen if you use ${{ github.sha }}) but not
if they’re stored as part of one content-addressed store (which would
require changes to the actions/cache implementation).

chrispat · 2019-11-27T16:28:19Z

A truly remote cache is an appealing option, but comes with a lot more
operational overhead for the user. Storing files is much easier than
running a server.

I was thinking we would run that server on behalf of the user so the operational overhead should be essentially the same is it would be for the existing cache action. I am not 100% sure that is the best option but it seems like it might be a really good one for build systems that support it.

wchargin · 2019-11-27T20:19:08Z

A truly remote cache is an appealing option, but comes with a lot more
operational overhead for the user. Storing files is much easier than
running a server.

I was thinking we would run that server on behalf of the user so the
operational overhead should be essentially the same is it would be for
the existing cache action.

Oh, that would be fantastic! —being able to just point Bazel to a remote
cache provided by a GitHub-managed action would be a huge value-add
for us compared to other CI services.

dsilva · 2019-12-28T03:45:49Z

Similar use case: ~/.cache/sccache for sccache works like the bazel cache. For now it's probably easier to point sccache at S3 or GCS to avoid the issues described above, and it would be nice if GitHub ran an sccache store as well.

See actions/cache#109

jacquesbh · 2020-03-24T20:12:29Z

Hi!

I have the same issue I think with composer cache, in PHP.

Composer saves every download it makes in cache and most of the time this cache is globally used.
As example on a developer machine the cache is growing all the time with new releases.

I've been using the github.sha in my cache keys since it allows me to re-save the cache and avoid the case where it hits the cache but new versions of dependencies exist and it's always downloading them since they're not in cache.

See actions/cache#109

jvolkman · 2020-03-27T17:26:38Z

Specifically for Bazel: the cache protocol is pretty simple. I wonder if it would be feasible to write a service that simply proxies to Github's own artifactcache service and stand it up locally? Not sure how many cache keys Github allows.

GitHub actions caching will never update the cache if there was a cache hit, but for bazel we want to do this since bazel guarantees hermetic builds and will update the cache if needed. See actions/cache#109 for more context. As such, we adjust the cache key logic to work better with github actions as per the documentation: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key Now we will always have a cache miss and load the latest restore key and then we will always upload to a new cache key. This adds a couple minutes for saving the cache, but building from scratch is 12+ min, so it is worth it.

pramodka-revefi · 2022-07-25T19:31:33Z

Since this task is still open, What's the current best practice for bazel caching + github actions? Does someone have a snippet of their github workflow they can share?

Update: Just sharing the CI pipeline yaml with caching that we went with, hopefully it'll help the next person who lands on this task. (Slightly more permissive approach to @nanddalal above)

github-actions · 2023-02-13T08:34:17Z

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

mihaimaruseac · 2023-02-13T13:26:04Z

I still think this is useful to have and not close

yongtang · 2023-02-13T13:51:27Z

I think this will also help reduce the overall cost of compute resources on GitHub actions, as many open source projects can minimize the GitHub actions minutes they use for every run.

jsoref · 2023-03-27T02:53:43Z

So... If you're willing to use an a/b system, you could probably do something like:

- uses: actions/cache/restore@v3
  with:
    key: preferred
    restore-key: fallback
- run: do-work
- if: no-cache
  uses: actions/cache/save@v3
  with:
    key: fallback
- if: no-cache
  uses: actions/cache/save@v3
  with:
    key: preferred
- if: used-preferred-cache
  uses: ./delete-cache
  with:
    key: fallback
- if: used-preferred-cache
  uses: actions/cache/save@v3
  with:
    key: fallback
- if: used-fallback-cache
  uses: actions/cache/save@v3
  with:
    key: preferred
- if: used-preferred-cache
  uses: ./delete-cache
  with:
    key: preferred
- if: used-preferred-cache
  uses: actions/cache/save@v3
  with:
    key: preferred

Notes:

used-fallback-cache, used-preferred-cache, and no-cache aren't technical things, but actions/cache has outputs that you can use to construct the concepts
You might be able to condense the various stages, but you definitely want to ensure that at least one of the caches is available, and given how simple the with:'s will be, it might be simpler just to have lots of steps than to try to be incredibly fancy about it.
in addition to wrapping delete-cache into an action, you could wrap the entire delete+save pattern into an action

./delete-cache can be implemented using the APIs that were made available in circa June 27, 2022:
https://github.blog/changelog/2022-06-27-list-and-delete-caches-in-your-actions-workflows/

github-actions · 2023-10-13T08:31:19Z

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

Frenzie · 2023-10-13T09:08:50Z

Bots suck.

Cf. actions/cache#109 (comment)

IanButterworth · 2024-01-18T15:49:39Z

I think this can be closed as it's now released in v4

jsoref · 2024-01-18T16:12:20Z

I don't see how v4 changes anything. Either it was already possible (and I think my suggestions and others show that there are ways to do something) or it might still not be possible.

If it's now possible as of v4, it'd be nice if someone put together an actual example of how to do it.

IanButterworth · 2024-01-18T16:18:19Z

My bad. v4 has a save-always option. But this would be more like a save-overwrite option?

jsoref · 2024-01-18T20:45:17Z

I mean, I'd probably just use an epoch time value with a fallback of none:

        key: cache-${{ steps.time.outputs.epoch }}
        restore-keys: cache-

That'd result in it always writing one. Older caches will get wiped out as they become least recently used. Sure, you pay a bit to store a duplicate of the cache (or you could use actions/cache/restore and actions/cache/save and only conditionally call actions/cache/save if you made any changes...), but, so what?

ephemient · 2024-01-19T00:18:49Z

but, so what?

That excess space usage causes other caches to get dropped too.

jsoref · 2024-01-19T02:48:03Z

Then use restore & saveseparately and use an if: to only use save when you have changes.

If you're being really aggressive, you might be able to portion the cache into lots of pieces and have steps to calculate and retrieve/save them.

There will be a trade-off between how many steps you need to run and how big your cache pieces are.

Cf. actions/cache#109 (comment)

github-actions · 2024-08-06T08:32:21Z

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

Frenzie · 2024-08-06T09:17:27Z

https://github.com/c-hive/fresh-bot

mihaimaruseac · 2024-08-06T14:14:12Z

This is still an issue worth resolving

TheGiraffe3 · 2024-11-26T10:43:37Z

Why wasn't the stale label removed?

dhadka mentioned this issue Dec 20, 2019

Request: compute hash key in post step #135

Closed

wchargin mentioned this issue Jan 3, 2020

ci: enable GitHub actions for tests tensorflow/tensorboard#2953

Merged

joshmgross added the enhancement label Feb 19, 2020

christhalinger mentioned this issue Mar 20, 2020

Support writing to the cache even on a cache hit #171

Closed

jacquesbh added a commit to monsieurbiz/SyliusAlertMessagePlugin that referenced this issue Mar 24, 2020

Use github.sha instead of hashFiles for cache key

2f6388a

See actions/cache#109

jacquesbh added a commit to monsieurbiz/SyliusAlertMessagePlugin that referenced this issue Mar 24, 2020

Use github.sha instead of hashFiles for cache key

d5d646a

See actions/cache#109

acecilia mentioned this issue Apr 1, 2020

Cache files outside github workspace (bazel) #239

Closed

jcp19 mentioned this issue Jun 14, 2022

The CI does not update the gobra-cache on cache-hits viperproject/VerifiedSCION#28

Closed

github-actions bot added the stale label Feb 13, 2023

github-actions bot removed the stale label Feb 14, 2023

github-actions bot added the stale label Oct 13, 2023

github-actions bot removed the stale label Oct 14, 2023

gallais added a commit to msp-strath/MSPweb that referenced this issue Oct 17, 2023

[ ci ] bring back caching

f71be80

Cf. actions/cache#109 (comment)

gallais mentioned this issue Oct 17, 2023

[ ci ] bring back caching msp-strath/MSPweb#19

Merged

gallais added a commit to msp-strath/MSPweb that referenced this issue Oct 17, 2023

[ ci ] bring back caching (#19)

9886f4e

Cf. actions/cache#109 (comment)

gallais added a commit to msp-strath/MSPweb that referenced this issue Feb 5, 2024

[ ci ] bring back caching (#19)

e55ac8b

Cf. actions/cache#109 (comment)

github-actions bot added the stale label Aug 6, 2024

romani mentioned this issue Sep 30, 2024

Investigate reducing extra cache on step Restore Generated Config and Project Files in regression-report.yml checkstyle/checkstyle#15635

Closed

agriyakhetarpal mentioned this issue Oct 22, 2024

Re-enable Pyodide CI job and testing for SymPy sympy/sympy#27183

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable always writing cache to support hermetic build systems #109

Enable always writing cache to support hermetic build systems #109

wchargin commented Nov 21, 2019

chrispat commented Nov 21, 2019

wchargin commented Nov 22, 2019 •

edited

Loading

hvr commented Nov 22, 2019

chrispat commented Nov 22, 2019

wchargin commented Nov 25, 2019

chrispat commented Nov 25, 2019

mborgerson commented Nov 26, 2019 •

edited

Loading

wchargin commented Nov 26, 2019

mborgerson commented Nov 26, 2019 •

edited

Loading

wchargin commented Nov 27, 2019

chrispat commented Nov 27, 2019

wchargin commented Nov 27, 2019

dsilva commented Dec 28, 2019

jacquesbh commented Mar 24, 2020

jvolkman commented Mar 27, 2020

pramodka-revefi commented Jul 25, 2022 •

edited

Loading

github-actions bot commented Feb 13, 2023

mihaimaruseac commented Feb 13, 2023

yongtang commented Feb 13, 2023 •

edited

Loading

jsoref commented Mar 27, 2023

github-actions bot commented Oct 13, 2023

Frenzie commented Oct 13, 2023

IanButterworth commented Jan 18, 2024

jsoref commented Jan 18, 2024

IanButterworth commented Jan 18, 2024

jsoref commented Jan 18, 2024 •

edited

Loading

ephemient commented Jan 19, 2024

jsoref commented Jan 19, 2024

github-actions bot commented Aug 6, 2024

Frenzie commented Aug 6, 2024

mihaimaruseac commented Aug 6, 2024

TheGiraffe3 commented Nov 26, 2024

Enable always writing cache to support hermetic build systems #109

Enable always writing cache to support hermetic build systems #109

Comments

wchargin commented Nov 21, 2019

chrispat commented Nov 21, 2019

wchargin commented Nov 22, 2019 • edited Loading

hvr commented Nov 22, 2019

chrispat commented Nov 22, 2019

wchargin commented Nov 25, 2019

chrispat commented Nov 25, 2019

mborgerson commented Nov 26, 2019 • edited Loading

wchargin commented Nov 26, 2019

mborgerson commented Nov 26, 2019 • edited Loading

wchargin commented Nov 27, 2019

chrispat commented Nov 27, 2019

wchargin commented Nov 27, 2019

dsilva commented Dec 28, 2019

jacquesbh commented Mar 24, 2020

jvolkman commented Mar 27, 2020

pramodka-revefi commented Jul 25, 2022 • edited Loading

github-actions bot commented Feb 13, 2023

mihaimaruseac commented Feb 13, 2023

yongtang commented Feb 13, 2023 • edited Loading

jsoref commented Mar 27, 2023

github-actions bot commented Oct 13, 2023

Frenzie commented Oct 13, 2023

IanButterworth commented Jan 18, 2024

jsoref commented Jan 18, 2024

IanButterworth commented Jan 18, 2024

jsoref commented Jan 18, 2024 • edited Loading

ephemient commented Jan 19, 2024

jsoref commented Jan 19, 2024

github-actions bot commented Aug 6, 2024

Frenzie commented Aug 6, 2024

mihaimaruseac commented Aug 6, 2024

TheGiraffe3 commented Nov 26, 2024

wchargin commented Nov 22, 2019 •

edited

Loading

mborgerson commented Nov 26, 2019 •

edited

Loading

mborgerson commented Nov 26, 2019 •

edited

Loading

pramodka-revefi commented Jul 25, 2022 •

edited

Loading

yongtang commented Feb 13, 2023 •

edited

Loading

jsoref commented Jan 18, 2024 •

edited

Loading