Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ccache for CI builds #277

Merged
merged 1 commit into from Jan 2, 2023
Merged

Enable ccache for CI builds #277

merged 1 commit into from Jan 2, 2023

Conversation

TerrorJack
Copy link
Contributor

This PR enables ccache for LLVM in CI, and shall speed up building processes for subsequent PRs in this repo.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Did you measure the speedup? Presumably subsequent builds with the same llvm version should have a very high hit rate?

Makefile Outdated
@@ -62,6 +62,7 @@ build/llvm.BUILT:
-DLLVM_STATIC_LINK_CXX_STDLIB=ON \
-DLLVM_HAVE_LIBXAR=OFF \
-DCMAKE_OSX_ARCHITECTURES="arm64;x86_64" \
$(LLVM_CMAKE_FLAGS) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this add the of the command line (line 77)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the bottom of the command line? Sure, will do after the current pipeline finishes.

.github/workflows/main.yml Show resolved Hide resolved
- uses: actions/cache@v3
with:
path: ~/AppData/Local/ccache
key: 0-${{ format( 'cache-windows-latest-{0}', matrix.arch) }}-${{ github.run_id }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is run_id? Wouldn't we want the cache to be shared between different runs of the workflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, see https://github.com/actions/cache/blob/main/tips-and-workarounds.md#update-a-cache for details. Adding a run_id suffix ensures that when we bump LLVM revision in the future, the cache can get updated accordingly; and the fallback key here ensures the workflow can receive cache from previous workflows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little more.. how does this ensure that "when we bump LLVM revision in the future, the cache can get updated accordingly"? Where is the llvm revision encoded here?

From that link the "Please note that this will create a new cache on every run and hence will consume the cache quota" is a little worrying.. do we really need to store a separate copy of the ccache data for each run?

Sorry, this is my first time the cache action in github so I'm probably misunderstanding.

This change lgtm assuming it actually speeds up builds..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah never mind! I'll try to explain it as a bullet-point list, and do let me know if further clarification is needed:

  • Each GitHub cache is an immutable blob indexed by a key
  • The actions/cache step will attempt to fetch the blob via that key, if not present, the cache is created and uploaded (first run)
  • If key doesn't have a unique suffix, subsequent runs have cache hit, but doesn't update the cache, because it's immutable by design
  • So in the future when we bump LLVM, we end up with stale cache in each run
  • So we append run_id to the cache key. Now, we ensure the cache key is always unique, and the cache is always created and uploaded, so no stale cache problem
  • Additionally, we have restore-keys which is just the key without the unique suffix. actions/cache will 100% miss the unique key, but then it'll look up restore-keys and it'll see some previously created caches with that prefix, and the most recent one is restored
  • The 0- prefix on all cache keys is a precautionary prefix; in the future if the cache become corrupted and break builds, and we don't want to eliminate caching logic at all, simply bump this epoch number will allow ignoring all previous caches altogether
  • Once the cache quota begins to be reached, the older ones get garbage collected
  • No need to encode LLVM revision anywhere

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That makes sense. Might be worth a comment regarding how the epoch thing works. Something like "bump this to to avoid using cached object files from previous builds, for example, if we update host version of clang"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a comment. Though ccache is smart enough to notice host compiler updates, so epoch bumps is very unlikely going to be useful anyway.

- uses: actions/cache@v3
with:
path: ~/AppData/Local/ccache
key: 0-${{ format( 'cache-windows-latest-{0}', matrix.arch) }}-${{ github.run_id }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little more.. how does this ensure that "when we bump LLVM revision in the future, the cache can get updated accordingly"? Where is the llvm revision encoded here?

From that link the "Please note that this will create a new cache on every run and hence will consume the cache quota" is a little worrying.. do we really need to store a separate copy of the ccache data for each run?

Sorry, this is my first time the cache action in github so I'm probably misunderstanding.

This change lgtm assuming it actually speeds up builds..

@TerrorJack TerrorJack force-pushed the ccache branch 2 times, most recently from f78fe9b to 09703bc Compare January 2, 2023 15:42
@TerrorJack TerrorJack marked this pull request as ready for review January 2, 2023 19:18
@TerrorJack
Copy link
Contributor Author

@sbc100 Ready for merge, all jobs are green and confirmed to be cached now. The macos and docker jobs just uploaded their first cache so there's no comparison; meanwhile, for other cached jobs, the speedup is very promising (comparing with latest master pipeline):

Job Before After
ubuntu-latest 1h 46m 10s 11m 11s
windows x64 2h 28m 43s 25m 4s
windows x86 2h 4m 19s 24m 35s

@sbc100 sbc100 merged commit 388a7ca into WebAssembly:main Jan 2, 2023
@TerrorJack TerrorJack deleted the ccache branch January 13, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants