Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable read-only configuration cache #29467

Open
mikejuyoon opened this issue Jun 7, 2024 · 3 comments
Open

Enable read-only configuration cache #29467

mikejuyoon opened this issue Jun 7, 2024 · 3 comments
Labels
a:feature A new functionality in:configuration-cache Configuration Caching

Comments

@mikejuyoon
Copy link

Expected Behavior

There should be a setting to enable loading configuration cache but skip the store step at the end of the build, which adds potentially unnecessary time to the build.

Current Behavior (optional)

When configuration cache is enabled, both load and store steps are enabled with no way to opt out of either step

Context

We want to enable configuration-cache in CI, where majority of the builds will be consuming configuration cache that was produced from another CI job. Majority of builds won't need to store the cache at the end of the build as it won't be reused. However the time to store this cache can make the performance gain from cc negligible, or even worse, especially when theres a cache miss.

@mikejuyoon mikejuyoon added a:feature A new functionality to-triage labels Jun 7, 2024
@mikejuyoon mikejuyoon changed the title Add setting to allow read-only configuration cache Enable read-only configuration cache Jun 7, 2024
@bamboo bamboo added the in:configuration-cache Configuration Caching label Jun 7, 2024
@mlopatkin
Copy link
Member

Can you tell us more about what you perceive as a source of the overhead? We know about sequential dependency resolution, and will address this soon. Is the I/O in general a problem? Do you expect the machine to have enough memory to temporarily hold the cached state if needed?

The store operation is an essential step to prepare for the execution phase, to allow parallel task execution, for example. It doesn't really happen at the end of the build. Our end goal is to get rid of non-configuration-cached execution, so falling back to it won't be a long-term solution.

An alternative solution could be to fail the build if the cached state cannot be reused to indicate that cache-priming build has to be re-run. Does this behavior fit your use case?

@joshfriend
Copy link

Can you tell us more about what you perceive as a source of the overhead?

At one point, we had enabled configuration cache in our CI as a way to validate that our build was compatible with CC when updating gradle, but we found that this came at a ~10% performance penalty. For larger builds we would sometimes observe the storing of configuration cache to take >1m.

We are rolling out configuration caching to CI builds where we produce the cache in the main branch, and PR builds will restore the cache from the nearest commit ancestor on main that has cache available. In some cases, a developer has made a CC invalidating change and the cache we restore is not reusable. In these cases we would like to basically continue as if CC were disabled and not incur the cost of storing the new configuration to the cache.

Do you expect the machine to have enough memory to temporarily hold the cached state if needed?

Generally yes, we had one or two CI jobs in the build where configuration cache had to remain disabled because it caused OOMs, but we have been able to hold the state in memory for everything else.

The store operation is an essential step to prepare for the execution phase, to allow parallel task execution, for example

We are able to run task execution in parallel with CC turned off, I don't understand why this is a requirement when CC is enabled. I think I am missing some bit of knowledge here that would help this requirement make sense.

An alternative solution could be to fail the build if the cached state cannot be reused to indicate that cache-priming build has to be re-run. Does this behavior fit your use case?

Potentially,. We would have to check if the time taken to run the initialization twice with different settings would be faster than writing the configuration cache and discarding it. That doesn't seem great in general from a usability standpoint though.

@bamboo
Copy link
Member

bamboo commented Jun 17, 2024

Thanks, @joshfriend, we'll get back to the this issue soon, I just wanted to clarify one point:

We are able to run task execution in parallel with CC turned off, I don't understand why this is a requirement when CC is enabled.

By isolating tasks, CC can enable intra-project parallelism while --parallel only enables inter-project parallelism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:feature A new functionality in:configuration-cache Configuration Caching
Projects
None yet
Development

No branches or pull requests

4 participants