-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a mechanism to have a per file cache eviction/retention #788
Comments
@devminded can you please help me understand the use case better?
If I am not wrong, the dependencies get updated for every build and that means the old files go away. And hence the cache also will only have latest files. Is this problem more with runtimes and tooling where multiple versions may exist side by side? If so then the problem may be much less impacting as such version changes would not happen too frequently. Am I reading this wrong? |
Not sure if I have misunderstood something in how the cache-mechanism works. As far as I understand, a cache-hit is simply that we found a cache with a matching key, that is then restored. How we calculate this key each build will affect if we restore the cache or not. The issue is that for example maven/gradle saves all the dependencies, toolchains, wrappers, etc. in a directory. Gradle for example has a default 30 day eviction from some of these directories, but (AFAIK) it's based on "last accessed time" which seems to break when using GitHub caches, so after a while every new cache-file becomes larger. Some things can be managed by being picky how we generate the cache key (like hashing the gradle-wrapper file), but that has two issues:
What I feel is missing is some kind of middle ground where we can evict content based on some rule (so it's excluded when packing the cache). Perhaps I'm missing something obvious here. |
@devminded looks like your ask is to be able to update a cache. Something similar to #342 ? Essentially you want to:
There are two parts to it which are not possible today:
|
I understand that it goes into one large tar that gets packed at the end of the build. The problems is just that that the source for the tar is a bunch of directories that our build tools fills with stuff but are unable to clean due to being based on timestamps and the cache pack-unpack mechanism seems to do something with the timestamps. I guess I will do what I wrote in my original post and base the cache key on the week-number or something. With that said I would then like to propose the following: |
This is a pretty common issue with maven caches. If you have a dependency of foo-1.0.0.jar and then upgrade to foo-1.0.1.jar the original foo-1.0.0.jar will stay in the cache forever. I have a step in my builds to remove those dirs from .m2 at the end of the build if the last accessed time is older than that of a dir that I created
Something built in to delete the dir for the maven dep if not accessed in X days would be nice and would reduce the cache size for a lot of people significantly. |
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days. |
🏓 |
I'd like this feature too. There are some caches that I'm ok to let it expire to the default 7 days. But, there are larger caches that I'd like to store for about 1-2 days. But, there's no GA input to specify such. |
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days. |
You shall not close |
@bishal-pdMSFT, I think the ask here is simply to delete stale files during cache restore (or save), based on configurable name patterns and maximum age. @devminded, you may be able to do a version of this yourself with an additional workflow step at the end of your job:
Ideally this would use the last access time, but this cache action doesn't appear to preserve If this action could preserve |
This is related to issue actions/setup-java#269
The problem is that caches fill up over time as dependencies, runtimes, and tooling are upgraded. Old files are never evicted and the cache grows. The current solution is to recalculate the cache-key at every build (base it on the week number or such) and throw it all away, but that works against the purpose of a cache to begin with.
I suggest that when saving the caches we should be able to evict files older than a configurable number of days. That way old dependencies will be removed over time and we can have the best of both worlds.
PS. I'm not sure how the cache-hit logic works in this scenario.
Something like this:
The text was updated successfully, but these errors were encountered: