remove unsafe use of maven dependency caching, and cache deps explicitly#15106
remove unsafe use of maven dependency caching, and cache deps explicitly#15106xvrl wants to merge 11 commits intoapache:masterfrom
Conversation
|
@xvrl - The static checks are failing. |
kgyrtkirk
left a comment
There was a problem hiding this comment.
could you please reference a failed build which have failed because of that caching?
since the cache is labeled by the pom hashes I don't think its showstopper to have installed artifacts from the project being built - but yeah; it could be left out
my problem with using random maven commands to download artifacts is that they don't always do everything - for example: druid has some extra maven plugin stuff which downloads a nodejs.tgz into the local m2 repo - we were hitting some issues because of rate limitations...
I have experienced this locally where I had artifacts from a different build causing integration tests to fail. This change removes that possibility. See also further below in my comment for an example that happened as part of this change.
The pom hash is not sufficient, different code could have the same pom hash, causing incorrect artifacts in the maven repository from being used at runtime. Some tests may also accidentally pass due to artifacts being present instead of failing if they are not rebuilt. Only those cases where we explicitly want to reuse artifacts from a previous build step for same commit should be restoring maven install artifacts. Those are already marked as such https://github.com/apache/druid/blob/master/.github/workflows/unit-and-integration-tests-unified.yml#L72-L80
fair, although the goal of caching is to download as many dependencies as possible without affecting the integrity of the build. If we miss a few it should not be a dealbreaker. I'm trying to achieve this with as few commands a possible. We may need to resolve additional dependencies explicitly if they are only called during some specific phases of the build. It looks like |
@abhishekagarwal87 yes, see my comment about bout dependency:resolve failing on druid artifacts, I'll have to look for an alternative way to pull maven dependencies. |
I don't understand how these caches affect your local builds - if you can't point to a build which have failed/misbehaved then I don't really understand why do we need to change this. those artifacts could be there and cached and reloaded - but they will not affect things; as they will get overwritten by the local build because incremental compile could not kick in for a fresh checkout - so for the install target it will need to compile and create the artifacats from 0.
I don't think it should be a best-effort thing...we should try to avoid as external (re)downloads as possible... I think as an alternative we could probably purge the local druid artifacts (remove
That depends on the usecase:
Could you describe the scenario you have in mind? |
Any phase where we restore the maven repository without explicitly restoring from a specific git commit sha should be safe to assume that the maven repository will only contain publicly available dependencies. We cannot assume that every phase will always call install first, since that is already not the case today, e.g. our web console step does not https://github.com/apache/druid/blob/master/.github/workflows/static-checks.yml#L173-L177 The cache might be deleted for any reason, and in cases where we fall back to using the setup-java maven cache such as here https://github.com/apache/druid/blob/master/.github/workflows/unit-and-integration-tests-unified.yml#L80 it's possible the maven repo would contain artifacts that do not get built again. For example, if a PR removes a submodule, but some code depending on that submodule still exists, it could pass if the cache contains that artifact, but fail if the cache does not. |
|
@kgyrtkirk to answer your question in the other thread
package works in some cases, e.g. unit tests, but for integration tests the artifacts need to be in the local maven repo. |
that's pretty interesting :D a cache should never be taken for granted...
I agree that we should prefer to not keep artifacts produced from the current sources in the cache I think that we would need to cache more than what the proposed PR tries (include the node stuff/etc) and possibly remove the attempts to avoid compilation in some jobs - because they use up too much cache space and cause churn.
I'm right now biased toward the
some notes on current state:
|
2e5c736 to
baba1b6
Compare
| restore-keys: | | ||
| maven-${{ runner.os }}-depsonly- |
There was a problem hiding this comment.
I think this should be removed as it could result in dragging outdate artifacts forward:
- project declares
artifact#1.0 - cache stores
artifact#1.0 - pom changes to use
artifact#2.0 - because of fallback next cache will still contain
artifact#1.0
There was a problem hiding this comment.
I'm following the approach suggested in https://github.com/actions/cache/blob/main/examples.md#java---maven.
Even if this includes older artifacts I believe it would still save us time compared to downloading all artifacts again on every change to them pom. This is no different than what happens when developing locally. I rarely have to clean my local .m2 cache, and it does not grow that fast.
GitHub also limits the size of cached artifacts, so I expect the size to stay bounded. If this becomes a problem we can revisit.
There was a problem hiding this comment.
it can be in an example; but I disagree with that: right now we load from zero the repo cache 4 times for every PR; switching that to open a new one if the pom.xml-s change is a huge step forward already...I think its safer to not pick up "some" earlier cache when we will be saving it
There was a problem hiding this comment.
if we add a mvn dependency:purge-local-repository step before compile, would that make you feel better?
There was a problem hiding this comment.
I'm not sure how purge-local-repository would help here ; wouldn't that still remove the local repo in case there is a cache hit?
I wonder if the problem is with the opportunistic caching of project artifacts; can't we somehow avoid them getting into the cache:
- exclude them from the cache
- I did experiment with this - and its a bit convoluted due to the fact that excludes work by filtering the included list see here
- remove them with an
always()step
I think this second approach is better - because if someone decides to add another exclude - it will be less likely to trigger the exclusion issue differently
There was a problem hiding this comment.
- we can use
purge-local-repositoryto do what you suggest, it has lots of options to include/exclude various artifacts / purge only snapshot, and doesn't make any assumptions where the maven repo is stored. - I fail to see how your suggestion addresses your earlier point that the cache would just keep growing though.
|
|
||
| - name: compile and cache maven dependencies | ||
| # run maven compile step to cache all shared maven dependencies | ||
| # maven install will be run in a subsequent step to avoid polluting the global maven cache |
There was a problem hiding this comment.
I wonder why you haven't replied back to my comments;
why can't we just exclude the artifacts built by the project from the cache?
There was a problem hiding this comment.
excluding artifacts is brittle since it requires knowing every artifact that the project might write to the local maven repo, and knowing what might or might not be safe to cache.
Instead I'd rather we err on the safe side and only add things we know are safe to cache.
There was a problem hiding this comment.
I believe we are developing the project ...and it should only produce org.apache.druid artifacts - if it doesn't we should fix that...
but all of them seem to be under .m2/repository/org/apache/druid - do you know about any which is not installed there?
I don't feel that using a crafted maven command to try to warm up the cache would be a better approach
There was a problem hiding this comment.
if we also want to be able to cache things like npm for the web console like you suggested, then we will need to rely on a compile/test-compile step to get all the cache entries warmed up. Without that we cannot guarantee that all plugins will be downloaded or invoked to pull their dependencies
|
Any idea if this patch would help with #15276 (comment)? I just noticed some weird behavior where an IT repeatedly fails on an |
|
@gianm I don't think so, I would imagine the npm caching is separate, and we don't restore any of that today. This might be a different issue. |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
@kgyrtkirk I no longer have time to work on this, so if you or someone else would like to push it over the hill feel free. In the meantime I will close it. |
Fix unsafe usage of maven dependency caching in
actions/setup-java,and replace with explicit save/restore of maven dependencies only without relying on
mvn install.Using the the setup-java maven cache is not safe for steps calling
mvn install,since setup-java cache keys are only based on the pom hashfiles whereas
mvn install causes artifacts specific to that commit to end up into the local maven repository.
This disables the use of the setup-java maven cache for all steps.
This also removes references to polluted setup-java cache keys, which we were referencing as fall-back
cache using
restore-keys:in some integration and unit tests.Instead, to avoid pulling dependencies all the time, maven dependencies are first resolved explicitly
using
maven compile / test-compile, and then cached/restored explicitly or used as fallback forcache misses with explicit commit tags.