-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI dependency wheel caching #6287
Conversation
Remove cache dir, re-trigger cache Only pip archives Not sudo when pip
Remove no-cache-dir instruction Remove last sudo occurrences v0.3
Codecov Report
@@ Coverage Diff @@
## master #6287 +/- ##
==========================================
+ Coverage 79.68% 80.45% +0.77%
==========================================
Files 146 146
Lines 26595 26595
==========================================
+ Hits 21192 21397 +205
+ Misses 5403 5198 -205
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any notion of routines/functions in circleci? Might make it easier to reuse some of this logic without typos. Otherwise this looks great and thanks for the informative PR description!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for digging into this, those errors are so annoying!
If the model is missing on S3, it should be reported as a failure, shouldn't it? I wouldn't include |
@julien-c These flaky failing tests are saying that the model is missing on S3, while it isn't. It's available, but since CircleCI has connection issues it reports those with an error. I'll link such a failing test here when there is one. |
In the past week there has been a drastic increase of Circle CI test failures due to mismatching hash for large dependencies (Torch and TensorFlow), exemple here:
The issue stems from incomplete downloads when Circle CI downloads the wheels in order to install the dependencies. The download is halted, the file is incomplete which results in a different hash.
With this PR, the CirlceCI
~/.cache/pip
directory which contains the downloaded wheels is cached between runs, which means that the files won't be re-downloaded as long as the latest version is available in the cache. A cache is created for each workflow.Unfortunately, CircleCI does not make it available to update the cache nor to delete the cache. If a new version of either Torch or TensorFlow is released, the cache won't be updated until we update either
setup.py
, the cache version, or until the cache expires 15 days after its creation.If the cache works well enough, I'll include
~/.cache/torch
in the cache, so that all the downloads done during the tests are saved in the cache as well. This should prevent other flaky tests from happening due to missing models on the S3.