-
Notifications
You must be signed in to change notification settings - Fork 560
Switch to pytorch sccache #4489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@yeounoh Can you take this one? |
|
@huydhn will verify, thank you! |
| # if image layers are not present in the repo. | ||
| docker tag ${GCR_DOCKER_IMAGE} ${ECR_DOCKER_IMAGE_BASE}:v0.6 >/dev/null | ||
| docker push ${ECR_DOCKER_IMAGE_BASE}:v0.6 >/dev/null | ||
| docker tag ${GCR_DOCKER_IMAGE} ${ECR_DOCKER_IMAGE_BASE}:v0.7 >/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use v0.8? v0.7 was already taken, so I pushed under v0.8 while testing/verifying your image in our CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I can merge and bump up the version separately.
|
Oh, I think we also need to rebase before merging 🙏 looking at the test failures. |
|
@yeounoh Thank you for doing the rebase for me. I'll update XLA docker image on PyTorch CI to v0.8 accordingly |
Given the context in pytorch/xla#4489, we now have a new XLA Docker image `v0.8`. This should fix the flaky sccache initialization failures with XLA. Pull Request resolved: #93041 Approved by: https://github.com/malfet
* Use pytorch/sccache * Up the image version
Recently, there are some flaky sccache start-up failures on PyTorch when building XLA, for example:
The full list can be found here. The error, strangely, comes only from XLA.
It turns out that XLA
pytorch/xla_base:v0.6uses upstream sccache fromhttps://github.com/mozilla/sccache.gitwhile the rest of PyTorch CI uses a custom fork fromhttps://github.com/pytorch/sccache.gitas defined in https://github.com/pytorch/pytorch/blob/master/.circleci/docker/common/install_cache.sh#L12IMO, it's easier to stick to
https://github.com/pytorch/sccache.gitas the rest of the CI till we have the capacity to do the switch back to upstream sccache.AFAIK,
https://github.com/pytorch/sccache.githas some fixes to work with nvcc.