-
Notifications
You must be signed in to change notification settings - Fork 22
Clean up Docker images on HPU and CPU benchmarks #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
| uses: aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 # v2.0.1 | ||
| with: | ||
| registry-type: public | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why it is download in the first place, if those docker images are unused at this point? It just download all of them , and only use some? curious about this, but will approve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are from previous jobs running on the same runner. This is a common issue for non-ephemeral runners where docker pull keeps piling up the images it uses over time. On our pet instances like H100, we have a daemon to clean up these images running on the server, so you don't see this step on CI. In this case, HPU servers are from Intel, so they are not set up in the same way
|
Thanks so much for the PR, I just acknowledged the issue this morning and cleaned up HPU now. |
HPU runners ran out of space with all the Docker images it downloaded https://github.com/pytorch/pytorch-integration-testing/actions/runs/19330939614/job/55293605542
Testing
https://github.com/pytorch/pytorch-integration-testing/actions/runs/19347827069
cc @louie-tsai @xuechendi