Clean up Docker images on HPU and CPU benchmarks #106

huydhn · 2025-11-13T22:24:57Z

HPU runners ran out of space with all the Docker images it downloaded https://github.com/pytorch/pytorch-integration-testing/actions/runs/19330939614/job/55293605542

Testing

https://github.com/pytorch/pytorch-integration-testing/actions/runs/19347827069

cc @louie-tsai @xuechendi

Signed-off-by: Huy Do <huydhn@gmail.com>

yangw-dev · 2025-11-14T01:52:35Z

.github/workflows/vllm-benchmark.yml

        uses: aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 # v2.0.1
        with:
          registry-type: public



why it is download in the first place, if those docker images are unused at this point? It just download all of them , and only use some? curious about this, but will approve

They are from previous jobs running on the same runner. This is a common issue for non-ephemeral runners where docker pull keeps piling up the images it uses over time. On our pet instances like H100, we have a daemon to clean up these images running on the server, so you don't see this step on CI. In this case, HPU servers are from Intel, so they are not set up in the same way

xuechendi · 2025-11-14T18:59:55Z

Thanks so much for the PR, I just acknowledged the issue this morning and cleaned up HPU now.
However, I found issue is not image not deleted(I have a cronjob to do so), but the docker container is never stopped after benchmark

[no ci] Clean up Docker images on HPU and CPU benchmarks

52bc890

Signed-off-by: Huy Do <huydhn@gmail.com>

meta-cla bot added the cla signed label Nov 13, 2025

huydhn had a problem deploying to pytorch-x-vllm November 13, 2025 22:25 — with GitHub Actions Error

huydhn requested a deployment to pytorch-x-vllm November 13, 2025 22:25 — with GitHub Actions In progress

[no ci] Minor tweak

803fcaa

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn changed the title ~~[no ci] Clean up Docker images on HPU and CPU benchmarks~~ Clean up Docker images on HPU and CPU benchmarks Nov 13, 2025

huydhn temporarily deployed to pytorch-x-vllm November 13, 2025 22:28 — with GitHub Actions Inactive

huydhn had a problem deploying to pytorch-x-vllm November 13, 2025 22:28 — with GitHub Actions Error

huydhn requested a review from yangw-dev November 13, 2025 22:29

yangw-dev reviewed Nov 14, 2025

View reviewed changes

yangw-dev approved these changes Nov 14, 2025

View reviewed changes

huydhn merged commit 8dcdcda into main Nov 14, 2025
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up Docker images on HPU and CPU benchmarks #106

Clean up Docker images on HPU and CPU benchmarks #106

Uh oh!

huydhn commented Nov 13, 2025 •

edited

Loading

Uh oh!

yangw-dev Nov 14, 2025 •

edited

Loading

Uh oh!

huydhn Nov 14, 2025

Uh oh!

Uh oh!

xuechendi commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Clean up Docker images on HPU and CPU benchmarks #106

Clean up Docker images on HPU and CPU benchmarks #106

Uh oh!

Conversation

huydhn commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

yangw-dev Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huydhn Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xuechendi commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

huydhn commented Nov 13, 2025 •

edited

Loading

yangw-dev Nov 14, 2025 •

edited

Loading