-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Binary docker builds - use image tagged with folder sha #150558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150558
Note: Links to docs will display an error until the docs builds have been completed. ❌ 71 New FailuresAs of commit 71992a5 with merge base 91923f0 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
d00587a
to
9852c18
Compare
40b35c9
to
b666376
Compare
How do we achieve this for regular pull request builds nowadays? I feel like they'd need to have this in order to actually be able to test pull requests |
4d9a7be
to
45dc37e
Compare
Currently s390x doesn't push the image for PRs, just builds it |
Building the image every time wouldn't be terrible as long as we had upstream cache |
9592076
to
dbf0750
Compare
119c2e4
to
b8122a0
Compare
b8122a0
to
94e34f8
Compare
9079033
to
1c68f2d
Compare
65f226b
to
675f14b
Compare
675f14b
to
71992a5
Compare
I'm going to try splitting this up into smaller PRs |
This is part of splitting up #150558 into smaller chunks, please see that for more context Uses calculate docker image with the new custom tag prefix, so the naming convention of the docker images is slightly different for images built on PR based off of https://github.com/pytorch/pytorch/blob/a582f046084d1ea49b2a253ece15a4d6157f2579/.github/workflows/build-manywheel-images.yml#L101 Also moves the push of the docker images from inside the build scripts to inside the workflow Currently not used anywhere, but the binary docker builds are very similar so I'm going to change them to use this instead Pull Request resolved: #151471 Approved by: https://github.com/malfet, https://github.com/seemethere, https://github.com/ZainRizvi
This is part of splitting up pytorch#150558 into smaller chunks, please see that for more context Uses calculate docker image with the new custom tag prefix, so the naming convention of the docker images is slightly different for images built on PR based off of https://github.com/pytorch/pytorch/blob/a582f046084d1ea49b2a253ece15a4d6157f2579/.github/workflows/build-manywheel-images.yml#L101 Also moves the push of the docker images from inside the build scripts to inside the workflow Currently not used anywhere, but the binary docker builds are very similar so I'm going to change them to use this instead Pull Request resolved: pytorch#151471 Approved by: https://github.com/malfet, https://github.com/seemethere, https://github.com/ZainRizvi
#151483) This is part of splitting up #150558 into smaller chunks, please see that for more context Use the binary docker build action from #151471 Change the workflow trigger to be all of .ci/docker so it will make a new image + tag whenever it changes. build script: * change to be independent of the CUDA_VERSION env var, since all the info should be in the imagename:tag * remove docker push parts since that will happen during the workflow * clean up a bit * make the build script more like the CI build script (use a temp image name) I don't think this image is actually used anywhere Also push docker image to imagename:tag, I got rid of it in the PR making the reusable workflow since I thought it was not in the original scripts but it actually is there Pull Request resolved: #151483 Approved by: https://github.com/ZainRizvi
This is part of splitting up #150558 into smaller chunks, please see that for more context Similar to #151483 but for libtorch Changed the job name Testing: Can't really test since PRs don't have the credentials to push to docker io, which is the image used for everything, including PRs right now Pull Request resolved: #151488 Approved by: https://github.com/atalman
This is part of splitting up #150558 into smaller chunks, please see that for more context Similar to #151483 but for manywheel Changed the job name s390x doesn't have access to aws ecr so it doesn't use the action. manylinuxs390x-builder ecr repo doesn't exist in docker hub so idk why the image name is that Testing: Can't really test since PRs don't have the credentials to push to docker io, which is the image used for everything, including PRs right now Pull Request resolved: #151489 Approved by: https://github.com/seemethere
Should be the last part of #150558, except for maybe s390x stuff, which I'm still not sure what's going on there For binary builds, do the thing like we do in CI where we tag each image with a hash of the .ci/docker folder to ensure a docker image built from that commit gets used. Previously it would use imagename:arch-main, which could be a version of the image based on an older commit After this, changing a docker image and then tagging with ciflow/binaries on the same PR should use the new docker images Release and main builds should still pull from docker io Cons: * if someone rebuilds the image from main or a PR where the hash is the same (ex folder is unchanged, but retrigger docker build for some reason), the release would use that image instead of one built on the release branch * spin wait for docker build to finish Pull Request resolved: #151706 Approved by: https://github.com/atalman
It is hard to test the docker images that are built for binaries because the the binary workflows are hard coded to run on an image from docker io, with the main tag. To test, you have to make a change to generate_binary_build_matrix to fetch the correct tag from aws ecr and open a separate PR to test
insert example pr here from when i tested nccl
The main idea is to make the binary docker build more similar to the CI docker builds, where if .ci/docker folder is changed, a new docker image gets built that is identified by the hash of .ci/docker folder. Then CI jobs pull this docker image identified by the folder. For the binary docker images, this includes pushing docker images to ecr in addition to docker.io.
The main change is using calculate docker image everywhere and renaming things to use the new convention with docker tag prefix separate from docker image name
Overview:
Cons
Notes: