-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py-tensorflow: remove patch file for 2.16-rocm-enhanced #44783
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed patch URLs exist. Just have a quick question about one of the two affected patches.
I had to add an additional patch as some of the paths were still assuming all the rocm components were in a single ROCM_PATH. I'll try to push the changes in the new patch to the rocm/tensorflow-upstream eventually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the unused patch file from the package repository.
@spackbot rerun pipeline |
I've started that pipeline for you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
LGTM. @adamjstewart @aweits Do either of you want to comment on this PR before it is merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will defer to @adamjstewart on this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly, the reason that these patches are being removed is because they were merged into the AMD release branches. AKA, the existing versions were working correctly until the patches were merged, then all builds started failing, and this PR is required to fix them. This lack of stability is the main reason we avoid using branches in Spack. Would be great if we could use stable, checksummable release tarballs or commit hashes in the future. Would be even better if we didn't need to use a fork for TF+ROCm at all, but I think that's already in the works. I'll try to restrain my complaints because I think you've heard them plenty of times before, but it would be really nice to get TF+ROCm in CI so we can detect and fix breakages like this.
The changes were merged only into the develop branch of https://github.com/ROCm/tensorflow-upstream. The rocm-enhanced branches were unaffected and py-tensorflow+rocm was still building successfully, but I ran into some other problems when testing that required some additional changes. Also, I didn't like the idea of having patch files that are almost 400 lines long which is why I wanted to remove them once the changes were commited. I've brought up using release tarballs instead of branches to the folks that maintain the ROCm tf fork, so hopefully we'll be using tarballs for future releases. Since TF+ROCM is working with spack built ROCm binaries now, I think we can try adding that to the CI as well. |
* remove patch file for py-tensorflow@2.16-rocm-enhanced * add changes * fix style errors * remove 2.14-rocm-enhanced version and add patch file * fix stlye error * remove jit patch * add 2.14-rocm-enhanced version
The changes to make tensorflow with rocm support compatible with spack have been pushed:
ROCm/tensorflow-upstream@c467913