Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AOTriton install step for ROCm manylinux images #1862

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

jithunnair-amd
Copy link
Collaborator

@jithunnair-amd jithunnair-amd commented Jun 10, 2024

Fixes pytorch/pytorch#128420

Logic borrowed from pytorch/pytorch#124885

  • Installs AOTriton from release tarball in manylinux image
  • Avoids building of AOTriton from source during PyTorch build

Pre-this-PR:

Post-this-PR:
TBD

@jithunnair-amd
Copy link
Collaborator Author

jithunnair-amd commented Jun 10, 2024

cc @atalman Please approve the workflows for this PR, so we can test out AOTriton install in manywheel docker images

@jithunnair-amd jithunnair-amd marked this pull request as ready for review June 11, 2024 15:01
Copy link

pytorch-bot bot commented Jun 11, 2024

No ciflow labels are configured for this repo.
For information on how to enable CIFlow bot see this wiki

Copy link
Contributor

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@atalman atalman merged commit 178b624 into pytorch:main Jun 11, 2024
26 checks passed
atalman pushed a commit to atalman/builder that referenced this pull request Jun 11, 2024
* Add AOTriton install step for ROCm

* No common_utils.sh needed
atalman added a commit that referenced this pull request Jun 12, 2024
* Add AOTriton install step for ROCm

* No common_utils.sh needed

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
atalman added a commit that referenced this pull request Jun 17, 2024
* Remove triton constraint for py312 (#1846)

* Cache OpenBLAS to docker image for SBSA builds (#1842)

* apply openblas cache for cpu-aarch64

* reapply for cuda-aarch64

* [MacOS] Don't build wheel while building libtorch

Not sure why this was ever done twice

* Allow validate doker images to be called from different workflow (#1850)

* Allow validate doker images to be called from different workflow

* Revert "[MacOS] Don't build wheel while building libtorch"

This reverts commit d88495a.

* [MacOS] Don't build libtorch twice (take 2)

By not invoking `tools/build_libtorch.py` as as it's not done on Linux

* [MacOs][LibTorch] Copy libomp.dylib into libtorch package

* Update cudnn from v8 to v9 across CUDA versions and x86/arm (#1847)

* Update cudnn to v9.1.0.70 for cuda11.8, cuda12.1, and cuda12.4

* Add CUDNN_VERSION variable

* Remove 2 spaces for install_cu124

* trivial fix

* Fix DEPS_LIST and DEPS_SONAME for x86
Update cudnn to v9 for arm cuda binary as well

* libcudnn_adv_infer/libcudnn_adv_train becomes libcudnn_adv

* Change DEPS due to cudnn v9 libraries name changes (and additions)

* Fix lint

* Add missing changes to cu121/cu124

* Change OpenSSL URL (#1854)

* Change OpenSSL URL

* Change to use openssl URL (but no longer ftp!)

* Update build-manywheel-images.yml - Add a note about manylinux_2_28 state

* Revert "Update cudnn from v8 to v9 across CUDA versions and x86/arm" (#1855)

This reverts commit 5783bcc.

* Don't run torch.compile on runtime images in docker validations (#1858)

* Don't run torch.compile on runtime images

* test

* Don't run torch.compile on runtime images in docker validations

* Update cudnn from v8 to v9 across CUDA versions and x86/arm (#1857)

* Update cudnn to v9.1.0.70 for cuda11.8, cuda12.1, and cuda12.4

* Add CUDNN_VERSION variable

* Remove 2 spaces for install_cu124

* trivial fix

* Fix DEPS_LIST and DEPS_SONAME for x86
Update cudnn to v9 for arm cuda binary as well

* libcudnn_adv_infer/libcudnn_adv_train becomes libcudnn_adv

* Change DEPS due to cudnn v9 libraries name changes (and additions)

* Fix lint

* Add missing changes to cu121/cu124

* Fix aarch64 cuda typos

* Update validate-docker-images.yml - disable runtime error check for now

* Update validate-docker-images.yml - use validation_runner rather then hardcoded one

* Update validate-docker-images.yml - fix MATRIX_GPU_ARCH_TYPE setting for cpu only workflows

* [aarch64 cuda cudnn] Add RUNPATH to libcudnn_graph.so.9 (#1859)

* Add executorch to pypi prep, promotion and validation scripts (#1860)

* Add AOTriton install step for ROCm manylinux images (#1862)

* Add AOTriton install step for ROCm

* No common_utils.sh needed

* temporary disable runtime error check

* Add python 3.13 builder (#1845)

---------

Co-authored-by: Ting Lu <92425201+tinglvv@users.noreply.github.com>
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
Co-authored-by: Wei Wang <143543872+nWEIdia@users.noreply.github.com>
Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
PaliC pushed a commit that referenced this pull request Jun 18, 2024
* Add AOTriton install step for ROCm

* No common_utils.sh needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AOTriton Cmake error breaking PyTorch nightly binary builds for ROCm
3 participants