-
Notifications
You must be signed in to change notification settings - Fork 75
[rocm7.1_internal_testing] Fix issues with merge conflicts #2461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rocm7.1_internal_testing] Fix issues with merge conflicts #2461
Conversation
- Fix the tensorpipe branch for ROCm - External/aotriton.cmake: remove use of __AOTRITON_VER_WITH_COMMIT - macros/Export.h: remove TORCH_HIP_CPP_API/TORCH_HIP_API as CUDA ones get hipified and converted correctly (need to upstream this) - CUDALoops.cuh: Bad merge - Blas.cpp: remove MX patch - cuda_vectorized_test.cu: remove ROCmloops specific test, this was removed in rocm7.0_internal_testing branch. I had incorrectly addressed the merge conflicts when merging with upstream
|
Jenkins build for 13e472a3bca6121446e9911e3b6f1970f9a64f90 commit finished as FAILURE |
- Update triton commit to point to ToT of release/internal/3.4.x - Update related_commits file with ToT of respective repos
|
Jenkins build for 7bd7047ad708ae240f538c3159fe2d5a5ab49a60 commit finished as FAILURE |
This reverts commit 550bc77.
|
Jenkins build for 9025071507f9ca4ba1f9bfecf9c7630e0100930b commit finished as FAILURE |
| #else | ||
| TORCH_CHECK(false, "Block-wise scaling for Float8_e8m0fnu requires ROCm 7.0 or later"); | ||
| #endif | ||
| "hipblaslt rowwise _scaled_mm only supports BFloat16 output but got ", out.scalar_type()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pragupta Sorry, it's not clear to me why we are removing these lines... Is the gfx950-specific code not needed?
torch/headeronly/macros/Export.h
Outdated
|
|
||
| // Enums only need to be exported on windows for non-CUDA files | ||
| #if defined(_WIN32) && defined(__CUDACC__) | ||
| #if defined(_WIN32) && defined(__HIPCC__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pragupta The changes to the comments in this file and the change in this line seem to contradict the PR description: "macros/Export.h: remove TORCH_HIP_CPP_API/TORCH_HIP_API as CUDA ones get hipified and converted correctly (need to upstream this)"?
jithunnair-amd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most changes look good, left questions on a couple of changes that weren't clear
|
Jenkins build for 8ba5af94c7932107119bb841a3d1ad4060ddbeef commit finished as FAILURE |
|
Jenkins build for 8ba5af94c7932107119bb841a3d1ad4060ddbeef commit finished as FAILURE |
8ba5af9 to
9ce6b1a
Compare
|
Jenkins build for 1ee30ac39628b611d22a8fdf331e07b4a697ec5a commit finished as FAILURE |
I tested this with the following docker image:
registry-sc-harbor.amd.com/framework/compute-rocm-rel-7.0:24_ubuntu24.04_py3.12_pytorch_lw_release-2.7_faae1f39and ran all the "core" UTs.export TESTS_TO_INCLUDE="test_nn test_torch test_cuda test_ops test_unary_ufuncs test_binary_ufuncs test_autograd inductor/test_torchinductor"
export TESTS_TO_INCLUDE="distributed/test_c10d_common distributed/test_c10d_nccl distributed/test_distributed_spawn"
Only the following UTs failed with accuracy issues:
Fixes #ISSUE_NUMBER