[rocm7.1_internal_testing] Fix issues with merge conflicts #2461

pragupta · 2025-08-05T20:43:08Z

Remove building tensorpipe for ROCm by reverting 550bc77 as this support is going to get dropped upstream as well.
External/aotriton.cmake: remove use of __AOTRITON_VER_WITH_COMMIT
macros/Export.h: remove TORCH_HIP_CPP_API/TORCH_HIP_API and other hipified instances as CUDA ones get hipified and converted correctly (need to upstream this)
CUDALoops.cuh: Bad merge
Blas.cpp: remove MX patch (Blockwise support is not upstreamed)
cuda_vectorized_test.cu: remove ROCmloops specific test, this was removed in rocm7.0_internal_testing branch. I had incorrectly addressed the merge conflicts when merging with upstream
Update requirements-ci.txt to reflect both upstream and rocm/release/2.8 changes.

I tested this with the following docker image: registry-sc-harbor.amd.com/framework/compute-rocm-rel-7.0:24_ubuntu24.04_py3.12_pytorch_lw_release-2.7_faae1f39 and ran all the "core" UTs.

export TESTS_TO_INCLUDE="test_nn test_torch test_cuda test_ops test_unary_ufuncs test_binary_ufuncs test_autograd inductor/test_torchinductor"
export TESTS_TO_INCLUDE="distributed/test_c10d_common distributed/test_c10d_nccl distributed/test_distributed_spawn"

Only the following UTs failed with accuracy issues:

test/test_nn.py::TestNN::test_Transformer_multilayer_coder_cuda_tf32
test/test_cuda.py::TestCudaMallocAsync::test_memory_snapshot
test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace
test/distributed/test_c10d_nccl.py::CommTest::test_intra_node_comm_all_reduce

Fixes #ISSUE_NUMBER

- Fix the tensorpipe branch for ROCm - External/aotriton.cmake: remove use of __AOTRITON_VER_WITH_COMMIT - macros/Export.h: remove TORCH_HIP_CPP_API/TORCH_HIP_API as CUDA ones get hipified and converted correctly (need to upstream this) - CUDALoops.cuh: Bad merge - Blas.cpp: remove MX patch - cuda_vectorized_test.cu: remove ROCmloops specific test, this was removed in rocm7.0_internal_testing branch. I had incorrectly addressed the merge conflicts when merging with upstream

rocm-repo-management-api · 2025-08-05T20:46:53Z

Jenkins build for 13e472a3bca6121446e9911e3b6f1970f9a64f90 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

- Update triton commit to point to ToT of release/internal/3.4.x - Update related_commits file with ToT of respective repos

.gitmodules

rocm-repo-management-api · 2025-08-06T20:49:30Z

Jenkins build for 7bd7047ad708ae240f538c3159fe2d5a5ab49a60 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

This reverts commit 550bc77.

rocm-repo-management-api · 2025-08-06T21:49:30Z

Jenkins build for 9025071507f9ca4ba1f9bfecf9c7630e0100930b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

jithunnair-amd · 2025-08-08T02:42:02Z

aten/src/ATen/native/cuda/Blas.cpp

-#else
-    TORCH_CHECK(false, "Block-wise scaling for Float8_e8m0fnu requires ROCm 7.0 or later");
-#endif
+         "hipblaslt rowwise _scaled_mm only supports BFloat16 output but got ", out.scalar_type());


@pragupta Sorry, it's not clear to me why we are removing these lines... Is the gfx950-specific code not needed?

jithunnair-amd · 2025-08-08T03:11:13Z

torch/headeronly/macros/Export.h


 // Enums only need to be exported on windows for non-CUDA files
-#if defined(_WIN32) && defined(__CUDACC__)
+#if defined(_WIN32) && defined(__HIPCC__)


@pragupta The changes to the comments in this file and the change in this line seem to contradict the PR description: "macros/Export.h: remove TORCH_HIP_CPP_API/TORCH_HIP_API as CUDA ones get hipified and converted correctly (need to upstream this)"?

jithunnair-amd

Most changes look good, left questions on a couple of changes that weren't clear

rocm-repo-management-api · 2025-08-08T16:56:55Z

Jenkins build for 8ba5af94c7932107119bb841a3d1ad4060ddbeef commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-08-08T17:11:14Z

Jenkins build for 8ba5af94c7932107119bb841a3d1ad4060ddbeef commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-08-08T21:13:34Z

Jenkins build for 1ee30ac39628b611d22a8fdf331e07b4a697ec5a commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

pragupta requested review from jithunnair-amd and pruthvistony and removed request for pruthvistony August 5, 2025 20:43

pragupta changed the title ~~Fix issues with merge conflicts~~ [rocm7.1_internal_testing] Fix issues with merge conflicts Aug 6, 2025

Update triton commit and related_commits files

7bd7047

- Update triton commit to point to ToT of release/internal/3.4.x - Update related_commits file with ToT of respective repos

pragupta requested review from jataylo and jeffdaily as code owners August 6, 2025 20:29

jithunnair-amd reviewed Aug 6, 2025

View reviewed changes

.gitmodules Outdated Show resolved Hide resolved

Revert "CONSOLIDATED COMMITS: Enable tensorpipe with hip_basic backend"

9025071

This reverts commit 550bc77.

jithunnair-amd reviewed Aug 8, 2025

View reviewed changes

Export.h: dehipify this file

301dc46

Update requirements-ci with upstream and rocm/release/2.8

9ce6b1a

pragupta force-pushed the pg_rocm7.1_internal_testing_IFU_08052025 branch from 8ba5af9 to 9ce6b1a Compare August 8, 2025 21:04

pragupta and others added 3 commits August 8, 2025 21:16

Use newer versions from upstream

45f8d8e

Fix protobuf version

44451cd

typo

1ee30ac

jithunnair-amd merged commit 36e36c8 into ROCm:rocm7.1_internal_testing Aug 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rocm7.1_internal_testing] Fix issues with merge conflicts #2461

[rocm7.1_internal_testing] Fix issues with merge conflicts #2461

Uh oh!

pragupta commented Aug 5, 2025 •

edited

Loading

Uh oh!

rocm-repo-management-api bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 6, 2025 •

edited

Loading

Uh oh!

rocm-repo-management-api bot commented Aug 6, 2025 •

edited

Loading

Uh oh!

jithunnair-amd Aug 8, 2025

Uh oh!

jithunnair-amd Aug 8, 2025

Uh oh!

jithunnair-amd left a comment

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[rocm7.1_internal_testing] Fix issues with merge conflicts #2461

[rocm7.1_internal_testing] Fix issues with merge conflicts #2461

Uh oh!

Conversation

pragupta commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jithunnair-amd Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

jithunnair-amd Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

jithunnair-amd left a comment

Choose a reason for hiding this comment

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pragupta commented Aug 5, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 5, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 6, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 6, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented Aug 8, 2025 •

edited

Loading