Skip to content

Conversation

@xinyazhang
Copy link

@xinyazhang xinyazhang commented Jul 2, 2025

Tested build locally using: registry-sc-harbor.amd.com/rocm-ci-images/compute-rocm-dkms-component-staging-hip:2277-ubunt6u-22.04

@xinyazhang xinyazhang changed the title Enable FA/ME UT on gfx950 for ROCM >= 6.5 [release/2.5] Enable FA/ME UT on gfx950 for ROCM >= 6.5 Jul 2, 2025
@xinyazhang xinyazhang marked this pull request as ready for review July 2, 2025 18:52
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@xinyazhang xinyazhang marked this pull request as draft July 2, 2025 19:37
xinyazhang and others added 3 commits July 2, 2025 21:08
…:warp_size() (#2293)

Fixes SWDEV-540240, SWDEV-540309, SWDEV-539989

```
...
```

80cca70
created a static global variable that used `at::cuda::warp_size()` to
initialize its value, which needs GPUs to be visible to query device
properties. However, GPUs are not present on CPU-only build systems.

Convert static variable into a static function, thus preventing static
initialization.

http://rocm-ci.amd.com/job/pyt_whl_docker_mainline/1461/artifact/build_artifacts.txt/*view*/

Ran microbenchmark to confirm basic functionality:
```
root@ubb4-rack-22:/var/lib/jenkins/pytorch-micro-benchmarking# python3 micro_benchmarking_pytorch.py --network resnet50
INFO: running forward and backward for warmup.
INFO: running the benchmark..
OK: finished running benchmark..
--------------------SUMMARY--------------------------
Microbenchmark for network : resnet50
Num devices: 1
Dtype: FP32
Mini batch size [img] : 64
Time per mini-batch : 0.10158218145370483
Throughput [img/sec] : 630.0317544289736=
```
@xinyazhang xinyazhang force-pushed the xinyazhang/rocm7.0torch2.5-enable_mi350_testing branch from 14abff4 to 3c1c9e8 Compare July 2, 2025 21:48
@xinyazhang xinyazhang changed the title [release/2.5] Enable FA/ME UT on gfx950 for ROCM >= 6.5 [release/2.5] Fix the Build and Enable FA/ME UT on gfx950 for ROCM >= 6.5 Jul 2, 2025
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 2, 2025

Jenkins build for 3c1c9e87f8386753925a7f67dbcc852071f8aa93 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@jithunnair-amd jithunnair-amd merged commit a1ad153 into release/2.5 Jul 2, 2025
0 of 2 checks passed
@jithunnair-amd jithunnair-amd deleted the xinyazhang/rocm7.0torch2.5-enable_mi350_testing branch July 2, 2025 23:35
@xinyazhang xinyazhang restored the xinyazhang/rocm7.0torch2.5-enable_mi350_testing branch July 7, 2025 23:06
pruthvistony pushed a commit that referenced this pull request Jul 8, 2025
Backport #2311

---------

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Co-authored-by: Ethan Wee <Ethan.Wee@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants