Skip to content

Conversation

@xinyazhang
Copy link

@xinyazhang xinyazhang commented Jul 8, 2025

Backport #2311

jeffdaily and others added 2 commits July 8, 2025 10:41
…:warp_size() (#2293)

Fixes SWDEV-540240, SWDEV-540309, SWDEV-539989

```
...
```

80cca70
created a static global variable that used `at::cuda::warp_size()` to
initialize its value, which needs GPUs to be visible to query device
properties. However, GPUs are not present on CPU-only build systems.

Convert static variable into a static function, thus preventing static
initialization.

http://rocm-ci.amd.com/job/pyt_whl_docker_mainline/1461/artifact/build_artifacts.txt/*view*/

Ran microbenchmark to confirm basic functionality:
```
root@ubb4-rack-22:/var/lib/jenkins/pytorch-micro-benchmarking# python3 micro_benchmarking_pytorch.py --network resnet50
INFO: running forward and backward for warmup.
INFO: running the benchmark..
OK: finished running benchmark..
--------------------SUMMARY--------------------------
Microbenchmark for network : resnet50
Num devices: 1
Dtype: FP32
Mini batch size [img] : 64
Time per mini-batch : 0.10158218145370483
Throughput [img/sec] : 630.0317544289736=
```
@xinyazhang xinyazhang changed the title Xinyazhang/rocm7.0torch2.4 enable mi350 testing [release/2.5] Fix the Build on ROCM 7.0 Jul 8, 2025
@xinyazhang xinyazhang marked this pull request as ready for review July 8, 2025 15:44
@xinyazhang xinyazhang changed the title [release/2.5] Fix the Build on ROCM 7.0 [release/2.4] Fix the Build on ROCM 7.0 Jul 8, 2025
@jithunnair-amd jithunnair-amd changed the title [release/2.4] Fix the Build on ROCM 7.0 [release/2.4] Fix the PyTorch build on ROCM 7.0 Jul 8, 2025
@pruthvistony pruthvistony merged commit a0e5785 into release/2.4 Jul 8, 2025
0 of 2 checks passed
@pruthvistony pruthvistony deleted the xinyazhang/rocm7.0torch2.4-enable_mi350_testing branch July 8, 2025 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants