Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device Attributes are wrong for AMD GPUs #7676

Open
Electricks94 opened this issue Jun 30, 2023 · 1 comment
Open

Device Attributes are wrong for AMD GPUs #7676

Electricks94 opened this issue Jun 30, 2023 · 1 comment

Comments

@Electricks94
Copy link

Description

Fetching the hardware properties of AMD GPUs does not work

To Reproduce

import cupy as cp
device = cp.cuda.Device(0)
da = device.attributes

The code works fine, however the dict da is not complete in terms of entries (for example MaxBlockDimX or MaxBlockDimY are missing) and the entries that are contained are mostly wrong:

{'ClockRate': 1,
 'ConcurrentManagedAccess': 16384,
 'CooperativeLaunch': 1024,
 'CooperativeMultiDeviceLaunch': 1024,
 'GlobalMemoryBusWidth': 0,
 'IsMultiGpuBoard': 1,
 'KernelExecTimeout': 16384,
 'L2CacheSize': 0,
 'MaxBlockDimZ': 1,
 'MaxGridDimY': 1700000,
 'MaxGridDimZ': 0,
 'MaxRegistersPerBlock': 1,
 'MaxSharedMemoryPerMultiprocessor': 9,
 'MaxTexture1DWidth': 1024,
 'MaxTexture2DHeight': 2147483647,
 'MaxTexture2DWidth': 2147483647,
 'MaxTexture3DWidth': 2147483647,
 'MaxThreadsPerBlock': 0,
 'MaxThreadsPerMultiProcessor': 0,
 'MemoryPoolsSupported': 0,
 'PageableMemoryAccess': 16384,
 'PageableMemoryAccessUsesHostPageTables': 8192,
 'TotalConstantMemory': 1,
 'WarpSize': 1}

Installation

None

Environment

OS                        : Linux-5.15.0-75-generic-x86_64-with-glibc2.31
Python Version            : 3.10.9
CuPy Version              : 13.0.0a1
CuPy Platform             : AMD ROCm
NumPy Version             : 1.23.5
SciPy Version             : 1.10.0
Cython Build Version      : 0.29.35
Cython Runtime Version    : None
CUDA Root                 : /opt/rocm-5.4.3
hipcc PATH                : /opt/rocm-5.4.3/bin/hipcc
CUDA Build Version        : 50422804
CUDA Driver Version       : 50422804
CUDA Runtime Version      : 50422804
cuBLAS Version            : (available)
cuFFT Version             : 10021
cuRAND Version            : 201009
cuSOLVER Version          : (3, 20, 0)
cuSPARSE Version          : (available)
NVRTC Version             : (9, 0)
Thrust Version            : 101600
CUB Build Version         : 201012
Jitify Build Version      : None
cuDNN Build Version       : None
cuDNN Version             : None
NCCL Build Version        : 21304
NCCL Runtime Version      : 21304
cuTENSOR Version          : None
cuSPARSELt Build Version  : None
Device 0 Name             : AMD Instinct MI210
Device 0 Arch             : gfx90a:sramecc+:xnack-
Device 0 PCI Bus ID       : 0000:27:00.0
Device 1 Name             : AMD Instinct MI100
Device 1 Arch             : gfx908:sramecc+:xnack-
Device 1 PCI Bus ID       : 0000:e4:00.0

Additional Information

This was tested for rocm-5.4.3 on machine containing a MI100 and MI210. For both GPUs the hardware information is wrong. However, also tests for ROCm 5.5.0 showed similar issues.

@Electricks94 Electricks94 added the cat:bug Bugs label Jun 30, 2023
@takagi
Copy link
Member

takagi commented Jul 3, 2023

@AdrianAbeyta Would you check this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants