-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm 5.xx ever planning to include gfx90c GPUs? #1743
Comments
Hi, @shridharkini6! Thanks for your request. Since I am not an employee at AMD, I have no insight into what is planned there internally. However, at least some amount of library coverage seems to be a prerequisite for extending the Docker images to this class of GPUs, which are integrated into the CPU (or an "APU" in AMD's lingo). However, I do not see any support for the gfx90c as a This aligns with what can be gathered from public sources, namely that AMD is focussing on the products which the hyperscalers or supercomputer customers are currently buying. I personally think this is fair enough, as those customers seem to be rather feature-sensitive. Starting from those high-profile customers, consider the following leaky pipe of support:
Things might change a bit with the Ryzen 7000 line of desktop processors, which are announced to include a chiplet-ish GPU in the IO die. Such an arrangement does not currently fit into this leaky support pipe, but I would also not hold my breath for any kind of revolution. My bet would be on support gradually improving, as it has (not without setbacks) in the past. |
I do not think it is AMD's top priority to support an APU when even the Navi 22 and Navi 23 are not supported. Also, AMD did pull the plug on supporting APUs long time before. So I think quite frankly, to answer your question, it is... never. |
@ffleader1 that's not so clever move from AMD, because they have nothing positioned against Nvidia Jetson type of hardware. So we buy Nvidia APUs despite they're not very FOSS friendly. |
Here is a workaround to run pytorch on gfx90c.
Note: |
You can also use docker of pytorch on gfx90c. Just run like this. @shridharkini6
Note: |
Maybe you have not tried it but at least do u think your to method will work with unsupported GPUs, like gfx1031 for example. |
You may try, run like this. $ HSA_OVERRIDE_GFX_VERSION=10.3.0 python3 main.py |
wait I am a bit confused. |
1, Docker with PyTorch and ROCm installed |
I have not tried docker but for rocm, I am pretty sure the install will only be successful if your GPU is supported. I.e the rocm installation will not work on a gfx1031 or lower. |
@xfyucg i followed your methods, looks to me training is using only CPU not GPU.
throws error like
Thanks |
Run like this, that works well on my Cezanne platform.
|
Tried this as well..ended up with same error |
Can you put the output of $ rocminfo here? |
ROCk module is loadedHSA System AttributesRuntime Version: 1.1 ==========
|
@shridharkini6 Are you using docker?
|
I have tried the same..used rocm/pytorch:latest-base docker. |
According to https://docs.amd.com/bundle/AMD-Deep-Learning-Guide-v5.1.3/page/Deep_Learning_Frameworks.html docker pull rocm/pytorch:latest-base NOTE This will download the base container, which does not contain PyTorch So please use rocm/pytorch:latest
|
His hardware is not supported, and so is your I think. APUs in general do not work. Docker won't change unsatisfied prerequisites hardware availability. |
No, they use same ISA with gfx900. So for gfx90c, just override it to gfx900. That actually works. |
@xfyucg i have followed all the procedures you suggested. i.e used rocm/pytorch:latest-base and compiled pytorch from source. but get the same error
|
May be some environment issues, it's hard to debug. It's error-prone to build pytorch by yourself. |
@xfyucg yes i tried with rocm/pytorch:latest also. it throws similar errors. i hope it could be issues with base libraries as @Bengt mentioned. |
No. If you install and start docker(rocm/pytorch:latest) correctly, you will get the error like following.
After override gfx90c to gfx900
|
Make sure amdgpu kernel mode driver is installed. If you use a generic kernel on Ubuntu 20.04, install amdgpu kernel mode driver as following.
|
Try updating your system's kernel to a version newer than 6.0 and run the commands setting the following environment variable:
You can use I tested this on NixOS, branch 22.11, kernel 6.0.13 and latest rocm/pytorch container with a Ryzen 5600G. |
CC @hongxiayang |
@shridharkini6 Hi, is your issue resolved on the latest ROCm? If so can we close this ticket? |
Is this still applicable to latest ROCm? |
@shridharkini6 Unfortunately your APU (gfx90c) is not currently supported in the latest ROCm. Thanks! |
Hi
the official docker images of pytorch and tf docker are avialble only for gfx900(Vega10-type GPU - MI25, Vega56, Vega64), gfx906 (Vega20-type GPU - MI50, MI60) and gfx908 (MI100), gfx90a (MI200) and gfx1030 (Navi21).
When does gfx90c support expected.
Thanks
The text was updated successfully, but these errors were encountered: