Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which devices are even supported? (HIP/ROCm) #1714

Closed
samuelpmish opened this issue Mar 25, 2022 · 82 comments
Closed

Which devices are even supported? (HIP/ROCm) #1714

samuelpmish opened this issue Mar 25, 2022 · 82 comments

Comments

@samuelpmish
Copy link

I'm a long-time CUDA developer looking to explore ROCm and HIP development, but finding out which hardware even supports these tools is harder than it needs to be.

Let's see... this repo's readme has a section on "Supported GPUs":

Screenshot from 2022-03-24 17-23-25

Okay, "extends" implies it supports other GPUs too-- which ones? Maybe the FAQ has more info:

Screenshot from 2022-03-24 17-26-48

Nope, it'll tell me all of the NVIDIA cards that work, but none of the AMD ones apparently. Okay, I guess I'll look at their HIP Programming Guide pdf. Skimming the table of contents, no indication of "supported GPUs"-- it's a 100 page document, surely they don't expect a user to read all of that to just see if a card works or not? Let's try searching instead:

CTRL+F "supported GPU": zero results
CTRL+F "supported platform": zero results
CTRL+F "supported device": zero results

okay..

CTRL+F "supported": 87 results, great. Going through them one by one, I guess. First 76 results unrelated, 77 is the closest thing I can find:

Screenshot from 2022-03-24 17-35-47

This sounds sort of related to what I'm looking for, although it's deprecated, so the options for gpu_arch are probably out of date. I would like to know what HIP currently supports, let's look at the option --offload-arch=<target> documentation:

Screenshot from 2022-03-24 17-39-49

Okay, the documentation doesn't actually explain anything at all, it just links to something. I might have wasted a lot of my time getting here, but finally, a link with an answer to my simple question:

https://clang.llvm.org/docs/ClangOffloadBundlerFileFormat.html#target-id

Ah, of course-- the link is also broken. Maybe try:

https://clang.llvm.org/docs/ClangOffloadBundlerFileFormat.html

No, also broken.

Forgive the sarcastic tone of this issue, but am I an idiot or is this documentation just abysmal?

If I want to know which NVIDIA GPUs support CUDA, and which features, all of that information is readily available in many places, e.g.

https://developer.nvidia.com/cuda-gpus

I've been looking for an hour and found nothing official about the AMD support for HIP, so I quit. Hopefully creating a github issue will lead to an answer to this trivial question.

@Rmalavally
Copy link
Contributor

@samuelpmish We are sorry you were unable to find the information you need on the documentation portal. Please refer to the ROCm Installation Guide and the latest version of the ROCm Release Notes (v5.0), and let us know if they were helpful.

If there's specific information you need, please let me know, and I am happy to help.

AMD ROCm Documentation Team

@samuelpmish
Copy link
Author

Please refer to the ROCm Installation Guide ...

https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Overview_of_ROCm_Installation_Methods.html

this does not contain any information about which devices support ROCm or HIP.

and the latest version of the ROCm Release Notes (v5.0)

https://docs.amd.com/bundle/ROCm_Release_Notes_v5.0/page/About_This_Document.html

Thank you, this document does indicate that there are seven GPUs that support ROCm: Instinct (MI50, MI60, MI100, MI200) and Pro (VII, W6800, V620).

Does this imply that all other AMD GPUs do not support ROCm? All of the products indicated above have multi-thousand-dollar price tags and/or are not even being manufactured.

If there's specific information you need, please let me know, and I am happy to help.

The original question was specific: which AMD GPUs support ROCm and/or HIP?

@ffleader1
Copy link

ffleader1 commented Mar 25, 2022

Please refer to the ROCm Installation Guide ...

https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Overview_of_ROCm_Installation_Methods.html

this does not contain any information about which devices support ROCm or HIP.

and the latest version of the ROCm Release Notes (v5.0)

https://docs.amd.com/bundle/ROCm_Release_Notes_v5.0/page/About_This_Document.html

Thank you, this document does indicate that there are seven GPUs that support ROCm: Instinct (MI50, MI60, MI100, MI200) and Pro (VII, W6800, V620).

Does this imply that all other AMD GPUs do not support ROCm? All of the products indicated above have multi-thousand-dollar price tags and/or are not even being manufactured.

If there's specific information you need, please let me know, and I am happy to help.

The original question was specific: which AMD GPUs support ROCm and/or HIP?

I tried AMD Vega 64 and it works so at least there is that. I do want to figure out if Navi 21 is supported, then what prevents Navi 22 from getting supported? Does something like 6700 XT get supported, even unofficially?

@mark-decker
Copy link

mark-decker commented Mar 28, 2022

The list of supported GPUs is also found here in the prerequisite actions document. Even here though it does not specify if other GPUs based on the same architecture are supported.

@ffleader1
Copy link

ffleader1 commented Mar 28, 2022

The list of supported GPUs is also found here in the prerequisite actions document. Even here though it does not specify if other GPUs based on the same architecture are supported.

It does not even list all supported GPU. I have a Vega 64 and I can confirm it works

@bernharl
Copy link

It works on my RX 6800 XT now. AMD should really add a "unsupported but works" category to their list of supported devices.

@ffleader1
Copy link

ffleader1 commented Mar 28, 2022

It works on my RX 6800 XT now. AMD should really add a "unsupported but works" category to their list of supported devices.

Yeah I think so too. The point of a document is to make thing clear. It seems AMD is trying so hard to do the exact opposite. It seems the company really do not want "casual" radeon users to know that their card can work for some reason.

Anyway, does that mean a 6800 non XT should work to, cuz I am thinking of getting one.

@ye-luo
Copy link

ye-luo commented Mar 28, 2022

Here is my understanding. ROCm is a software suite with compilers, runtime libraries, accelerated numerical libraries, AI related libraries and more. "Support" simply means given hardware are validated at AMD with the whole ROCm stack.

a) Technically the compiler likely works for all the GPUs being listed https://llvm.org/docs/AMDGPUUsage.html. This means compiling/linking not necessarily running the code.
b) The runtime library depends on the GPU driver and hardware compatibility.
c) accelerated numerical libraries, AI related libraries depend on if binaries shipped by ROCm contains the needed GPU architecture. It is very likely nothing beyond the "support" list works but you are still free to compile from source code to the needed architecture.

Users may just need a subset of the stack for their purpose. That is why some ROCm "unsupported" hardware works in limited scopes. Since the scope is on per-user basis, this is not meaningful to list "unsupported but works".

@samuelpmish
Copy link
Author

Thanks to @mark-decker and @ye-luo for linking some relevant documentation to shed light on this issue.

I still wish someone official would weigh in, rather than having us speculate about the reality of what works and what doesn't. I agree that "unsupported but works" is sort of a meaningless idea, perhaps "untested" would be more accurate. If it is the case that some of the libraries in the stack do not support certain cards, then AMD should at least communicate that, rather than being ambiguous about it.

e.g. (NOTE: this table is for illustration only, it does not reflect what actually works and what doesn't)

GPU hipSparse hipSolver rocFFT rocBLAS rocThrust
gfx801 ✔️ ✔️
gfx802 ✔️
gfx803 ✔️
gfx1010 ✔️ ✔️ ✔️
gfx1030 ✔️ ✔️

✔️ : confirmed to work
❔ : untested
❌ : not working

Something like the above needs to be front and center on the documentation, if it is the case that the library support is so limited.

@ffleader1
Copy link

Thanks to @mark-decker and @ye-luo for linking some relevant documentation to shed light on this issue.

I still wish someone official would weigh in, rather than having us speculate about the reality of what works and what doesn't. I agree that "unsupported but works" is sort of a meaningless idea, perhaps "untested" would be more accurate. If it is the case that some of the libraries in the stack do not support certain cards, then AMD should at least communicate that, rather than being ambiguous about it.

e.g. (NOTE: this table is for illustration only, it does not reflect what actually works and what doesn't)

GPU hipSparse hipSolver rocFFT rocBLAS rocThrust
gfx801 ✔️ ❌ ✔️ ❌ ❌
gfx802 ❌ ❌ ❔ ✔️ ❌
gfx803 ❔ ❔ ✔️ ❔ ❌
gfx1010 ✔️ ✔️ ❔ ❔ ✔️
gfx1030 ❌ ❔ ✔️ ❔ ✔️
✔️ : confirmed to work ❔ : untested ❌ : not working

Something like the above needs to be front and center on the documentation, if it is the case that the library support is so limited.

What more interesting to me is why gfx1030 works, but gfx 1031 does not? It was not the case with Polars. It was not the case with Vega. The cut down version works just fine.

It seems to me that AMD is trying so hard to limit Rocm tool for high-end/professional grade product. Meanwhile Nvidia has a 3060 with 12GB VRAM, bringing ML to everyone.

It is a shame really.

@Bengt
Copy link

Bengt commented Mar 29, 2022

I think there is a distinction to be made between "working" and "supported". That is, a GPU might seemingly work, but has subtle bugs (e.g. correctness). AMD might choose not to be bothered with bug reports about older cards with this state (e.g. gfx803). I would suggest considering these cards working with known issues, yet being unsupported. On the other hand, as a prospective buyer I want to know, to which AMD commits some amount of attention. For example, the W6800 is currently supported, so if one buys that card today, one should reasonably expect to find any reported issues with it being honored on this issue tracker within its useful lifetime.

This consideration necessitates a fourth category:

✔️ supported issues being honored
⚙️ working maybe with known issues
untested ... or not tested rigorously
dysfunctional tested and found broken

@Bengt
Copy link

Bengt commented Mar 29, 2022

Adding to the list of unhelpful information, there is also this two-year-old - ehm - gem of an outdated document to add confusion:

https://github.com/ROCm/ROCm.github.io/blob/master/hardware.md

@FCLC
Copy link

FCLC commented Mar 31, 2022

It's unfortunate, but official replies can be hard to come by at times, especially regarding support for hardware.

A small subset of issues that received either vague or no official answers is #1706 #1694 #1683 #1676 #1617 #1623 #1631 #1595 #1592 #1544 #1547 #1539

When timelines have been given/set, they've been missed every time that I'm aware of.

RDNA1 is nearly 3 years in market (launch was July 7 2019) but the workstation card still had no support in the stack.

With the Frontier super computer now behind schedule with it's software stack, I'm expecting engineering resources that would be allocated to RDNA1-2 support to be redirected towards improving CDNA2.

See https://insidehpc.com/2022/03/oak-ridge-frontier-exascale-to-deliver-full-user-operations-on-jan-1-2023-crusher-test-system-now-running-code/ for more information on the Frontier delay.

@642258387b
Copy link

My graphics card is 6800xt and I tried to install rocm5.1 and pytorch, pytorch displays CUDA.is_ Available is true, but an error about HIP will be reported when running. However,there is no problem when I run the training in the packaged image in docker,. I don't know how to solve the problem and how to configure the pytorch in my local environment.

@ffleader1
Copy link

My graphics card is 6800xt and I tried to install rocm5.1 and pytorch, pytorch displays CUDA.is_ Available is true, but an error about HIP will be reported when running. However,there is no problem when I run the training in the packaged image in docker,. I don't know how to solve the problem and how to configure the pytorch in my local environment.

What is the error? Nothing was shown to you?

@littlewu2508
Copy link

Actually the hip/clang compiler support many GPUs. When ROCm-4.3 released, I added gfx1031 to source code of Tensile, rocBLAS, rocFFT, MIOpen, etc. Although there are test failures, especially rocPRIM cannot compile the test suite, pytorch and tensorflow successfully run on RX 6700 XT. So I suggests the support range is actually not restricted to the officially supported chips.

With help from ROCm developers, navi22 enabled rocBLAS is distributed on gentoo, and I expect gfx1031 on other packages can be more easily enabled.

@ffleader1
Copy link

ffleader1 commented Apr 1, 2022

Actually the hip/clang compiler support many GPUs. When ROCm-4.3 released, I added gfx1031 to source code of Tensile, rocBLAS, rocFFT, MIOpen, etc. Although there are test failures, especially rocPRIM cannot compile the test suite, pytorch and tensorflow successfully run on RX 6700 XT. So I suggests the support range is actually not restricted to the officially supported chips.

With help from ROCm developers, navi22 enabled rocBLAS is distributed on gentoo, and I expect gfx1031 on other packages can be more easily enabled.

Well yes but the problem is the amount of tinkering required to make, say 6700 XT, works maybe a lot. Assuming I am a casual student who do gaming on Windows, but want to dabble in to ML. Not only I have to install a completely new OS, I need to figure out the many tricks of Ubuntu/Linux to install 6700 XT and make it run pytorch... Or I can just get a Nvidia card, and it "just works" on Windows. Now if you think about it, Rocm user-friendliness is like 10 steps behind Nvidia.

Is there any "it just work" guide for installing rocm to run tf/pytorch on 6700 XT? If not, that is a huge problem.

@wsippel
Copy link

wsippel commented Apr 3, 2022

Yeah, ROCm absolutely needs a proper support matrix and a strong public commitment from AMD to get as many GPUs supported as possible, as quickly as possible.. According to two AMD engineers, ROCm actually supports pretty much every GPU since Polaris to varying degrees. rocm-opencl for example should work on everything since Vega, while HIP should work on every GPU since Polaris (but has apparently seen very little testing on older chips). It's also a chicken-and-egg problem, there's really not much software to test with in the first place, and the limited official support makes ROCm not very attractive to developers. Looking at the seven officially supported cards would do little to convince most devs to target ROCm.

@ffleader1
Copy link

ffleader1 commented Apr 3, 2022

Yeah, ROCm absolutely needs a proper support matrix and a strong public commitment from AMD to get as many GPUs supported as possible, as quickly as possible.. According to two AMD engineers, ROCm actually supports pretty much every GPU since Polaris to varying degrees. rocm-opencl for example should work on everything since Vega, while HIP should work on every GPU since Polaris (but has apparently seen very little testing on older chips). It's also a chicken-and-egg problem, there's really not much software to test with in the first place, and the limited official support makes ROCm not very attractive to developers. Looking at the seven officially supported cards would do little to convince most devs to target ROCm.

Well if someone has to take a bet, it has to be AMD. Can't win a war if you do not burn some money. As a programmer myself, I would say AMD is hesitant to burn more R&D budget on Rocm that they has already did, thus creating this unfinished product called Rocm that works with every card, but 50% of the cards, and every time, but 50% of the time.

Goes big or goes home does apply here, and I believe Intel is very much willing to chew away this market from Nvidia also.

My opinion means shit of course, but maybe expanding their budgets on Rocm, both technical and marketing. Hide more programmers, sure, but also gives out free/discounted AMD GPUs to academy institutions, create competitions like BETA ML with AMD or something to both hunt bugs and make progress with Rocm. More people in, more data for dev to work with GPUs, more polished product and so on... And also please freaking make Rocm works on Windows. Treat Rocm becomes a product, not a tool.

Well, just my two cents of BSing. I do want to support AMD/Rocm, but I would love not to pay scalper money to get a lack luster ML GPU that does not event "officially" supported on paper.

@Niko-1118
Copy link

My graphics card is 6800xt and I tried to install rocm5.1 and pytorch, pytorch displays CUDA.is_ Available is true, but an error about HIP will be reported when running. However,there is no problem when I run the training in the packaged image in docker,. I don't know how to solve the problem and how to configure the pytorch in my local environment.

What is the error? Nothing was shown to you?

After exploring for a few days, I think I know the reason. According to the official website documentation, I know i need to download the source code of torch and compile a version of torch suitable for my hardware in my local environment. I failed this step because I am a linux novice, but It doesn't matter, it's more convenient to use docker images, and local deployment is just because of my obsessive-compulsive disorder.finally thank you。

@dbenedb
Copy link

dbenedb commented Apr 24, 2022

(hardware/software table)

Something like the above needs to be front and center on the documentation, if it is the case that the library support is so limited.

Couldn't agree more. Also: clear categories for HPC, workstation/prosumer and consumer hardware.

@emirkmo
Copy link

emirkmo commented May 6, 2022

It works on my RX 6800 XT now. AMD should really add a "unsupported but works" category to their list of supported devices.

The box of RX 6800XT literally advertises something that’s not officially supported. Why is there no word about whether it’s officially supported?

@saadrahim
Copy link
Member

saadrahim commented May 17, 2022

Navi1x GPU support will not be available in ROCm. My apologies for the delays in confirming this.

AMD GPU support is based on ISA architectures. We officially support two Navi21 GPUs that use the gfx1030 architecture. These two GPUs are Radeon Pro V620 and Radeon Pro W6800. However, if you look at https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/be030feb91fff8d6d2b4409153fe549b81237580/CMakeLists.txt#L113-L118, our code only incorporates GPU support based on the ISA architecture. The model name only impacts official support. As a result, you can be confident that Radeon RX 6800, Radeon RX 6800 XT and Radeon RX 6900 XT run on a stack that has undergone full QA verification of the ISA code generated that is specific to this GPU architecture. Of course, at the moment no official support is promised for the consumer GPUs. And performance optimizations for the supported GPUs may not carry over to the unsupported gfx1030 GPUs due minor hardware differences.

Going forward, the lack of clarity on GPU support will be addressed. Please be patient and continue to report issues.

Bengt added a commit to Bengt/ROCm that referenced this issue May 17, 2022
There was a discussion about better documenting the GPU support status:

ROCm#1714

This pull request makes an attempt on documenting the latest official statement on the matter by @saadrahim:

ROCm#1714 (comment)
@Bengt
Copy link

Bengt commented May 17, 2022

@saadrahim thanks for clarifying the matter. I created a pull request documenting the current state of unofficial support in the README. Would you please extend your statement to the recently released "50" variants of cards? The AMD Radeon 6950XT is also using the gfx1030 ISA and should therefore also be unofficially supported, right?

@wsippel
Copy link

wsippel commented May 17, 2022

I successfully use HIP and rocm-opencl on a 5700XT, so RDNA1 evidently works, even if it's not officially supported. AMD's own recently released HIP-RT officially supports Vega1, Vega2, RDNA1 and RDNA2, and runs on ROCm - which officially only supports one of those GPU generations. There appears to be a lot of confusion on AMD's side what "supported" means and what ROCm even is in the first place.

@cgmb
Copy link
Collaborator

cgmb commented Aug 23, 2023

It's somewhat off-topic, but folks may also be interested in Debian's Supported GPU List for their ROCm packages.

@FCLC
Copy link

FCLC commented Aug 23, 2023

Briefly jumping in:

a few factors will dictate if you can run a model:

  1. Software Support (binary yes or no)
  2. Quantization level (does the model use quanta? if so, to what level? and does the HW/SW stack in question support that level)
  3. parameter count

assuming you run a 3b model at int8 quanta, that's 3GBs of model data in vram. add some margin for pointers, maths and so on (context/tokens for 1k length can be ~500MB) and you're at 3.5GBs.

Don't forget you also have a desktop environment to run.

In essence, a 3B model can barely fit on a 4GB card, but will fit (depending on your setup). Navi10 GPUs (5600xt, 5700, 5700xt) all ship with 6 to 8 GBs of vram.

@fredi-python
Copy link

It's somewhat off-topic, but folks may also be interested in Debian's Supported GPU List for their ROCm packages.

What architecture does RX 5700 XT use?

@cgmb
Copy link
Collaborator

cgmb commented Aug 24, 2023

The RX 5700 XT is gfx1010.

@fredi-python
Copy link

Thanks for the info!

@grigio
Copy link

grigio commented Aug 28, 2023

It's somewhat off-topic, but folks may also be interested in Debian's Supported GPU List for their ROCm packages.

I don't understand why Debian have to list AMD gpus supported by ROCm and not AMD officially

@FCLC
Copy link

FCLC commented Aug 28, 2023

Because when a HW vendor says something is supported, they can be taken to task for it when it breaks.

When opensource gets something that's sort of working/running, there's a larger understanding of what that does and doesn't mean.

It's the same reason you'll always see the WS/enterprise cards supported "first" by a vendor, because the support surface area for specific applications being problematic is much smaller

@yacc143
Copy link

yacc143 commented Aug 28, 2023 via email

@fredi-python
Copy link

So, I got a used RX 5700 XT from ebay and want to get things running.
I have arch linux and debian testing.
IG debian works better than arch linux in case of ROCM.
So I want to inference some LLMs with the transformers library, what are the first steps to take?

@cgmb
Copy link
Collaborator

cgmb commented Sep 23, 2023

So I want to inference some LLMs with the transformers library, what are the first steps to take?

You mean huggingface/transformers? It seems to depend on PyTorch, Tensorflow or Jax.

The RX 5700 XT is gfx1010. It is not officially supported by ROCm. To my knowledge, only Debian is building the ROCm math libraries for that architecture. However, Debian has not yet packaged miopen or pytorch-rocm.

You can use the Debian packages for most of the ROCm libraries, but would need to extend MIOpen and PyTorch with support for gfx1010, then build them from source.

@fredi-python
Copy link

So I want to inference some LLMs with the transformers library, what are the first steps to take?

You mean huggingface/transformers? It seems to depend on PyTorch, Tensorflow or Jax.

The RX 5700 XT is gfx1010. It is not officially supported by ROCm. To my knowledge, only Debian is building the ROCm math libraries for that architecture. However, Debian has not yet packaged miopen or pytorch-rocm.

You can use the Debian packages for most of the ROCm libraries, but would need to extend MIOpen and PyTorch with support for gfx1010, then build them from source.

Yes the huggingface transformers library, exactly
If Ubuntu works better with ROCM I could also install that, Seems to be quite tricky on debian

And another question:
How well does the RX 6650 XT perform against the RX 5700 XT in ML tasks?
And is the RX 6650 XT easier to setup with ROCM?
I am a bit confused what gfx the 6650 has, as I can't find it in https://llvm.org/docs/AMDGPUUsage.html#processors

@cgmb
Copy link
Collaborator

cgmb commented Sep 23, 2023

How well does the RX 6650 XT perform against the RX 5700 XT in ML tasks?

I don't know.

I am a bit confused what gfx the 6650 has

It is Navi 23 and is therefore gfx1032.

And is the RX 6650 XT easier to setup with ROCM?

Neither is officially supported, but the gfx1032 ISA is identical to the gfx1030 ISA. It can probably be made to work by setting the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 and using the official binaries. Navi 21 GPUs [RX 6800 / RX 6800 XT / RX 6900 XT / RX 6950 XT ] require less fiddling as they are already gfx1030.

As one of the members of the Debian AI team working on packaging this stuff, I think you can expect improvements for all RDNA cards over the next year as we've nearly finished packaging the ROCm math libraries and are moving on to packaging the AI libraries. For the most part, it has not been very difficult to extend basic functionality to all discrete AMD GPUs as we've prepared the packages.

@fredi-python
Copy link

fredi-python commented Sep 24, 2023

For AI stuff and some gaming would the RTX 3060 be the best price to performence option?
I have the feeling there is no real mid range Card that works very good with ROCm, If there is please tell me.

@muziqaz
Copy link

muziqaz commented Sep 24, 2023

For AI stuff and some gaming would the RTX 3060 be the best price to performence option? I have the feeling there is no real mid range Card that works very good with ROCm, If there is please tell me.

For AI, yes, for gaming, no. But we are getting off topic here.
It is safe to assume that generation (RDNA3) which has GPUs supported by ROCm, will work throughout all GPUs from that generation, so if 6900xt is supported, it is safe to assume rx6600 will also work. The reason we are not seeing all GPUs marked as supported is because those who are deploying ROCm to desktop space do not have time (or means) to run all the GPUs on the market to tick the box that they are supported. We managed to run HIP in Windows on ryzen 7000 iGPUs and rx550, as well as 7900xtx, 6900xt, rx6600, Radeon 7, 5700xt, etc

@littlewu2508
Copy link

For AI stuff and some gaming would the RTX 3060 be the best price to performence option? I have the feeling there is no real mid range Card that works very good with ROCm, If there is please tell me.

For RDNA3 optimization is still on-going, e.g. ROCm/rocBLAS@247d4a9 is still in develop branch and do not enter any release yet. With out that optimization you will get poor FP32 performance (ROCm/Tensile#1715). However its FP16 and FP32+=FP16*FP16 mixed performance already looks good. So wait for optimizations for RDNA3.

@fredi-python
Copy link

Now with the release of ROCm 5.7.1, does only the RX 7900 xtx work or also GPUs like the RX 7600?

@muziqaz
Copy link

muziqaz commented Oct 20, 2023

Now with the release of ROCm 5.7.1, does only the RX 7900 xtx work or also GPUs like the RX 7600?

I am very confident, that AMD did not remove any before supported GPUs, and those we managed to get working. It would be extremely counter productive to wipe all the previous support and start from zero.
This support by the way is just a validation.

@tperka
Copy link

tperka commented Dec 22, 2023

I think that thread is a good place to ask - I'm a daily Linux user and ML student. I'd like to buy myself RX 6700 XT for Christmas. Has anyone made it work with Pytorch on Linux with ROCM 5.6/5.7? Is the performance of this GPU better than RTX 3060 or does lack of official support for Linux slow it down in any way?

I'm fine with tinkering, just curious if it's even possible before buying

@littlewu2508
Copy link

I think that thread is a good place to ask - I'm a daily Linux user and ML student. I'd like to buy myself RX 6700 XT for Christmas. Has anyone made it work with Pytorch on Linux with ROCM 5.6/5.7? Is the performance of this GPU better than RTX 3060 or does lack of official support for Linux slow it down in any way?

I'm fine with tinkering, just curious if it's even possible before buying

On Linux, after the rocr-runtime/hsa level, you can get nearly same level of support of Pro W6800 (gfx1030) which is on the official support list, via environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0.

References:
#1756
ROCm/rocBLAS#1251

@nix-wolf
Copy link

nix-wolf commented Jan 9, 2024

I think that thread is a good place to ask - I'm a daily Linux user and ML student. I'd like to buy myself RX 6700 XT for Christmas. Has anyone made it work with Pytorch on Linux with ROCM 5.6/5.7? Is the performance of this GPU better than RTX 3060 or does lack of official support for Linux slow it down in any way?

I'm fine with tinkering, just curious if it's even possible before buying

I have 2 6750xt 12gb, and it works pretty good, if your having troubles and picked up a card. I have a list of maybe 25 calls from a minimal rhel install to running ml, pytorch/diffusers/transformers and such do work as well and nearly out of box, just a little memory management needs to be done.

@capsicumw
Copy link

How well does the RX 6650 XT perform against the RX 5700 XT in ML tasks?

I don't know.

I am a bit confused what gfx the 6650 has

It is Navi 23 and is therefore gfx1032.

And is the RX 6650 XT easier to setup with ROCM?

Neither is officially supported, but the gfx1032 ISA is identical to the gfx1030 ISA. It can probably be made to work by setting the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 and using the official binaries. Navi 21 GPUs [RX 6800 / RX 6800 XT / RX 6900 XT / RX 6950 XT ] require less fiddling as they are already gfx1030.

As one of the members of the Debian AI team working on packaging this stuff, I think you can expect improvements for all RDNA cards over the next year as we've nearly finished packaging the ROCm math libraries and are moving on to packaging the AI libraries. For the most part, it has not been very difficult to extend basic functionality to all discrete AMD GPUs as we've prepared the packages.

Is Debian going to be an officially supported hip/ROCm distro?
It would be an amazing step up since we can't expect long-term support for CentOS and Debian is upstream for so many other distros. (And Debian stable has been my default distro for 6 years.)
The two Enterprise Linuxes aren't products well suited to single desktop users, and I refuse to use Ubuntu(to much forced "not invented here" breaking compatibility with the rest of Linux). OpenSUSE-leap would be a decent option too, but it is not on the official support list.

@capsicumw
Copy link

Has AMD made any firm support-date commitments for officially supported cards?

I mean nVidia has demonstrated continued cuda support for many cards that are nearly 10 years old so their actions are proof enough.
Microsoft provides formal end of support dates for its various software eg "Product ZYX will be supported at least through 2025 May 31st", same goes for most major Linux distributions.

I would like to avoid the proprietary Cuda garden and maybe program with openSyCL. But why should I gamble thousands of USD assembling a new machine when hip/ROCm/pro-driver support can be haphazardly pulled out from under me next month? And AMD has a bad habit of eliminating access to older versions of software/firmware that could be used on older systems. (My current system doesn't support PCIe atomics, so a new AMD card would mean a fresh build.)

On the CPU side AMD has shown excellent long-term support, but my experience on the GPGPU side has burned me twice due to poor/misleading marketing of features and compatibility.
One was a high-end workstation card (W7000 way back in the southern islands era) and the other was a more cautious purchase of a consumer polaris card where I only considered the implied GPGPU compute features as a side bonus.

@darkshvein
Copy link

The list of supported GPUs is also found here

404 - Page Not Found
Return home or use the sidebar navigation to get back on track.
))))

@darkshvein
Copy link

but am I an idiot or is this documentation just abysmal?

The same issue for me. I search support list for blender, and my rx480 but documentation very ugly.

@cgmb
Copy link
Collaborator

cgmb commented Feb 13, 2024

The list of supported GPUs is also found here

404 - Page Not Found Return home or use the sidebar navigation to get back on track. ))))

The list for ROCm 6.0.2 can be found at https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.0.2/reference/system-requirements.html#supported-gpus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests