Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any support for AMD GPU (ROCM) #2540

Closed
mdrokz opened this issue Aug 7, 2023 · 28 comments · Fixed by #1087
Closed

Is there any support for AMD GPU (ROCM) #2540

mdrokz opened this issue Aug 7, 2023 · 28 comments · Fixed by #1087
Assignees

Comments

@mdrokz
Copy link
Contributor

mdrokz commented Aug 7, 2023

Hi i was wondering if there is any support for using llama.cpp with AMD GPU is there a ROCM implementation ?

@MichaelDays
Copy link

There’s a ROCm branch that hasn’t been merged yet, but is being maintained by the author.

https://github.com/ggerganov/llama.cpp/pull/1087/commits

@aiaicode
Copy link

aiaicode commented Aug 8, 2023

Have you tried this? : https://github.com/ggerganov/llama.cpp#clblast

To check if you have CUDA support via ROCm, do the following :

$ python
import torch
torch.cuda.is_available()
Output :
True or False

If it's True then you have the right ROCm and Pytorch installed and things should work. At least for Stable diffusion that's how you check and make it work.

If it's False then you need to check if your GPU has CUDA support or not. You can see the AMD OpenCL supported devices list here : https://en.wikipedia.org/wiki/OpenCL#Devices

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 8, 2023

There’s a ROCm branch that hasn’t been merged yet, but is being maintained by the author.

https://github.com/ggerganov/llama.cpp/pull/1087/commits

oh any idea why it hasnt been merged yet ?

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 8, 2023

Have you tried this? : https://github.com/ggerganov/llama.cpp#clblast

To check if you have CUDA support via ROCm, do the following :

$ python import torch torch.cuda.is_available() Output : True or False

If it's True then you have the right ROCm and Pytorch installed and things should work. At least for Stable diffusion that's how you check and make it work.

If it's False then you need to check if your GPU has CUDA support or not. You can see the AMD OpenCL supported devices list here : https://en.wikipedia.org/wiki/OpenCL#Devices

llama.cpp doesnt use torch as its a custom implementation so that wont work and stable diffusion uses torch by default and torch supports rocm.

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 8, 2023

oh any idea why it hasnt been merged yet ?

Soon, I just haven't had time recently to work on it.

@SlyEcho SlyEcho linked a pull request Aug 8, 2023 that will close this issue
@ghost
Copy link

ghost commented Aug 8, 2023

oh any idea why it hasnt been merged yet ?

Soon, I just haven't had time recently to work on it.

Which work needs to be done (except for Windows support, maybe)? Your version is working fine here with my 6650 XT and Linux.

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 8, 2023

I want to add some CI checks to see if it compiles so if the CUDA code is updated it would not break (well, at least not break the build).

Then some small tweaks, like having the UI say "ROCm" or "HIP" instead of "CUDA".

@SlyEcho SlyEcho self-assigned this Aug 9, 2023
@shibe2
Copy link
Collaborator

shibe2 commented Aug 14, 2023

AMD GPUs are supported through CLBlast. In my experience, ROCm is much more problematic than OpenCL. I recommend going with CLBlast, unless you get better performance with another option or for some specific reason.

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 14, 2023

AMD GPUs are supported through CLBlast. In my experience, ROCm is much more problematic than OpenCL. I recommend going with CLBlast, unless you get better performance with another option or for some specific reason.

Oh wow i didnt know that i will try CL blast. do you know the performance difference between CLBlast and ROCM ?

@ghost
Copy link

ghost commented Aug 14, 2023

For me, ROCm is much faster compared to CLBlast. And I don't see any reasons to not use ROCm (at least when we speak about Linux, ROCm for Windows is still really new). IF the hardware/os is supported, which is the only downside right now. There is a comparison between ROCm and CLBlast here, but I think it is a bit outdated:

https://github.com/YellowRoseCx/koboldcpp-rocm/

It's for koboldcpp, but this uses llama.cpp.

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 14, 2023

The prompt evaluation is much faster with ROCm. If this could be optimized, I'm sure they could be similar. The only downside to OpenCL is that the memory management is not as advanced as it is in ROCm/CUDA.

I would keep an eye on the Vulkan version in #2059, it has a lot of promise to support much wider set of devices.

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 16, 2023

For me, ROCm is much faster compared to CLBlast. And I don't see any reasons to not use ROCm (at least when we speak about Linux, ROCm for Windows is still really new). IF the hardware/os is supported, which is the only downside right now. There is a comparison between ROCm and CLBlast here, but I think it is a bit outdated:

https://github.com/YellowRoseCx/koboldcpp-rocm/

It's for koboldcpp, but this uses llama.cpp.

which GPU did you use ? i cant get CLblast to run on my RX 6700 XT, also is opencl through rocm supported ?

@shibe2
Copy link
Collaborator

shibe2 commented Aug 16, 2023

Although OpenCL and ROCm are different APIs, OpenCL driver for Radeon RX 6xxx is based on ROCm code (see AMD CLR). CLBlast supports Radeon RX 6700 XT out of the box with the default driver on Linux.

@mdrokz You need to make sure that OpenCL is working properly on your system. Try clinfo and other software that uses OpenCL. Also make sure that llama.cpp is compiled with CLBlast.

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 16, 2023

If you use Linux you have to install AMD's OpenCL platform, the open source Mesa project has two OpenCL platforms: the old Clover that may work or may crash your whole PC, then the new Rusticl, which is probably the future, but it is not as fast right now.

It should be working in Windows, there are CLBlast binaries available on our releases page.

@efschu
Copy link

efschu commented Aug 17, 2023

How to split over multiple rocm devices?
If so, possible to mix AMD and NVIDIA cards then?

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 17, 2023

Although OpenCL and ROCm are different APIs, OpenCL driver for Radeon RX 6xxx is based on ROCm code (see AMD CLR). CLBlast supports Radeon RX 6700 XT out of the box with the default driver on Linux.

@mdrokz You need to make sure that OpenCL is working properly on your system. Try clinfo and other software that uses OpenCL. Also make sure that llama.cpp is compiled with CLBlast.

Well i use the open source mesa drivers so i installed mesa-opencl package clinfo works but when i run the program i get this error

ggml_opencl: selecting platform: 'Clover'
ggml_opencl: selecting device: 'AMD Radeon RX 6700 XT (navi22, LLVM 15.0.7, DRM 3.52, 6.3.4-201.fsync.fc37.x86_64)'
ggml_opencl: device FP16 support: false
ggml_opencl: kernel compile error:

fatal error: cannot open file '/usr/lib64/clc/gfx1031-amdgcn-mesa-mesa3d.bc': No such file or directory

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 17, 2023

If you use Linux you have to install AMD's OpenCL platform, the open source Mesa project has two OpenCL platforms: the old Clover that may work or may crash your whole PC, then the new Rusticl, which is probably the future, but it is not as fast right now.

It should be working in Windows, there are CLBlast binaries available on our releases page.

Yeah i realized i have to use the AMDGPU driver to run it i found a docker image for that i will try running it through that because i use the open source mesa drivers and also i dont use windows :(

@shibe2
Copy link
Collaborator

shibe2 commented Aug 17, 2023

@mdrokz You can compile or find packages for open source ROCm-based OpenCL driver. Where did you get ROCm from?

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 17, 2023

How to split over multiple rocm devices?
If so, possible to mix AMD and NVIDIA cards then?

Absolute herecy.

Jokes aside, it wouldn't work like that, they are using completely different architectures and compiled separately. There are some libraries like Orochi that promise to do this, but we are right now using CUDA for Nvidia and HIP for AMD.

Maybe the MPI stuff would be useful in this case?


Well i use the open source mesa drivers so i installed mesa-opencl package clinfo works but when i run the program i get this error

It depends on the distro, for Arch, there are packages for the AMD OpenCL driver. On others, maybe AMD's installer can help you.

It's possible to have multiple platforms installed, clinfo will show them all. llama.cpp has environment flags to help the program choose the right one if it happens to load something that is not desired.

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 17, 2023

@mdrokz You can compile or find packages for open source ROCm-based OpenCL driver. Where did you get ROCm from?

im using fedora 37 i got rocm from here http://repo.radeon.com/rocm/yum/5.2.3/main/ and i installed rocm-opencl

@efschu
Copy link

efschu commented Aug 17, 2023

How to split over multiple rocm devices?
If so, possible to mix AMD and NVIDIA cards then?

Absolute herecy.

OK, no split between NVIDIA and AMD.

But is it possible to use multiple ROCM devices like I can with CUDA?

Loading models in ocl consume less VRAM then loading them in CUDA. Problem is, big models still need multiple GPUs.

@SlyEcho
Copy link
Sponsor Collaborator

SlyEcho commented Aug 17, 2023

Should be possible to use multiple AMD cards, but I haven't tested it myself.

The OpenCL code uses a bit different memory management, and it seems to be more efficient, but this is a known issue.

@shibe2
Copy link
Collaborator

shibe2 commented Aug 17, 2023

@mdrokz I know that version 5.6 works. Sometimes it may be necessary to set some environment variables to enable/disable OpenCL drivers, for example, OCL_ICD_VENDORS. clinfo should have "AMD-APP" in Platform Version and "HSA" in Driver Version. If clinfo shows multiple devices, you can use GGML_OPENCL_PLATFORM to select the correct driver.

@efschu
Copy link

efschu commented Aug 17, 2023

Should be possible to use multiple AMD cards, but I haven't tested it myself.

The OpenCL code uses a bit different memory management, and it seems to be more efficient, but this is a known issue.

Well, how do I split across multiple ROCm devices?

The way I do it with CUDA is not working, loading only on first card.

@arch-btw
Copy link
Contributor

@mdrokz do you have libclc installed?

I'm not familiar with how to use yum, but here's the project's website: https://libclc.llvm.org

Maybe this more specifically: https://packages.fedoraproject.org/pkgs/libclc/libclc/

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 18, 2023

@mdrokz I know that version 5.6 works. Sometimes it may be necessary to set some environment variables to enable/disable OpenCL drivers, for example, OCL_ICD_VENDORS. clinfo should have "AMD-APP" in Platform Version and "HSA" in Driver Version. If clinfo shows multiple devices, you can use GGML_OPENCL_PLATFORM to select the correct driver.

I dont have much idea about that can you check this gist https://gist.github.com/mdrokz/303ca842dcf63df733b3ab27b6f1dd14 Which platform should i use ? its showing 3 currently

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 18, 2023

@mdrokz do you have libclc installed?

I'm not familiar with how to use yum, but here's the project's website: https://libclc.llvm.org

Maybe this more specifically: https://packages.fedoraproject.org/pkgs/libclc/libclc/

Yes i have libclc installed

@mdrokz
Copy link
Contributor Author

mdrokz commented Aug 22, 2023

I ended up using the amdgpu driver in a docker container like this https://github.com/mdrokz/rust-llama.cpp/blob/implement_blas_support/examples/opencl/Dockerfile this dockerfile works on my GPU RX 6700 XT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants