-
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpaca uses my CPU instead of my GPU (AMD) #139
Comments
Hi, yes this is a problem with ROCM and Flatpaks, I believe this is also a problem with Blender. Whilst any flatpak can detect and use the GPU for some reason ROCM doesn't work out of the box, there must be a way but I haven't figured it out and it's a bit hard to test since I have an incompatible GPU. For now I suggest you host an Ollama instance using docker and connect it to Alpaca using the remote connection option. |
There's no hurry, I use it sparsely and I can afford it to use the CPU for the time being. Is there any way I can help to test a possible fix? Is my GPU supposed to be compatible? |
Alpaca is based on ollama. Ollama automatically detect CPU and GPU, but
when it's executed by flatpak, ollama is contenarized (idk if this word
exist lol) and don't have enough privilege to check if GPU can be used.
That's what I understand !!
|
yeah that word does exist, though the problem isn't exactly the fact that it is inside a container, the problem is that ROCM doesn't work out of the box. |
I think rocm need to be loaded separately. |
adding that as it is would mean making Alpaca 4 times heavier and not everybody would even need rocm, the problem here is that either the freedesktop runtime or the gnome runtime should include rocm, that or I might not know a better solution right now since I'm still new with flatpak packaging |
I might finally have a solution where the flatpak accesses the ROCM libraries from the system itself |
You could always package it as an extension in that case |
yeah the problem with that is that I would need to make a different package for flathub |
Any progress on this? Anything you need help with in getting this done? |
do you have rocm installed on your system? I think I can make Ollama use the system installation |
if someone has ROCm installed and want to test this run these commands
This gives the Flatpak access to |
How can I install ROCm on my Silverblue machine? I tried to run "rpm-ostree install rocm" but I get a "packages not found" error. |
I think this should be it https://copr.fedorainfracloud.org/coprs/cosmicfusion/ROCm-GFX8P/ |
Ask https://discussion.fedoraproject.org/ for help they actually help in this case |
I was looking around on what Flatpaks include and they have all the stuff needed to run an app with OpenCL (a mesa alternative to ROCm as far as I'm aware) but Ollama can't use it, my recommendation for now is to run Ollama separately of Alpaca and just connect to it as a remote connection |
Could you use Genuine question |
I don't have ROCm on my system, since it's kind of a headache to install on openSUSE Tumbleweed |
As far as i know the backend ollama use rocm instead of vulkan from front end this not too easy to implement this |
GPT4All uses llama.cpp backend while this app uses ollama. |
Yes I don't lnow much about llama.cpp |
Ahhh I see, sorry for the confusion. If anyone wants to track vulkan support on Ollama: |
Yes if this get merged i hope it will bring vulkan to thos one also. |
I know, it's a headache everywhere including the Flatpak sandbox |
I installed rocm on fedora using this tutorial |
Ollama says that AMD users should try the propitiatory driver tho https://github.com/ollama/ollama/blob/main/docs/linux.md#amd-radeon-gpu-support |
Nice find, I think that's exactly what we need to make this work. For now I'll use |
By the way I just pushed an update to the extension that adds support for GFX1010 cards (mine is included hehe) Afaik it covers RX5600xt and RX5700xt If someone has one of those cards you'll need to set |
This might be a basic/stupid question, but how do I update the extension? Do I have to do it manually via the terminal, or will it get picked up as a software update? I'm pretty new to Linux/Flatpaks |
Don't worry it's not a stupid question, there aren't a lot of extensions in Flathub anyways. It should appear in the updates section of your software center, I believe this is the case with both Gnome Software and KDE Discover It sometimes takes a couple of minutes to get picked up by your Flatpak installation, if you want to force an update use the |
It seems like alpaca now runs fine on dedicated GPU. However, due to ollama limitations, it doesn't yet "run well" on integrated GPU, as it do not request more vram (GTT memory), and simply fallback to CPU. For future reader that uses a AMD iGPU (APU), see the threads here: ollama/ollama#6282 , ROCm/ROCm#2014 , ollama/ollama#2637 |
Should it work with an RX 6650xt card? Because it's still using my CPU instead. |
RX 6700 XT also not working |
As far as I know those cards don't need an override, they should just be supported out of the box |
Maybe user wide installed alpaca is the problem. I'll test again in the evening |
no no I just found it they are not supported, could you guys give me the output of |
I have a 6800XT, and am experiencing the same issue - I hope I can be useful:
|
Works for me with RX6700XT after installing the extension and setting |
how do i set this override? (env variable?) |
In the alpaca settings, second tab. |
I'm not sure if this is related, but Llama3.1 (8b) works great, but when I try to run Mistral Nemo (12b) or Gemma2 (27b) Ollama just crashes:
|
I did manage to get Gema2 (27b) working once and it was really slow (as expected), but I can't get it or Nemo to work at all now, wile Llama3.1 (8b) still works fine. I'm not sure if it's VRAM-related, but I've also noticed using the resource monitor that the model stays in the VRM for quite some time before it gets flushed: If I'm using Llama3.1 (8b) then the VRAM stays at around 6GB, even if I close down Alpaca. |
The models are kept alive for 5 minutes by default, you can change that in preferences |
If Ollama crashes though, the VRAM doesn't go back down unless I shut down or restart my PC. |
I just tested now, I ran Mistral Nemo, Ollama then crashed as I mentioned above. I then waited 10 minutes and VRAM had still not gone down. I then tried to close Alpaca, and after a few seconds I had the option to Force Quit as it wasn't responding. 10 minutes after that and the VRAM has still not gone down. |
Well, same Issue here, while the RX7900XTX is listed as supported, I just cannot get Alpaca/LLama to use my GPU :( |
It seems like you don't have the extension install (or it might be outdated) You can check all the installed apps / extensions using |
I also seem to still have issues with the extension: alpaca-debug.txt most relevant line seems to be this one:
I added myself to the Edit: after rebooting I'm happy to report that it's working for me ^^ |
If your model is big enough and amount of vram required is more than your gpu vram ollama will use cpu instead of gpu. |
Hi, I have a small update on AMD support, I added this indicator to the preferences dialog Also if you run a model that's too big for your VRAM (or RAM if you are using cpu) instead of just giving a generic crash notification it will say Also, happy 100 comments to this issue 🎉 |
This has fixed all issues at once, you are amazing! For everyone still having issues: Check if you have installed the Alpaca AMD Support flatpak extension for Alpaca. |
Yeah, the same thing happens to me. I have the application and extension installed but for some reason Alpaca is still using my CPU instead of my GPU. |
Not all GPUs are supported, please check this page |
I have noticed that Alpaca uses my CPU instead of my GPU. Here's a screenshot showing how it's using almost 40% of my CPU, and only 1% of my GPU.
I'm using an AMD Radeon RX 6650 XT GPU, which is properly detected by the OS and used by other Flatpak apps like Steam. As you can see in this other screenshot:
The text was updated successfully, but these errors were encountered: