Alpaca uses my CPU instead of my GPU (AMD) #139

frandavid100 · 2024-07-10T04:55:55Z

I have noticed that Alpaca uses my CPU instead of my GPU. Here's a screenshot showing how it's using almost 40% of my CPU, and only 1% of my GPU.

I'm using an AMD Radeon RX 6650 XT GPU, which is properly detected by the OS and used by other Flatpak apps like Steam. As you can see in this other screenshot:

Jeffser · 2024-07-10T05:18:00Z

Hi, yes this is a problem with ROCM and Flatpaks, I believe this is also a problem with Blender.

Whilst any flatpak can detect and use the GPU for some reason ROCM doesn't work out of the box, there must be a way but I haven't figured it out and it's a bit hard to test since I have an incompatible GPU.

For now I suggest you host an Ollama instance using docker and connect it to Alpaca using the remote connection option.

frandavid100 · 2024-07-10T07:00:11Z

There's no hurry, I use it sparsely and I can afford it to use the CPU for the time being.

Is there any way I can help to test a possible fix? Is my GPU supposed to be compatible?

loulou64490 · 2024-07-10T07:30:57Z

Alpaca is based on ollama. Ollama automatically detect CPU and GPU, but when it's executed by flatpak, ollama is contenarized (idk if this word exist lol) and don't have enough privilege to check if GPU can be used. That's what I understand !!

Jeffser · 2024-07-11T03:12:22Z

yeah that word does exist, though the problem isn't exactly the fact that it is inside a container, the problem is that ROCM doesn't work out of the box.

olumolu · 2024-07-23T07:53:35Z

I think rocm need to be loaded separately.
https://github.com/ollama/ollama/releases/download/v0.2.8/ollama-linux-amd64-rocm.tgz
this contains the rocm driver. this is a real issue that need to be fixed.

Jeffser · 2024-07-24T01:49:07Z

adding that as it is would mean making Alpaca 4 times heavier and not everybody would even need rocm, the problem here is that either the freedesktop runtime or the gnome runtime should include rocm, that or I might not know a better solution right now since I'm still new with flatpak packaging

Jeffser · 2024-07-24T02:02:05Z

I might finally have a solution where the flatpak accesses the ROCM libraries from the system itself

0chroma · 2024-07-24T02:56:28Z

adding that as it is would mean making Alpaca 4 times heavier and not everybody would even need rocm, the problem here is that either the freedesktop runtime or the gnome runtime should include rocm, that or I might not know a better solution right now since I'm still new with flatpak packaging

You could always package it as an extension in that case

Jeffser · 2024-07-24T03:05:48Z

yeah the problem with that is that I would need to make a different package for flathub

TacoCake · 2024-08-04T14:29:48Z

Any progress on this? Anything you need help with in getting this done?

Jeffser · 2024-08-04T20:43:52Z

do you have rocm installed on your system? I think I can make Ollama use the system installation

Jeffser · 2024-08-04T20:49:55Z

if someone has ROCm installed and want to test this run these commands

flatpak override --filesystem=/opt/rocm com.jeffser.Alpaca
flatpak override --env=LD_LIBRARY_PATH=/opt/rocm/lib:/opt/rocm/lib64:/app/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib:/usr/lib/x86_64-linux-gnu/openh264/extra:/usr/lib/sdk/llvm15/lib:/usr/lib/sdk/openjdk11/lib:/usr/lib/sdk/openjdk17/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib com.jeffser.Alpaca

This gives the Flatpak access to /opt/rocm and specifies that it is an available library, the rest are the default Flatpak libraries, just ignore that

frandavid100 · 2024-08-05T06:35:17Z

How can I install ROCm on my Silverblue machine? I tried to run "rpm-ostree install rocm" but I get a "packages not found" error.

Jeffser · 2024-08-05T07:05:40Z

I think this should be it

https://copr.fedorainfracloud.org/coprs/cosmicfusion/ROCm-GFX8P/

olumolu · 2024-08-05T09:26:07Z

How can I install ROCm on my Silverblue machine? I tried to run "rpm-ostree install rocm" but I get a "packages not found" error.

Ask https://discussion.fedoraproject.org/ for help they actually help in this case

Jeffser · 2024-08-05T23:54:37Z

I was looking around on what Flatpaks include and they have all the stuff needed to run an app with OpenCL (a mesa alternative to ROCm as far as I'm aware) but Ollama can't use it, my recommendation for now is to run Ollama separately of Alpaca and just connect to it as a remote connection

TacoCake · 2024-08-08T18:16:27Z

Could you use Vulkan instead of trying to use ROCm? Kinda how GTP4ALL does it https://github.com/nomic-ai/gpt4all

Genuine question

TacoCake · 2024-08-08T18:17:02Z

do you have rocm installed on your system? I think I can make Ollama use the system installation

I don't have ROCm on my system, since it's kind of a headache to install on openSUSE Tumbleweed

olumolu · 2024-08-08T18:18:21Z

As far as i know the backend ollama use rocm instead of vulkan from front end this not too easy to implement this

Shished · 2024-08-08T18:20:33Z

GPT4All uses llama.cpp backend while this app uses ollama.

olumolu · 2024-08-08T18:21:38Z

Yes I don't lnow much about llama.cpp
But ollama use rocm
I this is really a issue that rocm is not installed on many systems

TacoCake · 2024-08-08T18:22:15Z

As far as i know the backend ollama use rcom instead of vulkan from front end this not too easy to implement this

GPT4All uses llama.cpp backend while this app uses ollama.

Ahhh I see, sorry for the confusion. If anyone wants to track vulkan support on Ollama:

olumolu · 2024-08-08T18:25:26Z

Yes if this get merged i hope it will bring vulkan to thos one also.
For the time being I dont thing we can do much

Jeffser · 2024-08-08T18:37:23Z

do you have rocm installed on your system? I think I can make Ollama use the system installation

I don't have ROCm on my system, since it's kind of a headache to install on openSUSE Tumbleweed

I know, it's a headache everywhere including the Flatpak sandbox

francus11 · 2024-08-12T03:54:10Z

flatpak override --filesystem=/opt/rocm com.jeffser.Alpaca
flatpak override --env=LD_LIBRARY_PATH=/opt/rocm/lib:/opt/rocm/lib64:/app/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib:/usr/lib/x86_64-linux-gnu/openh264/extra:/usr/lib/sdk/llvm15/lib:/usr/lib/sdk/openjdk11/lib:/usr/lib/sdk/openjdk17/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib com.jeffser.Alpaca

I installed rocm on fedora using this tutorial
https://fedoraproject.org/wiki/SIGs/HC#Installation
Still though, my GPU usage is 0%. Any other suggestions?

Jeffser · 2024-08-13T06:01:28Z

Today I learned that ROCm is actually bundled with the Ollama binary... So I have no idea what to try now lol

(third line)

Jeffser · 2024-08-13T06:02:55Z

Ollama says that AMD users should try the propitiatory driver tho

https://github.com/ollama/ollama/blob/main/docs/linux.md#amd-radeon-gpu-support

Jeffser · 2024-09-11T22:10:43Z

I tried with every specific device type that flatpak uses, for some reason it only works when I use all

Could be related to https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/1535#note_1310656438

Nice find, I think that's exactly what we need to make this work. For now I'll use all so that it at least works, once that becomes available I will change it

Jeffser · 2024-09-11T22:12:33Z

By the way I just pushed an update to the extension that adds support for GFX1010 cards (mine is included hehe)

Afaik it covers RX5600xt and RX5700xt

If someone has one of those cards you'll need to set HSA_OVERRIDE_GFX_VERSION to 10.1.0

P-Jay357 · 2024-09-11T22:59:15Z

This might be a basic/stupid question, but how do I update the extension? Do I have to do it manually via the terminal, or will it get picked up as a software update? I'm pretty new to Linux/Flatpaks

Jeffser · 2024-09-11T23:04:00Z

This might be a basic/stupid question, but how do I update the extension? Do I have to do it manually via the terminal, or will it get picked up as a software update? I'm pretty new to Linux/Flatpaks

Don't worry it's not a stupid question, there aren't a lot of extensions in Flathub anyways.

It should appear in the updates section of your software center, I believe this is the case with both Gnome Software and KDE Discover

It sometimes takes a couple of minutes to get picked up by your Flatpak installation, if you want to force an update use the flatpak update command

czhang03 · 2024-09-11T23:41:54Z

It seems like alpaca now runs fine on dedicated GPU. However, due to ollama limitations, it doesn't yet "run well" on integrated GPU, as it do not request more vram (GTT memory), and simply fallback to CPU.

For future reader that uses a AMD iGPU (APU), see the threads here: ollama/ollama#6282 , ROCm/ROCm#2014 , ollama/ollama#2637

frandavid100 · 2024-09-12T04:56:06Z

By the way I just pushed an update to the extension that adds support for GFX1010 cards (mine is included hehe)

Afaik it covers RX5600xt and RX5700xt

Should it work with an RX 6650xt card? Because it's still using my CPU instead.

TheRsKing · 2024-09-12T05:12:19Z

RX 6700 XT also not working

Jeffser · 2024-09-12T05:18:24Z

As far as I know those cards don't need an override, they should just be supported out of the box

TheRsKing · 2024-09-12T05:34:08Z

Maybe user wide installed alpaca is the problem. I'll test again in the evening

Jeffser · 2024-09-12T05:35:07Z

no no I just found it they are not supported, could you guys give me the output of rocminfo ?
I'll figure out what override you guys should use

@frandavid100
@TheRsKing

daniwhal · 2024-09-12T09:24:50Z

could you guys give me the output of rocminfo ? I'll figure out what override you guys should use

I have a 6800XT, and am experiencing the same issue - I hope I can be useful:

ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
Runtime Ext Version:     1.4
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 7 5700X3D 8-Core Processor
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 7 5700X3D 8-Core Processor
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3000                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            16                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    32753640(0x1f3c7e8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    32753640(0x1f3c7e8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    32753640(0x1f3c7e8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1030                            
  Uuid:                    GPU-8f89fe13d8ed7bda               
  Marketing Name:          AMD Radeon RX 6800 XT              
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      4096(0x1000) KB                    
    L3:                      131072(0x20000) KB                 
  Chip ID:                 29631(0x73bf)                      
  ASIC Revision:           1(0x1)                             
  Cacheline Size:          128(0x80)                          
  Max Clock Freq. (MHz):   2575                               
  BDFID:                   2048                               
  Internal Node ID:        1                                  
  Compute Unit:            72                                 
  SIMDs per CU:            2                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 118                                
  SDMA engine uCode::      83                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1030         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***

Shished · 2024-09-12T15:21:06Z

Works for me with RX6700XT after installing the extension and setting HSA_OVERRIDE_GFX_VERSION="10.3.0"

TheRsKing · 2024-09-12T15:59:51Z

Works for me with RX6700XT after installing the extension and setting HSA_OVERRIDE_GFX_VERSION="10.3.0"

how do i set this override? (env variable?)

Shished · 2024-09-12T16:08:37Z

In the alpaca settings, second tab.

P-Jay357 · 2024-09-12T18:28:28Z

I'm not sure if this is related, but Llama3.1 (8b) works great, but when I try to run Mistral Nemo (12b) or Gemma2 (27b) Ollama just crashes:


time=2024-09-12T19:25:21.136+01:00 level=INFO source=server.go:391 msg="starting llama server" cmd="/home/gareth/.var/app/com.jeffser.Alpaca/cache/tmp/ollama/ollama57975444/runners/rocm_v60102/ollama_llama_server --model /home/gareth/.var/app/com.jeffser.Alpaca/data/.ollama/models/blobs/sha256-b559938ab7a0392fc9ea9675b82280f2a15669ec3e0e0fc491c9cb0a7681cf94 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 37 --verbose --parallel 1 --port 43643"
time=2024-09-12T19:25:21.136+01:00 level=DEBUG source=server.go:408 msg=subprocess environment="[LD_LIBRARY_PATH=/app/plugins/AMD/lib/ollama:/home/gareth/.var/app/com.jeffser.Alpaca/cache/tmp/ollama/ollama57975444/runners/rocm_v60102:/app/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib:/usr/lib/x86_64-linux-gnu/openh264/extra:/usr/lib/x86_64-linux-gnu/openh264/extra:/usr/lib/sdk/llvm15/lib:/usr/lib/x86_64-linux-gnu/GL/default/lib:/usr/lib/ollama:/app/plugins/AMD/lib/ollama PATH=/app/bin:/usr/bin HIP_VISIBLE_DEVICES=0]"
time=2024-09-12T19:25:21.137+01:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2024-09-12T19:25:21.137+01:00 level=INFO source=server.go:591 msg="waiting for llama runner to start responding"
time=2024-09-12T19:25:21.137+01:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server error"
INFO [main] build info | build=1 commit="1e6f655" tid="140547961254656" timestamp=1726165521
INFO [main] system info | n_threads=8 n_threads_batch=-1 system_info="AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | " tid="140547961254656" timestamp=1726165521 total_threads=16
INFO [main] HTTP server listening | hostname="127.0.0.1" n_threads_http="15" port="43643" tid="140547961254656" timestamp=1726165521
llama_model_loader: loaded meta data with 35 key-value pairs and 363 tensors from /home/gareth/.var/app/com.jeffser.Alpaca/data/.ollama/models/blobs/sha256-b559938ab7a0392fc9ea9675b82280f2a15669ec3e0e0fc491c9cb0a7681cf94 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Mistral Nemo Instruct 2407
llama_model_loader: - kv   3:                            general.version str              = 2407
llama_model_loader: - kv   4:                           general.finetune str              = Instruct
llama_model_loader: - kv   5:                           general.basename str              = Mistral-Nemo
llama_model_loader: - kv   6:                         general.size_label str              = 12B
llama_model_loader: - kv   7:                            general.license str              = apache-2.0
llama_model_loader: - kv   8:                          general.languages arr[str,9]       = ["en", "fr", "de", "es", "it", "pt", ...
llama_model_loader: - kv   9:                          llama.block_count u32              = 40
llama_model_loader: - kv  10:                       llama.context_length u32              = 1024000
llama_model_loader: - kv  11:                     llama.embedding_length u32              = 5120
llama_model_loader: - kv  12:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv  13:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  14:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  15:                       llama.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  16:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  17:                 llama.attention.key_length u32              = 128
llama_model_loader: - kv  18:               llama.attention.value_length u32              = 128
llama_model_loader: - kv  19:                          general.file_type u32              = 2
llama_model_loader: - kv  20:                           llama.vocab_size u32              = 131072
llama_model_loader: - kv  21:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  22:            tokenizer.ggml.add_space_prefix bool             = false
llama_model_loader: - kv  23:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  24:                         tokenizer.ggml.pre str              = tekken
llama_model_loader: - kv  25:                      tokenizer.ggml.tokens arr[str,131072]  = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv  26:                  tokenizer.ggml.token_type arr[i32,131072]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
ERROR	[window.py | connection_error] Connection error
INFO	[connection_handler.py | reset] Resetting Alpaca's Ollama instance
INFO	[connection_handler.py | stop] Stopping Alpaca's Ollama instance
INFO	[connection_handler.py | stop] Stopped Alpaca's Ollama instance
INFO	[connection_handler.py | start] Starting Alpaca's Ollama instance...
INFO	[connection_handler.py | start] Started Alpaca's Ollama instance
2024/09/12 19:25:22 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11435 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/gareth/.var/app/com.jeffser.Alpaca/data/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-09-12T19:25:22.415+01:00 level=INFO source=images.go:753 msg="total blobs: 15"
time=2024-09-12T19:25:22.415+01:00 level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-12T19:25:22.415+01:00 level=INFO source=routes.go:1172 msg="Listening on 127.0.0.1:11435 (version 0.3.9)"
INFO	[connection_handler.py | start] client version is 0.3.9
INFO	[window.py | show_toast] There was an error with the local Ollama instance, so it has been reset

P-Jay357 · 2024-09-12T18:31:34Z

I did manage to get Gema2 (27b) working once and it was really slow (as expected), but I can't get it or Nemo to work at all now, wile Llama3.1 (8b) still works fine. I'm not sure if it's VRAM-related, but I've also noticed using the resource monitor that the model stays in the VRM for quite some time before it gets flushed:

If I'm using Llama3.1 (8b) then the VRAM stays at around 6GB, even if I close down Alpaca.

Jeffser · 2024-09-12T18:54:47Z

The models are kept alive for 5 minutes by default, you can change that in preferences

P-Jay357 · 2024-09-12T19:10:59Z

If Ollama crashes though, the VRAM doesn't go back down unless I shut down or restart my PC.

P-Jay357 · 2024-09-12T19:16:24Z

I just tested now, I ran Mistral Nemo, Ollama then crashed as I mentioned above. I then waited 10 minutes and VRAM had still not gone down. I then tried to close Alpaca, and after a few seconds I had the option to Force Quit as it wasn't responding. 10 minutes after that and the VRAM has still not gone down.

ghost · 2024-09-12T21:37:25Z

@P-Jay357 That's because the llama server is still running after Alpaca crashes. You need to kill its two processes, then you can use Alpaca again. I experienced crashes as described in #298.

AlgorithmArtist · 2024-10-02T00:55:33Z

Well, same Issue here, while the RX7900XTX is listed as supported, I just cannot get Alpaca/LLama to use my GPU :(
alpaca-debug.txt
I have no idea why it wouldn't consider the rocm library, which is even installed locally, and sometime the log just straight up tells me, that the gpu is not considered??
Maybe I am missing something obvious?
Using Alpaca 2.0.6

Jeffser · 2024-10-03T01:42:01Z

Well, same Issue here, while the RX7900XTX is listed as supported, I just cannot get Alpaca/LLama to use my GPU :( alpaca-debug.txt I have no idea why it wouldn't consider the rocm library, which is even installed locally, and sometime the log just straight up tells me, that the gpu is not considered?? Maybe I am missing something obvious? Using Alpaca 2.0.6

It seems like you don't have the extension install (or it might be outdated)

You can check all the installed apps / extensions using flatpak list

0chroma · 2024-10-04T08:49:05Z

I also seem to still have issues with the extension: alpaca-debug.txt

most relevant line seems to be this one:

time=2024-10-04T01:39:47.736-07:00 level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="permissions not set up properly.  Either run ollama as root, or add you user account to the render group. open /dev/kfd: permission denied"

I added myself to the render group and logged out/in again, but no dice. I run an immutable OS, so may need to reboot.

Edit: after rebooting I'm happy to report that it's working for me ^^

olumolu · 2024-10-04T09:07:13Z

If your model is big enough and amount of vram required is more than your gpu vram ollama will use cpu instead of gpu.

Jeffser · 2024-10-13T21:06:01Z

Hi, I have a small update on AMD support, I added this indicator to the preferences dialog

8c98be6

Also if you run a model that's too big for your VRAM (or RAM if you are using cpu) instead of just giving a generic crash notification it will say Model request too large for system

115e22e

Also, happy 100 comments to this issue 🎉

AlgorithmArtist · 2024-10-14T18:59:36Z

This has fixed all issues at once, you are amazing! ♥️

For everyone still having issues: Check if you have installed the Alpaca AMD Support flatpak extension for Alpaca.

Jeffser · 2024-10-14T22:47:55Z

Final Update (I guess)

I added a link to the repo wiki, I'm working on writing it

ndonkersloot · 2024-11-04T20:18:33Z

I'm not sure if i'm missing something but it does still seem to use CPU in my case.
I'm running Fedora Silverblue 41 on a lenovo T14 with a AMD GPU.

The application and extension is installed and up-to-date.

frandavid100 · 2024-11-04T21:43:51Z

Yeah, the same thing happens to me. I have the application and extension installed but for some reason Alpaca is still using my CPU instead of my GPU.

Jeffser · 2024-11-05T03:37:10Z

Not all GPUs are supported, please check this page

frandavid100 added the bug Something isn't working label Jul 10, 2024

Jeffser changed the title ~~Alpaca uses my CPU instead of my GPU~~ Alpaca uses my CPU instead of my GPU (AMD) Jul 30, 2024

Jeffser added the help wanted Extra attention is needed label Aug 4, 2024

Alpaca uses my CPU instead of my GPU (AMD) #139

Alpaca uses my CPU instead of my GPU (AMD) #139

Comments

frandavid100 commented Jul 10, 2024

Jeffser commented Jul 10, 2024

frandavid100 commented Jul 10, 2024

loulou64490 commented Jul 10, 2024 via email • edited Loading

Jeffser commented Jul 11, 2024

olumolu commented Jul 23, 2024

Jeffser commented Jul 24, 2024

Jeffser commented Jul 24, 2024

0chroma commented Jul 24, 2024

Jeffser commented Jul 24, 2024

TacoCake commented Aug 4, 2024

Jeffser commented Aug 4, 2024

Jeffser commented Aug 4, 2024

frandavid100 commented Aug 5, 2024

Jeffser commented Aug 5, 2024

olumolu commented Aug 5, 2024

Jeffser commented Aug 5, 2024

TacoCake commented Aug 8, 2024

TacoCake commented Aug 8, 2024

olumolu commented Aug 8, 2024 • edited Loading

Shished commented Aug 8, 2024

olumolu commented Aug 8, 2024 • edited Loading

TacoCake commented Aug 8, 2024

olumolu commented Aug 8, 2024

Jeffser commented Aug 8, 2024

francus11 commented Aug 12, 2024

Jeffser commented Aug 13, 2024

Jeffser commented Aug 13, 2024

Jeffser commented Sep 11, 2024

Jeffser commented Sep 11, 2024 • edited Loading

P-Jay357 commented Sep 11, 2024 • edited Loading

Jeffser commented Sep 11, 2024

czhang03 commented Sep 11, 2024 • edited Loading

frandavid100 commented Sep 12, 2024

TheRsKing commented Sep 12, 2024

Jeffser commented Sep 12, 2024

TheRsKing commented Sep 12, 2024

Jeffser commented Sep 12, 2024

daniwhal commented Sep 12, 2024

Shished commented Sep 12, 2024

TheRsKing commented Sep 12, 2024 • edited Loading

Shished commented Sep 12, 2024

P-Jay357 commented Sep 12, 2024

P-Jay357 commented Sep 12, 2024

Jeffser commented Sep 12, 2024

P-Jay357 commented Sep 12, 2024

P-Jay357 commented Sep 12, 2024

ghost commented Sep 12, 2024

AlgorithmArtist commented Oct 2, 2024 • edited Loading

Jeffser commented Oct 3, 2024

0chroma commented Oct 4, 2024 • edited Loading

olumolu commented Oct 4, 2024

Jeffser commented Oct 13, 2024 • edited Loading

AlgorithmArtist commented Oct 14, 2024

Jeffser commented Oct 14, 2024

ndonkersloot commented Nov 4, 2024

frandavid100 commented Nov 4, 2024

Jeffser commented Nov 5, 2024

loulou64490 commented Jul 10, 2024 via email •

edited

Loading

olumolu commented Aug 8, 2024 •

edited

Loading

olumolu commented Aug 8, 2024 •

edited

Loading

Jeffser commented Sep 11, 2024 •

edited

Loading

P-Jay357 commented Sep 11, 2024 •

edited

Loading

czhang03 commented Sep 11, 2024 •

edited

Loading

TheRsKing commented Sep 12, 2024 •

edited

Loading

AlgorithmArtist commented Oct 2, 2024 •

edited

Loading

0chroma commented Oct 4, 2024 •

edited

Loading

Jeffser commented Oct 13, 2024 •

edited

Loading