Using GGUF models efficiently #473

N1h1lv5 · 2023-09-13T11:46:48Z

N1h1lv5
Sep 13, 2023

Hello all,

So today finally we have GGUF support ! Quite exciting and many thanks to @PromtEngineer !

At the moment I run the default model llama 7b with --device_type cuda, and I can see some GPU memory being used but the processing at the moment goes only to the CPU. Does anyone have experience with GGUF's + GPU ?

Answered by N1h1lv5

Sep 18, 2023

This was the solution for me, I am running windows with a conda env :

set the environment variable properly :

$Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"
$Env:FORCE_CMAKE="1"

check that it works

echo $Env:CMAKE_ARGS

uninstall previous version of llama-cpp-python

pip uninstall llama-cpp-python

install the proper version :

pip install llama-cpp-python==0.1.83 --no-cache-dir

Now GPU + CPU works toether. Thanks @PromtEngineer

View full answer

N1h1lv5 · 2023-09-18T07:41:33Z

N1h1lv5
Sep 18, 2023
Author

This was the solution for me, I am running windows with a conda env :

set the environment variable properly :

$Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"
$Env:FORCE_CMAKE="1"

check that it works

echo $Env:CMAKE_ARGS

uninstall previous version of llama-cpp-python

pip uninstall llama-cpp-python

install the proper version :

pip install llama-cpp-python==0.1.83 --no-cache-dir

Now GPU + CPU works toether. Thanks @PromtEngineer

2 replies

nekiee13 Sep 21, 2023

What I don't understand is - does LocalGPT automatically determine a number of layers that can be offloaded to GPU?
Or am I supposed to set a number of offloaded layers somewhere?

HWiese1980 Feb 11, 2024

Is this answer still valid?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using GGUF models efficiently #473

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Using GGUF models efficiently #473

N1h1lv5 Sep 13, 2023

Replies: 1 comment · 2 replies

N1h1lv5 Sep 18, 2023 Author

nekiee13 Sep 21, 2023

HWiese1980 Feb 11, 2024

N1h1lv5
Sep 13, 2023

Replies: 1 comment 2 replies

N1h1lv5
Sep 18, 2023
Author