You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected behavior
Since cortex-cpp is using a local llm and the CUDA toolkit, it should primarily use the GPU for processing and not consume as much CPU.
Desktop
OS: Linux
Additional context
The logs indicate that 32 out of 33 layers are offloaded to the GPU, but 1 layer is still processed on the CPU. This behavior will be investigated further.
The text was updated successfully, but these errors were encountered:
Van-QA
changed the title
bug: Cortex-cpp continues to have 1 layer offload to CPU why using GPU
bug: Cortex-cpp continues to have 1 layer offload to CPU while using GPU
Jun 20, 2024
Describe the bug
When generating responses using a local llm, cortex-cpp still seems to use CPU.
https://discord.com/channels/1107178041848909847/1149558035971321886/1253148982188838954
To Reproduce
Expected behavior
Since cortex-cpp is using a local llm and the CUDA toolkit, it should primarily use the GPU for processing and not consume as much CPU.
Desktop
Additional context
The logs indicate that 32 out of 33 layers are offloaded to the GPU, but 1 layer is still processed on the CPU. This behavior will be investigated further.
The text was updated successfully, but these errors were encountered: