even after offloading all layers to gpu, is it normal for CPU to use up all threads at 100% utilization? #3210
Replies: 2 comments 2 replies
-
When offloading all layers, you usually want to set threads to 1 or a low value. The way llama.cpp uses threads basically has them spin, eating 100% CPU even if they don't have work to do. I think there was a behavior recently added to set threads to 1 when offloading all layers (not 100% sure it got merged). If so, you could try making sure you're not specifying |
Beta Was this translation helpful? Give feedback.
-
@KerfuffleV2 thx for explaining! i didnt specify -t and it's using all cores. should really make it automatic t = 1. |
Beta Was this translation helpful? Give feedback.
-
even after offloading all layers to gpu, is it normal for CPU to use up all threads at 100% utilization?
i thought i already offloaded to gpu (which i did) but why is CPU still being used?
Beta Was this translation helpful? Give feedback.
All reactions