-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed as not planned
Labels
hardwareHardware specific issueHardware specific issue
Description
The current default value is cpu_count/2:
llama-cpp-python/llama_cpp/llama.py
Line 102 in b2a24bd
self.n_threads = n_threads or max(multiprocessing.cpu_count() // 2, 1) |
This value does not seem to be optimal for multicore systems. For example, a CPU with 8 cores will have 4 cores idle. Or to put it simply, we will get twice the slowdown (if there are no more nuances in model execution).
Related issues: #71
In this discussion I would like to know the motivation for such a default value, as it seems that it is not obvious to most users.
Metadata
Metadata
Assignees
Labels
hardwareHardware specific issueHardware specific issue