New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slightly modify the detection of the number of CPU cores #44973
Conversation
This might be a bit controversial but 1. AMD offers 48/67/64 (true) core systems already (Milan, (*)), up to 96 cores/CPU soon (Genoa), 2. Intel sells a few Xeon CPUs with > cores (**), and we shouldn't artificially limit core utilization on such sytems. (*) https://en.wikipedia.org/wiki/Epyc (**) https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable/platinum/products.html
@nickitat Do you know if SMP/hyperthreading generally benefits/deteriorates performance? In case of the latter, it will be better to check programmatically if SMP is on and halve the max used cores only in this case. |
looks like generally (e.g. if we take the whole set of clickbench queries) we should benefit from hyper threading. it was the case when we brought it back for 16 cores instances. it is because normal queries do have stalls on memory access and branch mispredictions. only heavy number crunching queries without lots of memory accesses (like Q29 from CB) could probably slow down, but it is fairly marginal case. |
Tnx, this makes sense. @alexey-milovidov Feel free to merge or advice further. |
imo it worth to do measurements anyway to be confident in what we are doing. e.g. launch all |
Some funny measurements of ClickBench on a 128-vCore / 64 physical core instance (m6i.32xlarge):
(averaging 6 measurements each) So with HyperThreadin,g CPU consumption is surprisingly 6% worse. Of course, this could be because ClickBench is CPU-intensive. I nevertheless think that the code should be made aware of HT so we don't underutilize >32-core systems with HT disabled. We can keep the existing threshold though. |
Hyperthreading cores don't increase performance linearly. If a query has a low number of cache misses and branch mispredictions, hyperthreading makes performance worse. Cache misses mainly appear while doing large GROUP BY, DISTINCT, IN, and JOIN. But on a very large number of cores, the memory becomes the bottleneck earlier than we start using all hyper-threaded cores. It will make sense to allow more cores by default, but queries should automatically stop using so many threads when concurrent queries start running. I can be done with #37285, but this setting is not enabled by default. |
PS. SMT, not SMP. |
porting ClickHouse/ClickHouse#44973 on Feb 20, 2023
AMD offers 48/67/64 core systems (Milan), and soon CPUs with up to 96 cores (Genoa). Intel also sells Xeon CPUs with >32 cores. If the system has >= 32 logical cores then the current logic is to use only half of them. If SMT (HyperThreading) is on, then ClickHouse will effectively utilize only the physical cores (i.e. "disable" HyperThreading by means of the software).
However, if SMT is disabled (logical == physical core count), which is not uncommon, then ClickHouse uses only half the available physical cores.
With this PR, the 32-core threshold is retained but the number of used cores is only reduced if SMT is on.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Remove the limitation that on systems with >=32 cores and SMT disabled ClickHouse uses only half of the cores