-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Partially improve support for --cpus
from Docker CLI
#23747
Conversation
In the case where `--cpus` is set to a value very close to the smaller integer (ex: 1.499999999), it would previously be rounded down. This would mean that the runtime would only try to take advantage of 1 CPU in this example, leading to underutilization. By rounding it up, we augment the pressure on the OS threads scheduler, but even in the worst case scenario (`--cpus=1.000000001` previously being rounded to 1, now rounded to 2), we do not observe any overutilization of the CPU leading to performance degradation.
By making sure we do take the CPU limits into account when computing the CPU busy time, we ensure we do not have the various heuristic of the threadpool competing with each other: one trying to allocate more threads to increase the CPU busy time, and the other one trying to allocate less threads because there adding more doesn't improve the throughput. Let's take the example of a system with 20 cores, and a docker container with `--cpus=2`. It would mean the total CPU usage of the machine is 2000%, while the CPU limit is 200%. Because the OS scheduler would never allocate more than 200% of its total CPU budget to the docker container, the CPU busy time would never get over 200%. From `PAL_GetCpuBusyTime`, this would indicate that we threadpool threads are mostly doing non-CPU bound work, meaning we could launch more threads.
cpu_count = quota / period; | ||
if (cpu_count < UINT32_MAX) | ||
cpu_count = (double) quota / period; | ||
if (cpu_count < UINT32_MAX - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be '<=' otherwise if cpu_count is equal to (UINT32_MAX - 1) then this function will return UINT32_MAX instead of (UINT32_MAX - 1), right?
In general, I agree with @tmds here that rounding up before the 'if' statement might make the code simple. Would something like this work or I missed some details?
// Calculate cpu count based on quota and round it up
cpu_count = (double) quota / period + 0.999999999;
*val = (cpu_count < UINT32_MAX) ? (uint32_t)cpu_count : UINT32_MAX;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to return UINT32_MAX
. I didn't adopt @tmds approach mostly to reduce code change noise.
In which dotnet core runtime version is this PR available? |
It is available in the .NET Core 3.0 preview6. https://dotnet.microsoft.com/download/dotnet-core/3.0 |
…r#23747) * Round up the value of the CPU limit In the case where `--cpus` is set to a value very close to the smaller integer (ex: 1.499999999), it would previously be rounded down. This would mean that the runtime would only try to take advantage of 1 CPU in this example, leading to underutilization. By rounding it up, we augment the pressure on the OS threads scheduler, but even in the worst case scenario (`--cpus=1.000000001` previously being rounded to 1, now rounded to 2), we do not observe any overutilization of the CPU leading to performance degradation. * Teach the ThreadPool of CPU limits By making sure we do take the CPU limits into account when computing the CPU busy time, we ensure we do not have the various heuristic of the threadpool competing with each other: one trying to allocate more threads to increase the CPU busy time, and the other one trying to allocate less threads because there adding more doesn't improve the throughput. Let's take the example of a system with 20 cores, and a docker container with `--cpus=2`. It would mean the total CPU usage of the machine is 2000%, while the CPU limit is 200%. Because the OS scheduler would never allocate more than 200% of its total CPU budget to the docker container, the CPU busy time would never get over 200%. From `PAL_GetCpuBusyTime`, this would indicate that we threadpool threads are mostly doing non-CPU bound work, meaning we could launch more threads. Commit migrated from dotnet/coreclr@aea3b1a
This focuses on better supporting Docker CLI's parameter
--cpus
, which limits the amount of CPU time available to the container (ex: 1.8 means 180% CPU time, ie on 2 cores 90% for each core, on 4 cores 45% on each core, etc.) in the case of the ThreadPool and in the case of calculating the CPU limit.