Partially improve support for `--cpus` from Docker CLI #23747

luhenry · 2019-04-04T22:25:53Z

This focuses on better supporting Docker CLI's parameter --cpus, which limits the amount of CPU time available to the container (ex: 1.8 means 180% CPU time, ie on 2 cores 90% for each core, on 4 cores 45% on each core, etc.) in the case of the ThreadPool and in the case of calculating the CPU limit.

luhenry · 2019-04-04T22:26:38Z

This spins off part of #23398 for the parts which are not controversial and do not change the API behavior.

/cc @tmds

src/gc/unix/cgroup.cpp

In the case where `--cpus` is set to a value very close to the smaller integer (ex: 1.499999999), it would previously be rounded down. This would mean that the runtime would only try to take advantage of 1 CPU in this example, leading to underutilization. By rounding it up, we augment the pressure on the OS threads scheduler, but even in the worst case scenario (`--cpus=1.000000001` previously being rounded to 1, now rounded to 2), we do not observe any overutilization of the CPU leading to performance degradation.

By making sure we do take the CPU limits into account when computing the CPU busy time, we ensure we do not have the various heuristic of the threadpool competing with each other: one trying to allocate more threads to increase the CPU busy time, and the other one trying to allocate less threads because there adding more doesn't improve the throughput. Let's take the example of a system with 20 cores, and a docker container with `--cpus=2`. It would mean the total CPU usage of the machine is 2000%, while the CPU limit is 200%. Because the OS scheduler would never allocate more than 200% of its total CPU budget to the docker container, the CPU busy time would never get over 200%. From `PAL_GetCpuBusyTime`, this would indicate that we threadpool threads are mostly doing non-CPU bound work, meaning we could launch more threads.

sergiy-k · 2019-04-05T16:19:25Z

src/gc/unix/cgroup.cpp

-        cpu_count = quota / period;
-        if (cpu_count < UINT32_MAX)
+        cpu_count = (double) quota / period;
+        if (cpu_count < UINT32_MAX - 1)


Should this be '<=' otherwise if cpu_count is equal to (UINT32_MAX - 1) then this function will return UINT32_MAX instead of (UINT32_MAX - 1), right?
In general, I agree with @tmds here that rounding up before the 'if' statement might make the code simple. Would something like this work or I missed some details?

// Calculate cpu count based on quota and round it up cpu_count = (double) quota / period + 0.999999999; *val = (cpu_count < UINT32_MAX) ? (uint32_t)cpu_count : UINT32_MAX;

It's ok to return UINT32_MAX. I didn't adopt @tmds approach mostly to reduce code change noise.

bourquep · 2019-07-10T19:18:28Z

In which dotnet core runtime version is this PR available?

jkotas · 2019-07-10T19:23:47Z

It is available in the .NET Core 3.0 preview6. https://dotnet.microsoft.com/download/dotnet-core/3.0

…r#23747) * Round up the value of the CPU limit In the case where `--cpus` is set to a value very close to the smaller integer (ex: 1.499999999), it would previously be rounded down. This would mean that the runtime would only try to take advantage of 1 CPU in this example, leading to underutilization. By rounding it up, we augment the pressure on the OS threads scheduler, but even in the worst case scenario (`--cpus=1.000000001` previously being rounded to 1, now rounded to 2), we do not observe any overutilization of the CPU leading to performance degradation. * Teach the ThreadPool of CPU limits By making sure we do take the CPU limits into account when computing the CPU busy time, we ensure we do not have the various heuristic of the threadpool competing with each other: one trying to allocate more threads to increase the CPU busy time, and the other one trying to allocate less threads because there adding more doesn't improve the throughput. Let's take the example of a system with 20 cores, and a docker container with `--cpus=2`. It would mean the total CPU usage of the machine is 2000%, while the CPU limit is 200%. Because the OS scheduler would never allocate more than 200% of its total CPU budget to the docker container, the CPU busy time would never get over 200%. From `PAL_GetCpuBusyTime`, this would indicate that we threadpool threads are mostly doing non-CPU bound work, meaning we could launch more threads. Commit migrated from dotnet/coreclr@aea3b1a

luhenry requested review from stephentoub, jkotas, janvorli, Maoni0 and kouvel April 4, 2019 22:25

jkotas approved these changes Apr 4, 2019

View reviewed changes

janvorli approved these changes Apr 4, 2019

View reviewed changes

kouvel approved these changes Apr 4, 2019

View reviewed changes

tmds approved these changes Apr 5, 2019

View reviewed changes

tmds reviewed Apr 5, 2019

View reviewed changes

src/gc/unix/cgroup.cpp Show resolved Hide resolved

luhenry added 2 commits April 5, 2019 07:35

luhenry force-pushed the fix-gh22302-2 branch from 23eaa2e to fb36270 Compare April 5, 2019 14:36

sergiy-k reviewed Apr 5, 2019

View reviewed changes

luhenry merged commit aea3b1a into dotnet:master Apr 5, 2019

jkotas mentioned this pull request Jan 13, 2020

Environment.ProcessorCount incorrect reporting in containers in 3.1 dotnet/runtime#622

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partially improve support for `--cpus` from Docker CLI #23747

Partially improve support for `--cpus` from Docker CLI #23747

luhenry commented Apr 4, 2019

luhenry commented Apr 4, 2019 •

edited

Loading

sergiy-k Apr 5, 2019

luhenry Apr 5, 2019

bourquep commented Jul 10, 2019

jkotas commented Jul 10, 2019

Partially improve support for --cpus from Docker CLI #23747

Partially improve support for --cpus from Docker CLI #23747

Conversation

luhenry commented Apr 4, 2019

luhenry commented Apr 4, 2019 • edited Loading

sergiy-k Apr 5, 2019

Choose a reason for hiding this comment

luhenry Apr 5, 2019

Choose a reason for hiding this comment

bourquep commented Jul 10, 2019

jkotas commented Jul 10, 2019

Partially improve support for `--cpus` from Docker CLI #23747

Partially improve support for `--cpus` from Docker CLI #23747

luhenry commented Apr 4, 2019 •

edited

Loading