-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hysteresis effect on threadpool hill-climbing #51935
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
The whole period of square wave sounds an awful lot like it is around 49.7 days. That is how long it takes In managed code GetTickCount is exposed as If I am right, the change over dates would be (give or take a day or so due to timezones, what time of day data was taken, etc): What is unclear to me is where the bug is. All the TickCount usages in the portable thread pool code appear at first glance to be doing the right thing. But if these dates match up, then this sort of thing is pretty clearly the cause. It could potentially be a different api from Environment.TickCount that is also derived from GetTickCount(). Hope this helps. |
@KevinCathcart I can confirm the dates match the ones you provided. |
Ok, found a bug that could definitely be causing this: runtime/src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.cs Lines 328 to 335 in 59d9d0d
currentTimeMs is The if clause controls if the hill climbing is even run.
Therefore, The easy fix here is probably to use unsigned arithmetic, but alternatively having the two fields get initialized to Environment.TickCount probably also works. |
cc: @kouvel |
Nice find, thanks! I'll put up a PR to fix it. |
@KevinCathcart you passed the interview, start on Monday? |
Updated the time intervals used in the hill climbing check to be unsigned to prevent wrap-around. This also prevents the check from failing when the current time is negative since the prior and next times are initialized to zero, and matches the previous implementation. Fixes dotnet#51935
Updated the time intervals used in the hill climbing check to be unsigned to prevent wrap-around. This also prevents the check from failing when the current time is negative since the prior and next times are initialized to zero, and matches the previous implementation. Fixes #51935
In my estimation, essentially all uses of
Is it true that in a freshly started .NET process, |
Yes. It is like that since .NET Framework 1.0 and documentation explicitly mentions it. Nothing really broke here. I agree that it is a trap one has to be careful about. |
We have noticed a periodic pattern on the threadpool hill-climbing logic, which uses either
n-cores
orn-cores + 20
with an hysteresis effect that switches every 3-4 weeks:The main visible impact is on performance results, here is an example with JsonPlatform mean latency, but some scenarios are also impacted in throughput:
This happens independently of the runtime version, meaning that using an older runtime/aspnet/sdk doesn't change the "current" value of the TP threads.
It is also independent of the hardware, and happens on all machines (Linux only) on the same day. These machines have auto-updates disabled. Here are ARM64 (32 cores), AMD (48 cores), INTEL (28 cores):
Disabling hill-climbing restores the better perf in this case, so it is believe that fixing this variation will actually have a negative impact on perf for these scenarios.
The text was updated successfully, but these errors were encountered: