You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have an application running with multiple instances (~40 pods) in Azure Kubernetes Service (AKS) that connects in total 20.000 IoT devices. The application has a timer (using System.Threading.Timer) that should run once a minute to update some counters and metrics. This application was recently ported from .NET Framework 4.7.2 to .NET 8, the code where this timer lives in is in .NET Standard 2.0. We know the results of the timer by seeing the counters and metrics on our self created dashboard.
While running in AKS we see that Azure sometimes decides to do maintenance, particularly a "freeze" of the virtual machine (link). The description for that is "The Virtual Machine is scheduled to pause for a few seconds. CPU and network connectivity may be suspended, but there's no impact on memory or open files.".
Out of our ~40 pods, and in the span of 6 months we have seen 5 cases where the System.Threading.Timer seems to have stopped working. At the same time the timer stops working we see the number of threads and memory usage of the pod in our Prometheus metrics continuously increasing. As if there is leakage. All of these 5 cases can be traced back to when a freeze occurred. We suspect that the timer is planned to go off during the freeze (which takes usually 7 to 9 seconds).
Is this a known bug in the NetStandard2.0 implementation of the System.Threading.Timer? Or is there something wrong in our code that can cause this?
Reproduction Steps
I cannot easily reproduce due to the requirement of an Azure VM Freeze. But if I would be able to test that:
Create an application in dotnet 8. Using a dotnet standard 2.0 class library that has a timer (seeTimerBasedActionTrigger below).
Run it multiple times on a VM inside AKS.
Schedule the timer so that it fires inside a Freeze of the VM.
Trigger a Freeze of the VM.
Wait until after the Freeze.
Expected behavior
The timer keeps working independent of a CPU freeze.
Actual behavior
The timer seems to have stopped working, my counters and timers are no longer updated. My thread count keeps slowly going up.
Regression?
No response
Known Workarounds
No response
Configuration
Application is running .NET 8. The timer is in a class library which uses .NET Standard 2.0. We're using Ubuntu 22.04 in x64.
Uh oh!
There was an error while loading. Please reload this page.
Description
We have an application running with multiple instances (~40 pods) in Azure Kubernetes Service (AKS) that connects in total 20.000 IoT devices. The application has a timer (using
System.Threading.Timer
) that should run once a minute to update some counters and metrics. This application was recently ported from .NET Framework 4.7.2 to .NET 8, the code where this timer lives in is in.NET Standard 2.0
. We know the results of the timer by seeing the counters and metrics on our self created dashboard.While running in AKS we see that Azure sometimes decides to do maintenance, particularly a "freeze" of the virtual machine (link). The description for that is "The Virtual Machine is scheduled to pause for a few seconds. CPU and network connectivity may be suspended, but there's no impact on memory or open files.".
Out of our ~40 pods, and in the span of 6 months we have seen 5 cases where the
System.Threading.Timer
seems to have stopped working. At the same time the timer stops working we see the number of threads and memory usage of the pod in our Prometheus metrics continuously increasing. As if there is leakage. All of these 5 cases can be traced back to when a freeze occurred. We suspect that the timer is planned to go off during the freeze (which takes usually 7 to 9 seconds).Is this a known bug in the NetStandard2.0 implementation of the
System.Threading.Timer
? Or is there something wrong in our code that can cause this?Reproduction Steps
I cannot easily reproduce due to the requirement of an Azure VM Freeze. But if I would be able to test that:
TimerBasedActionTrigger
below).Expected behavior
The timer keeps working independent of a CPU freeze.
Actual behavior
The timer seems to have stopped working, my counters and timers are no longer updated. My thread count keeps slowly going up.
Regression?
No response
Known Workarounds
No response
Configuration
Application is running .NET 8. The timer is in a class library which uses .NET Standard 2.0. We're using Ubuntu 22.04 in x64.
Other information
Wrapper around our usage of the Timer
The text was updated successfully, but these errors were encountered: