Skip to content

System.Threading.Timer in NetStandard2.0 leaks threads after Azure VM Freeze #114991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Kolky opened this issue Apr 24, 2025 · 2 comments
Open
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner

Comments

@Kolky
Copy link

Kolky commented Apr 24, 2025

Description

We have an application running with multiple instances (~40 pods) in Azure Kubernetes Service (AKS) that connects in total 20.000 IoT devices. The application has a timer (using System.Threading.Timer) that should run once a minute to update some counters and metrics. This application was recently ported from .NET Framework 4.7.2 to .NET 8, the code where this timer lives in is in .NET Standard 2.0. We know the results of the timer by seeing the counters and metrics on our self created dashboard.

While running in AKS we see that Azure sometimes decides to do maintenance, particularly a "freeze" of the virtual machine (link). The description for that is "The Virtual Machine is scheduled to pause for a few seconds. CPU and network connectivity may be suspended, but there's no impact on memory or open files.".

Out of our ~40 pods, and in the span of 6 months we have seen 5 cases where the System.Threading.Timer seems to have stopped working. At the same time the timer stops working we see the number of threads and memory usage of the pod in our Prometheus metrics continuously increasing. As if there is leakage. All of these 5 cases can be traced back to when a freeze occurred. We suspect that the timer is planned to go off during the freeze (which takes usually 7 to 9 seconds).

Is this a known bug in the NetStandard2.0 implementation of the System.Threading.Timer? Or is there something wrong in our code that can cause this?

Reproduction Steps

I cannot easily reproduce due to the requirement of an Azure VM Freeze. But if I would be able to test that:

  1. Create an application in dotnet 8. Using a dotnet standard 2.0 class library that has a timer (seeTimerBasedActionTrigger below).
  2. Run it multiple times on a VM inside AKS.
  3. Schedule the timer so that it fires inside a Freeze of the VM.
  4. Trigger a Freeze of the VM.
  5. Wait until after the Freeze.

Expected behavior

The timer keeps working independent of a CPU freeze.

Actual behavior

The timer seems to have stopped working, my counters and timers are no longer updated. My thread count keeps slowly going up.

Regression?

No response

Known Workarounds

No response

Configuration

Application is running .NET 8. The timer is in a class library which uses .NET Standard 2.0. We're using Ubuntu 22.04 in x64.

Other information

Wrapper around our usage of the Timer

using System;
using System.Diagnostics;
using System.Threading;
namespace XXX
{
    public class TimerBasedActionTrigger
    {
        private readonly object _timerLocker = new();
        private Timer _triggerTimer;
        private readonly Action _action;
        private readonly TimeSpan _triggerInterval;
        public TimerBasedActionTrigger(Action action, TimeSpan interval)
        {
            _action = action;
            _triggerInterval = interval;
        }
        public void Start()
        {
            lock (_timerLocker)
            {
                _triggerTimer = new Timer(TriggerTimerCallback);
                _triggerTimer.Change(0, Timeout.Infinite);
            }
        }
        public void Stop()
        {
            lock (_timerLocker)
            {
                _triggerTimer?.Change(Timeout.Infinite, Timeout.Infinite);
                _triggerTimer = null;
            }
        }
        private void RestartTriggerTimer()
        {
            lock (_timerLocker)
            {
                _triggerTimer?.Change(_triggerInterval, Timeout.InfiniteTimeSpan);
            }
        }
        private void TriggerTimerCallback(object state)
        {
            try
            {
                _action();
            }
            finally
            {
                RestartTriggerTimer();
            }
        }
    }
}
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Apr 24, 2025
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@Kolky
Copy link
Author

Kolky commented May 15, 2025

Any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

1 participant