Skip to content

Fix CounterGroup timer to use Stopwatch instead of DateTime.UtcNow#127303

Open
unsafePtr wants to merge 2 commits intodotnet:mainfrom
unsafePtr:fix/countergroup-stopwatch
Open

Fix CounterGroup timer to use Stopwatch instead of DateTime.UtcNow#127303
unsafePtr wants to merge 2 commits intodotnet:mainfrom
unsafePtr:fix/countergroup-stopwatch

Conversation

@unsafePtr
Copy link
Copy Markdown

Background

DateTime.UtcNow can jump due to NTP sync, causing elapsed time reported to EventCounter subscribers to be incorrect for that interval — affecting rate calculations like requests/sec in monitoring dashboards. Stopwatch is monotonic and not subject to clock adjustments.

CounterGroup is only directly referenced from DiagnosticCounter. DiagnosticCounter is the base class of EventCounter, PollingCounter, IncrementingPollingCounter, IncrementingEventCounter, so they all are affected.

@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 22, 2026
_nextPollingTimeStamp = DateTime.UtcNow + new TimeSpan(0, 0, (int)pollingIntervalInSeconds);
long now = Stopwatch.GetTimestamp();
_timeStampSinceCollectionStarted = now;
_nextPollingTimeStamp = now + (long)(Stopwatch.Frequency * pollingIntervalInSeconds);
Copy link
Copy Markdown
Member

@tannergooding tannergooding Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is notably a change in behavior, previously it treated 0.9f as 0 (float to integer conversions truncate). Now it will instead track the difference more precisely to the amount represented. Is this change correct/expected?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit above we have following lines passed to EnableTimer

if (!e.Arguments.TryGetValue("EventCounterIntervalSec", out string? valueStr)
|| !float.TryParse(valueStr, out float intervalValue))

(int)0.9f = 0, so for EventCounterIntervalSec=0.9 we would fire immediately. As it will prdocue garbage on first data point using sub-second intervals.

The existing test was failing before this change
https://github.com/dotnet/runtime/blob/main/src/libraries/System.Diagnostics.Tracing/tests/BasicEventSourceTest/TestEventCounter.cs#L127

Copy link
Copy Markdown
Author

@unsafePtr unsafePtr Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-checked the code again. Probably there won't be values less than 1s, but if let's say 1.5 is submitted it will be truncated to 1 with old behaviour. All subsequnt intervals maintain initial interval of 1.5. One data point is not statistically significant, but if we can fix it, why not?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not specifying here whether the change is right or wrong, just that it's something that was likely overlooked and so should be carefully considered and potentially documented as part of the change here.

Even if the first data point was somewhat faulty, changing from "fire immediately" to "wait 900ms to fire" is a substantial change and likely needs weigh in from those on the dotnet/area-system-diagnostics team.

_timeStampSinceCollectionStarted = now;
TimeSpan delta = now - _nextPollingTimeStamp;
delta = _pollingIntervalInMilliseconds > delta.TotalMilliseconds ? TimeSpan.FromMilliseconds(_pollingIntervalInMilliseconds) : delta;
long intervalTicks = (long)((double)Stopwatch.Frequency * _pollingIntervalInMilliseconds / 1000);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can potentially have quirks for high frequency rates (although such frequencies are unexpected/unlikely) and the conversion to double in particular seems unnecessary here since double->long just truncates, which is how integer division already works.

I'm not sure if you're rather wanting to insert an explicit Round operation (0.6 -> 1 instead of 0.6 -> 0) or if this rather should just be Frequency * (_pollingIntervalInMilliseconds / 1000) which avoids the extra cost/complexity.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've propbably overlooked, since at other places I am not castign Stopwatch.Frequency to double

DateTime now = DateTime.UtcNow;
if (counterGroup._nextPollingTimeStamp < now + new TimeSpan(0, 0, 0, 0, 1))
long now = Stopwatch.GetTimestamp();
if (counterGroup._nextPollingTimeStamp < now + Stopwatch.Frequency / 1000)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like a lot of this complexity could be simplified if kept using TimeSpan instead, which normalizes to 100ns units and allows easier working with concepts like seconds.

Then you really only need to get the start/now timestamps from Stopwatch and use Stopwatch.GetElapsedTime` when comparing that to the tracked intervals.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Diagnostics.Tracing community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants