Microsoft.AspNetCore.Hosting perf counters are slow #50412

EgorBo · 2023-08-29T22:15:42Z

We're trying to benchmark a webapp on a many-cores system and we're mainly interested in RPS metrics at the moment. We don't want to rely on client side RPS metrics (bombardier or Wrk depending on OS) because there are multiple clients and they're not started instantly at the same time. It seems that Microsoft.AspNetCore.Hosting perf counters can be used to calculate RPS on aspnet side which is what we need. Unfortunately, it seems that these counters make everything twice slower when enabled.

Basically, it can be simulated on our PerfLab with crank:

crank 
--config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/json.benchmarks.yml 
--config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/azure.profile.yml 
--config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/steadystate.profile.yml 
--config https://raw.githubusercontent.com/aspnet/Benchmarks/main/build/ci.profile.yml 
--scenario json 
--profile arm-lin-28-app 
--profile intel-load2-load 
--application.framework net8.0 
--application.collectDependencies true 
--application.options.collectCounters true 
--application.options.counterProviders "Microsoft.AspNetCore.Hosting"

the last line is the culprit. Is it possible to somehow achieve an overhead-free server-side RPS metric? The current culprit seems to be EvenPipe:

Tried Linux-x64, Linux-arm64 and Windows-x64 machines.

UPD: Current suspects:

Mode	RPS
No counters	1M
Collect "System.Runtime" counters	1M
Collect "Microsoft.AspNetCore.Hosting" counters	0.48M
Collect "Microsoft.AspNetCore.Hosting" counters, EventPipeEnableStackwalk=0	0.65M
Collect "Microsoft.AspNetCore.Hosting" counters, EventPipeEnableStackwalk=0, without Interlocked	0.75M
Collect counters, Microsoft.AspNetCore.Hosting, EventPipeEnableStackwalk=0, without Interlocked, without lock in CounterAggregator	0.8M

The text was updated successfully, but these errors were encountered:

EgorBo · 2023-08-29T23:26:57Z

also, cc @dotnet/dotnet-diag and @sebastienros

EgorBo · 2023-08-29T23:40:15Z

Also, can the cpu cache contention be an issue here? https://github.com/dotnet/aspnetcore/blob/main/src/Hosting/Hosting/src/Internal/HostingEventSource.cs#L55-L64

we perform 3 interlocked operations per each request

davmason · 2023-08-29T23:46:31Z

Can you set DOTNET_EventPipeEnableStackwalk=0 and try again? That will make EventPipe never try to collect managed stacks for events and should greatly increase performance. It will mean you don't get stacks for any events, but I think that is OK with your scenario from what I see

EgorBo · 2023-08-29T23:55:49Z

Can you set DOTNET_EventPipeEnableStackwalk=0 and try again? That will make EventPipe never try to collect managed stacks for events and should greatly increase performance. It will mean you don't get stacks for any events, but I think that is OK with your scenario from what I see

Thanks! It does improve things, but still slower than just System.Runtime counters (which don't affect RPS)

Mode	RPS
Defaut	1M
Collect counters, System.Runtime	1M
Collect counters, Microsoft.AspNetCore.Hosting	0.48M
Collect counters, Microsoft.AspNetCore.Hosting, EventPipeEnableStackwalk=0	0.65M
Collect counters, Microsoft.AspNetCore.Hosting, EventPipeEnableStackwalk=0, without Interlocked	0.75M
Collect counters, Microsoft.AspNetCore.Hosting, EventPipeEnableStackwalk=0, without Interlocked, without lock in CounterAggregator	0.8M

sebastienros · 2023-08-29T23:57:31Z

we perform 3 interlocked operations per each request

At the level of concurrency we have on these systems (a lot of concurrent requests and 80 cores on some of the ARM64 machines) it's probably the reason.

EgorBo · 2023-08-30T00:24:19Z

So, it seems that Interlocked counters do contribute but there is still something else that slows us down here. I compiled Microsoft.AspNetCore.Hosting.dll locally and removed the counters and got +0.1M RPS (resulting in 0.75M RPS)

EgorBo · 2023-08-30T00:44:17Z

Windows-x64, Interlocked.* operations commented (so the provider reports 0 RPS) and DOTNET_EventPipeEnableStackwalk=0:

Traces for a random TP thread:

so 20% of perf is still wasted somewhere

EgorBo · 2023-08-30T00:50:58Z

Another suspect is https://github.com/dotnet/runtime/blob/9c3f8b3727d9be4de483a1d725c2bda22f956688/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/CounterAggregator.cs#L19

PS: yes, it's responsible for another 50k RPS

davidfowl · 2023-09-02T00:08:52Z

Are we fixing this for .NET 8?

noahfalk · 2023-09-06T10:10:28Z

What EventPipe providers do you have enabled in the Collect "Microsoft.AspNetCore.Hosting" counters case? There is both an EventSource named Microsoft.AspNetCore.Hosting as well as a Meter named "Microsoft.AspNetCore.Hosting". All Meters get reported in aggregate via the MetricsEventSource.

Its possible to enable both, but there is likely no reason to do so. Looking at the code in ASP.NET Core, you may also be paying the costs to track both sets of metrics even if you only enabled one of them:
https://source.dot.net/#Microsoft.AspNetCore.Hosting/Internal/HostingApplicationDiagnostics.cs,57 https://source.dot.net/#Microsoft.AspNetCore.Hosting/Internal/HostingApplicationDiagnostics.cs,353
Its possible to check if they are enabled individually, but currently the code doesn't do that.

Given you saw improvements removing the locks in CounterAggregator, I assume you enabled MetricsEventSource. Do you know what filter arguments got passed that select which Meters and Instruments to listen to? For example if you enabled all instruments on the Microsoft.AspNetCore.Hosting Meter then you would have enabled the request duration histogram and that histogram update hits a different lock here: https://github.com/dotnet/runtime/blob/9c3f8b3727d9be4de483a1d725c2bda22f956688/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/ExponentialHistogramAggregator.cs#L176

so 20% of perf is still wasted somewhere

If something not yet accounted for is taking 20% of the time that should make ETW traces look fairly different. Is the difference being elusive and things feel stuck, or its just some time is needed to peel the next layer of the onion?

Is it possible to somehow achieve an overhead-free server-side RPS metric?

Nothing is totally free, but I think we could get a metric that has sufficiently low overhead it wouldn't be observable at the scale of this measurement. We can do some combination of:

turning off things not necessary for the measurement
better optimizing what is needed
last resort, create a custom metric that is specifically tuned for perf

JamesNK · 2023-09-06T23:58:13Z

Its possible to enable both, but there is likely no reason to do so. Looking at the code in ASP.NET Core, you may also be paying the costs to track both sets of metrics even if you only enabled one of them:
source.dot.net/#Microsoft.AspNetCore.Hosting/Internal/HostingApplicationDiagnostics.cs,57 source.dot.net/#Microsoft.AspNetCore.Hosting/Internal/HostingApplicationDiagnostics.cs,353
Its possible to check if they are enabled individually, but currently the code doesn't do that.

Adding separate checks for metrics vs event counters in ASP.NET Core hosting isn't difficult. I'll do that and aim for .NET 8.
Edit: PR #50412

However, it won't help if someone enables metrics and event counters. I don't see why someone would want both. I think there is a task here to look at what .NET telemetry libraries and tooling are doing when they asks to listen for counters. I know dotnet-counters listens to both, but people aren't going to use that in production, so it doesn't matter. What about the .NET OpenTelemetry SDK? Or other counter gathering tooling?

I don't think it is worth in investing in optimizing the HostingEventSource's Interlocked.Increment calls. We should encourage people to move towards metrics.

JamesNK · 2023-09-07T06:06:02Z

My PR to enable metrics and event counters independently didn't have any impact on the benchmark. The RPS is approximately 500k both before and after. That means application.options.counterProviders "Microsoft.AspNetCore.Hosting" is capturing event counters and metrics counters. Supporting enabling them independently is a Good Thing and the PR should go into the product, but it has no impact here.

The performance of metrics incrementing values is out of our hands in aspnetcore. Perf sensitive code is in System.Diaganostics.Metrics. There is a PR looking at it here: dotnet/runtime#91566

Something we can impact in aspnetcore is the event counters' performance. We could update hosting event counters to use something other than Interlocked. We could implement something similar to what dotnet/runtime#91566 is doing. Or use ThreadLocal<long>. @sebastienros I have a vague memory of you talking about a library that did this to improve ARM perf. Can you share details, or did I imagine the conversation?

danmoseley · 2023-09-07T07:10:52Z

Are all of our counters required to be exact? Where a nearly accurate value is acceptable we can use an algorithm similar to what is done here to avoid interlocked operations.

dotnet/runtime#84427

JamesNK · 2023-09-07T08:50:43Z

There are two counters on the hot path: total requests and active requests.

It would be surprising if the total request count wasn’t exact. And it would be a regression from current behavior.

The active request count is incremented when a request starts and decremented when the request ends. It definitely needs to be exact. Otherwise, you could end up with a situation where an idle server reports there are active requests. Or even there are negative active requests.

noahfalk · 2023-09-07T08:58:40Z

Are all of our counters required to be exact? Where a nearly accurate value is acceptable we can use an algorithm similar to what is done here to avoid interlocked operations.

I think it depends what people want to use them for. In the scenario here a small error might be acceptable, in others people probably expect counts to be exact. I could easily imagine this is something that tools opt into when it is appropriate for the scenario at hand, but I think it would confuse people if we introduced error bars by default.

@EgorBo @sebastienros - can you guys help me answer some of the questions in #50412 (comment)? It sounds like @JamesNK answered one of them which is crank is enabling both the HostingEventSource EventCounters and the new Meters via MetricEventSource. Presumably you don't need both and when you change the tool to disable one the overhead is going to go down. If the goal is to get a good RPS measurement right now probably the EventCounters are going to have lower overhead at the moment. If the goal is improve the perf of Meters when run by customers then lets disable the EventCounters so we isolate and work on just the Meter overhead.

noahfalk · 2023-09-07T09:05:56Z

There is a PR looking at it here: #50412

Did you mean dotnet/runtime#91566?

davidfowl · 2023-09-07T14:29:35Z

Lets make sure our counters are exact, there are several options we have here to improve the performance of how we track that (@stephentoub mentioned these techniques in the other issue on runtime)

This is an option
Or something like this

We don't want to use the approximate counting algorithm here.

EgorBo added the design-proposal This issue represents a design proposal for a different issue, linked in the description label Aug 29, 2023

dotnet-issue-labeler bot added the needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically label Aug 29, 2023

EgorBo mentioned this issue Aug 30, 2023

[API Proposal]: Fast cache-contention aware counters dotnet/runtime#91319

Closed

gfoidl added area-runtime and removed needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically labels Aug 31, 2023

BrennanConroy added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Aug 31, 2023

Tratcher added the Perf label Sep 1, 2023

stephentoub mentioned this issue Sep 5, 2023

Improve scalability of monitored Counter / UpDownCounter dotnet/runtime#91566

Merged

adityamandaleeka assigned JamesNK Sep 7, 2023

JamesNK mentioned this issue Sep 7, 2023

Support independently enabling hosting event counters and metrics #50565

Merged

EgorBo mentioned this issue Sep 25, 2023

[API Proposal]: Low Read High Write Counter dotnet/runtime#92442

Open

noahfalk mentioned this issue Jan 11, 2024

[Feature Request] Faster aggregation for the MetricsEventSource under concurrency dotnet/runtime#71563

Open

dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024

wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024

dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024

wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 13, 2024

dotnet deleted a comment from dotnet-policy-service bot Feb 13, 2024

PaulusParssinen mentioned this issue May 13, 2024

Integrate OpenTelemetry to enhance observability microsoft/garnet#379

Open

EgorBo mentioned this issue Aug 29, 2024

Adding OpenTelemetry instrumentation results in 15% RPS degradation dotnet/runtime#107158

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Microsoft.AspNetCore.Hosting perf counters are slow #50412

Microsoft.AspNetCore.Hosting perf counters are slow #50412

EgorBo commented Aug 29, 2023 •

edited

Loading

EgorBo commented Aug 29, 2023 •

edited

Loading

EgorBo commented Aug 29, 2023

davmason commented Aug 29, 2023

EgorBo commented Aug 29, 2023 •

edited

Loading

sebastienros commented Aug 29, 2023

EgorBo commented Aug 30, 2023 •

edited

Loading

EgorBo commented Aug 30, 2023 •

edited

Loading

EgorBo commented Aug 30, 2023 •

edited

Loading

davidfowl commented Sep 2, 2023

noahfalk commented Sep 6, 2023

JamesNK commented Sep 6, 2023 •

edited

Loading

JamesNK commented Sep 7, 2023 •

edited

Loading

danmoseley commented Sep 7, 2023 •

edited

Loading

JamesNK commented Sep 7, 2023 •

edited

Loading

noahfalk commented Sep 7, 2023

noahfalk commented Sep 7, 2023

davidfowl commented Sep 7, 2023 •

edited

Loading

Microsoft.AspNetCore.Hosting perf counters are slow #50412

Microsoft.AspNetCore.Hosting perf counters are slow #50412

Comments

EgorBo commented Aug 29, 2023 • edited Loading

EgorBo commented Aug 29, 2023 • edited Loading

EgorBo commented Aug 29, 2023

davmason commented Aug 29, 2023

EgorBo commented Aug 29, 2023 • edited Loading

sebastienros commented Aug 29, 2023

EgorBo commented Aug 30, 2023 • edited Loading

EgorBo commented Aug 30, 2023 • edited Loading

EgorBo commented Aug 30, 2023 • edited Loading

davidfowl commented Sep 2, 2023

noahfalk commented Sep 6, 2023

JamesNK commented Sep 6, 2023 • edited Loading

JamesNK commented Sep 7, 2023 • edited Loading

danmoseley commented Sep 7, 2023 • edited Loading

JamesNK commented Sep 7, 2023 • edited Loading

noahfalk commented Sep 7, 2023

noahfalk commented Sep 7, 2023

davidfowl commented Sep 7, 2023 • edited Loading

EgorBo commented Aug 29, 2023 •

edited

Loading

EgorBo commented Aug 29, 2023 •

edited

Loading

EgorBo commented Aug 29, 2023 •

edited

Loading

EgorBo commented Aug 30, 2023 •

edited

Loading

EgorBo commented Aug 30, 2023 •

edited

Loading

EgorBo commented Aug 30, 2023 •

edited

Loading

JamesNK commented Sep 6, 2023 •

edited

Loading

JamesNK commented Sep 7, 2023 •

edited

Loading

danmoseley commented Sep 7, 2023 •

edited

Loading

JamesNK commented Sep 7, 2023 •

edited

Loading

davidfowl commented Sep 7, 2023 •

edited

Loading