Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception middlewares introduce duplicates of error.type diagnostic tags in some situations #55159

Closed
1 task done
hacst opened this issue Apr 16, 2024 · 2 comments · Fixed by #55211
Closed
1 task done
Labels
area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions

Comments

@hacst
Copy link

hacst commented Apr 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

In some specific situations exceptions can cause the Microsoft.AspNetCore.Hosting,http.server.request.duration metric to duplicate error.type tags. One such situation is throwing an exception in a handler after a websocket connection is closed, but there might be others.

This is devastating if downstream systems cannot handle tag duplication. E.g. with prometheus once tags are duplicated scraping fails completely, marking the service as down in prometheus. The only way to recover from this is restarting the service (until the issue reoccurs).

I found open-telemetry/opentelemetry-dotnet#5199 reported against OTEL which sounds like it might describe the same problem caused in other ways.

Expected Behavior

Regardless of in which situation an exception is thrown, the tags on the metrics should not be duplicated. Looking at the code this seems to be the intent of the implementation to begin with.

Steps To Reproduce

  1. Run the example application below using dotnet run
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.UseExceptionHandler(options => {}); // Exception handling middleware introduces duplicate tag
app.UseWebSockets();

app.MapGet("/", async (HttpContext context, CancellationToken ct) =>
{
    if (context.WebSockets.IsWebSocketRequest)
    {
        using var ws = await context.WebSockets.AcceptWebSocketAsync();
        await ws.CloseAsync(System.Net.WebSockets.WebSocketCloseStatus.NormalClosure, "done", ct);
        throw new InvalidOperationException("Throw after websocket request completion to produce the bug");
    }
});
app.Run();
  1. Attach using dotnet-counters collect -n DuplicateTagRepro Microsoft.AspNetCore.Hosting (or any other tooling that allows seing the Microsot.AspNetCore.Hosting metrics.
  2. Use a websocket client to request the endpoint. E.g. wscat --connect localhost:5221/
  3. Observe that metrics now show Microsoft.AspNetCore.Hosting,http.server.request.duration metrics with duplicated error.type=System.InvalidOperationException tags

E.g. Microsoft.AspNetCore.Hosting,http.server.request.duration (s)[error.type=System.InvalidOperationException;error.type=System.InvalidOperationException;error.type=System.InvalidOperationException;http.request.method=GET;http.response.status_code=101;http.route=/;network.protocol.version=1.1;url.scheme=http;Percentile=50]

You will see one duplication per exception middleware. E.g. when running in Development mode the automatically added developer exception middleware will introduce another duplication.

Exceptions (if any)

As expected (and correct), triggering the exception causes one or more exceptions related to unhandled exceptions:

fail: Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware[1]
      An unhandled exception has occurred while executing the request.
      System.InvalidOperationException: Throw after websocket request completion
         at Program.<>c.<<<Main>$>b__0_1>d.MoveNext() in C:\dev\experiments\DuplicateTagRepro\Program.cs:line 20
      --- End of stack trace from previous location ---
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.<Invoke>g__Awaited|10_0(ExceptionHandlerMiddlewareImpl middleware, HttpContext context, Task task)
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.HandleException(HttpContext context, ExceptionDispatchInfo edi)
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.<Invoke>g__Awaited|10_0(ExceptionHandlerMiddlewareImpl middleware, HttpContext context, Task task)
         at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddlewareImpl.Invoke(HttpContext context)
fail: Microsoft.AspNetCore.Server.Kestrel[13]
      Connection id "0HN2U17KD956D", Request id "0HN2U17KD956D:00000001": An unhandled exception was thrown by the application.
      System.InvalidOperationException: Throw after websocket request completion
         at Program.<>c.<<<Main>$>b__0_1>d.MoveNext() in C:\dev\experiments\DuplicateTagRepro\Program.cs:line 20
      --- End of stack trace from previous location ---
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.<Invoke>g__Awaited|10_0(ExceptionHandlerMiddlewareImpl middleware, HttpContext context, Task task)
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.HandleException(HttpContext context, ExceptionDispatchInfo edi)
         at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.<Invoke>g__Awaited|10_0(ExceptionHandlerMiddlewareImpl middleware, HttpContext context, Task task)
         at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddlewareImpl.Invoke(HttpContext context)
         at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddlewareImpl.Invoke(HttpContext context)
         at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)

.NET Version

8.0.200 or 9.0.100-preview.3.24204.13

Anything else?

I reproduced this in dotnet 8.0.200 and the current preview 9.0.100-preview.3.24204.13.

At least one part in this duplication seems to be the unexpected presence of the exception in HostingMetrics.RequestEnd
https://github.com/dotnet/aspnetcore/blob/8486d31e24f30e3fa1809a95699a0adc16f448d7/src/Hosting/Hosting/src/Internal/HostingMetrics.cs#L72C1-L77C14
The comment there states, that if ExceptionHandlerMiddleware or DeveloperExceptionMiddleware is used, the exception variable should be null. However this is not the case with the reproducer above.

Other instances of the tag are introduced in the ExceptionHandlerMiddlewareImpl in

DiagnosticsTelemetry.ReportUnhandledException(_logger, context, edi.SourceException);

and DeveloperExceptionPageMiddlewareImpl in
DiagnosticsTelemetry.ReportUnhandledException(_logger, context, ex);

Which call DiagnosticsTelemetry.ReportUnhandledException which adds the error.type tag into the IHttpMetricsTagsFeature Tags list. This is later given to HostingMetrics.RequestEnd as customTag entries.

This way a single record can have three identical error.type entries. One for each middleware and one by the RequestEnd function.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Apr 16, 2024
@amcasey amcasey added the area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware label Apr 18, 2024
@amcasey
Copy link
Member

amcasey commented Apr 18, 2024

That seems bad. @JamesNK may have thoughts on how we expected to handle this.

@JamesNK
Copy link
Member

JamesNK commented Apr 19, 2024

Fix here: #55211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants