Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microsoft.AspNetCore.RateLimiting metrics #47745

Closed
JamesNK opened this issue Apr 17, 2023 · 13 comments
Closed

Microsoft.AspNetCore.RateLimiting metrics #47745

JamesNK opened this issue Apr 17, 2023 · 13 comments
Labels
api-approved API was approved in API review, it can be implemented area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware

Comments

@JamesNK
Copy link
Member

JamesNK commented Apr 17, 2023

Background and Motivation

Addresses the rate limiting part of dotnet/runtime#79459

Microsoft.AspNetCore.RateLimiting is new in .NET 7. It currently doesn't have event counters or metrics counters. This PR adds metrics counters, making rate limiting in ASP.NET Core apps more observable.

Notes:

  • Microsoft.AspNetCore.RateLimiting meter is created by metrics DI integration.
  • These counters are in the middleware layer and focus on how requests are impacted by rate limiting. They don't provide low-level counter information.
  • Rate-limiting middleware supports partitioning. The partition isn't added to rate-limiting counter tags. This is for a couple of reasons:
    • Primary reason is the partition is defined by the app could be high-cardinality. For example, a common partition is the authenticated user name. The system could have thousands of users, creating a metrics tag cardinality explosion.
    • It's possible to have a global and endpoint policy for a request. They could have different partition values. Which value to use? Or change counters to measure policies separately? Simpler not to include it.

Proposed API

Microsoft.AspNetCore.RateLimiting

current-lease-requests

Name Instrument Type Unit Description
current-lease-requests UpDownCounter {request} Number of HTTP requests that are currently active on the server that hold a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

lease-request-duration

Name Instrument Type Unit Description
lease-request-duration Histogram s The duration of rate limiting leases held by HTTP requests on the server.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

current-requests-queued

Name Instrument Type Unit Description
current-requests-queued UpDownCounter {request} Number of HTTP requests that are currently queued, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

queued-request-duration

Name Instrument Type Unit Description
queued-request-duration Histogram s The duration of requests in a queue, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

lease-failed-requests

Name Instrument Type Unit Description
lease-failed-requests Counter {request} Number of HTTP requests that failed to acquire a rate limiting lease. Requests could be rejected by global or endpoint rate limiting policies. Or the request could be canceled while waiting for the lease.
Attribute Type Description Examples Presence
reason string Reason why acquiring the lease failed. GlobalLimiter; EndpointLimiter; RequestCanceled Always.
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

Usage Examples

Alternative Designs

Risks

@JamesNK JamesNK added area-runtime api-ready-for-review API is ready for formal API review - https://github.com/dotnet/apireviews labels Apr 17, 2023
@JamesNK
Copy link
Member Author

JamesNK commented Apr 17, 2023

@Tratcher @davidfowl @noahfalk @tarekgh @samsp-msft @joperezr

Metrics counters for Microsoft.AspNetCore.RateLimiting. This is ASP.NET Core middleware that makes it easy to apply rate limiting to HTTP requests.

@davidfowl
Copy link
Member

Shouldn't the route template be a dimension?

@JamesNK
Copy link
Member Author

JamesNK commented Apr 17, 2023

If an endpoint has a policy name then that could be used to identify the endpoint.

But yes, route could be included here. If the route-limiting middleware is before UseRouting in the pipeline then it will be null (along with policy name).

@davidfowl
Copy link
Member

Feels like that would be a useful thing to add for any meter in the pipeline that is endpoint aware.

@BrennanConroy
Copy link
Member

BrennanConroy commented Apr 19, 2023

If we add the ability for multiple leases to be acquired by a single request, how will that be displayed by the metrics?
Should current-lease-requests be, current-acquired-leases instead? And not mention HTTP?
Or would we add another counter?

Similarly with current-requests-queued, that could be current-queued-leases.

@JamesNK
Copy link
Member Author

JamesNK commented Apr 19, 2023

A single request can already acquire multiple leases because of global + endpoint limiters both being acquired. Also, PartitionedRateLimiter.CreateChained can merge multiple limiters together, but we have no visibility of that. From the middleware's perspective, it's one limiter and lease.

The counters specifically focus on requests rather than leases. Perhaps if lower-level rate limiting adds metrics then it could record that level of information.

One potential area of confusion is current-requests-queued. If a request is waiting on the global limiter then it is in the current-requests-queued. However, it's recorded with the endpoint's policy name. The policy name might make someone think the queue is waiting on the endpoint limiter instead of the global limiter.

Maybe the current-requests-queued could be incremented and decremented for each limiter (global and endpoint) and include a tag to say what the queue reason is (i.e. GlobalLimiter, EndpointLimiter) rather than treating them together as the queue.

@BrennanConroy
Copy link
Member

A single request can already acquire multiple leases because of global + endpoint limiters both being acquired

That's not what I meant. A potential future feature in the middleware could be to add "costs" to endpoints, so instead of AcquireAsync(1) you could call AcquireAsync(5).

@JamesNK
Copy link
Member Author

JamesNK commented Apr 20, 2023

Is that permitCount? https://learn.microsoft.com/en-us/dotnet/api/system.threading.ratelimiting.partitionedratelimiter-1.attemptacquire?view=aspnetcore-7.0#definition

Do you imagine that an endpoint could specify a number? e.g. [EnableRateLimiting("myPolicy", PermitCount = 5)]

It could get added as a tag to all the counters like the policy name.

@BrennanConroy
Copy link
Member

Do you imagine that an endpoint could specify a number? e.g. [EnableRateLimiting("myPolicy", PermitCount = 5)]

Yeah, something like that. Although, I think there is someone asking to add additional permit cost after the request is done being processed. Which would be problematic since the tags need to match for start and end?

It could get added as a tag to all the counters like the policy name.

Would that start getting close to cardinality explosion? Now you can have any combination of policy and permit count cost.

@JamesNK
Copy link
Member Author

JamesNK commented Apr 21, 2023

We're adding the route as tag, so there is a separate value per route already. Each route only has one policy and permit count.

"route + policy + permit count" will almost always line up so including the policy and permit count doesn't make it any worse.

@halter73 halter73 reopened this Apr 27, 2023
@halter73
Copy link
Member

API Review Notes

  • What unit should we use for duration?
    • Seconds is more consistent with open telemetry and other tools in the ecosystem, but Kestrel's previous event counters milliseconds. Seconds is stored as double.
    • There's no push back on seconds as a unit.
    • Histograms are standard for duration.
  • Does queued-request-duration only count requests that are queued? Would zero be entered for requests that succeed or fail immediately?
    • Only queued requests.
  • Does queued-request-duration need an attribute for whether it succeeded or failed?
    • It seems potentially useful. Let's add the "reason" attribute from lease-failed-requests but only set it for failures instead of always.
  • What do we think about current-lease-requests vs. current-acquired-leases.
    • current-acquired-leases could clarify that it does not include queued lease requests
    • However current-acquired-leases might imply we increment this twice per request. Once for the global, and once for the endpoint-specific limiter.
    • current-requests-with-acquired-leases might be more clear, but we want something shorter that ends with requests.
    • current-acquired-lease-requests?
      • Yes. acquired-request-duration should be updated to acquired-lease-request-duration too then.
  • Is the route pattern filled in? Should the attribute be called "routePattern" instead of "route"?
    • The pattern is not filled in.
    • We chose "route" because it's shorter.
    • Technically, someone could set the IRouteDiagnosticsMetadata.Route manually, and the metadata just uses "Route" instead of "RoutePattern".
  • The following is our current best proposal, but we want to discuss the naming of the "acquired" and "failed" lease counters over email.

Microsoft.AspNetCore.RateLimiting

current-acquired-lease-requests

Name Instrument Type Unit Description
current-acquired-lease-requests UpDownCounter {request} Number of HTTP requests that are currently active on the server that hold a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

acquired-lease-request-duration

Name Instrument Type Unit Description
acquired-lease-request-duration Histogram s The duration of rate limiting leases held by HTTP requests on the server.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

current-requests-queued

Name Instrument Type Unit Description
current-requests-queued UpDownCounter {request} Number of HTTP requests that are currently queued, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

queued-request-duration

Name Instrument Type Unit Description
queued-request-duration Histogram s The duration of requests in a queue, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set
reason string Reason why acquiring the lease failed. GlobalLimiter; EndpointLimiter; RequestCanceled If failed.

lease-failed-requests

Name Instrument Type Unit Description
lease-failed-requests Counter {request} Number of HTTP requests that failed to acquire a rate limiting lease. Requests could be rejected by global or endpoint rate limiting policies. Or the request could be canceled while waiting for the lease.
Attribute Type Description Examples Presence
reason string Reason why acquiring the lease failed. GlobalLimiter; EndpointLimiter; RequestCanceled Always.
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

@halter73
Copy link
Member

Here is the final set of names after an email suggestion from @Tratcher. It's a subtle variation on the original proposal which used "lease-requests" instead of "leased-requests". Moving "requests" to the end of "current-queued-requests" also sounds more consistent.

Microsoft.AspNetCore.RateLimiting

current-leased-requests

Name Instrument Type Unit Description
current-leased-requests UpDownCounter {request} Number of HTTP requests that are currently active on the server that hold a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

leased-request-duration

Name Instrument Type Unit Description
leased-request-duration Histogram s The duration of rate limiting leases held by HTTP requests on the server.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

current-queued-requests

Name Instrument Type Unit Description
current-queued-requests UpDownCounter {request} Number of HTTP requests that are currently queued, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

queued-request-duration

Name Instrument Type Unit Description
queued-request-duration Histogram s The duration of requests in a queue, waiting to acquire a rate limiting lease.
Attribute Type Description Examples Presence
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set
reason string Reason why acquiring the lease failed. GlobalLimiter; EndpointLimiter; RequestCanceled If failed.

lease-failed-requests

Name Instrument Type Unit Description
lease-failed-requests Counter {request} Number of HTTP requests that failed to acquire a rate limiting lease. Requests could be rejected by global or endpoint rate limiting policies. Or the request could be canceled while waiting for the lease.
Attribute Type Description Examples Presence
reason string Reason why acquiring the lease failed. GlobalLimiter; EndpointLimiter; RequestCanceled Always.
policy string Rate limiting policy name for this request. MyPolicyName Added if the matched route has a rate limiting policy name.
method string HTTP request method. GET; POST; HEAD Added if route endpoint set
route string The matched route {controller}/{action}/{id?} Added if route endpoint set

API approved!

@halter73 halter73 added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for formal API review - https://github.com/dotnet/apireviews labels Apr 28, 2023
@JamesNK
Copy link
Member Author

JamesNK commented May 31, 2023

Done with #47758

@JamesNK JamesNK closed this as completed May 31, 2023
@amcasey amcasey added area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware and removed area-runtime labels Jun 2, 2023
@dotnet dotnet locked as resolved and limited conversation to collaborators Jul 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware
Projects
None yet
Development

No branches or pull requests

5 participants