Notes for design: Rate Limiting #59

jamesmunns · 2024-07-29T18:08:10Z

As part of the current milestone, we'd like to add basic rate limiting options.

https://blog.nginx.org/blog/rate-limiting-nginx is a good basic read on how NGINX does this.

There's a couple of concepts we need to define:

The "key" - or what we use to match a request to the rate limiting rule
- Could be source IP, request URI, or other items
- TODO: This is probably another area where we want to define a common "format-like" syntax, also needed for load balancing with hash algorithms, so you can specify things like $src_ip/$uri or something
- TODO: We probably need to define how to handle when a request could potentially match two keys, but PROBABLY the answer is "rules are applied in the order they are applicable, first match wins" - nginx applies all and takes the most restrictive result!
The "rule" - or what policies we follow to decide what to do with each request. The three outcomes of a rule are:
- Forward immediately - allow the request to continue immediately
- Delay - don't serve the request immediately, but wait for some amount of time (more on this later) before forwarding
- Reject - Immediately respond with a 503 or similar "too much" error response
The "Rate", which actually has multiple components, if we are implementing a leaky bucket style of rate limiting (what NGINX does)
- The "active in-flight" count - how many outstanding forwarded requests can we have at one time?
- The "delayed in-flight" count - how many requests will we hold on to at one time before rejecting?
- The "delay to active promotion rate" - how often do we "pop" a delayed request to the "active" queue?
- NOTE: These are SLIGHTLY different than the rate, burst, and delay terms from nginx!

Unlike NGINX, I don't currently think we need to consider the "zone" or "scope" of the rules - I currently intend for rate limiting to be per-service, which means that the "zone" is essentially the tuple of (service, key)

I have checked with @eaufavor, and the correct way to handle delayed requests in pingora is to have the task yield (for a timer, or activation queue, etc).

The text was updated successfully, but these errors were encountered:

git001 · 2024-07-29T19:15:55Z

Maybe you are also interested to look into Introduction to Traffic Shaping Using HAProxy which have similar features as nginx.

Traffic shaping is available since https://www.haproxy.com/blog/announcing-haproxy-2-7

jamesmunns · 2024-07-30T09:32:15Z

@git001 thanks for the pointer, I'll check it out!

jamesmunns added this to the Kickstart Spike 2 milestone Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes for design: Rate Limiting #59

Notes for design: Rate Limiting #59

jamesmunns commented Jul 29, 2024

git001 commented Jul 29, 2024

jamesmunns commented Jul 30, 2024

Notes for design: Rate Limiting #59

Notes for design: Rate Limiting #59

Comments

jamesmunns commented Jul 29, 2024

git001 commented Jul 29, 2024

jamesmunns commented Jul 30, 2024