Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes for design: Rate Limiting #59

Open
jamesmunns opened this issue Jul 29, 2024 · 2 comments
Open

Notes for design: Rate Limiting #59

jamesmunns opened this issue Jul 29, 2024 · 2 comments

Comments

@jamesmunns
Copy link
Collaborator

As part of the current milestone, we'd like to add basic rate limiting options.

https://blog.nginx.org/blog/rate-limiting-nginx is a good basic read on how NGINX does this.

There's a couple of concepts we need to define:

  • The "key" - or what we use to match a request to the rate limiting rule
    • Could be source IP, request URI, or other items
    • TODO: This is probably another area where we want to define a common "format-like" syntax, also needed for load balancing with hash algorithms, so you can specify things like $src_ip/$uri or something
    • TODO: We probably need to define how to handle when a request could potentially match two keys, but PROBABLY the answer is "rules are applied in the order they are applicable, first match wins" - nginx applies all and takes the most restrictive result!
  • The "rule" - or what policies we follow to decide what to do with each request. The three outcomes of a rule are:
    • Forward immediately - allow the request to continue immediately
    • Delay - don't serve the request immediately, but wait for some amount of time (more on this later) before forwarding
    • Reject - Immediately respond with a 503 or similar "too much" error response
  • The "Rate", which actually has multiple components, if we are implementing a leaky bucket style of rate limiting (what NGINX does)
    • The "active in-flight" count - how many outstanding forwarded requests can we have at one time?
    • The "delayed in-flight" count - how many requests will we hold on to at one time before rejecting?
    • The "delay to active promotion rate" - how often do we "pop" a delayed request to the "active" queue?
    • NOTE: These are SLIGHTLY different than the rate, burst, and delay terms from nginx!

Unlike NGINX, I don't currently think we need to consider the "zone" or "scope" of the rules - I currently intend for rate limiting to be per-service, which means that the "zone" is essentially the tuple of (service, key)

I have checked with @eaufavor, and the correct way to handle delayed requests in pingora is to have the task yield (for a timer, or activation queue, etc).

@jamesmunns jamesmunns added this to the Kickstart Spike 2 milestone Jul 29, 2024
@git001
Copy link

git001 commented Jul 29, 2024

Maybe you are also interested to look into Introduction to Traffic Shaping Using HAProxy which have similar features as nginx.

Traffic shaping is available since https://www.haproxy.com/blog/announcing-haproxy-2-7

@jamesmunns
Copy link
Collaborator Author

@git001 thanks for the pointer, I'll check it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants