Rate Limiting Logic
If the number of requests in the past (burst / rate)
is less than burst
,
then another request is acceptable immediately. Otherwise let start
be the
greater of the current time and the last timestamp in the history; the next
request will be allowed at the time start + (1 / rate)
.
In fact we can do this without storing the history, instead only storing a
single timestamp per client and updating it atomically; the stored time is the
time at which the next request will be allowed. On accepting a request, if the
stored time is nonexistent or more than requestTS - (burst / rate)
, we set it
to requestTS - ((burst - 1) / rate)
. Otherwise, we add (1 / rate)
to its present
value. If the resulting value is in the future then no more requests are acceptable
until the time named by the result; otherwise, another request is acceptable
immediately.
The requestTS
used above should be the received timestamp for accepted
requests, and the delayed timestamp for delayed requests; this ensures that
the timestamp is always moved forward by enough to ensure the rate limit.