Points-based Limiter #5757

kirillplatonov · 2023-01-31T18:28:58Z

The existing leaky bucket limiter supports request-based rate limiting. Some APIs like provide points-based limiters. For example Shopify GraphQL API:
We're given 1000 points max. Every API request has its cost and returns the remaining points in response. The bucket is refilled at a rate of X points per second.

It would be great if Sidekiq Enterprise could support this use case as well.

References:
https://shopify.dev/api/usage/rate-limits#graphql-admin-api-rate-limits

mperham · 2023-01-31T18:34:03Z

If Shopify wanted to contribute an official limiter I would consider it but this is pretty hyper-specialized for me to build and support long-term. I generally don't build things which are only used for one site. Are there any other points-based limiters elsewhere? Can we build a standard limiter that works with all of those sites, esp if those sites add response headers which mutate the limiter logic somehow (for example, a hypothetical X-Limiter-Points-Consumed: 10.75)?

kirillplatonov · 2023-01-31T19:20:41Z

Point-based limits seem to be pretty common for GraphQL-based APIs. It's used to limit the complexity of queries made during a certain period.

I found a few other GraphQL APIs that use a similar approach to rate limits:
https://docs.github.com/en/graphql/overview/resource-limitations#rate-limit
https://developers.linear.app/docs/graphql/working-with-the-graphql-api/rate-limiting#api-request-complexity-limits
https://developer.monday.com/api-reference/docs/rate-limits

What makes it hyper-specialized on Shopify is the combination of point limits with a leaky bucket. I guess having just a standard points-based limiter that limits the number of points for a certain period should be enough for most cases. The logic for fetching remaining points should definitely be part of the application code as some APIs return it in headers and some in the response body. The app could just write back the remaining number to Sidekiq. For example:

limiter = Sidekiq::Limiter.points("shopify-graphql-#{user_id}", 1000, 20) # 1000 points every 20 seconds
limiter.within_limit do
  # make graphql call and fetch remaining points from response
  limiter.remaining(850)
end

Alternatively, we can pass to limiter how many points has been consumed:

limiter = Sidekiq::Limiter.points("shopify-graphql-#{user_id}", 1000, 20) # 1000 points every 20 seconds
limiter.within_limit do
  # make graphql call and fetch consumed points from response
  limiter.points_consumed(150)
end

mperham · 2023-02-01T21:18:08Z

I think the points aspect is perfectly normal, points are just drops in the leaky bucket. You get back X drops per second or minutes.

The novel twist is that each call dynamically consumes M points which you don't know until the call is finished. The tricky aspect is estimating the point usage before the call and how to fetch the actual points consumed after the call that is specialized to each service provider. As far as I can tell, none of the links you gave (thanks!) share much of any logic.

I could provide a generic points limiter which has two additions:

limiter = Sidekiq::Limiter.points("foo", 5000, :minute)

estimate = # you estimate how many points this call will make
limiter.within_limit(estimate: 300) do |handle|
  # do work

  actual = # you calculate or get the actual points value consumed
  handle.finalize(actual)
end

Something like that might work but I'd need to build a prototype. It may not be worth it to implement the finalize bit as there is an open question about points and windows of time. Might be better to just implement an estimate and let them rate limit you if your estimate is off.

kirillplatonov · 2023-02-13T17:43:14Z

I think having estimate with rate limit error handling should be enough for most cases. finalize will add additional precision and will allow making more calls if the actual cost is much less than the estimate. But that will make a difference only at a large scale.

mperham · 2023-02-13T18:24:46Z

Sure. This is low priority for me so don't expect it soon.

th-ad · 2023-02-14T16:46:14Z

I put together a rough implementation based on this conversation and the existing enterprise limiters.

@mperham What would be the best way to send it to you?

mperham · 2023-02-14T16:58:00Z

@th-ad Cool. You can send it to mike@contribsys.com

mperham · 2023-02-15T00:11:24Z

Thanks @th-ad, I have this working now: it calls successfully four times and raises on the fifth call, as expected.

  def test_typical_flow
    limiter = Sidekiq::Limiter.points(:shopify, 1000, 20, wait_timeout: 0)
    count = 0
    e = assert_raises Sidekiq::Limiter::OverLimit do
      5.times do
        limiter.within_limit(estimate: 300) do |helper|
          count += 1
          helper.finalize(200)
        end
      end
    end
    assert_equal 4, count
    assert_equal "shopify: need 300 points, have 200", e.message
  end

Is finalize the right method name to report the actual points used?

kirillplatonov · 2023-02-15T04:29:23Z

points_used might be a good option as it's more clear from the name what it does

mperham · 2023-02-22T00:02:50Z

I'm finalizing the API design and implementation. The happy path is great but there's one big unanswered question: what do we do if the remote side raises an error?

We've already consumed the estimated points locally but we don't know if they've been consumed on the remote side. If I have 1000 points and I make a call with an estimate of 200 points and that call raises an error, do I still have 1000 points, 800 or other?

What if you get a 503? 401?
Which error types should consume the points?
Which error types should reclaim the points?
Should the caller be responsible for calling points_used(0) to reclaim the estimated points?

I'm inclined to say that this "unhappy path" bookkeeping is the responsibility of the caller to handle. What do you think?

kirillplatonov · 2023-02-22T20:58:58Z

I'm inclined to say that this "unhappy path" bookkeeping is the responsibility of the caller to handle. What do you think?

I agree with it. Handling errors on the limiter's level doesn't make much sense. Implementations of Graphql clients vary a lot and some clients don't even raise Ruby exceptions in case of errors so there's nothing to catch.

I think being able to report points_used(0) to reclaim points should be enough. Some types of errors won't consume points and would be great to have a way to return them back.

kirillplatonov mentioned this issue Jan 31, 2023

Leaky Bucket Limiter #4414

Closed

kirillplatonov changed the title ~~Points-based Leaky Bucket Limiter~~ Points-based Limiter Feb 1, 2023

mperham mentioned this issue Feb 15, 2023

Upcoming 7.1 changes #5769

Closed

10 tasks

mperham added this to the 7.1 milestone Feb 21, 2023

mperham added the enterprise label Feb 21, 2023

mperham closed this as completed Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Points-based Limiter #5757

Points-based Limiter #5757

kirillplatonov commented Jan 31, 2023

mperham commented Jan 31, 2023

kirillplatonov commented Jan 31, 2023

mperham commented Feb 1, 2023

kirillplatonov commented Feb 13, 2023

mperham commented Feb 13, 2023

th-ad commented Feb 14, 2023

mperham commented Feb 14, 2023

mperham commented Feb 15, 2023

kirillplatonov commented Feb 15, 2023

mperham commented Feb 22, 2023 •

edited

Loading

kirillplatonov commented Feb 22, 2023 •

edited

Loading

Points-based Limiter #5757

Points-based Limiter #5757

Comments

kirillplatonov commented Jan 31, 2023

mperham commented Jan 31, 2023

kirillplatonov commented Jan 31, 2023

mperham commented Feb 1, 2023

kirillplatonov commented Feb 13, 2023

mperham commented Feb 13, 2023

th-ad commented Feb 14, 2023

mperham commented Feb 14, 2023

mperham commented Feb 15, 2023

kirillplatonov commented Feb 15, 2023

mperham commented Feb 22, 2023 • edited Loading

kirillplatonov commented Feb 22, 2023 • edited Loading

mperham commented Feb 22, 2023 •

edited

Loading

kirillplatonov commented Feb 22, 2023 •

edited

Loading