How many rate limiters are too many? #3981

dhempy · 2018-09-26T19:03:05Z

Ruby version:
sidekiq-ent (1.5.4)

We use Sidekiq::Limiter.concurrent() to good effect in several places. I'm about to implement a single-concurrency worker to ensure no duplicates get imported into a system in a race. (Detecting duplicates is complicated and cannot be implemented in the database). I started out with:

  Sidekiq::Limiter.concurrent('Importer', 1, wait_timeout: 0, lock_timeout: 60)

This is fine, except we're importing a LOT of records, and I'd like to take advantage of the many servers in our farm to process non-duplicate candidates.

There are a dozen fields that contribute to uniqueness with some fuzzy matches. However, there is a key, "purchase_order_id" which distinguishes 99.9% of the records coming in at any one time. So, what I'd like to do is put that in the Limiter name, similar to the Stripe user example in the Sidekiq docs:

  Sidekiq::Limiter.concurrent("Importer-#{purchase_order_id}, 1, wait_timeout: 0, lock_timeout: 60)

So, my question is...how many is too many? If we had thousands of these simultaneously processing, would that cause any problem?

If we churned through millions per hour, would that gum up Redis over time?

If either of those are issues, do you have a recommended threshhold?

The text was updated successfully, but these errors were encountered:

mperham · 2018-09-26T19:20:28Z

I'm unsure as I don't have a lot of data on limiter scale.

There's a few things you can do to minimize risk:

Use a separate Redis instance for limiters.
Don't visit the Limiter page in the Web UI or things will go badly. I don't believe it implements paging.
Set the TTL to aggressively expire unused rate limiters: ttl: 24.hours

I would think you can use 1000s of limiters simultaneously. I would try it in staging and let me know what you learn about scale. Heavy usage will not gum anything up over time.

dhempy · 2018-09-26T19:31:20Z

Thanks, Mike. I missed the TTL option. I'll use ttl: 1.day, for sure.

On further thought, I think we're going to use the last 3 digits of the ID for the rate limiter name. This will avoid races between potentially matching records, and sidestep any limiter-count overload entirely. We're only looking to improve parallelism -- not implement guaranteed uniqueness via the rate limiter. This approach should give us both.

If we generate any interesting data, I'll share it with you.

raivil · 2020-04-03T17:17:06Z

Hey,

Do you have more data on this issue?
I'm facing a similar situation where the exclusive rate limiter Redis instance it almost using all the memory.
App uses concurrent limiter of size 1 to make a distributed mutexes.

We're considering reducing some of the locks TTL to less than 24.hours but before doing that I'd like to understand why it's not recommended to go below 24 hours, as written on https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting#ttl

thanks.

mperham · 2020-04-03T21:38:32Z

I would wonder why you are running out of memory on Redis. You must be running a very small Redis instance or making many, many millions of limiters.

I recommend 24 hours because it should never, ever go lower than the max amount of time that the limiter might be held. If you have a long-running job which takes the limiter, it might be held for hours. If your jobs only hold the limiter for seconds or minutes, you can drop the TTL to a few hours.

locofocos · 2021-03-11T23:23:53Z

Thanks for the info. I came here from a search after reading https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting#ttl which recommended 24 hours.

Just as another example, we're doing exactly what you mentioned- running millions of limiters 😆 We have a multi-tenant ecommerce software that uses a concurrent limiter. We use a limiter name based on the full product ID, I think to avoid concurrent updates for that product in other systems. Over the past 90 days, we've used Sidekiq::Limiter.concurrent for about 3.8 million records. Because sidekiq is generating redis keys for both lmtr-c-whatever and lmtr-cfree for each limiter, that put us at over 7 million redis keys!

As someone a little new to sidekiq, it took me a while to figure out where the redis keys were coming from. One thing that would have helped me figure this out sooner: if there was sidekiq documentation to the effect of

If your redis instance is using too much memory / has to many keys, use an rdb analyzer to grab the unique keys. If you see key names like lmtr-cfree-asdf and lmtr-c-asdf, those are created when you call Sidekiq::Limiter.concurrent('asdf', ...).

Thanks again for the tips!

mperham · 2021-03-12T00:20:41Z

@locofocos The wiki is publicly editable. This is great content that you are welcome to add to the page. 😎

mperham closed this as completed Oct 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many rate limiters are too many? #3981

How many rate limiters are too many? #3981

dhempy commented Sep 26, 2018

mperham commented Sep 26, 2018

dhempy commented Sep 26, 2018 •

edited

Loading

raivil commented Apr 3, 2020

mperham commented Apr 3, 2020

locofocos commented Mar 11, 2021

mperham commented Mar 12, 2021

How many rate limiters are too many? #3981

How many rate limiters are too many? #3981

Comments

dhempy commented Sep 26, 2018

mperham commented Sep 26, 2018

dhempy commented Sep 26, 2018 • edited Loading

raivil commented Apr 3, 2020

mperham commented Apr 3, 2020

locofocos commented Mar 11, 2021

mperham commented Mar 12, 2021

dhempy commented Sep 26, 2018 •

edited

Loading