-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How many rate limiters are too many? #3981
Comments
I'm unsure as I don't have a lot of data on limiter scale. There's a few things you can do to minimize risk:
I would think you can use 1000s of limiters simultaneously. I would try it in staging and let me know what you learn about scale. Heavy usage will not gum anything up over time. |
Thanks, Mike. I missed the TTL option. I'll use On further thought, I think we're going to use the last 3 digits of the ID for the rate limiter name. This will avoid races between potentially matching records, and sidestep any limiter-count overload entirely. We're only looking to improve parallelism -- not implement guaranteed uniqueness via the rate limiter. This approach should give us both. If we generate any interesting data, I'll share it with you. |
Hey, Do you have more data on this issue? We're considering reducing some of the locks TTL to less than thanks. |
I would wonder why you are running out of memory on Redis. You must be running a very small Redis instance or making many, many millions of limiters. I recommend 24 hours because it should never, ever go lower than the max amount of time that the limiter might be held. If you have a long-running job which takes the limiter, it might be held for hours. If your jobs only hold the limiter for seconds or minutes, you can drop the TTL to a few hours. |
Thanks for the info. I came here from a search after reading https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting#ttl which recommended 24 hours. Just as another example, we're doing exactly what you mentioned- running millions of limiters 😆 We have a multi-tenant ecommerce software that uses a concurrent limiter. We use a limiter name based on the full product ID, I think to avoid concurrent updates for that product in other systems. Over the past 90 days, we've used Sidekiq::Limiter.concurrent for about 3.8 million records. Because sidekiq is generating redis keys for both As someone a little new to sidekiq, it took me a while to figure out where the redis keys were coming from. One thing that would have helped me figure this out sooner: if there was sidekiq documentation to the effect of
Thanks again for the tips! |
@locofocos The wiki is publicly editable. This is great content that you are welcome to add to the page. 😎 |
Ruby version:
sidekiq-ent (1.5.4)
We use Sidekiq::Limiter.concurrent() to good effect in several places. I'm about to implement a single-concurrency worker to ensure no duplicates get imported into a system in a race. (Detecting duplicates is complicated and cannot be implemented in the database). I started out with:
This is fine, except we're importing a LOT of records, and I'd like to take advantage of the many servers in our farm to process non-duplicate candidates.
There are a dozen fields that contribute to uniqueness with some fuzzy matches. However, there is a key, "purchase_order_id" which distinguishes 99.9% of the records coming in at any one time. So, what I'd like to do is put that in the Limiter name, similar to the Stripe user example in the Sidekiq docs:
So, my question is...how many is too many? If we had thousands of these simultaneously processing, would that cause any problem?
If we churned through millions per hour, would that gum up Redis over time?
If either of those are issues, do you have a recommended threshhold?
The text was updated successfully, but these errors were encountered: