Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several concurrent but rate limited? #9

Closed
ziggamon opened this issue Jan 26, 2016 · 14 comments
Closed

Several concurrent but rate limited? #9

ziggamon opened this issue Jan 26, 2016 · 14 comments
Assignees
Labels

Comments

@ziggamon
Copy link

Hi,

Tried implementing this library for the use case of a remote API that has rate limits.

var limiter = new Bottleneck(3, 1000); // 3 per second

This code actually waits a second before firing off request nr. 2.

I'd like it to be able to do 3 requests per second and only wait if the queue is full, that is when run in a loop it would fire the first three requests immediately, and request 4 would go off 1 sec after request 1, etc.

More complex example use case: (remote API has specs: max 3/sec + 200/hr)

var limiter = new Bottleneck(3, 1000); // 3 per second
var hourlyLimit = new Bottleneck(200, 1000*60*60); // 200 per hour
limiter.chain(hourlyLimit);

This currently fires one and then waits an hour for the next call.

Am I missing something here? Thankful for advice.

@SGrondin
Copy link
Owner

This code actually waits a second before firing off request nr. 2.

This is normal. Basically the first argument is the number of concurrently executing requests and the second is the minimal time to wait between requests. Imagine it as a logical AND between those 2 rules. Requests will be executed as fast as possible in a way that both rules are followed.

I'd like it to be able to do 3 requests per second and only wait if the queue is full, that is when run in a loop it would fire the first three requests immediately, and request 4 would go off 1 sec after request 1, etc.

If you want to do 3 requests per second, the easiest way is something like this:

var limiter = new Bottleneck(0, 0) // All unlimited
limiter.changeReservoir(3) // Needs to be set to 3 immediately, then the interval will reset it to 3 every second
setInterval(function () {
  limiter.changeReservoir(3)
}, 1000)

// Then add all your requests into the limiter and it'll guarantee a 3/sec rate.

That way every second it resets the number of requests it's allowed to run to 3.

More complex example use case: (remote API has specs: max 3/sec + 200/hr)

Chain 2 limiters, one with that calls changeReservoir(3) every second and one that calls changeReservoir(200) every hour. Like this:

var limiterSec = new Bottleneck(0, 0)
limiterSec.changeReservoir(3)
setInterval(function () {
  limiterSec.changeReservoir(3)
}, 1000)

var limiterHour = new Bottleneck(0, 0)
limiterHour.changeReservoir(200)
setInterval(function () {
  limiterHour.changeReservoir(200)
}, 1000*60*60)

limiterSec.chain(limiterHour)

// Then add all your requests to limiterSec

I might add a feature that makes your use case easier, but I need to think about how to make it clean both in the way to use the library and in the library itself. Bottleneck is meant to give you all the tools to build complex throttling solutions from basic commands. Lemme know if you have suggestions, suggestions or any other questions.

@SGrondin SGrondin self-assigned this Jan 26, 2016
@ziggamon
Copy link
Author

Thanks for a speedy and helpful reply Simon!

I guess by that logic what I really want would be to for each scheduled call to do a

setTimeout(function(){ limiter.incrementReservoir(1); }, 1000);

This way if requests get added not all at once, every time a second has passed since the first it can take in another?

Without that it'd do them all in batches whereas I'd like them to be in a rolling queue, if that makes sense.

Could you explain to me for my education, what is an example use case for the way you implemented it the original way?

@SGrondin
Copy link
Owner

I guess by that logic what I really want would be to for each scheduled call to do a setTimeout(function(){ limiter.incrementReservoir(1); }, 1000); This way if requests get added not all at once, every time a second has passed since the first it can take in another?

Yeah that will work, just don't forget to call limiter.changeReservoir(3) when you create the limiter.

Without that it'd do them all in batches whereas I'd like them to be in a rolling queue, if that makes sense. Could you explain to me for my education, what is an example use case for the way you implemented it the original way?

Your idea will create a nice rolling average, yeah. The code I suggested you will make "batches", but it'll maximize your throughput, so all your requests will be completed as fast as "legally" possible. With your idea, you lose the time that is spent doing a request. If you want to maximize your throughput with your solution, you'd have to measure how long each request took, then call the timeout with 1000 - timeTaken.

@ziggamon
Copy link
Author

In many cases I guess it's irrelevant how long the request takes, just that I want no more than 3 to start any given second. Thus I can set the timeout at the same time as scheduling it. Although that doesn't work either because I'm not sure if it'll start to execute right away or not.

@SGrondin
Copy link
Owner

You can wrap your request into a function and submit that function:

var lambda = function (cb) {
  setTimeout(function(){ limiter.incrementReservoir(1); }, 1000);
  doMyRequest(cb)
}

limiter.submit(lambda, cb) // or null instead of cb

@ziggamon
Copy link
Author

Yeah, I guess.

Do you mind explaining briefly why this setup isn't the default way the library works? I feel like I'm missing something obvious here...

@SGrondin
Copy link
Owner

The reason is that having specific 'per interval' rate limits is far from the average case. It might sound crazy, but those 2 arguments you pass to the constructor are enough for almost everyone and every use case.

Having 'per interval' rate limits is not easy to keep clean, you would need a lot more code on your side, for example:

// Hypothetical example
var limiter = new Bottleneck([{concurrent: 0, nbRequests: 3, interval: 1000}, {concurrent: 0, nbRequests: 200, interval: 1000*60*60}])

and that will cover just a small percentage of all the use cases of rate limiting/throttling/task scheduling out there. And it doesn't play nice with different priority levels. I regularly do Github code searches to look at how people use Bottleneck and the use cases are very varied, some people build really cool things because it's basic and flexible.

In comparison, something like var limiter = new Bottleneck(2, 250) is very easy to get started and powerful enough for almost every simple use case out there.

I might add a whole new class that would look like var limiter = new Bottleneck.Interval([{concurrent: 0, nbRequests: 3, interval: 1000}, ...) to simplify your use case. Obviously it would just be using the basic Bottleneck calls underneath. The problem is that it gets pretty complex to just understand how priorities, promises, callbacks, limits and all that interact with each other from the user's perspective and that can lead to subtle bugs when they don't exactly understand what they're doing. By keeping the Bottleneck functions simple and composable, it's a lot easier to understand what you're doing and what's going to happen.


By the way, have you thought about simply doing var limiter = new Bottleneck(0, 333) for the simple 3/sec case? All it'll do is guarantee 333ms between each request. It might be enough for what you want to do.

If that works for you, for your more complex example you could then chain it into

var limiterHour = new Bottleneck(0, 0)
limiterHour.changeReservoir(200)
setInterval(function () {
  limiterHour.changeReservoir(200)
}, 1000*60*60)

You won't get "batches" because the first limiter will have smoothed them out.

@ziggamon
Copy link
Author

Fair enough.

Yeah, Bottleneck(1,334) is how I implemented it now.

I guess I actually misunderstood what the lib does, I assumed the lib worked "my way" and only noticed that it didn't when testing. I guess what threw me was the first concurrent parameter, it basically does nothing if the request takes shorter to execute than maxConcurrent*minTime (or am I misunderstanding again?).

Would be happy to help out and test if you make an Interval subclass, I think it would be very helpful, especially to fit the double-spec requirement that many APIs use these days (per second and per hour limits).

@SGrondin
Copy link
Owner

it basically does nothing if the request takes shorter to execute than maxConcurrent*minTime (or am I misunderstanding again?).

Yeah, it really depends on your use case. It's pretty rare that the requests are always shorter than that. 334ms is also pretty slow, so that contributes to that effect in your use case.

I think it would be very helpful, especially to fit the double-spec requirement that many APIs use these days (per second and per hour limits).

Very true.

I'll make an interval branch this weekend and I'll let you know when it's ready for feedback.

@ziggamon
Copy link
Author

Awesome, thanks! :D

@SGrondin
Copy link
Owner

SGrondin commented Feb 2, 2016

Really glad I'm tackling this one myself, there are many subtle details to get right. I've made some progress, but nothing ready to test yet. I'll update this issue when I have something ready to try out.

@ziggamon
Copy link
Author

ziggamon commented Feb 2, 2016

:) I'm sure. Let me know if I can be of help somehow!

Btw, came across this plugin in Restify, perhaps it's useful? https://github.com/restify/plugins/blob/master/lib/plugins/throttle.js Haven't looked too much in it though

@SGrondin
Copy link
Owner

I'm gonna close this for now and keep it in mind for a possible next version.

@jshakes
Copy link

jshakes commented Mar 24, 2017

Hi Simon, I've found myself with the same requirements as Sergej. Are you still considering this for version 2? I would be happy to help out wherever possible.

To give some background on my situation: the Twitter API allows max. 15 requests per endpoint, per 15 minutes. I'm currently running 1 request per minute to be safe, but I would rather run up to 15 requests immediately then wait for 15 minutes for the next batch (which in my application is unlikely to be necessary).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants