Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Rate limit aware batch completion calls #527

Closed
ishaan-jaff opened this issue Oct 4, 2023 · 7 comments
Closed

[Feature]: Rate limit aware batch completion calls #527

ishaan-jaff opened this issue Oct 4, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@ishaan-jaff
Copy link
Contributor

The Feature

pass in a bunch of batch completions and a rate limit. the requests should not fail

Motivation, pitch

user request

Twitter / LinkedIn details

No response

@ishaan-jaff ishaan-jaff added the enhancement New feature or request label Oct 4, 2023
@krrishdholakia
Copy link
Contributor

@krrishdholakia
Copy link
Contributor

@ishaan-jaff are the related issues in-scope for this feature?

@ishaan-jaff
Copy link
Contributor Author

as a v0 of this sticking to just this: https://docs.litellm.ai/docs/rate_limit_manager

@vilmar-hillow
Copy link

Is there current/planned support for rate limit aware batch completion calls? Or any workaround/tweaks to the current batch_completion method call to slow down/space out the rate of messages being sent?

@krrishdholakia
Copy link
Contributor

@vilmar-hillow the recommended approach here is to use Router for handling the reliability logic, and allow users to write their own parallel calling logic - https://docs.litellm.ai/docs/routing

reason: seems like the batch completion logic is just multi-threaded / parallel async calls to completion().

Is there a reason you want litellm to handle this?

@vilmar-hillow
Copy link

Thanks! I'll explore the router. The use case is simple - batch processing for experimentation, sending as many requests as possible to a single model given the set TPM/RPM, and batch_completion seems usable only for non-rate-limited cases. Then I saw this closed feature specifically talking about this feature, so I thought it would be a good place to ask :)

@krrishdholakia
Copy link
Contributor

@CLARKBENHAM caught a relevant issue around making the router better for this.

We've added the testing for this. Will work on fixing issues it raises. If there's something specific, will make it a separate issue for tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants