[Feature]: Rate limit aware batch completion calls #527

ishaan-jaff · 2023-10-04T16:16:39Z

The Feature

pass in a bunch of batch completions and a rate limit. the requests should not fail

Motivation, pitch

user request

Twitter / LinkedIn details

No response

krrishdholakia · 2023-10-04T16:19:46Z

Related issues

krrishdholakia · 2023-10-04T16:20:03Z

@ishaan-jaff are the related issues in-scope for this feature?

ishaan-jaff · 2023-10-04T23:17:11Z

as a v0 of this sticking to just this: https://docs.litellm.ai/docs/rate_limit_manager

vilmar-hillow · 2024-02-09T19:03:05Z

Is there current/planned support for rate limit aware batch completion calls? Or any workaround/tweaks to the current batch_completion method call to slow down/space out the rate of messages being sent?

krrishdholakia · 2024-02-10T03:14:03Z

@vilmar-hillow the recommended approach here is to use Router for handling the reliability logic, and allow users to write their own parallel calling logic - https://docs.litellm.ai/docs/routing

reason: seems like the batch completion logic is just multi-threaded / parallel async calls to completion().

Is there a reason you want litellm to handle this?

vilmar-hillow · 2024-02-10T04:49:45Z

Thanks! I'll explore the router. The use case is simple - batch processing for experimentation, sending as many requests as possible to a single model given the set TPM/RPM, and batch_completion seems usable only for non-rate-limited cases. Then I saw this closed feature specifically talking about this feature, so I thought it would be a good place to ask :)

krrishdholakia · 2024-04-06T15:49:24Z

@CLARKBENHAM caught a relevant issue around making the router better for this.

We've added the testing for this. Will work on fixing issues it raises. If there's something specific, will make it a separate issue for tracking.

ishaan-jaff added the enhancement New feature or request label Oct 4, 2023

ishaan-jaff closed this as completed Oct 4, 2023

CLARKBENHAM mentioned this issue Apr 2, 2024

add test for rate limits - Router isn't coroutine safe #2798

Merged

krrishdholakia mentioned this issue Apr 6, 2024

[Bug]: Have fallback work with batch_completions #507

Closed

josh-ashkinaze mentioned this issue Jul 10, 2024

Improve parallel requests handling josh-ashkinaze/plurals#30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Rate limit aware batch completion calls #527

[Feature]: Rate limit aware batch completion calls #527

ishaan-jaff commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

vilmar-hillow commented Feb 9, 2024

krrishdholakia commented Feb 10, 2024

vilmar-hillow commented Feb 10, 2024

krrishdholakia commented Apr 6, 2024

[Feature]: Rate limit aware batch completion calls #527

[Feature]: Rate limit aware batch completion calls #527

Comments

ishaan-jaff commented Oct 4, 2023

The Feature

Motivation, pitch

Twitter / LinkedIn details

krrishdholakia commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

vilmar-hillow commented Feb 9, 2024

krrishdholakia commented Feb 10, 2024

vilmar-hillow commented Feb 10, 2024

krrishdholakia commented Apr 6, 2024