Skip to content

[Feature]: Add support for custom adaptative timeout settings #15171

@jeromeroussin

Description

@jeromeroussin

The Feature

In general, the openai models in Azure (and I suspect other models anywhere) get progressively slower as the size of the request increases. Combine that with how Azure behaves when their backend is overloaded (requests hang), and it is sometimes challenging to set the proper timeout settings for a model. Too low of a timeout value and callers may be missing out on responses that were being handled fine but were taking a while. Too high and callers unnecessarily wait on requests that will never come back.

This RFE is about supporting something like this in the proxy:

general_settings:
  custom_timeout_setter: timeout.my_timeout_setter

The custom timeout setter function would take the request about to be sent as input, and return a timeout value or None if the timeouts defined in the config.yaml should be used.
This would allow us to set smaller timeouts on small requests and higher timeouts on bigger requests, using our telemetry data to set appropriate thresholds.

Motivation, pitch

Better handling of current model performance behavior (in Azure)

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

Twitter / LinkedIn details

https://www.linkedin.com/in/jeromeroussin/

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions