You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to have some separation between azure/anthropic/openai/etc clients and fallback clients
I see fallback clients as a strategy which defines a ruleset for orchestrating various LLM clients. I would prefer if this strategy gave me the ability to set the order of clients (ideally including which ones I can send in parallel), how to deal with various errors (which Baml already provides), and an ability to override default options (such as request timeouts).
Here is a specific example: I have two clients which are both responsible for calling the GPT-3.5-Turbo deployment on Azure given below. The only difference is the request timeout
Notice all of the options are the same with the exception of the request_timeout field. The reason I did this is because in certain AI functions, I need the operations to complete quickly, so it's not reasonable to wait for the default 30 second timeout. This leads to a lot of redundancy in my clients (really the "strategies").
With the proposed functionality, I could instead do something like
I would like to have some separation between azure/anthropic/openai/etc clients and fallback clients
I see fallback clients as a
strategy
which defines a ruleset for orchestrating various LLM clients. I would prefer if this strategy gave me the ability to set the order of clients (ideally including which ones I can send in parallel), how to deal with various errors (which Baml already provides), and an ability to override default options (such as request timeouts).Here is a specific example: I have two clients which are both responsible for calling the GPT-3.5-Turbo deployment on Azure given below. The only difference is the request timeout
Notice all of the options are the same with the exception of the
request_timeout
field. The reason I did this is because in certain AI functions, I need the operations to complete quickly, so it's not reasonable to wait for the default 30 second timeout. This leads to a lot of redundancy in my clients (really the "strategies").With the proposed functionality, I could instead do something like
The text was updated successfully, but these errors were encountered: