baml-fallback should maybe be a strategy as opposed to another client #226

villagab4 · 2023-12-29T20:31:42Z

I would like to have some separation between azure/anthropic/openai/etc clients and fallback clients

I see fallback clients as a strategy which defines a ruleset for orchestrating various LLM clients. I would prefer if this strategy gave me the ability to set the order of clients (ideally including which ones I can send in parallel), how to deal with various errors (which Baml already provides), and an ability to override default options (such as request timeouts).

Here is a specific example: I have two clients which are both responsible for calling the GPT-3.5-Turbo deployment on Azure given below. The only difference is the request timeout

client<llm> AzureGPT35Turbo {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 30
    }
}

client<llm> AzureGPT35TurboShortTimeout {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 5
    }
}

// My version of "strategy"
client<llm> GPTFamilyShortTimeout {
  provider baml-fallback
  options {
    strategy [
      AzureGPT35TurboShortTimeout,
      AzureGPT4TurboShortTimeout
    ]
  }
}

Notice all of the options are the same with the exception of the request_timeout field. The reason I did this is because in certain AI functions, I need the operations to complete quickly, so it's not reasonable to wait for the default 30 second timeout. This leads to a lot of redundancy in my clients (really the "strategies").

With the proposed functionality, I could instead do something like

client<llm> AzureGPT35Turbo {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 30
    }
}

// My version of "strategy"
client<llm> GPTFamilyShortTimeout {
  provider baml-fallback
  options {
    strategy [
      AzureGPT35Turbo.options(request_timeout=5),
    ]
  }
}

The text was updated successfully, but these errors were encountered:

aaronvg · 2023-12-30T13:59:19Z

It makes sense you dont want to have to declare a new client just to change one property -- the timeout.

We will look at how a user can add property-level overrides and/or indicate parallel execution in baml syntax.

hellovai added the invalid This doesn't seem right label Jul 10, 2024

hellovai closed this as completed Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baml-fallback should maybe be a strategy as opposed to another client #226

baml-fallback should maybe be a strategy as opposed to another client #226

villagab4 commented Dec 29, 2023

aaronvg commented Dec 30, 2023

baml-fallback should maybe be a strategy as opposed to another client #226

baml-fallback should maybe be a strategy as opposed to another client #226

Comments

villagab4 commented Dec 29, 2023

aaronvg commented Dec 30, 2023