Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baml-fallback should maybe be a strategy as opposed to another client #226

Closed
villagab4 opened this issue Dec 29, 2023 · 1 comment
Closed
Labels
invalid This doesn't seem right

Comments

@villagab4
Copy link
Contributor

I would like to have some separation between azure/anthropic/openai/etc clients and fallback clients

I see fallback clients as a strategy which defines a ruleset for orchestrating various LLM clients. I would prefer if this strategy gave me the ability to set the order of clients (ideally including which ones I can send in parallel), how to deal with various errors (which Baml already provides), and an ability to override default options (such as request timeouts).

Here is a specific example: I have two clients which are both responsible for calling the GPT-3.5-Turbo deployment on Azure given below. The only difference is the request timeout

client<llm> AzureGPT35Turbo {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 30
    }
}

client<llm> AzureGPT35TurboShortTimeout {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 5
    }
}

// My version of "strategy"
client<llm> GPTFamilyShortTimeout {
  provider baml-fallback
  options {
    strategy [
      AzureGPT35TurboShortTimeout,
      AzureGPT4TurboShortTimeout
    ]
  }
}

Notice all of the options are the same with the exception of the request_timeout field. The reason I did this is because in certain AI functions, I need the operations to complete quickly, so it's not reasonable to wait for the default 30 second timeout. This leads to a lot of redundancy in my clients (really the "strategies").

With the proposed functionality, I could instead do something like

client<llm> AzureGPT35Turbo {
    provider baml-azure-chat
    retry_policy ZenfetchDefaultPolicy
    options {
      api_key env.AZURE_OPENAI_API_KEY
      api_base env.AZURE_OPENAI_BASE
      engine env.AZURE_GPT_35_TURBO_DEPLOYMENT_NAME
      api_version "2023-07-01-preview"
      api_type azure
      request_timeout 30
    }
}

// My version of "strategy"
client<llm> GPTFamilyShortTimeout {
  provider baml-fallback
  options {
    strategy [
      AzureGPT35Turbo.options(request_timeout=5),
    ]
  }
}
@aaronvg
Copy link
Contributor

aaronvg commented Dec 30, 2023

It makes sense you dont want to have to declare a new client just to change one property -- the timeout.

We will look at how a user can add property-level overrides and/or indicate parallel execution in baml syntax.

@hellovai hellovai added the invalid This doesn't seem right label Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants