Adding load balancing for Services

With many Azure OpenAI Endpoints available to Azure customers, with each having a limited number of Tokens per Minute (and possibly other vendors in the future as well), being limited to one endpoint at a time in SK can lead to an exhaustion of available token quotas.

The idea is to extend the KernelBuilder or Kernel with an attribute named "EnableLoadBalancing". In the Kernel GetRequiredService<T> currently only the last Service is returned. This can easily be modified to return a random service<T>, hence adding load balancing when multiple are registered in the Kernel. 

I would take on the development myself

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding load balancing for Services #4902

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding load balancing for Services #4902

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions