Skip to content

Adding load balancing for Services #4902

@Jenscaasen

Description

@Jenscaasen

With many Azure OpenAI Endpoints available to Azure customers, with each having a limited number of Tokens per Minute (and possibly other vendors in the future as well), being limited to one endpoint at a time in SK can lead to an exhaustion of available token quotas.

The idea is to extend the KernelBuilder or Kernel with an attribute named "EnableLoadBalancing". In the Kernel GetRequiredService currently only the last Service is returned. This can easily be modified to return a random service, hence adding load balancing when multiple are registered in the Kernel.

I would take on the development myself

Metadata

Metadata

Labels

questionFurther information is requested

Type

No type

Projects

Status

Sprint: Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions