Skip to content

Standardize Inference Providers to Use OpenAIMixin #3387

@mattf

Description

@mattf

🤔 What is the technical debt you think should be addressed?

When the inference providers were originally created, shared mixins like OpenAIMixin and LiteLLMOpenAIMixin did not exist. As a result, many providers implemented their own logic manually and inconsistently.

Now that these mixins are available and some providers have adopted them, we have a fragmented implementation across the codebase. This results in:

  • Duplicated logic (e.g. for streaming, parameter handling, response formatting)
  • Inconsistent behavior across providers
  • Increased maintenance burden
  • Higher likelihood of subtle bugs and divergent implementations

💡 What is the benefit of addressing this technical debt?

  • Consistency: All inference providers follow the same behavior.
  • Maintainability: Changes (e.g. API updates, bug fixes) can be made in one place.
  • Reduced Duplication: Shared logic eliminates repeated code across providers.
  • Scalability: Easier to onboard or implement new providers.
  • Better Testing: Shared mixins can be tested centrally, increasing reliability.

Inference providers

provider chat completions embeddings status notes
anthropic yes yes yes #3366
azure openai yes yes yes #3396
bedrock yes yes no #3410 via OpenAIChatCompletionToLlamaStackMixin and OpenAICompletionToLlamaStackMixin, openai-compat
cerebras yes yes no #3481
databricks yes no no #3500
fireworks yes yes yes #3480
gemini yes yes yes #3351
groq yes yes yes #3348
llama yes yes yes #2835
nvidia yes yes yes #2835
ollama yes yes yes #3395
openai yes yes yes #2835
runpod yes no no TODO vLLM based, uses openai, openai-compat
sambanova yes yes yes #3345
tgi yes yes no #3417
hf::serverless yes yes no TODO BROKEN: #3415
hf::endpoints yes yes no TODO
together yes yes yes #3458
vertexai yes yes no #3377
vllm yes yes no #3404
watsonx yes yes no TODO custom

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions