feat: add OVHcloud AI Endpoints provider support#3343
Conversation
Add `ovhcloud` as a built-in alias provider for OVHcloud AI Endpoints, following the exact pattern established for Baseten in PR #3341. Users can now write `provider: ovhcloud` (or inline `ovhcloud/<model>`) and get the correct base URL, token env var, and consecutive-system-message merge behaviour automatically — instead of hand-rolling a `providers:` block as shown in #3145. - Alias registered in aliases.go with APIType openai, BaseURL https://oai.endpoints.kepler.ai.cloud.ovh.net/v1, TokenEnvVar OVH_AI_ENDPOINTS_ACCESS_TOKEN - Default/featured model: Qwen3-235B-A22B - shouldMergeConsecutiveMessages extended to include ovhcloud (load-bearing fix for #3145: Qwen3 on OVHcloud returns an empty stream when a request carries more than one system message; docker-agent emits one per toolset) - Auto-detection added to cloudProviders (after baseten in priority order) - Provider page, nav, concept/config doc updates, and example YAML included - delta.reasoning streaming (thinking display from #3145) is already handled generically in the oaistream adapter — no adapter changes needed Closes #3342 Related: #3145
Sayt-0
left a comment
There was a problem hiding this comment.
Summary
The alias plumbing, auto-detection, consecutive system-message merge, docs coverage, and test coverage are all correct and consistent with the Baseten pattern (#3341). One blocking issue: the default model Qwen3-235B-A22B is not present in the current OVHcloud AI Endpoints catalogue, which breaks the out-of-the-box experience the PR advertises.
Model IDs verified against the catalogue
Checked against the live /v1/models endpoint (https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/models) and the official catalogue.
| Model referenced | In current catalogue |
|---|---|
Qwen3-235B-A22B (default) |
no |
Qwen3-32B |
yes |
Meta-Llama-3_3-70B-Instruct |
yes |
Mistral-Small-3.2-24B-Instruct-2506 |
yes |
Mixtral-8x7B-Instruct-v0.1 |
no |
DeepSeek-R1-Distill-Llama-70B |
no |
The catalogue appears to have rotated its large Qwen3 MoE from Qwen3-235B-A22B to Qwen3.5-397B-A17B. Other current Qwen3 IDs: Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct.
Impact
DefaultModels["ovhcloud"]is what auto-selection returns when onlyOVH_AI_ENDPOINTS_ACCESS_TOKENis set, sodocker-agent runwith just the token requests a nonexistent model and fails.examples/ovhcloud.yamland the doc snippets do not work as written.TestParseExamplesdoes not catch this, becauseovhcloudwas added tomodelsDevAbsentProviders, which disables the model-existence check for this provider.
Suggested fix
- Point the default at a model that currently exists, e.g.
Qwen3.5-397B-A17B(current large MoE, confirmed working) orQwen3-32B(lighter, more likely free-tier). Propagate toDefaultModels,examples/ovhcloud.yaml, and the doc snippets. - Drop or replace the three stale rows in the model table.
- Optional doc improvement: mention
provider_opts.enable_thinking: falsefor Qwen3 to suppress reasoning output, and confirm which model the free tier serves.
Not blocking (correct as-is)
Env var name (OVH_AI_ENDPOINTS_ACCESS_TOKEN), base URL, auto-detection precedence, and the merge path via Provider == "ovhcloud" are all correct. Because the catalogue rotates, pinning any single default is inherently fragile, but the pinned ID should at least exist at merge time.
| "mistral": "mistral-small-latest", | ||
| "openrouter": "meta-llama/llama-3.3-70b-instruct", | ||
| "baseten": "deepseek-ai/DeepSeek-V3.1", | ||
| "ovhcloud": "Qwen3-235B-A22B", |
There was a problem hiding this comment.
Qwen3-235B-A22B is not in the current OVHcloud catalogue (not returned by https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/models). This is the model auto-selection returns when only OVH_AI_ENDPOINTS_ACCESS_TOKEN is set, so the token-only path fails at runtime. Suggest an existing ID, e.g. Qwen3.5-397B-A17B (current large MoE) or Qwen3-32B.
| models: | ||
| ovhcloud_model: | ||
| provider: ovhcloud | ||
| model: Qwen3-235B-A22B |
There was a problem hiding this comment.
Same nonexistent model, so this example fails as written. Suggest Qwen3.5-397B-A17B or Qwen3-32B.
| ```yaml | ||
| agents: | ||
| root: | ||
| model: ovhcloud/Qwen3-235B-A22B |
There was a problem hiding this comment.
Nonexistent model in the inline example. Suggest an existing ID (e.g. ovhcloud/Qwen3.5-397B-A17B). The Named Model snippet at line 45 has the same issue.
| | `Qwen3-235B-A22B` | Large Qwen3 MoE — strong general, coding, and reasoning | | ||
| | `Qwen3-32B` | Mid-size Qwen3 — fast, tool-calling, reasoning | | ||
| | `Meta-Llama-3_3-70B-Instruct` | Llama 3.3 70B — reliable general-purpose chat | | ||
| | `Mistral-Small-3.2-24B-Instruct-2506` | Compact, fast, tool-calling | | ||
| | `Mixtral-8x7B-Instruct-v0.1` | Mixtral MoE — cost-efficient | | ||
| | `DeepSeek-R1-Distill-Llama-70B` | Distilled reasoning model | |
There was a problem hiding this comment.
Three of these IDs are absent from the current catalogue: Qwen3-235B-A22B (line 64), Mixtral-8x7B-Instruct-v0.1 (line 68), and DeepSeek-R1-Distill-Llama-70B (line 69). Current options to list instead: Qwen3.5-397B-A17B, Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct.
Qwen3-235B-A22B is not in the current OVHcloud AI Endpoints catalogue. Replace it with Qwen3.5-397B-A17B (confirmed present) everywhere: DefaultModels, examples/ovhcloud.yaml, and all doc snippets. Also refresh the model table in docs/providers/ovhcloud/index.md: drop the three stale rows (Qwen3-235B-A22B, Mixtral-8x7B-Instruct-v0.1, DeepSeek-R1-Distill-Llama-70B) and add current catalogue IDs (Qwen3.5-397B-A17B, Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct). Addresses review feedback on PR #3343.
|
Addressed all review findings in 310a140:
|
Sayt-0
left a comment
There was a problem hiding this comment.
The changes are correct ! Approving !
Add
ovhcloudas a built-in alias provider for OVHcloud AI Endpoints (closes #3342, related #3145).Users can now write
provider: ovhcloud(or inlineovhcloud/Qwen3-235B-A22B) without hand-rolling a customproviders:block.Highlights
Qwen3-235B-A22Bwith no billing setup required.shouldMergeConsecutiveMessagesis extended to coalesce consecutive system messages forovhcloud, exactly as PR feat: add Baseten provider support #3341 did forbaseten. Without this, Qwen3 on OVHcloud returns an empty stream when a request carries more than one system message (docker-agent emits one per toolset).main.Files changed (15): alias + auto-detection Go code with full test coverage, provider doc page, nav/concept/config doc updates, example YAML.
Note: The provider page describes the API type as
openai_chatcompletions(matching the Baseten docs convention) even though the internal alias field is"openai". This is consistent with all other OpenAI-compatible alias docs and does not affect runtime behaviour.