Skip to content

feat: add OVHcloud AI Endpoints provider support#3343

Merged
Sayt-0 merged 2 commits into
mainfrom
feat/ovhcloud-provider
Jul 1, 2026
Merged

feat: add OVHcloud AI Endpoints provider support#3343
Sayt-0 merged 2 commits into
mainfrom
feat/ovhcloud-provider

Conversation

@aheritier

Copy link
Copy Markdown
Contributor

Add ovhcloud as a built-in alias provider for OVHcloud AI Endpoints (closes #3342, related #3145).

Users can now write provider: ovhcloud (or inline ovhcloud/Qwen3-235B-A22B) without hand-rolling a custom providers: block.

Highlights

Files changed (15): alias + auto-detection Go code with full test coverage, provider doc page, nav/concept/config doc updates, example YAML.

Note: The provider page describes the API type as openai_chatcompletions (matching the Baseten docs convention) even though the internal alias field is "openai". This is consistent with all other OpenAI-compatible alias docs and does not affect runtime behaviour.

Add `ovhcloud` as a built-in alias provider for OVHcloud AI Endpoints,
following the exact pattern established for Baseten in PR #3341.

Users can now write `provider: ovhcloud` (or inline `ovhcloud/<model>`)
and get the correct base URL, token env var, and consecutive-system-message
merge behaviour automatically — instead of hand-rolling a `providers:`
block as shown in #3145.

- Alias registered in aliases.go with APIType openai,
  BaseURL https://oai.endpoints.kepler.ai.cloud.ovh.net/v1,
  TokenEnvVar OVH_AI_ENDPOINTS_ACCESS_TOKEN
- Default/featured model: Qwen3-235B-A22B
- shouldMergeConsecutiveMessages extended to include ovhcloud (load-bearing
  fix for #3145: Qwen3 on OVHcloud returns an empty stream when a request
  carries more than one system message; docker-agent emits one per toolset)
- Auto-detection added to cloudProviders (after baseten in priority order)
- Provider page, nav, concept/config doc updates, and example YAML included
- delta.reasoning streaming (thinking display from #3145) is already handled
  generically in the oaistream adapter — no adapter changes needed

Closes #3342
Related: #3145
@aheritier aheritier requested a review from a team as a code owner July 1, 2026 07:48
@Sayt-0 Sayt-0 self-assigned this Jul 1, 2026
@aheritier aheritier added area/config For configuration parsing, YAML, environment variables area/docs Documentation changes area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/providers/openai For features/issues/fixes related to the usage of OpenAI models kind/feat PR adds a new feature (maps to feat:). Use on PRs only. labels Jul 1, 2026

@Sayt-0 Sayt-0 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

The alias plumbing, auto-detection, consecutive system-message merge, docs coverage, and test coverage are all correct and consistent with the Baseten pattern (#3341). One blocking issue: the default model Qwen3-235B-A22B is not present in the current OVHcloud AI Endpoints catalogue, which breaks the out-of-the-box experience the PR advertises.

Model IDs verified against the catalogue

Checked against the live /v1/models endpoint (https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/models) and the official catalogue.

Model referenced In current catalogue
Qwen3-235B-A22B (default) no
Qwen3-32B yes
Meta-Llama-3_3-70B-Instruct yes
Mistral-Small-3.2-24B-Instruct-2506 yes
Mixtral-8x7B-Instruct-v0.1 no
DeepSeek-R1-Distill-Llama-70B no

The catalogue appears to have rotated its large Qwen3 MoE from Qwen3-235B-A22B to Qwen3.5-397B-A17B. Other current Qwen3 IDs: Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct.

Impact

  • DefaultModels["ovhcloud"] is what auto-selection returns when only OVH_AI_ENDPOINTS_ACCESS_TOKEN is set, so docker-agent run with just the token requests a nonexistent model and fails.
  • examples/ovhcloud.yaml and the doc snippets do not work as written.
  • TestParseExamples does not catch this, because ovhcloud was added to modelsDevAbsentProviders, which disables the model-existence check for this provider.

Suggested fix

  • Point the default at a model that currently exists, e.g. Qwen3.5-397B-A17B (current large MoE, confirmed working) or Qwen3-32B (lighter, more likely free-tier). Propagate to DefaultModels, examples/ovhcloud.yaml, and the doc snippets.
  • Drop or replace the three stale rows in the model table.
  • Optional doc improvement: mention provider_opts.enable_thinking: false for Qwen3 to suppress reasoning output, and confirm which model the free tier serves.

Not blocking (correct as-is)

Env var name (OVH_AI_ENDPOINTS_ACCESS_TOKEN), base URL, auto-detection precedence, and the merge path via Provider == "ovhcloud" are all correct. Because the catalogue rotates, pinning any single default is inherently fragile, but the pinned ID should at least exist at merge time.

Comment thread pkg/config/auto.go Outdated
"mistral": "mistral-small-latest",
"openrouter": "meta-llama/llama-3.3-70b-instruct",
"baseten": "deepseek-ai/DeepSeek-V3.1",
"ovhcloud": "Qwen3-235B-A22B",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qwen3-235B-A22B is not in the current OVHcloud catalogue (not returned by https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/models). This is the model auto-selection returns when only OVH_AI_ENDPOINTS_ACCESS_TOKEN is set, so the token-only path fails at runtime. Suggest an existing ID, e.g. Qwen3.5-397B-A17B (current large MoE) or Qwen3-32B.

Comment thread examples/ovhcloud.yaml Outdated
models:
ovhcloud_model:
provider: ovhcloud
model: Qwen3-235B-A22B

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nonexistent model, so this example fails as written. Suggest Qwen3.5-397B-A17B or Qwen3-32B.

Comment thread docs/providers/ovhcloud/index.md Outdated
```yaml
agents:
root:
model: ovhcloud/Qwen3-235B-A22B

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonexistent model in the inline example. Suggest an existing ID (e.g. ovhcloud/Qwen3.5-397B-A17B). The Named Model snippet at line 45 has the same issue.

Comment thread docs/providers/ovhcloud/index.md Outdated
Comment on lines +64 to +69
| `Qwen3-235B-A22B` | Large Qwen3 MoE — strong general, coding, and reasoning |
| `Qwen3-32B` | Mid-size Qwen3 — fast, tool-calling, reasoning |
| `Meta-Llama-3_3-70B-Instruct` | Llama 3.3 70B — reliable general-purpose chat |
| `Mistral-Small-3.2-24B-Instruct-2506` | Compact, fast, tool-calling |
| `Mixtral-8x7B-Instruct-v0.1` | Mixtral MoE — cost-efficient |
| `DeepSeek-R1-Distill-Llama-70B` | Distilled reasoning model |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three of these IDs are absent from the current catalogue: Qwen3-235B-A22B (line 64), Mixtral-8x7B-Instruct-v0.1 (line 68), and DeepSeek-R1-Distill-Llama-70B (line 69). Current options to list instead: Qwen3.5-397B-A17B, Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct.

Qwen3-235B-A22B is not in the current OVHcloud AI Endpoints catalogue.
Replace it with Qwen3.5-397B-A17B (confirmed present) everywhere:
DefaultModels, examples/ovhcloud.yaml, and all doc snippets.

Also refresh the model table in docs/providers/ovhcloud/index.md:
drop the three stale rows (Qwen3-235B-A22B, Mixtral-8x7B-Instruct-v0.1,
DeepSeek-R1-Distill-Llama-70B) and add current catalogue IDs
(Qwen3.5-397B-A17B, Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct).

Addresses review feedback on PR #3343.
@aheritier

Copy link
Copy Markdown
Contributor Author

Addressed all review findings in 310a140:

  • Default model (auto.go, examples/ovhcloud.yaml, all doc snippets): replaced Qwen3-235B-A22B with Qwen3.5-397B-A17B (confirmed current in the catalogue).
  • Model table: dropped the three stale rows (Qwen3-235B-A22B, Mixtral-8x7B-Instruct-v0.1, DeepSeek-R1-Distill-Llama-70B) and added the current catalogue IDs (Qwen3.5-397B-A17B, Qwen3.6-27B, Qwen3.5-9B, Qwen3-Coder-30B-A3B-Instruct).

@Sayt-0 Sayt-0 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are correct ! Approving !

@Sayt-0 Sayt-0 merged commit 8a0d88f into main Jul 1, 2026
11 checks passed
@Sayt-0 Sayt-0 deleted the feat/ovhcloud-provider branch July 1, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config For configuration parsing, YAML, environment variables area/docs Documentation changes area/providers/openai For features/issues/fixes related to the usage of OpenAI models area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) kind/feat PR adds a new feature (maps to feat:). Use on PRs only.

Projects

None yet

2 participants