Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: AI Endpoints - Using Virtual Models
excerpt: Learn how to use OVHcloud AI Endpoints Virtual Models
updated: 2025-08-18
updated: 2025-10-12
---

> [!primary]
Expand Down Expand Up @@ -42,6 +42,11 @@ Follow the instructions in the [AI Endpoints - Getting Started](/pages/public_cl

## Model DSL

> [!warning]
>
> As our virtual model feature allows dynamic model switching, the model’s characteristics (including pricing or context size) may change when a newer model is selected to handle your query. If you prefer certain features to remain fixed, you can lock them using the query conditions listed below.
>

When you request an LLM generation through our unified endpoint, you can provide in the OpenAI-compliant `model` field a model DSL query instead of a hardcoded model name.

These queries are divided into three parts: tag, ranker, and condition:
Expand All @@ -54,10 +59,10 @@ Below are some example queries and the models they currently resolve to. Please

| Model Query | Current Target Model | Usage |
|-----------|-----------|-----------|
| code_chat@latest | Qwen3-32B | The most recently released model optimized for code chat tasks |
| meta-llama@latest | Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
| mistral@latest?context_size > 100000 | Mistral-Small-3.2-24B-Instruct-2506 | The latest Mistral model with a context window greater than 100k tokens |
| llama@biggest?input_cost<0.5 | Llama-3.1-8B-Instruct | The largest LLaMA model whose input token cost is under €0.50 per 1M tokens |
| code_chat@latest | **Example:** Qwen3-32B | The most recently released model optimized for code chat tasks |
| meta-llama@latest | **Example:** Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
| mistral@latest?context_size > 100000 | **Example:** Mistral-Small-3.2-24B-Instruct-2506 | The latest Mistral model with a context window greater than 100k tokens |
| llama@biggest?input_cost<0.5 | **Example:** Llama-3.1-8B-Instruct | The largest LLaMA model whose input token cost is under €0.50 per 1M tokens |

You can visit our [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to learn more about the different model specifications.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: AI Endpoints - Modèles virtuels
excerpt: "Découvrez comment utiliser les modèles virtuels d'AI Endpoints"
updated: 2025-08-18
updated: 2025-10-12
---

> [!primary]
Expand Down Expand Up @@ -42,6 +42,11 @@ Follow the instructions in the [AI Endpoints - Getting Started](/pages/public_cl

## Model DSL

> [!warning]
>
> As our virtual model feature allows dynamic model switching, the model’s characteristics (including pricing or context size) may change when a newer model is selected to handle your query. If you prefer certain features to remain fixed, you can lock them using the query conditions listed below.
>

When you request an LLM generation through our unified endpoint, you can provide in the OpenAI-compliant `model` field a model DSL query instead of a hardcoded model name.

These queries are divided into three parts: tag, ranker, and condition:
Expand All @@ -54,10 +59,10 @@ Below are some example queries and the models they currently resolve to. Please

| Model Query | Current Target Model | Usage |
|-----------|-----------|-----------|
| code_chat@latest | Qwen3-32B | The most recently released model optimized for code chat tasks |
| meta-llama@latest | Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
| mistral@latest?context_size > 100000 | Mistral-Small-3.2-24B-Instruct-2506 | The latest Mistral model with a context window greater than 100k tokens |
| llama@biggest?input_cost<0.5 | Llama-3.1-8B-Instruct | The largest LLaMA model whose input token cost is under €0.50 per 1M tokens |
| code_chat@latest | **Example:** Qwen3-32B | The most recently released model optimized for code chat tasks |
| meta-llama@latest | **Example:** Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
| mistral@latest?context_size > 100000 | **Example:** Mistral-Small-3.2-24B-Instruct-2506 | The latest Mistral model with a context window greater than 100k tokens |
| llama@biggest?input_cost<0.5 | **Example:** Llama-3.1-8B-Instruct | The largest LLaMA model whose input token cost is under €0.50 per 1M tokens |

You can visit our [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to learn more about the different model specifications.

Expand Down Expand Up @@ -141,5 +146,4 @@ If you need training or technical assistance to implement our solutions, contact

Please send us your questions, feedback and suggestions to improve the service:

- On the OVHcloud [Discord server](https://discord.gg/ovhcloud).

- On the OVHcloud [Discord server](https://discord.gg/ovhcloud).