Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions pages/managed-inference/reference-content/supported-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dates:

Scaleway Managed Inference allows you to deploy various AI models, either from:

* [Scaleway model catalog](#scaleway-model-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https:/console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https:/www.scaleway.com/en/developers/api/inference/#path-models-list-models)
* [Scaleway model catalog](#scaleway-model-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
* [Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.

## Scaleway model catalog
Expand All @@ -19,14 +19,14 @@ You can find a complete list of all models available in Scaleway's catalog on th
## Custom models

<Message type="note">
Custom model support is currently in **beta**. If you encounter issues or limitations, please report them via our [Slack community channel](https:/scaleway-community.slack.com/archives/C01SGLGRLEA) or [customer support](https:/console.scaleway.com/support/tickets/create?for=product&productName=inference).
Custom model support is currently in **beta**. If you encounter issues or limitations, please report them via our [Slack community channel](https://scaleway-community.slack.com/archives/C01SGLGRLEA) or [customer support](https://console.scaleway.com/support/tickets/create?for=product&productName=inference).
</Message>

### Prerequisites

<Message type="tip">
We recommend starting with a variation of a supported model from the Scaleway catalog.
For example, you can deploy a [quantized (4-bit) version of Llama 3.3](https:/huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit).
For example, you can deploy a [quantized (4-bit) version of Llama 3.3](https://huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit).
If deploying a fine-tuned version of Llama 3.3, make sure your file structure matches the example linked above.
Examples whose compatibility has been tested are available in [tested models](#known-compatible-models).
</Message>
Expand All @@ -37,7 +37,7 @@ To deploy a custom model via Hugging Face, ensure the following:

* You must have access to the model using your Hugging Face credentials.
* For gated models, request access through your Hugging Face account.
* Credentials are not stored, but we recommend using [read or fine-grained access tokens](https:/huggingface.co/docs/hub/security-tokens).
* Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).

#### Required files

Expand All @@ -46,7 +46,7 @@ Your model repository must include:
* A `config.json` file containig:
* An `architectures` array (see [supported architectures](#supported-models-architecture) for the exact list of supported values).
* `max_position_embeddings`
* Model weights in the [`.safetensors`](https:/huggingface.co/docs/safetensors/index) format
* Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
* A `tokenizer.json` file
* If your are fine-tuning an existing model, we recommend you use the same `tokenizer.json` file from the base model.
* A chat template included in either:
Expand All @@ -68,7 +68,7 @@ Your model must be one of the following types:

<Message type="important">
**Security Notice**<br />
Models using formats that allow arbitrary code execution, such as Python [`pickle`](https:/docs.python.org/3/library/pickle.html), are **not supported**.
Models using formats that allow arbitrary code execution, such as Python [`pickle`](https://docs.python.org/3/library/pickle.html), are **not supported**.
</Message>

## API support
Expand Down