Skip to content

Commit 55262de

Browse files
authored
Update rate-limits.mdx (#4516)
1 parent 73c1025 commit 55262de

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

pages/generative-apis/reference-content/rate-limits.mdx

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,28 +21,28 @@ Any model served through Scaleway Generative APIs gets limited by:
2121
These limits only apply if you created a Scaleway Account and registered a valid payment method. Otherwise, stricter limits apply to ensure usage stays within Free Tier only.
2222
</Message>
2323

24+
## How can I increase the rate limits?
25+
26+
We actively monitor usage and will improve rates based on feedback.
27+
If you need to increase your rate limits, [contact our support team](https://console.scaleway.com/support/create), providing details on the model used and specific use case.
28+
Note that for increases of up to x5 or x10 volumes, we highly recommend using dedicated deployments with [Managed Inference](https://console.scaleway.com/inference/deployments), which provides exactly the same features and API compatibility.
29+
2430
### Chat models
2531

2632
| Model string | Requests per minute | Total tokens per minute |
2733
|-----------------|-----------------|-----------------|
28-
| `llama-3.1-8b-instruct` | 300 | 100K |
29-
| `llama-3.1-70b-instruct` | 300 | 100K |
30-
| `mistral-nemo-instruct-2407`| 300 | 100K |
31-
| `pixtral-12b-2409`| 300 | 100K |
32-
| `qwen2.5-32b-instruct`| 300 | 100K |
34+
| `llama-3.1-8b-instruct` | 300 | 200K |
35+
| `llama-3.1-70b-instruct` | 300 | 200K |
36+
| `mistral-nemo-instruct-2407`| 300 | 200K |
37+
| `pixtral-12b-2409`| 300 | 200K |
38+
| `qwen2.5-32b-instruct`| 300 | 200K |
3339

3440
### Embedding models
3541

3642
| Model string | Requests per minute | Input tokens per minute |
3743
|-----------------|-----------------|-----------------|
38-
| `sentence-t5-xxl` | 100 | 200K |
39-
| `bge-multilingual-gemma2` | 100 | 200K |
44+
| `bge-multilingual-gemma2` | 300 | 400K |
4045

4146
## Why do we set rate limits?
4247

4348
These limits safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.
44-
45-
## How can I increase the rate limits?
46-
47-
We actively monitor usage and will improve rates based on feedback.
48-
If you need to increase your rate limits, contact us via the support team, providing details on the model used and specific use case.

0 commit comments

Comments
 (0)