From 91330b50819500514c77a54a319cea4b96f6294e Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Wed, 23 Oct 2024 12:34:47 +0200 Subject: [PATCH] fix(inference): changed to full names everywhere --- .../how-to/managed-inference-with-private-network.mdx | 2 +- .../reference-content/llama-3-70b-instruct.mdx | 3 +-- .../reference-content/llama-3-8b-instruct.mdx | 3 +-- .../reference-content/llama-3.1-70b-instruct.mdx | 5 ++--- .../reference-content/llama-3.1-8b-instruct.mdx | 3 +-- .../reference-content/mistral-7b-instruct-v0.3.mdx | 5 ++--- .../reference-content/mistral-nemo-instruct-2407.mdx | 5 ++--- .../reference-content/mixtral-8x7b-instruct-v0.1.mdx | 3 +-- .../managed-inference/reference-content/pixtral-12b-2409.mdx | 1 - .../managed-inference/reference-content/sentence-t5-xxl.mdx | 3 +-- .../reference-content/wizardlm-70b-v1.0.mdx | 3 +-- 11 files changed, 13 insertions(+), 23 deletions(-) diff --git a/ai-data/managed-inference/how-to/managed-inference-with-private-network.mdx b/ai-data/managed-inference/how-to/managed-inference-with-private-network.mdx index 8979e9e81d..ca4f2c65da 100644 --- a/ai-data/managed-inference/how-to/managed-inference-with-private-network.mdx +++ b/ai-data/managed-inference/how-to/managed-inference-with-private-network.mdx @@ -91,7 +91,7 @@ Using a Private Network for communications between your Instances hosting your a import requests PAYLOAD = { - "model": "", # EXAMPLE= meta/llama-3-8b-instruct:bf16 + "model": "", # EXAMPLE= meta/llama-3.1-8b-instruct:fp8 "messages": [ {"role": "system", "content": "You are a helpful, respectful and honest assistant."}, diff --git a/ai-data/managed-inference/reference-content/llama-3-70b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3-70b-instruct.mdx index 909171c226..543a8e0158 100644 --- a/ai-data/managed-inference/reference-content/llama-3-70b-instruct.mdx +++ b/ai-data/managed-inference/reference-content/llama-3-70b-instruct.mdx @@ -17,7 +17,6 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Meta](https://llama.meta.com/llama3/) | -| Model Name | `llama-3-70b-instruct` | | Compatible Instances | H100 (FP8) | | Context size | 8192 tokens | @@ -62,7 +61,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"llama-3-70b-instruct", "messages":[{"role": "user","content": "Sing me a song about Xavier Niel"}], "max_tokens": 500, "top_p": 1, "temperature": 0.7, "stream": false}' +--data '{"model":"meta/llama-3-70b-instruct:fp8", "messages":[{"role": "user","content": "Sing me a song about Xavier Niel"}], "max_tokens": 500, "top_p": 1, "temperature": 0.7, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx index cd23ecf682..6970e05524 100644 --- a/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx +++ b/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx @@ -17,7 +17,6 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Meta](https://llama.meta.com/llama3/) | -| Model Name | `llama-3-8b-instruct` | | Compatible Instances | L4, H100 (FP8, BF16) | | Context size | 8192 tokens | @@ -66,7 +65,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"llama-3-8b-instruct", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "top_p": 1, "temperature": 0.7, "stream": false}' +--data '{"model":"meta/llama-3-8b-instruct:fp8", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "top_p": 1, "temperature": 0.7, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx index 26266b795c..eb6695e46b 100644 --- a/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx +++ b/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx @@ -17,8 +17,7 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Meta](https://llama.meta.com/llama3/) | -| License | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/) | -| Model Name | `llama-3.1-70b-instruct` | +| License | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/) | | | Compatible Instances | H100 (FP8), H100-2 (FP8, BF16) | | Context Length | up to 128k tokens | @@ -61,7 +60,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"llama-3.1-70b-instruct", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' +--data '{"model":"meta/llama-3.1-70b-instruct:fp8", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx index c7f943c185..1e45a0dcdb 100644 --- a/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx +++ b/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx @@ -18,7 +18,6 @@ categories: |-----------------|------------------------------------| | Provider | [Meta](https://llama.meta.com/llama3/) | | License | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/) | -| Model Name | `llama-3.1-8b-instruct` | | Compatible Instances | L4, H100, H100-2 (FP8, BF16) | | Context Length | up to 128k tokens | @@ -62,7 +61,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"llama-3.1-8b-instruct", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' +--data '{"model":"meta/llama-3.1-8b-instruct:fp8", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx b/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx index 3b448ef6c6..f4ff7ba4a6 100644 --- a/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx +++ b/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx @@ -17,14 +17,13 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Mistral](https://mistral.ai/technology/#models) | -| Model Name | `mistral-7b-instruct-v0.3` | | Compatible Instances | L4 (BF16) | | Context size | 32K tokens | ## Model name ```bash -mistral-7b-instruct-v0.3:bf16 +mistral/mistral-7b-instruct-v0.3:bf16 ``` ## Compatible Instances @@ -55,7 +54,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"mistral-7b-instruct-v0.3", "messages":[{"role": "user","content": "Explain Public Cloud in a nutshell."}], "top_p": 1, "temperature": 0.7, "stream": false}' +--data '{"model":"mistral/mistral-7b-instruct-v0.3:bf16", "messages":[{"role": "user","content": "Explain Public Cloud in a nutshell."}], "top_p": 1, "temperature": 0.7, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx b/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx index 7662863fb3..83c5472988 100644 --- a/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx +++ b/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx @@ -17,14 +17,13 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Mistral](https://mistral.ai/technology/#models) | -| Model Name | `mistral-nemo-instruct-2407` | | Compatible Instances | H100 (FP8) | | Context size | 128K tokens | ## Model name ```bash -mistral-nemo-instruct-2407:fp8 +mistral/mistral-nemo-instruct-2407:fp8 ``` ## Compatible Instances @@ -61,7 +60,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"mistral-nemo-instruct-2407", "messages":[{"role": "user","content": "Sing me a song about Xavier Niel"}], "top_p": 1, "temperature": 0.35, "stream": false}' +--data '{"model":"mistral/mistral-nemo-instruct-2407:fp8", "messages":[{"role": "user","content": "Sing me a song about Xavier Niel"}], "top_p": 1, "temperature": 0.35, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx b/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx index f48a258257..4f5fc07f5f 100644 --- a/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx +++ b/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx @@ -17,7 +17,6 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Mistral](https://mistral.ai/technology/#models) | -| Model Name | `mixtral-8x7b-instruct-v0.1` | | Compatible Instances | H100 (FP8) - H100-2 (FP16) | | Context size | 32k tokens | @@ -57,7 +56,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"mixtral-8x7b-instruct-v0.1", "messages":[{"role": "user","content": "Sing me a song about Scaleway"}], "max_tokens": 200, "top_p": 1, "temperature": 1, "stream": false}' +--data '{"model":"mistral/mixtral-8x7b-instruct-v0.1:fp8", "messages":[{"role": "user","content": "Sing me a song about Scaleway"}], "max_tokens": 200, "top_p": 1, "temperature": 1, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. diff --git a/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx b/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx index fd7c14bda1..c8193c38c4 100644 --- a/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx +++ b/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx @@ -17,7 +17,6 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [Mistral](https://mistral.ai/technology/#models) | -| Model Name | `pixtral-12b-2409` | | Compatible Instances | H100, H100-2 (bf16) | | Context size | 128k tokens | diff --git a/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx b/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx index 79015fba5e..c9aefbb111 100644 --- a/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx +++ b/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx @@ -15,11 +15,10 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [sentence-transformers](https://www.sbert.net/) | -| Model Name | `sentence-t5-xxl` | | Compatible Instances | L4 (FP32) | | Context size | 512 tokens | -## Model names +## Model name ```bash sentence-transformers/sentence-t5-xxl:fp32 diff --git a/ai-data/managed-inference/reference-content/wizardlm-70b-v1.0.mdx b/ai-data/managed-inference/reference-content/wizardlm-70b-v1.0.mdx index d9957b5e48..b58bf15854 100644 --- a/ai-data/managed-inference/reference-content/wizardlm-70b-v1.0.mdx +++ b/ai-data/managed-inference/reference-content/wizardlm-70b-v1.0.mdx @@ -17,7 +17,6 @@ categories: | Attribute | Details | |-----------------|------------------------------------| | Provider | [WizardLM](https://wizardlm.github.io/) | -| Model Name | `wizardlm-70B-V1.0` | | Compatible Instances | H100 (FP8) - H100-2 (FP16) | | Context size | 4,096 tokens | @@ -55,7 +54,7 @@ curl -s \ -H "Content-Type: application/json" \ --request POST \ --url "https://.ifr.fr-par.scaleway.com/v1/chat/completions" \ ---data '{"model":"wizardlm-70B-V1.0", "messages":[{"role": "user","content": "Say hello to Scaleway's Inference"}], "max_tokens": 200, "top_p": 1, "temperature": 1, "stream": false}' +--data '{"model":"wizardlm/wizardlm-70b-v1.0:fp8", "messages":[{"role": "user","content": "Say hello to Scaleway's Inference"}], "max_tokens": 200, "top_p": 1, "temperature": 1, "stream": false}' ``` Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.