From d804f7747f9fa5bf55cf051686aa0de2e633407b Mon Sep 17 00:00:00 2001 From: fpagny Date: Thu, 20 Mar 2025 14:30:22 +0100 Subject: [PATCH 1/3] fix(genapi): supported models Fix Qwen Coder maximum context size --- pages/generative-apis/reference-content/supported-models.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index 3265f60fd0..4a9ab8da94 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -24,7 +24,7 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer | Meta | `llama-3.3-70b-instruct` | 131k | 4096 | [Llama 3.3 Community](https://www.llama.com/llama3_3/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | | Meta | `llama-3.1-8b-instruct` | 128k | 16384 | [Llama 3.1 Community](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | | Mistral | `mistral-nemo-instruct-2407` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | -| Qwen | `qwen2.5-coder-32b-instruct` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) | +| Qwen | `qwen2.5-coder-32b-instruct` | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) | | DeepSeek (Preview) | `deepseek-r1` | 20k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1) | | DeepSeek (Preview) | `deepseek-r1-distill-llama-70b` | 32k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) | From 93f1261983261f02a6c6bacdcc51ac6a97fe6f79 Mon Sep 17 00:00:00 2001 From: fpagny Date: Thu, 20 Mar 2025 14:32:31 +0100 Subject: [PATCH 2/3] fix(genapi): qwen maximum context size --- .../integrating-generative-apis-with-popular-tools.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx b/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx index 4a2c9d69a7..6044d78f53 100644 --- a/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx +++ b/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx @@ -202,7 +202,7 @@ Zed is an IDE (Integrated Development Environment) including AI coding assistanc { "name": "qwen2.5-coder-32b-instruct", "display_name": "Qwen 2.5 Coder 32B", - "max_tokens": 128000 + "max_tokens": 32000 } ], "version": "1" From 6640b36396e668df8b3ae93b81dce9f32f56d2f9 Mon Sep 17 00:00:00 2001 From: fpagny Date: Thu, 20 Mar 2025 14:33:42 +0100 Subject: [PATCH 3/3] fix(inference): qwen maximum context size --- .../reference-content/qwen2.5-coder-32b-instruct.mdx | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx b/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx index 582ffcfb98..64e943e1cc 100644 --- a/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx +++ b/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx @@ -20,7 +20,7 @@ categories: | Provider | [Qwen](https://qwenlm.github.io/) | | License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) | | Compatible Instances | H100, H100-2 (INT8) | -| Context Length | up to 128k tokens | +| Context Length | up to 32k tokens | ## Model names @@ -32,8 +32,8 @@ qwen/qwen2.5-coder-32b-instruct:int8 | Instance type | Max context length | | ------------- |-------------| -| H100 | 128k (INT8) -| H100-2 | 128k (INT8) +| H100 | 32k (INT8) +| H100-2 | 32k (INT8) ## Model introduction @@ -75,4 +75,4 @@ Process the output data according to your application's needs. The response will Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently. - \ No newline at end of file +