Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions ai-data/generative-apis/how-to/query-code-models.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
meta:
title: How to query code models
description: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
content:
h1: How to query code models
paragraph: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
tags: generative-apis ai-data language-models code-models chat-completions-api
dates:
validation: 2024-12-09
posted: 2024-12-09
---

Scaleway's Generative APIs service allows users to interact with powerful code models hosted on the platform.

Code models are inherently language models specialized in **understanding code**, **generating code** and **fixing code**.

As such, they will be available through the same interfaces as language models:
- The Scaleway [console](https://console.scaleway.com) provides complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
- Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api)
For more information on how to query language models, read [our dedicated documentation](/ai-data/generative-apis/how-to/query-language-models/).

Code models are also ideal AI assistants when added to IDEs (integrated development environments).

<Macro id="requirements" />

- A Scaleway account logged into the [console](https://console.scaleway.com)
- [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization
- A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication
- An IDE such as VS Code or JetBrains

## Install Continue in your IDE

[Continue](https://www.continue.dev/) is an [open-source code assistant](https://github.com/continuedev/continue) to connect AI models to your IDE.

To get Continue, simply hit `install`
- on the [Continue extension page in Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue)
- or on the [Continue extension page in JetBrains Marketplace](https://plugins.jetbrains.com/plugin/22707-continue)

## Configure Scaleway as an API provider in Continue

Continue's `config.json` file will set models and providers allowed for chat, autocompletion etc.
Here is an example configuration with Scaleway's OpenAI-compatible provider:

```json
"models": [
{
"model": "qwen2.5-coder-32b-instruct",
"title": "Qwen2.5-coder",
"apiBase": "https://api.scaleway.ai/v1/",
"provider": "openai",
"apiKey": "###SCW SECRET KEY###",
"useLegacyCompletionsEndpoint": false
}
]
```

<Message type="tip">
The config.json file is typically stored as $HOME/.continue/config.json on Linux/macOS systems, and %USERPROFILE%\.continue\config.json on Windows.
</Message>

Read more about how to set up your `config.json` on the [official Continue documentation](https://docs.continue.dev/reference).

3 changes: 2 additions & 1 deletion ai-data/generative-apis/reference-content/rate-limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: Find our service limits in tokens per minute and queries per minute
tags: generative-apis ai-data rate-limits
dates:
validation: 2024-10-30
validation: 2024-12-09
posted: 2024-08-27
---

Expand All @@ -25,6 +25,7 @@ Any model served through Scaleway Generative APIs gets limited by:
| `llama-3.1-70b-instruct` | 300 | 100K |
| `mistral-nemo-instruct-2407`| 300 | 100K |
| `pixtral-12b-2409`| 300 | 100K |
| `qwen2.5-32b-instruct`| 300 | 100K |

### Embedding models

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Our [Chat API](/ai-data/generative-apis/how-to/query-language-models) has built-
| Meta | `llama-3.1-70b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) |
| Mistral | `mistral-nemo-instruct-2407` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
| Mistral | `pixtral-12b-2409` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) |
| Qwen | `qwen-2.5-coder-32b-instruct` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |


<Message type="tip">
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
meta:
title: Understanding the Qwen2.5-Coder-32B-Instruct model
description: Deploy your own secure Qwen2.5-Coder-32B-Instruct model with Scaleway Managed Inference. Privacy-focused, fully managed.
content:
h1: Understanding the Qwen2.5-Coder-32B-Instruct model
paragraph: This page provides information on the Qwen2.5-Coder-32B-Instruct model
tags:
dates:
validation: 2024-12-08
posted: 2024-12-08
categories:
- ai-data
---

## Model overview

| Attribute | Details |
|-----------------|------------------------------------|
| Provider | [Qwen](https://qwenlm.github.io/) |
| License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
| Compatible Instances | H100, H100-2 (INT8) |
| Context Length | up to 128k tokens |

## Model names

```bash
qwen/qwen2.5-coder-32b-instruct:int8
```

## Compatible Instances

| Instance type | Max context length |
| ------------- |-------------|
| H100 | 128k (INT8)
| H100-2 | 128k (INT8)

## Model introduction

Qwen2.5-coder is your intelligent programming assistant familiar with more than 40 programming languages.
With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning.

## Why is it useful?

- Qwen2.5-coder achieved the best performance on multiple popular code generation benchmarks (EvalPlus, LiveCodeBench, BigCodeBench), outranking many open-source models and providing competitive performance with GPT-4o.
- This model is versatile. While demonstrating strong and comprehensive coding abilities, it also possesses good general and mathematical skills.

## How to use it

### Sending Managed Inference requests

To perform inference tasks with your Qwen2.5-coder deployed at Scaleway, use the following command:

```bash
curl -s \
-H "Authorization: Bearer <IAM API key>" \
-H "Content-Type: application/json" \
--request POST \
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \
--data '{"model":"qwen/qwen2.5-coder-32b-instruct:int8", "messages":[{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful code assistant."},{"role": "user","content": "Write a quick sort algorithm."}], "max_tokens": 1000, "temperature": 0.8, "stream": false}'
```

<Message type="tip">
The model name allows Scaleway to put your prompts in the expected format.
</Message>

<Message type="note">
Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
</Message>

### Receiving Inference responses

Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request.

<Message type="note">
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
</Message>
8 changes: 8 additions & 0 deletions menu/navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,10 @@
{
"label": "Moshiko-0.1-8b model",
"slug": "moshiko-0.1-8b"
},
{
"label": "Qwen2.5-coder-32b-instruct model",
"slug": "qwen2.5-coder-32b-instruct"
}
],
"label": "Additional Content",
Expand Down Expand Up @@ -757,6 +761,10 @@
"label": "Query embedding models",
"slug": "query-embedding-models"
},
{
"label": "Query code models",
"slug": "query-code-models"
},
{
"label": "Use structured outputs",
"slug": "use-structured-outputs"
Expand Down
Loading