From 0dadcceeb2bde8286356cad81363b2c0707f9397 Mon Sep 17 00:00:00 2001 From: Benoit Marzelleau Date: Wed, 30 Oct 2024 17:50:07 +0100 Subject: [PATCH 1/6] feat(ifr): add moshi related documentation --- .../reference-content/moshika-0.1-8b.mdx | 86 +++++++++++++++++++ .../reference-content/moshiko-0.1-8b.mdx | 86 +++++++++++++++++++ 2 files changed, 172 insertions(+) create mode 100644 ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx create mode 100644 ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx diff --git a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx new file mode 100644 index 0000000000..3eaa7a21f6 --- /dev/null +++ b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx @@ -0,0 +1,86 @@ +--- +meta: + title: Understanding the Moshika-0.1-8b model + description: Deploy your own secure Moshika-0.1-8b model with Scaleway Managed Inference. Privacy-focused, fully managed. +content: + h1: Understanding the Moshika-0.1-8b model + paragraph: This page provides information on the Moshika-0.1-8b model +tags: +dates: + validation: 2024-10-30 +categories: + - ai-data +--- + +## Model overview + +| Attribute | Details | +|-----------------|------------------------------------| +| Provider | [Kyutai](https://github.com/kyutai-labs/moshi) | +| Compatible Instances | L4, H100 (FP8, BF16) | +| Context size | 4096 tokens | + +## Model names + +```bash +kyutai/moshika-0.1-8b:bf16 +kyutai/moshika-0.1-8b:fp8 +``` + +## Compatible Instances + +| Instance type | Max context length | +| ------------- |-------------| +| L4 | 4096 (FP8, BF16) | +| H100 | 4096 (FP8, BF16) + +## Model introduction + +Kyutai's Moshi is a speech-text foundation model for real-time dialogue. +Moshi is an experimental next-generation conversational model, designed to understand and respond fluidly and naturally to complex conversations, while providing unprecedented expressiveness and spontaneity. +While current systems for spoken dialogue rely on a pipeline of separate components, Moshi is the first real-time full-duplex spoken large language model. +Moshika is the variant of Moshi with a female voice in English. + +## Why is it useful? + +Moshi offers seamless real-time dialogue capabilities, enabling users to engage in natural conversations with the model. +It allows the modeling of arbitrary conversational dynamics, including overlapping speech, interruptions, interjections, and more. +In particular, this model: +- Processes 24 kHz audio down to a 12.5 Hz representation with a bandwith of 1.1 kbps, performing better than existing non-streaming models. +- Achieves a theoretical latency of 160 ms, with a practical latency of 200 ms, making it suitable for real-time applications. + +## How to use it + +To perform inference tasks with your Moshi deployed at Scaleway, a WebSocket API is exposed for real-time dialogue and is accessible at the following endpoint: + +```bash +wss://.ifr.fr-par.scaleway.com/api/chat +``` + +### Testing the WebSocket endpoint + +To test the endpoint, use the following command: + +```bash +curl -i --http1.1 \ +-H "Authorization: Bearer " \ +-H "Connection: Upgrade" \ +-H "Upgrade: websocket" \ +-H "Sec-WebSocket-Key: SGVsbG8sIHdvcmxkIQ==" \ +-H "Sec-WebSocket-Version: 13" \ +--url "https://.ifr.fr-par.scaleway.com/api/chat" +``` + +Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. + + + Authentication can be done using the `token` query parameter, which should be set to your IAM API key, if headers are not supported (e.g., in a browser). + + +The server should respond with a `101 Switching Protocols` status code, indicating that the connection has been successfully upgraded to a WebSocket connection. + +### Interacting with the model + +We provide code samples in various programming languages (python, rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. +Those code samples can be found in our [GitHub repository](https://github.com/scaleway/moshi-client-examples). +This repository contains instructions on how to run the code samples and interact with the model. \ No newline at end of file diff --git a/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx new file mode 100644 index 0000000000..adcd560514 --- /dev/null +++ b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx @@ -0,0 +1,86 @@ +--- +meta: + title: Understanding the Moshiko-0.1-8b model + description: Deploy your own secure Moshiko-0.1-8b model with Scaleway Managed Inference. Privacy-focused, fully managed. +content: + h1: Understanding the Moshiko-0.1-8b model + paragraph: This page provides information on the Moshiko-0.1-8b model +tags: +dates: + validation: 2024-10-30 +categories: + - ai-data +--- + +## Model overview + +| Attribute | Details | +|-----------------|------------------------------------| +| Provider | [Kyutai](https://github.com/kyutai-labs/moshi) | +| Compatible Instances | L4, H100 (FP8, BF16) | +| Context size | 4096 tokens | + +## Model names + +```bash +kyutai/moshiko-0.1-8b:bf16 +kyutai/moshiko-0.1-8b:fp8 +``` + +## Compatible Instances + +| Instance type | Max context length | +| ------------- |-------------| +| L4 | 4096 (FP8, BF16) | +| H100 | 4096 (FP8, BF16) + +## Model introduction + +Kyutai's Moshi is a speech-text foundation model for real-time dialogue. +Moshi is an experimental next-generation conversational model, designed to understand and respond fluidly and naturally to complex conversations, while providing unprecedented expressiveness and spontaneity. +While current systems for spoken dialogue rely on a pipeline of separate components, Moshi is the first real-time full-duplex spoken large language model. +Moshiko is the variant of Moshi with a male voice in English. + +## Why is it useful? + +Moshi offers seamless real-time dialogue capabilities, enabling users to engage in natural conversations with the model. +It allows the modeling of arbitrary conversational dynamics, including overlapping speech, interruptions, interjections, and more. +In particular, this model: +- Processes 24 kHz audio down to a 12.5 Hz representation with a bandwith of 1.1 kbps, performing better than existing non-streaming models. +- Achieves a theoretical latency of 160 ms, with a practical latency of 200 ms, making it suitable for real-time applications. + +## How to use it + +To perform inference tasks with your Moshi deployed at Scaleway, a WebSocket API is exposed for real-time dialogue and is accessible at the following endpoint: + +```bash +wss://.ifr.fr-par.scaleway.com/api/chat +``` + +### Testing the WebSocket endpoint + +To test the endpoint, use the following command: + +```bash +curl -i --http1.1 \ +-H "Authorization: Bearer " \ +-H "Connection: Upgrade" \ +-H "Upgrade: websocket" \ +-H "Sec-WebSocket-Key: SGVsbG8sIHdvcmxkIQ==" \ +-H "Sec-WebSocket-Version: 13" \ +--url "https://.ifr.fr-par.scaleway.com/api/chat" +``` + +Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. + + + Authentication can be done using the `token` query parameter, which should be set to your IAM API key, if headers are not supported (e.g., in a browser). + + +The server should respond with a `101 Switching Protocols` status code, indicating that the connection has been successfully upgraded to a WebSocket connection. + +### Interacting with the model + +We provide code samples in various programming languages (python, rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. +Those code samples can be found in our [GitHub repository](https://github.com/scaleway/moshi-client-examples). +This repository contains instructions on how to run the code samples and interact with the model. \ No newline at end of file From ea358638d4336f260fe8badab1181e8edb509a22 Mon Sep 17 00:00:00 2001 From: bmarzelleau Date: Thu, 31 Oct 2024 13:26:19 +0100 Subject: [PATCH 2/6] Fix missing `|` in ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx Co-authored-by: Benedikt Rollik --- ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx index 3eaa7a21f6..e3d51cc64f 100644 --- a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx +++ b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx @@ -32,7 +32,7 @@ kyutai/moshika-0.1-8b:fp8 | Instance type | Max context length | | ------------- |-------------| | L4 | 4096 (FP8, BF16) | -| H100 | 4096 (FP8, BF16) +| H100 | 4096 (FP8, BF16) | ## Model introduction From 31a9fed4e53b6e56f8ba694e54f859fbfabe174b Mon Sep 17 00:00:00 2001 From: bmarzelleau Date: Thu, 31 Oct 2024 13:26:49 +0100 Subject: [PATCH 3/6] Add posted date to moshika-0.1-8b.mdx Co-authored-by: Benedikt Rollik --- ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx index e3d51cc64f..6201162b11 100644 --- a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx +++ b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx @@ -8,6 +8,7 @@ content: tags: dates: validation: 2024-10-30 + posted: 2024-10-30 categories: - ai-data --- From a1f06b4391285c2020a157df06ec5d2a5a1cb234 Mon Sep 17 00:00:00 2001 From: bmarzelleau Date: Thu, 31 Oct 2024 13:27:13 +0100 Subject: [PATCH 4/6] Add missing `|` in moshiko-0.1-8b.mdx Co-authored-by: Benedikt Rollik --- ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx index adcd560514..7f6969c47c 100644 --- a/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx +++ b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx @@ -32,7 +32,7 @@ kyutai/moshiko-0.1-8b:fp8 | Instance type | Max context length | | ------------- |-------------| | L4 | 4096 (FP8, BF16) | -| H100 | 4096 (FP8, BF16) +| H100 | 4096 (FP8, BF16) | ## Model introduction From a2ee16a7719394a9471e836bda1c8af4554675e1 Mon Sep 17 00:00:00 2001 From: Benedikt Rollik Date: Thu, 31 Oct 2024 14:11:42 +0100 Subject: [PATCH 5/6] Update ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx index 6201162b11..61e29e6efb 100644 --- a/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx +++ b/ai-data/managed-inference/reference-content/moshika-0.1-8b.mdx @@ -82,6 +82,6 @@ The server should respond with a `101 Switching Protocols` status code, indicati ### Interacting with the model -We provide code samples in various programming languages (python, rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. +We provide code samples in various programming languages (Python, Rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. Those code samples can be found in our [GitHub repository](https://github.com/scaleway/moshi-client-examples). This repository contains instructions on how to run the code samples and interact with the model. \ No newline at end of file From 97d3b92a96860295d4aa3a0c483f786d5e38c7fb Mon Sep 17 00:00:00 2001 From: Benedikt Rollik Date: Thu, 31 Oct 2024 14:11:47 +0100 Subject: [PATCH 6/6] Update ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx index 7f6969c47c..967d23d885 100644 --- a/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx +++ b/ai-data/managed-inference/reference-content/moshiko-0.1-8b.mdx @@ -81,6 +81,6 @@ The server should respond with a `101 Switching Protocols` status code, indicati ### Interacting with the model -We provide code samples in various programming languages (python, rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. +We provide code samples in various programming languages (Python, Rust, typescript) to interact with the model using the WebSocket API as well as a simple web interface. Those code samples can be found in our [GitHub repository](https://github.com/scaleway/moshi-client-examples). This repository contains instructions on how to run the code samples and interact with the model. \ No newline at end of file