From 17b2a3328a86ffa24b32d4fd8beac066d01c1746 Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Wed, 30 Oct 2024 13:44:14 +0100 Subject: [PATCH 1/6] feat(inference): newest embedding --- .../bge-multilingual-gemma2.mdx | 66 +++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx new file mode 100644 index 0000000000..037ccf2e74 --- /dev/null +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -0,0 +1,66 @@ +--- +meta: + title: Understanding the BGE-Multilingual-Gemma2 embedding model + description: Deploy your own secure BGE-Multilingual-Gemma2 embedding model with Scaleway Managed Inference. Privacy-focused, fully managed. +content: + h1: Understanding the BGE-Multilingual-Gemma2 embedding model + paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model +tags: embedding +categories: + - ai-data +--- + +## Model overview + +| Attribute | Details | +|-----------------|------------------------------------| +| Provider | [baai](https://huggingface.co/BAAI) | +| Compatible Instances | L4 (FP32) | +| Context size | 4096 tokens | + +## Model name + +```bash +baai/bge-multilingual-gemma2:fp32 +``` + +## Compatible Instances + +| Instance type | Max context length | +| ------------- |-------------| +| L4 | 4096 (FP32) | + +## Model introduction + +BGE is short for BAAI General Embedding. This particular model is an LLM-based embedding, trained on a diverse range of languages and tasks from the lightweight [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b). +As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/gemma/terms). + +## Why is it useful? + +- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring #1 in french, #1 in polish, #7 in english, as of writing (Q4 2024). +- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more! +- It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. +- BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. + +## How to use it + +### Sending Managed Inference requests + +To perform inference tasks with your Embedding model deployed at Scaleway, use the following command: + +```bash +curl https://.ifr.fr-par.scaleway.com/v1/embeddings \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "input": "Embeddings can represent text in a numerical format.", + "model": "baai/bge-multilingual-gemma2:fp32" + }' +``` + +Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. + +### Receiving Inference responses + +Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. +Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request. From c6751aa2f40c2e603437a1c95ee11b0e0fdaf977 Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Wed, 30 Oct 2024 13:47:38 +0100 Subject: [PATCH 2/6] feat(inference): edited menu --- menu/navigation.json | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/menu/navigation.json b/menu/navigation.json index 6236883e1f..ed80f8b786 100644 --- a/menu/navigation.json +++ b/menu/navigation.json @@ -623,14 +623,14 @@ "label": "Mixtral-8x7b-instruct-v0.1 model", "slug": "mixtral-8x7b-instruct-v0.1" }, - { - "label": "WizardLM-70b-v1.0 model", - "slug": "wizardlm-70b-v1.0" - }, { "label": "Sentence-t5-xxl model", "slug": "sentence-t5-xxl" }, + { + "label": "BGE-Multilingual-Gemma2 model", + "slug": "bge-multilingual-gemma2" + }, { "label": "Pixtral-12b-2409 model", "slug": "pixtral-12b-2409" From 7ba1d712f210f1890a38d6b44b649a0891cf07cd Mon Sep 17 00:00:00 2001 From: Benedikt Rollik Date: Wed, 30 Oct 2024 15:16:18 +0100 Subject: [PATCH 3/6] Apply suggestions from code review Co-authored-by: nerda-codes <87707325+nerda-codes@users.noreply.github.com> --- .../reference-content/bge-multilingual-gemma2.mdx | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx index 037ccf2e74..9de50a13b6 100644 --- a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -7,6 +7,9 @@ content: paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model tags: embedding categories: +dates: + validation: 2024-10-30 + posted: 2024-10-30 - ai-data --- @@ -37,7 +40,7 @@ As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/ ## Why is it useful? -- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring #1 in french, #1 in polish, #7 in english, as of writing (Q4 2024). +- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024). - As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more! - It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. - BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. From 7e4bffc215ba7c50e294fbdbec586130cb00d3c1 Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Thu, 31 Oct 2024 10:10:55 +0100 Subject: [PATCH 4/6] Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- .../reference-content/bge-multilingual-gemma2.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx index 9de50a13b6..8dd89733b7 100644 --- a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -40,7 +40,7 @@ As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/ ## Why is it useful? -- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024). +- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard), scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024). - As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more! - It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. - BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. From e9278c17dfb783133c736c84df8f2cb3f2e8a02c Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Thu, 31 Oct 2024 10:11:00 +0100 Subject: [PATCH 5/6] Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- .../reference-content/bge-multilingual-gemma2.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx index 8dd89733b7..f3ad79c490 100644 --- a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -41,7 +41,7 @@ As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/ ## Why is it useful? - BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard), scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024). -- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more! +- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more. - It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. - BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. From ea42bbc07d2b61d53cde9722d895836465971044 Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Thu, 31 Oct 2024 10:11:05 +0100 Subject: [PATCH 6/6] Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- .../reference-content/bge-multilingual-gemma2.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx index f3ad79c490..b7cd211ea4 100644 --- a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -49,7 +49,7 @@ As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/ ### Sending Managed Inference requests -To perform inference tasks with your Embedding model deployed at Scaleway, use the following command: +To perform inference tasks with your embedding model deployed at Scaleway, use the following command: ```bash curl https://.ifr.fr-par.scaleway.com/v1/embeddings \