From c3a12ea2955e67b25ad9cfbe1486e6539d918824 Mon Sep 17 00:00:00 2001 From: fpagny Date: Thu, 10 Apr 2025 17:38:14 +0200 Subject: [PATCH 1/2] feat(genapi): document embedding dimensions size issues --- .../troubleshooting/fixing-common-issues.mdx | 23 +++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx index 2240ad6c8b..786461de09 100644 --- a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx +++ b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx @@ -120,7 +120,26 @@ Below are common issues that you may encounter when using Generative APIs, their - When displaying the Cockpit of a specific Project, but waiting for average token consumption to display: - Counter for **Tokens Processed** or **API Requests** should display a correct value (different from 0) - Graph across time should be empty -``` + +## Embeddings vectors cannot be stored in database or used with a third-party library + +### Cause +- The embedding model you are using generates vector representations of fixed dimensions number, which is too high for your database or third-party library. + - For example, the embedding model `bge-multilingual-gemma2` generates vector representations with `3584` dimensions. However, when storing vectors using PostgreSQL `pgvector` extensions, indexes (in `hnsw` or `ivvflat` formats) only support up to `2000` dimensions. + +### Solution +- Use a vector store supporting higher dimensions number, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/). +- Do not use indexes for vectors or disable them from your third party library. This may limit performance in vector similarity search for significant volumes. + - When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create index by default, and should not raise errors. + - When using [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions. + ```typescript + await vectorStore.createIndex({ + indexName: 'papers', + dimension: 3584, + indexConfig: {"type":"flat"}, + }); + ``` +- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance `sentence-t5-xxl` model which represent vectors with `768` dimensions. ## Best practices for optimizing model performance @@ -135,4 +154,4 @@ Below are common issues that you may encounter when using Generative APIs, their ### Debugging silent errors - For cases where no explicit error is returned: - Verify all fields in the API request are correctly named and formatted. - - Test the request with smaller and simpler inputs to isolate potential issues. \ No newline at end of file + - Test the request with smaller and simpler inputs to isolate potential issues. From 0a8c5715c0762aa87798ca48e4073242b47439d7 Mon Sep 17 00:00:00 2001 From: Benedikt Rollik Date: Fri, 11 Apr 2025 10:04:16 +0200 Subject: [PATCH 2/2] Apply suggestions from code review --- .../troubleshooting/fixing-common-issues.mdx | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx index 786461de09..0b62e9dc69 100644 --- a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx +++ b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx @@ -124,14 +124,14 @@ Below are common issues that you may encounter when using Generative APIs, their ## Embeddings vectors cannot be stored in database or used with a third-party library ### Cause -- The embedding model you are using generates vector representations of fixed dimensions number, which is too high for your database or third-party library. +The embedding model you are using generates vector representations with a fixed dimension number, which is too high for your database or third-party library. - For example, the embedding model `bge-multilingual-gemma2` generates vector representations with `3584` dimensions. However, when storing vectors using PostgreSQL `pgvector` extensions, indexes (in `hnsw` or `ivvflat` formats) only support up to `2000` dimensions. ### Solution - Use a vector store supporting higher dimensions number, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/). -- Do not use indexes for vectors or disable them from your third party library. This may limit performance in vector similarity search for significant volumes. - - When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create index by default, and should not raise errors. - - When using [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions. +- Do not use indexes for vectors or disable them from your third-party library. This may limit performance in vector similarity search for significant volumes. + - When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create an index by default and should not raise errors. + - When using the [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions. ```typescript await vectorStore.createIndex({ indexName: 'papers', @@ -139,7 +139,7 @@ Below are common issues that you may encounter when using Generative APIs, their indexConfig: {"type":"flat"}, }); ``` -- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance `sentence-t5-xxl` model which represent vectors with `768` dimensions. +- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance the`sentence-t5-xxl` model, which represents vectors with `768` dimensions. ## Best practices for optimizing model performance