diff --git a/snippets/general-shared-text/pinecone-api-placeholders.mdx b/snippets/general-shared-text/pinecone-api-placeholders.mdx index 9438dace..488b5984 100644 --- a/snippets/general-shared-text/pinecone-api-placeholders.mdx +++ b/snippets/general-shared-text/pinecone-api-placeholders.mdx @@ -1,4 +1,4 @@ - `` (required) - A unique name for this connector. -- `` (required) - The name of the index in the Pinecone database. +- `` - The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time. - `` (required) - The Pinecone API key. - `` - The maximum number of records to transmit in a single batch. The default is `50` unless otherwise specified. diff --git a/snippets/general-shared-text/pinecone-cli-api.mdx b/snippets/general-shared-text/pinecone-cli-api.mdx index 924b2cae..246e560b 100644 --- a/snippets/general-shared-text/pinecone-cli-api.mdx +++ b/snippets/general-shared-text/pinecone-cli-api.mdx @@ -11,4 +11,4 @@ import AdditionalIngestDependencies from '/snippets/general-shared-text/ingest-d The following environment variables: - `PINECONE_API_KEY` - The Pinecone API, represented by `--api-key` (CLI) or `api_key` (Python, in the `PineconeAccessConfig` object). -- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python). \ No newline at end of file +- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python). If no value is provided, see the beginning of this article for the behavior at run time. \ No newline at end of file diff --git a/snippets/general-shared-text/pinecone-platform.mdx b/snippets/general-shared-text/pinecone-platform.mdx index 2a7852de..0d211095 100644 --- a/snippets/general-shared-text/pinecone-platform.mdx +++ b/snippets/general-shared-text/pinecone-platform.mdx @@ -1,6 +1,6 @@ Fill in the following fields: - **Name** (_required_): A unique name for this connector. -- **Index Name** (_required_): The name of the index in the Pinecone database. +- **Index Name**: The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time. - **Batch Size**: The number of records to use in a single batch. The default is `50` if not otherwise specified. - **API Key** (_required_): The Pinecone API key. \ No newline at end of file diff --git a/snippets/general-shared-text/pinecone.mdx b/snippets/general-shared-text/pinecone.mdx index ed0b4753..c637ea6f 100644 --- a/snippets/general-shared-text/pinecone.mdx +++ b/snippets/general-shared-text/pinecone.mdx @@ -13,8 +13,26 @@ - A Pinecone API key. [Get an API key](https://docs.pinecone.io/guides/get-started/authentication#find-your-pinecone-api-key). - A Pinecone serverless index. [Create a serverless index](https://docs.pinecone.io/guides/indexes/create-an-index). + An existing index is not required. At runtime, the index behavior is as follows: + + For the [Unstructured Platform](/platform/overview): + + - If an existing index name is specified, and Unstructured generates embeddings, + but the number of dimensions that are generated does not match the existing index's embedding settings, the run will fail. + You must change your Unstructured embedding settings or your existing index's embedding settings to match, and try the run again. + - If an index name is not specified, Unstructured creates a new index in your Pinecone account. If Unstructured generates embeddings, + the new index's name will be `u--`. + If Unstructured does not generate embeddings, the new index's name will be `u - Unstructured recommends that all records in the target index have a field + If you create a new index or use an existing one, Unstructured recommends that all records in the target index have a field named `record_id` with a string data type. Unstructured can use this field to do intelligent document overwrites. Without this field, duplicate documents might be written to the index or, in some cases, the operation could fail altogether. diff --git a/snippets/general-shared-text/weaviate.mdx b/snippets/general-shared-text/weaviate.mdx index 6783fe68..0cc391de 100644 --- a/snippets/general-shared-text/weaviate.mdx +++ b/snippets/general-shared-text/weaviate.mdx @@ -37,7 +37,7 @@ - If an existing collection name is specified, and Unstructured generates embeddings, but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail. You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again. - - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Elements`. + - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Unstructuredautocreated`. If Unstructured creates a new collection and generates embeddings, you will not see an embeddings property in tools such as the Weaviate Cloud **Collections** user interface. To view the generated embeddings, you can run a Weaviate GraphQL query such as the following. In this query, replace `` with