Platform: Pinecone index automatic management behavior (#489)

Paul-Cornell · web-flow · commit 39923ad53657 · 2025-04-04T07:54:31.000-07:00
diff --git a/snippets/general-shared-text/pinecone-api-placeholders.mdx b/snippets/general-shared-text/pinecone-api-placeholders.mdx
@@ -1,4 +1,4 @@
 - `<name>` (required) - A unique name for this connector.
-- `<index-name>` (required) - The name of the index in the Pinecone database.
+- `<index-name>` - The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time.
 - `<api-key>` (required) - The Pinecone API key.
 - `<batch-size>` - The maximum number of records to transmit in a single batch. The default is `50` unless otherwise specified.
diff --git a/snippets/general-shared-text/pinecone-cli-api.mdx b/snippets/general-shared-text/pinecone-cli-api.mdx
@@ -11,4 +11,4 @@ import AdditionalIngestDependencies from '/snippets/general-shared-text/ingest-d
 The following environment variables:
 
 - `PINECONE_API_KEY` - The Pinecone API, represented by `--api-key` (CLI) or `api_key` (Python, in the `PineconeAccessConfig` object).
-- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python).
+- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python). If no value is provided, see the beginning of this article for the behavior at run time.
diff --git a/snippets/general-shared-text/pinecone-platform.mdx b/snippets/general-shared-text/pinecone-platform.mdx
@@ -1,6 +1,6 @@
 Fill in the following fields:
 
 - **Name** (_required_): A unique name for this connector.
-- **Index Name** (_required_): The name of the index in the Pinecone database.
+- **Index Name**: The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time.
 - **Batch Size**: The number of records to use in a single batch. The default is `50` if not otherwise specified.
 - **API Key** (_required_): The Pinecone API key.
diff --git a/snippets/general-shared-text/pinecone.mdx b/snippets/general-shared-text/pinecone.mdx
@@ -13,8 +13,26 @@
 - A Pinecone API key. [Get an API key](https://docs.pinecone.io/guides/get-started/authentication#find-your-pinecone-api-key).
 - A Pinecone serverless index. [Create a serverless index](https://docs.pinecone.io/guides/indexes/create-an-index).
 
+  An existing index is not required. At runtime, the index behavior is as follows:
+
+  For the [Unstructured Platform](/platform/overview):
+    
+  - If an existing index name is specified, and Unstructured generates embeddings, 
+    but the number of dimensions that are generated does not match the existing index's embedding settings, the run will fail. 
+    You must change your Unstructured embedding settings or your existing index's embedding settings to match, and try the run again.
+  - If an index name is not specified, Unstructured creates a new index in your Pinecone account. If Unstructured generates embeddings, 
+    the new index's name will be `u<short-workflow-id>-<short-embedding-model-name>-<number-of-dimensions>`. 
+    If Unstructured does not generate embeddings, the new index's name will be `u<short-workflow-id`.
+
+  For [Unstructured Ingest](/ingestion/overview):
+
+  - If an existing index name is specified, and Unstructured generates embeddings, 
+    but the number of dimensions that are generated does not match the existing index's embedding settings, the run will fail. 
+    You must change your Unstructured embedding settings or your existing index's embedding settings to match, and try the run again. 
+  - If an index name is not specified, Unstructured creates a new index in your Pinecone account. The new index's name will be `unstructuredautocreated`.
+
   <Note>
-      Unstructured recommends that all records in the target index have a field 
+      If you create a new index or use an existing one, Unstructured recommends that all records in the target index have a field 
       named `record_id` with a string data type. 
       Unstructured can use this field to do intelligent document overwrites. Without this field, duplicate documents 
       might be written to the index or, in some cases, the operation could fail altogether.
diff --git a/snippets/general-shared-text/weaviate.mdx b/snippets/general-shared-text/weaviate.mdx
@@ -37,7 +37,7 @@
     - If an existing collection name is specified, and Unstructured generates embeddings, 
       but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail. 
       You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again. 
-    - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Elements`.
+    - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Unstructuredautocreated`.
 
     If Unstructured creates a new collection and generates embeddings, you will not see an embeddings property in tools such as the Weaviate Cloud 
     **Collections** user interface. To view the generated embeddings, you can run a Weaviate GraphQL query such as the following. In this query, replace `<collection-name>` with