Skip to content

Commit 39923ad

Browse files
authored
Platform: Pinecone index automatic management behavior (#489)
1 parent d03de99 commit 39923ad

File tree

5 files changed

+23
-5
lines changed

5 files changed

+23
-5
lines changed
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
- `<name>` (required) - A unique name for this connector.
2-
- `<index-name>` (required) - The name of the index in the Pinecone database.
2+
- `<index-name>` - The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time.
33
- `<api-key>` (required) - The Pinecone API key.
44
- `<batch-size>` - The maximum number of records to transmit in a single batch. The default is `50` unless otherwise specified.

snippets/general-shared-text/pinecone-cli-api.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ import AdditionalIngestDependencies from '/snippets/general-shared-text/ingest-d
1111
The following environment variables:
1212

1313
- `PINECONE_API_KEY` - The Pinecone API, represented by `--api-key` (CLI) or `api_key` (Python, in the `PineconeAccessConfig` object).
14-
- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python).
14+
- `PINECONE_INDEX_NAME` - The Pinecone serverless index name, represented by `--index-name` (CLI) or `index_name` (Python). If no value is provided, see the beginning of this article for the behavior at run time.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Fill in the following fields:
22

33
- **Name** (_required_): A unique name for this connector.
4-
- **Index Name** (_required_): The name of the index in the Pinecone database.
4+
- **Index Name**: The name of the index in the Pinecone database. If no value is provided, see the beginning of this article for the behavior at run time.
55
- **Batch Size**: The number of records to use in a single batch. The default is `50` if not otherwise specified.
66
- **API Key** (_required_): The Pinecone API key.

snippets/general-shared-text/pinecone.mdx

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,26 @@
1313
- A Pinecone API key. [Get an API key](https://docs.pinecone.io/guides/get-started/authentication#find-your-pinecone-api-key).
1414
- A Pinecone serverless index. [Create a serverless index](https://docs.pinecone.io/guides/indexes/create-an-index).
1515

16+
An existing index is not required. At runtime, the index behavior is as follows:
17+
18+
For the [Unstructured Platform](/platform/overview):
19+
20+
- If an existing index name is specified, and Unstructured generates embeddings,
21+
but the number of dimensions that are generated does not match the existing index's embedding settings, the run will fail.
22+
You must change your Unstructured embedding settings or your existing index's embedding settings to match, and try the run again.
23+
- If an index name is not specified, Unstructured creates a new index in your Pinecone account. If Unstructured generates embeddings,
24+
the new index's name will be `u<short-workflow-id>-<short-embedding-model-name>-<number-of-dimensions>`.
25+
If Unstructured does not generate embeddings, the new index's name will be `u<short-workflow-id`.
26+
27+
For [Unstructured Ingest](/ingestion/overview):
28+
29+
- If an existing index name is specified, and Unstructured generates embeddings,
30+
but the number of dimensions that are generated does not match the existing index's embedding settings, the run will fail.
31+
You must change your Unstructured embedding settings or your existing index's embedding settings to match, and try the run again.
32+
- If an index name is not specified, Unstructured creates a new index in your Pinecone account. The new index's name will be `unstructuredautocreated`.
33+
1634
<Note>
17-
Unstructured recommends that all records in the target index have a field
35+
If you create a new index or use an existing one, Unstructured recommends that all records in the target index have a field
1836
named `record_id` with a string data type.
1937
Unstructured can use this field to do intelligent document overwrites. Without this field, duplicate documents
2038
might be written to the index or, in some cases, the operation could fail altogether.

snippets/general-shared-text/weaviate.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
- If an existing collection name is specified, and Unstructured generates embeddings,
3838
but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail.
3939
You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again.
40-
- If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Elements`.
40+
- If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Unstructuredautocreated`.
4141

4242
If Unstructured creates a new collection and generates embeddings, you will not see an embeddings property in tools such as the Weaviate Cloud
4343
**Collections** user interface. To view the generated embeddings, you can run a Weaviate GraphQL query such as the following. In this query, replace `<collection-name>` with

0 commit comments

Comments
 (0)