Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion snippets/general-shared-text/astradb-api-placeholders.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
- `<name>` (_required_) - A unique name for this connector.
- `<token>` (_required_) - The application token for the database.
- `<api-endpoint>` (_required_) - The database’s associated API endpoint.
- `<collection-name>` (_required_) - The name of the collection in the namespace.
- `<collection-name>` - The name of the collection in the namespace. If no value is provided, see the beginning of this article for the behavior at run time.
- `<keyspace>` - The name of the keyspace in the collection. The default is `default_keyspace` if not otherwise specified.
- `<batch-size>` - The maximum number of records to send per batch. The default is `20` if not otherwise specified.
- `flatten_metadata` - Set to `true` to flatten the metadata into each record. Specifically, when flattened, the metadata key values are brought to the top level of the element, and the `metadata` key itself is removed. By default, the metadata is not flattened (`false`).
2 changes: 1 addition & 1 deletion snippets/general-shared-text/astradb-cli-api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ These environment variables:
- `ASTRA_DB_API_ENDPOINT` - The API endpoint for the Astra DB database, represented by `--api-endpoint` (CLI) or `api_endpoint` (Python). To get the endpoint, see the **Database Details > API Endpoint** value on your database's **Overview** tab.
- `ASTRA_DB_APPLICATION_TOKEN` - The database application token value for the database, represented by `--token` (CLI) or `token` (Python). To get the token, see the **Database Details > Application Tokens** box on your database's **Overview** tab.
- `ASTRA_DB_KEYSPACE` - The name of the keyspace for the database, represented by `--keyspace` (CLI) or `keyspace` (Python).
- `ASTRA_DB_COLLECTION` - The name of the collection for the keyspace, represented by `--collection-name` (CLI) or `collection_name` (Python).
- `ASTRA_DB_COLLECTION` - The name of the collection for the keyspace, represented by `--collection-name` (CLI) or `collection_name` (Python). If no value is provided, see the beginning of this article for the behavior at run time.

Additional settings include:

Expand Down
2 changes: 1 addition & 1 deletion snippets/general-shared-text/astradb-platform.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Fill in the following fields:

- **Name** (_required_): A unique name for this connector.
- **Collection Name** (_required_): The name of the collection in the namespace.
- **Collection Name**: The name of the collection in the namespace. If no value is provided, see the beginning of this article for the behavior at run time.
- **Keyspace** (_required_): The name of the keyspace in the collection.
- **Batch Size**: The maximum number of records per batch. The default is `20` if not otherwise specified.
- **Flatten Metadata**: Check this box to flatten the metadata into each record.
Expand Down
20 changes: 19 additions & 1 deletion snippets/general-shared-text/astradb.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,22 @@ allowfullscreen
- A database in the Astra account. [Create a database in an account](https://docs.datastax.com/en/astra-db-classic/databases/manage-create.html).
- An application token for the database. [Create a database application token](https://docs.datastax.com/en/astra-db-serverless/administration/manage-application-tokens.html).
- A namespace in the database. [Create a namespace in a database](https://docs.datastax.com/en/astra-db-serverless/databases/manage-namespaces.html#create-namespace).
- A collection in the namespace. [Create a collection in a namespace](https://docs.datastax.com/en/astra-db-serverless/databases/manage-collections.html#create-collection).
- A collection in the namespace. [Create a collection in a namespace](https://docs.datastax.com/en/astra-db-serverless/databases/manage-collections.html#create-collection).

An existing collection is not required. At runtime, the collection behavior is as follows:

For the [Unstructured Platform](/platform/overview):

- If an existing collection name is specified, and Unstructured generates embeddings,
but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail.
You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again.
- If a collection name is not specified, Unstructured creates a new collection in your namespace. If Unstructured generates embeddings,
the new collections's name will be `u<short-workflow-id>_<short-embedding-model-name>_<number-of-dimensions>`.
If Unstructured does not generate embeddings, the new collections's name will be `u<short-workflow-id`.

For [Unstructured Ingest](/ingestion/overview):

- If an existing collection name is specified, and Unstructured generates embeddings,
but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail.
You must change your Unstructured embedding settings or your existing collections's embedding settings to match, and try the run again.
- If a collection name is not specified, Unstructured creates a new collection in your Pinecone account. The new collection's name will be `unstructuredautocreated`.