From 2a8bb67b0451e5583157e2e1feea99cec7d2e3bd Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Tue, 11 Mar 2025 16:57:33 -0700 Subject: [PATCH] Fix broken URLs/links --- {api => api-reference}/legacy-api/aws.mdx | 2 +- {api => api-reference}/legacy-api/azure.mdx | 0 .../legacy-api/free-api.mdx | 8 +- .../legacy-api/overview.mdx | 8 +- {api => api-reference}/overview.mdx | 10 +- .../partition/api-parameters.mdx | 20 +- .../partition/api-validation-errors.mdx | 0 {api => api-reference}/partition/chunking.mdx | 0 .../partition/document-elements.mdx | 0 {api => api-reference}/partition/examples.mdx | 2 +- .../partition/extract-image-block-types.mdx | 6 +- .../partition/generate-schema.mdx | 0 .../partition/get-chunked-elements.mdx | 4 +- .../partition/get-elements.mdx | 6 +- .../output-bounding-box-coordinates.mdx | 0 {api => api-reference}/partition/overview.mdx | 6 +- .../partition/partitioning.mdx | 0 .../partition/pipeline-1.mdx | 0 .../partition/post-requests.mdx | 8 +- {api => api-reference}/partition/sdk-jsts.mdx | 6 +- .../partition/sdk-python.mdx | 8 +- .../speed-up-large-files-batches.mdx | 6 +- .../partition/text-as-html.mdx | 4 +- .../partition/transform-schemas.mdx | 0 .../supported-file-types.mdx | 0 .../troubleshooting/api-key-url.mdx | 6 +- .../workflow/destinations/astradb.mdx | 0 .../workflow/destinations/azure-ai-search.mdx | 0 .../workflow/destinations/couchbase.mdx | 0 .../destinations/databricks-delta-table.mdx | 4 +- .../destinations/databricks-volumes.mdx | 2 +- .../workflow/destinations/delta-table.mdx | 2 +- .../workflow/destinations/elasticsearch.mdx | 0 .../workflow/destinations/google-cloud.mdx | 0 .../workflow/destinations/kafka.mdx | 0 .../workflow/destinations/milvus.mdx | 0 .../workflow/destinations/mongodb.mdx | 0 .../workflow/destinations/motherduck.mdx | 0 .../workflow/destinations/neo4j.mdx | 0 .../workflow/destinations/onedrive.mdx | 0 .../workflow/destinations/overview.mdx | 42 ++++ .../workflow/destinations/pinecone.mdx | 0 .../workflow/destinations/postgresql.mdx | 0 .../workflow/destinations/qdrant.mdx | 0 .../workflow/destinations/redis.mdx | 0 .../workflow/destinations/s3.mdx | 0 .../workflow/destinations/snowflake.mdx | 0 .../workflow/destinations/weaviate.mdx | 0 {api => api-reference}/workflow/jobs.mdx | 12 +- {api => api-reference}/workflow/overview.mdx | 30 +-- .../workflow/sources/azure-blob-storage.mdx | 0 .../workflow/sources/box.mdx | 0 .../workflow/sources/confluence.mdx | 0 .../workflow/sources/couchbase.mdx | 0 .../workflow/sources/databricks-volumes.mdx | 0 .../workflow/sources/dropbox.mdx | 0 .../workflow/sources/elasticsearch.mdx | 0 .../workflow/sources/google-cloud.mdx | 0 .../workflow/sources/google-drive.mdx | 0 .../workflow/sources/kafka.mdx | 0 .../workflow/sources/mongodb.mdx | 0 .../workflow/sources/onedrive.mdx | 0 .../workflow/sources/outlook.mdx | 0 api-reference/workflow/sources/overview.mdx | 40 ++++ .../workflow/sources/postgresql.mdx | 0 .../workflow/sources/s3.mdx | 0 .../workflow/sources/salesforce.mdx | 0 .../workflow/sources/sharepoint.mdx | 0 .../workflow/sources/snowflake.mdx | 0 {api => api-reference}/workflow/workflows.mdx | 20 +- api/workflow/destinations/overview.mdx | 42 ---- api/workflow/sources/overview.mdx | 40 ---- examplecode/codesamples/api/huggingchat.mdx | 6 +- .../apioss/table-extraction-from-pdf.mdx | 4 +- .../oss/multi-files-api-processing.mdx | 2 +- examplecode/tools/langflow.mdx | 4 +- .../how-to/extract-image-block-types.mdx | 2 +- ingestion/ingest-cli.mdx | 2 +- ingestion/overview.mdx | 2 +- ingestion/python-ingest.mdx | 2 +- mint.json | 194 +++++++++--------- .../core-functionality/partitioning.mdx | 2 +- open-source/introduction/overview.mdx | 4 +- .../general-shared-text/azure-ai-search.mdx | 2 +- snippets/general-shared-text/couchbase.mdx | 2 +- .../databricks-volumes.mdx | 2 +- snippets/general-shared-text/dropbox.mdx | 4 +- .../general-shared-text/elasticsearch.mdx | 4 +- .../free-api-key-no-serverless-access.mdx | 6 +- snippets/general-shared-text/milvus.mdx | 2 +- snippets/general-shared-text/neo4j.mdx | 2 +- .../no-url-for-serverless-api.mdx | 2 +- snippets/general-shared-text/opensearch.mdx | 2 +- snippets/general-shared-text/postgresql.mdx | 4 +- snippets/general-shared-text/qdrant.mdx | 2 +- .../singlestore-schema.mdx | 2 +- .../sql-sample-index-schema.mdx | 2 +- snippets/general-shared-text/sqlite.mdx | 2 +- .../use-ingest-or-platform-instead.mdx | 2 +- .../weaviate-sample-index-schema.mdx | 2 +- snippets/general-shared-text/weaviate.mdx | 6 +- .../partition-by-api-oss.mdx | 4 +- snippets/ingestion/code-generator.mdx | 2 +- snippets/quickstarts/platform-api.mdx | 2 +- welcome.mdx | 14 +- 105 files changed, 320 insertions(+), 320 deletions(-) rename {api => api-reference}/legacy-api/aws.mdx (99%) rename {api => api-reference}/legacy-api/azure.mdx (100%) rename {api => api-reference}/legacy-api/free-api.mdx (83%) rename {api => api-reference}/legacy-api/overview.mdx (63%) rename {api => api-reference}/overview.mdx (77%) rename {api => api-reference}/partition/api-parameters.mdx (93%) rename {api => api-reference}/partition/api-validation-errors.mdx (100%) rename {api => api-reference}/partition/chunking.mdx (100%) rename {api => api-reference}/partition/document-elements.mdx (100%) rename {api => api-reference}/partition/examples.mdx (99%) rename {api => api-reference}/partition/extract-image-block-types.mdx (84%) rename {api => api-reference}/partition/generate-schema.mdx (100%) rename {api => api-reference}/partition/get-chunked-elements.mdx (94%) rename {api => api-reference}/partition/get-elements.mdx (91%) rename {api => api-reference}/partition/output-bounding-box-coordinates.mdx (100%) rename {api => api-reference}/partition/overview.mdx (96%) rename {api => api-reference}/partition/partitioning.mdx (100%) rename {api => api-reference}/partition/pipeline-1.mdx (100%) rename {api => api-reference}/partition/post-requests.mdx (89%) rename {api => api-reference}/partition/sdk-jsts.mdx (97%) rename {api => api-reference}/partition/sdk-python.mdx (96%) rename {api => api-reference}/partition/speed-up-large-files-batches.mdx (82%) rename {api => api-reference}/partition/text-as-html.mdx (84%) rename {api => api-reference}/partition/transform-schemas.mdx (100%) rename {api => api-reference}/supported-file-types.mdx (100%) rename {api => api-reference}/troubleshooting/api-key-url.mdx (88%) rename {api => api-reference}/workflow/destinations/astradb.mdx (100%) rename {api => api-reference}/workflow/destinations/azure-ai-search.mdx (100%) rename {api => api-reference}/workflow/destinations/couchbase.mdx (100%) rename {api => api-reference}/workflow/destinations/databricks-delta-table.mdx (88%) rename {api => api-reference}/workflow/destinations/databricks-volumes.mdx (91%) rename {api => api-reference}/workflow/destinations/delta-table.mdx (91%) rename {api => api-reference}/workflow/destinations/elasticsearch.mdx (100%) rename {api => api-reference}/workflow/destinations/google-cloud.mdx (100%) rename {api => api-reference}/workflow/destinations/kafka.mdx (100%) rename {api => api-reference}/workflow/destinations/milvus.mdx (100%) rename {api => api-reference}/workflow/destinations/mongodb.mdx (100%) rename {api => api-reference}/workflow/destinations/motherduck.mdx (100%) rename {api => api-reference}/workflow/destinations/neo4j.mdx (100%) rename {api => api-reference}/workflow/destinations/onedrive.mdx (100%) create mode 100644 api-reference/workflow/destinations/overview.mdx rename {api => api-reference}/workflow/destinations/pinecone.mdx (100%) rename {api => api-reference}/workflow/destinations/postgresql.mdx (100%) rename {api => api-reference}/workflow/destinations/qdrant.mdx (100%) rename {api => api-reference}/workflow/destinations/redis.mdx (100%) rename {api => api-reference}/workflow/destinations/s3.mdx (100%) rename {api => api-reference}/workflow/destinations/snowflake.mdx (100%) rename {api => api-reference}/workflow/destinations/weaviate.mdx (100%) rename {api => api-reference}/workflow/jobs.mdx (58%) rename {api => api-reference}/workflow/overview.mdx (98%) rename {api => api-reference}/workflow/sources/azure-blob-storage.mdx (100%) rename {api => api-reference}/workflow/sources/box.mdx (100%) rename {api => api-reference}/workflow/sources/confluence.mdx (100%) rename {api => api-reference}/workflow/sources/couchbase.mdx (100%) rename {api => api-reference}/workflow/sources/databricks-volumes.mdx (100%) rename {api => api-reference}/workflow/sources/dropbox.mdx (100%) rename {api => api-reference}/workflow/sources/elasticsearch.mdx (100%) rename {api => api-reference}/workflow/sources/google-cloud.mdx (100%) rename {api => api-reference}/workflow/sources/google-drive.mdx (100%) rename {api => api-reference}/workflow/sources/kafka.mdx (100%) rename {api => api-reference}/workflow/sources/mongodb.mdx (100%) rename {api => api-reference}/workflow/sources/onedrive.mdx (100%) rename {api => api-reference}/workflow/sources/outlook.mdx (100%) create mode 100644 api-reference/workflow/sources/overview.mdx rename {api => api-reference}/workflow/sources/postgresql.mdx (100%) rename {api => api-reference}/workflow/sources/s3.mdx (100%) rename {api => api-reference}/workflow/sources/salesforce.mdx (100%) rename {api => api-reference}/workflow/sources/sharepoint.mdx (100%) rename {api => api-reference}/workflow/sources/snowflake.mdx (100%) rename {api => api-reference}/workflow/workflows.mdx (97%) delete mode 100644 api/workflow/destinations/overview.mdx delete mode 100644 api/workflow/sources/overview.mdx diff --git a/api/legacy-api/aws.mdx b/api-reference/legacy-api/aws.mdx similarity index 99% rename from api/legacy-api/aws.mdx rename to api-reference/legacy-api/aws.mdx index 79db79ce..ea78c837 100644 --- a/api/legacy-api/aws.mdx +++ b/api-reference/legacy-api/aws.mdx @@ -314,7 +314,7 @@ For example, run one of the following, setting the following environment variabl - Set `UNSTRUCTURED_API_URL` to `http://`, followed by your load balancer's DNS name, followed by `/general/v0/general`. You can now use this value (`http://`, followed by your load balancer's DNS name, followed by `/general/v0/general`) in place of - calling the [Unstructured Partition Endpoint](/api/partition/overview) URL as described elsewhere in the Unstructured API documentation. + calling the [Unstructured Partition Endpoint](/api-reference/partition/overview) URL as described elsewhere in the Unstructured API documentation. - Set `LOCAL_FILE_INPUT_DIR` to the path on your local machine to the files for the Unstructured API to process. If you do not have any input files available, you can download any of the ones from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in GitHub. - Set `LOCAL_FILE_OUTPUT_DIR` to the path on your local machine for Unstructured API to send the processed output in JSON format: diff --git a/api/legacy-api/azure.mdx b/api-reference/legacy-api/azure.mdx similarity index 100% rename from api/legacy-api/azure.mdx rename to api-reference/legacy-api/azure.mdx diff --git a/api/legacy-api/free-api.mdx b/api-reference/legacy-api/free-api.mdx similarity index 83% rename from api/legacy-api/free-api.mdx rename to api-reference/legacy-api/free-api.mdx index a5b5daa8..91a2ed6b 100644 --- a/api/legacy-api/free-api.mdx +++ b/api-reference/legacy-api/free-api.mdx @@ -5,7 +5,7 @@ title: Free Unstructured API The Free Unstructured API is in the process of deprecation by April 4, 2025. It is no longer supported and is not being actively updated. - Unstructured recommends that you use the [Unstructured API](/api/overview) instead, which provides new users with 14 days of free usage at up to 1000 pages per day during that period. + Unstructured recommends that you use the [Unstructured API](/api-reference/overview) instead, which provides new users with 14 days of free usage at up to 1000 pages per day during that period. This page is not being actively updated. It might contain out-of-date information. This page is provided for legacy reference purposes only. @@ -32,7 +32,7 @@ The Free Unstructured API is designed for prototyping purposes, and not for prod * Users of the Free Unstructured API do not get their own dedicated infrastructure. * The data sent over the Free Unstructured API can be used for model training purposes, and other service improvements. -If you require a production-ready API, consider using the [Unstructured API](/api/overview) instead. +If you require a production-ready API, consider using the [Unstructured API](/api-reference/overview) instead. import SharedPagesBilling from '/snippets/general-shared-text/pages-billing.mdx'; @@ -55,7 +55,7 @@ To work with the Free Unstructured API by using the [Unstructured Ingest CLI](/i - Set the `UNSTRUCTURED_API_KEY` environment variable to your Free Unstructured API key. - Set the `UNSTRUCTURED_API_URL` environment variable to your Free Unstructured API URL, which is `https://api.unstructured.io/general/v0/general` -- Have some compatible files on your local machine to be processed. [See the list of supported file types](/api/supported-file-types). If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. +- Have some compatible files on your local machine to be processed. [See the list of supported file types](/api-reference/supported-file-types). If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. Now, use the CLI to call the API, replacing: @@ -93,7 +93,7 @@ To work with Unstructured by using the [Unstructured Python library](/ingestion/ [Get your API key and API URL](#get-started). -- Have some compatible files on your local machine to be processed. [See the list of supported file types](/api/supported-file-types). If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. +- Have some compatible files on your local machine to be processed. [See the list of supported file types](/api-reference/supported-file-types). If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub. Now, use the CLI to call the API, replacing: diff --git a/api/legacy-api/overview.mdx b/api-reference/legacy-api/overview.mdx similarity index 63% rename from api/legacy-api/overview.mdx rename to api-reference/legacy-api/overview.mdx index b744304b..772fe648 100644 --- a/api/legacy-api/overview.mdx +++ b/api-reference/legacy-api/overview.mdx @@ -4,15 +4,15 @@ title: Overview Unstructured has deprecated the following APIs: -- The [Free Unstructured API](/api/legacy-api/free-api) is in the process of deprecation by April 4, 2025. +- The [Free Unstructured API](/api-reference/legacy-api/free-api) is in the process of deprecation by April 4, 2025. It is no longer supported and is not being actively updated. Unstructured recommends that you use the - [Unstructured API](/api/overview) instead, which provides new users with 14 days of free usage at up to + [Unstructured API](/api-reference/overview) instead, which provides new users with 14 days of free usage at up to 1000 pages per day during that period. -- The [Unstructured API on AWS](/api/legacy-api/aws) is deprecated. It is no longer supported and is not being actively updated. +- The [Unstructured API on AWS](/api-reference/legacy-api/aws) is deprecated. It is no longer supported and is not being actively updated. Unstructured is now available on the AWS Marketplace as a private offering. To explore supported options for running Unstructured within your virtual private cloud (VPC), email Unstructured Sales at [sales@unstructured.io](mailto:sales@unstructured.io). -- The [Unstructured API on Azure](/api/legacy-api/azure) is deprecated. It is no longer supported and is not being actively updated. +- The [Unstructured API on Azure](/api-reference/legacy-api/azure) is deprecated. It is no longer supported and is not being actively updated. Unstructured is now available on the AWS Marketplace as a private offering. To explore supported options for running Unstructured within your virtual private cloud (VPC), email Unstructured Sales at [sales@unstructured.io](mailto:sales@unstructured.io). diff --git a/api/overview.mdx b/api-reference/overview.mdx similarity index 77% rename from api/overview.mdx rename to api-reference/overview.mdx index 6eb9cec3..316d41fd 100644 --- a/api/overview.mdx +++ b/api-reference/overview.mdx @@ -4,15 +4,15 @@ title: Overview The Unstructured API consists of two parts: -- The [Unstructured Workflow Endpoint](/api/workflow/overview) enables a full range of partitioning, chunking, embedding, and +- The [Unstructured Workflow Endpoint](/api-reference/workflow/overview) enables a full range of partitioning, chunking, embedding, and enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to various storage, databases, and vector stores; and use the latest and highest-performing models on the market today. It has built-in logic - to deliver the highest quality results at the lowest cost. [Learn more](/api/workflow/overview). -- The [Unstructured Partition Endpoint](/api/partition/overview) is intended for rapid prototyping of Unstructured's + to deliver the highest quality results at the lowest cost. [Learn more](/api-reference/workflow/overview). +- The [Unstructured Partition Endpoint](/api-reference/partition/overview) is intended for rapid prototyping of Unstructured's various partitioning strategies, with limited support for chunking. It is designed to work only with processing of local files, one file - at a time. Use the [Unstructured Workflow Endpoint](/api/workflow/overview) for production-level scenarios, file processing in + at a time. Use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) for production-level scenarios, file processing in batches, files and data in remote locations, generating embeddings, applying post-transform enrichments, using the latest and - highest-performing models, and for the highest quality results at the lowest cost. [Learn more](/api/partition/overview). + highest-performing models, and for the highest quality results at the lowest cost. [Learn more](/api-reference/partition/overview). # Benefits over open source diff --git a/api/partition/api-parameters.mdx b/api-reference/partition/api-parameters.mdx similarity index 93% rename from api/partition/api-parameters.mdx rename to api-reference/partition/api-parameters.mdx index d9e42370..2377fa2c 100644 --- a/api/partition/api-parameters.mdx +++ b/api-reference/partition/api-parameters.mdx @@ -12,26 +12,26 @@ The only required parameter is `files` - the file you wish to process. | POST, Python | JavaScript/TypeScript | Description | |-------------------------------------------|------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `files` (_shared.Files_) | `files` (_File_, _Blob_, _shared.Files_) | The file to process. | -| `chunking_strategy` (_str_) | `chunkingStrategy` (_string_) | Use one of the supported strategies to chunk the returned elements after partitioning. When no chunking strategy is specified, no chunking is performed and any other chunking parameters provided are ignored. Supported strategies: `basic`, `by_title`, `by_page`, and `by_similarity`. [Learn more](/api/partition/chunking). | +| `chunking_strategy` (_str_) | `chunkingStrategy` (_string_) | Use one of the supported strategies to chunk the returned elements after partitioning. When no chunking strategy is specified, no chunking is performed and any other chunking parameters provided are ignored. Supported strategies: `basic`, `by_title`, `by_page`, and `by_similarity`. [Learn more](/api-reference/partition/chunking). | | `content_type` (_str_) | `contentType` (_string_) | A hint to Unstructured about the content type to use (such as `text/markdown`), when there are problems processing a specific file. This value is a MIME type in the format `type/subtype`. For available MIME types, see [model.py](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/file_utils/model.py). | -| `coordinates` (_bool_) | `coordinates` (_boolean_) | True to return bounding box coordinates for each element extracted with OCR. Default: false. [Learn more](/api/partition/examples#saving-bounding-box-coordinates). | +| `coordinates` (_bool_) | `coordinates` (_boolean_) | True to return bounding box coordinates for each element extracted with OCR. Default: false. [Learn more](/api-reference/partition/examples#saving-bounding-box-coordinates). | | `encoding` (_str_) | `encoding` (_string_) | The encoding method used to decode the text input. Default: `utf-8`. | -| `extract_image_block_types` (_List[str]_) | `extractImageBlockTypes` (_string[]_) | The types of elements to extract, for use in extracting image blocks as Base64 encoded data stored in element metadata fields, for example: `["Image","Table"]`. Supported filetypes are image and PDF. [Learn more](/api/partition/extract-image-block-types). | +| `extract_image_block_types` (_List[str]_) | `extractImageBlockTypes` (_string[]_) | The types of elements to extract, for use in extracting image blocks as Base64 encoded data stored in element metadata fields, for example: `["Image","Table"]`. Supported filetypes are image and PDF. [Learn more](/api-reference/partition/extract-image-block-types). | | `gz_uncompressed_content_type` (_str_) | `gzUncompressedContentType` (_string_) | If file is gzipped, use this content type after unzipping. Example: `application/pdf` | -| `hi_res_model_name` (_str_) | `hiResModelName` (_string_) | The name of the inference model used when strategy is `hi_res`. Options are `layout_v1.1.0` and `yolox`. Default: `layout_v1.1.0`. [Learn more](/api/partition/examples#changing-partition-strategy-for-a-pdf). | +| `hi_res_model_name` (_str_) | `hiResModelName` (_string_) | The name of the inference model used when strategy is `hi_res`. Options are `layout_v1.1.0` and `yolox`. Default: `layout_v1.1.0`. [Learn more](/api-reference/partition/examples#changing-partition-strategy-for-a-pdf). | | `include_page_breaks` (_bool_) | `includePageBreaks` (_boolean_) | True for the output to include page breaks if the filetype supports it. Default: false. | -| `languages` (_List[str]_) | `languages` (_string[]_) | The languages present in the document, for use in partitioning and OCR. [View the list of available languages](https://github.com/tesseract-ocr/tessdata). [Learn more](/api/partition/examples#specifying-the-language-of-a-document-for-better-ocr-results). | +| `languages` (_List[str]_) | `languages` (_string[]_) | The languages present in the document, for use in partitioning and OCR. [View the list of available languages](https://github.com/tesseract-ocr/tessdata). [Learn more](/api-reference/partition/examples#specifying-the-language-of-a-document-for-better-ocr-results). | | `output_format` (_str_) | `outputFormat` (_string_) | The format of the response. Supported formats are `application/json` and `text/csv`. Default: `application/json`. | | `pdf_infer_table_structure` (_bool_) | `pdfInferTableStructure` (_boolean_) | **Deprecated!** If true and `strategy` is `hi_res`, any `Table` elements extracted from a PDF will include an additional metadata field, `text_as_html`, where the value (string) is a just a transformation of the data into an HTML table. | | `skip_infer_table_types` (_List[str]_) | `skipInferTableTypes` (_string[]_) | The document types that you want to skip table extraction for. Default: `[]`. | | `starting_page_number` (_int_) | `startingPageNumber` (_number_) | The page number to be be assigned to the first page in the document. This information will be included in elements' metadata and can be be especially useful when partitioning a document that is part of a larger document. | -| `strategy` (_str_) | `strategy` (_string_) | The strategy to use for partitioning PDF and image files. Options are `auto`, `vlm`, `hi_res`, `fast`, and `ocr_only`. Default: `auto`. [Learn more](/api/partition/partitioning). | +| `strategy` (_str_) | `strategy` (_string_) | The strategy to use for partitioning PDF and image files. Options are `auto`, `vlm`, `hi_res`, `fast`, and `ocr_only`. Default: `auto`. [Learn more](/api-reference/partition/partitioning). | | `unique_element_ids` (_bool_) | `uniqueElementIds` (_boolean_) | True to assign UUIDs to element IDs, which guarantees their uniqueness (useful when using them as primary keys in database). Otherwise a SHA-256 of the element's text is used. Default: false. | | `vlm_model` (_str_) | (Not yet available) | Applies only when `strategy` is `vlm`. The name of the vision language model (VLM) provider to use for partitioning. `vlm_model_provider` must also be specified. For a list of allowed values, see the end of this article. | | `vlm_model_provider` (_str_) | (Not yet available) | Applies only when `strategy` is `vlm`. The name of the vision language model (VLM) to use for partitioning. `vlm_model` must also be specified. For a list of allowed values, see the end of this article. | | `xml_keep_tags` (_bool_) | `xmlKeepTags` (_boolean_) | True to retain the XML tags in the output. Otherwise it will just extract the text from within the tags. Only applies to XML documents. | -The following parameters only apply when a chunking strategy is specified. Otherwise, they are ignored. [Learn more](/api/partition/chunking). +The following parameters only apply when a chunking strategy is specified. Otherwise, they are ignored. [Learn more](/api-reference/partition/chunking). | POST, Python | JavaScript/TypeScript | Description | |----------------------------------|-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -44,16 +44,16 @@ The following parameters only apply when a chunking strategy is specified. Other | `overlap_all` (_bool_) | `overlapAll` (_boolean_) | True to have an overlap also applied to "normal" chunks formed by combining whole elements. Use with caution, as this can introduce noise into otherwise clean semantic units. Default: none. | | `similarity_threshold` (_float_) | `similarityThreshold` (_number_) | Applies only when the chunking strategy is set to `by_similarity`. The minimum similarity text in consecutive elements must have to be included in the same chunk. Must be between 0.0 and 1.0, exclusive (0.01 to 0.99, inclusive). Default: 0.5. | -The following parameters are specific to the Python and JavaScript/TypeScript clients and are not sent to the server. [Learn more](/api/partition/sdk-python#page-splitting). +The following parameters are specific to the Python and JavaScript/TypeScript clients and are not sent to the server. [Learn more](/api-reference/partition/sdk-python#page-splitting). | POST, Python | JavaScript/TypeScript | Description | |---------------------------------------|---------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | True to split the PDF file client-side. [Learn more](/api/partition/sdk-python#page-splitting). | +| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | True to split the PDF file client-side. [Learn more](/api-reference/partition/sdk-python#page-splitting). | | `split_pdf_allow_failed` (_bool_) | `splitPdfAllowFailed` (_boolean_) | When `true`, a failed split request will not stop the processing of the rest of the document. The affected page range will be ignored in the results. When `false`, a failed split request will cause the entire document to fail. Default: `false`. | | `split_pdf_concurrency_level` (_int_) | `splitPdfConcurrencyLevel` (_number_) | The number of split files to be sent concurrently. Default: 5. Maximum: 15. | | `split_pdf_page_range` (_List[int]_) | `splitPdfPageRange` (_number[]_) | A list of 2 integers within the range `[1, length_of_pdf]`. When pdf splitting is enabled, this will send only the specified page range to the API. | -Need help getting started? Check out the [Examples page](/api/partition/examples) for some inspiration. +Need help getting started? Check out the [Examples page](/api-reference/partition/examples) for some inspiration. Allowed values for `vlm_model_provider` and `vlm_model` pairs include the following: diff --git a/api/partition/api-validation-errors.mdx b/api-reference/partition/api-validation-errors.mdx similarity index 100% rename from api/partition/api-validation-errors.mdx rename to api-reference/partition/api-validation-errors.mdx diff --git a/api/partition/chunking.mdx b/api-reference/partition/chunking.mdx similarity index 100% rename from api/partition/chunking.mdx rename to api-reference/partition/chunking.mdx diff --git a/api/partition/document-elements.mdx b/api-reference/partition/document-elements.mdx similarity index 100% rename from api/partition/document-elements.mdx rename to api-reference/partition/document-elements.mdx diff --git a/api/partition/examples.mdx b/api-reference/partition/examples.mdx similarity index 99% rename from api/partition/examples.mdx rename to api-reference/partition/examples.mdx index 7921671e..18277919 100644 --- a/api/partition/examples.mdx +++ b/api-reference/partition/examples.mdx @@ -4,7 +4,7 @@ description: This page provides some examples of accessing Unstructured Partitio --- To use these examples, you'll first need to set an environment variable named `UNSTRUCTURED_API_KEY`, -representing your Unstructured API key. [Get your API key](/api/partition/overview). +representing your Unstructured API key. [Get your API key](/api-reference/partition/overview). For the POST and Unstructured JavaScript/TypeScript SDK examples, you'll also need to set an environment variable named `UNSTRUCTURED_API_URL` to the value `https://api.unstructuredapp.io/general/v0/general` diff --git a/api/partition/extract-image-block-types.mdx b/api-reference/partition/extract-image-block-types.mdx similarity index 84% rename from api/partition/extract-image-block-types.mdx rename to api-reference/partition/extract-image-block-types.mdx index 110c64e9..8875adef 100644 --- a/api/partition/extract-image-block-types.mdx +++ b/api-reference/partition/extract-image-block-types.mdx @@ -15,7 +15,7 @@ and then show it. ## To run this example You will need a document that is one of the document types supported by the `extract_image_block_types` argument. -See the `extract_image_block_types` entry in [API Parameters](/api/partition/api-parameters). +See the `extract_image_block_types` entry in [API Parameters](/api-reference/partition/api-parameters). This example uses a PDF file with embedded images and tables. import SharedAPIKeyURL from '/snippets/general-shared-text/api-key-url.mdx'; @@ -23,11 +23,11 @@ import ExtractImageBlockTypesPy from '/snippets/how-to-api/extract_image_block_t ## Code -For the [Unstructured Python SDK](/api/partition/sdk-python), you'll need: +For the [Unstructured Python SDK](/api-reference/partition/sdk-python), you'll need: ## See also -- [Extract text as HTML](/api/partition/text-as-html) +- [Extract text as HTML](/api-reference/partition/text-as-html) - [Table extraction from PDF](/examplecode/codesamples/apioss/table-extraction-from-pdf) \ No newline at end of file diff --git a/api/partition/generate-schema.mdx b/api-reference/partition/generate-schema.mdx similarity index 100% rename from api/partition/generate-schema.mdx rename to api-reference/partition/generate-schema.mdx diff --git a/api/partition/get-chunked-elements.mdx b/api-reference/partition/get-chunked-elements.mdx similarity index 94% rename from api/partition/get-chunked-elements.mdx rename to api-reference/partition/get-chunked-elements.mdx index 1990151c..00e6df7e 100644 --- a/api/partition/get-chunked-elements.mdx +++ b/api-reference/partition/get-chunked-elements.mdx @@ -55,11 +55,11 @@ You will need to chunk a document during processing. This example uses a PDF fil import GetChunkedElementsPy from '/snippets/how-to-api/get_chunked_elements.py.mdx'; import SharedAPIKeyURL from '/snippets/general-shared-text/api-key-url.mdx'; -For the [Unstructured Python SDK](/api/partition/sdk-python), you'll need: +For the [Unstructured Python SDK](/api-reference/partition/sdk-python), you'll need: ## See also - [Recovering chunk elements](/open-source/core-functionality/chunking#recovering-chunk-elements) -- [Chunking strategies](/api/partition/chunking) \ No newline at end of file +- [Chunking strategies](/api-reference/partition/chunking) \ No newline at end of file diff --git a/api/partition/get-elements.mdx b/api-reference/partition/get-elements.mdx similarity index 91% rename from api/partition/get-elements.mdx rename to api-reference/partition/get-elements.mdx index 4c9d8593..b0c9d601 100644 --- a/api/partition/get-elements.mdx +++ b/api-reference/partition/get-elements.mdx @@ -4,7 +4,7 @@ title: Get element contents ## Task -You want to get, manipulate, and print or save, the contents of the [document elements and metadata](/api/partition/document-elements) from the processed data that Unstructured returns. +You want to get, manipulate, and print or save, the contents of the [document elements and metadata](/api-reference/partition/document-elements) from the processed data that Unstructured returns. ## Approach @@ -14,7 +14,7 @@ The programmatic approach you take to get these document elements will depend on - For the [Unstructured Python SDK](/api/partition/sdk-python), calling an `UnstructuredClient` object's `general.partition_async` method returns a `PartitionResponse` object. + For the [Unstructured Python SDK](/api-reference/partition/sdk-python), calling an `UnstructuredClient` object's `general.partition_async` method returns a `PartitionResponse` object. This `PartitionResponse` object's `elements` variable contains a list of key-value dictionaries (`List[Dict[str, Any]]`). For example: @@ -78,7 +78,7 @@ The programmatic approach you take to get these document elements will depend on ``` - For the [Unstructured JavaScript/TypeScript SDK](/api/partition/sdk-jsts), calling an `UnstructuredClient` object's `general.partition` method returns a `Promise` object. + For the [Unstructured JavaScript/TypeScript SDK](/api-reference/partition/sdk-jsts), calling an `UnstructuredClient` object's `general.partition` method returns a `Promise` object. This `PartitionResponse` object's `elements` property contains an `Array` of string-value objects (`{ [k: string]: any; }[]`). For example: diff --git a/api/partition/output-bounding-box-coordinates.mdx b/api-reference/partition/output-bounding-box-coordinates.mdx similarity index 100% rename from api/partition/output-bounding-box-coordinates.mdx rename to api-reference/partition/output-bounding-box-coordinates.mdx diff --git a/api/partition/overview.mdx b/api-reference/partition/overview.mdx similarity index 96% rename from api/partition/overview.mdx rename to api-reference/partition/overview.mdx index 9ac9cfec..902d3325 100644 --- a/api/partition/overview.mdx +++ b/api-reference/partition/overview.mdx @@ -2,9 +2,9 @@ title: Overview --- -The Unstructured Partition Endpoint, part of the [Unstructured API](/api/overview), is intended for rapid prototyping of Unstructured's +The Unstructured Partition Endpoint, part of the [Unstructured API](/api-reference/overview), is intended for rapid prototyping of Unstructured's various partitioning strategies, with limited support for chunking. It is designed to work only with processing of local files, one file -at a time. Use the [Unstructured Workflow Endpoint](/api/workflow/overview) for production-level scenarios, file processing in +at a time. Use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) for production-level scenarios, file processing in batches, files and data in remote locations, generating embeddings, applying post-transform enrichments, using the latest and highest-performing models, and for the highest quality results at the lowest cost. @@ -159,7 +159,7 @@ available for download from [https://constitutioncenter.org/media/files/constitu -You can also call the Unstructured Partition Endpoint by using the [Unstructured Python SDK](/api/partition/sdk-python) or the [Unstructured JavaScript/TypeScript SDK](/api/partition/sdk-jsts). +You can also call the Unstructured Partition Endpoint by using the [Unstructured Python SDK](/api-reference/partition/sdk-python) or the [Unstructured JavaScript/TypeScript SDK](/api-reference/partition/sdk-jsts). ## Telemetry diff --git a/api/partition/partitioning.mdx b/api-reference/partition/partitioning.mdx similarity index 100% rename from api/partition/partitioning.mdx rename to api-reference/partition/partitioning.mdx diff --git a/api/partition/pipeline-1.mdx b/api-reference/partition/pipeline-1.mdx similarity index 100% rename from api/partition/pipeline-1.mdx rename to api-reference/partition/pipeline-1.mdx diff --git a/api/partition/post-requests.mdx b/api-reference/partition/post-requests.mdx similarity index 89% rename from api/partition/post-requests.mdx rename to api-reference/partition/post-requests.mdx index 253a9091..671ef6c9 100644 --- a/api/partition/post-requests.mdx +++ b/api-reference/partition/post-requests.mdx @@ -9,7 +9,7 @@ import SharedAPIKeyURL from '/snippets/general-shared-text/api-key-url.mdx'; -[Get your API key](/api/partition/overview). +[Get your API key](/api-reference/partition/overview). The API URL is `https://api.unstructuredapp.io/general/v0/general` @@ -32,14 +32,14 @@ curl --request 'POST' \ ``` In the example above we're representing the API endpoint with the environment variable `UNSTRUCTURED_API_URL`. Note, however, that you also need to authenticate yourself with -your individual API Key, represented by the environment variable `UNSTRUCTURED_API_KEY`. Learn how to obtain an API URL and API key in the [Unstructured Partition Endpoint guide](/api/partition/overview). +your individual API Key, represented by the environment variable `UNSTRUCTURED_API_KEY`. Learn how to obtain an API URL and API key in the [Unstructured Partition Endpoint guide](/api-reference/partition/overview). ## Parameters & examples The API parameters are the same across all methods of accessing the Unstructured Partition Endpoint. -* Refer to the [API parameters](/api/partition/api-parameters) page for the full list of available parameters. -* Refer to the [Examples](/api/partition/examples) page for some inspiration on using the parameters. +* Refer to the [API parameters](/api-reference/partition/api-parameters) page for the full list of available parameters. +* Refer to the [Examples](/api-reference/partition/examples) page for some inspiration on using the parameters. [//]: # (TODO: when we have the concepts page shared across products, link it from here for the users to learn about partition strategies, chunking strategies and other important shared concepts) diff --git a/api/partition/sdk-jsts.mdx b/api-reference/partition/sdk-jsts.mdx similarity index 97% rename from api/partition/sdk-jsts.mdx rename to api-reference/partition/sdk-jsts.mdx index 547bfc03..bde98e82 100644 --- a/api/partition/sdk-jsts.mdx +++ b/api-reference/partition/sdk-jsts.mdx @@ -5,7 +5,7 @@ title: JavaScript/TypeScript SDK The [Unstructured JavaScript/TypeScript SDK](https://github.com/Unstructured-IO/unstructured-js-client) client allows you to send one file at a time for processing by the Unstructured Partition Endpoint. To use the JavaScript/TypeScript SDK, you'll first need to set an environment variable named `UNSTRUCTURED_API_KEY`, -representing your Unstructured API key. [Get your API key](/api/partition/overview). +representing your Unstructured API key. [Get your API key](/api-reference/partition/overview). ## Installation @@ -286,6 +286,6 @@ The parameter names used in this document are for the JavaScript/TypeScript SDK, convention. The Python SDK follows the `snake_case` convention. Other than this difference in naming convention, the names used in the SDKs are the same across all methods. -* Refer to the [API parameters](/api/partition/api-parameters) page for the full list of available parameters. -* Refer to the [Examples](/api/partition/examples) page for some inspiration on using the parameters. +* Refer to the [API parameters](/api-reference/partition/api-parameters) page for the full list of available parameters. +* Refer to the [Examples](/api-reference/partition/examples) page for some inspiration on using the parameters. diff --git a/api/partition/sdk-python.mdx b/api-reference/partition/sdk-python.mdx similarity index 96% rename from api/partition/sdk-python.mdx rename to api-reference/partition/sdk-python.mdx index 6553d820..f8d48304 100644 --- a/api/partition/sdk-python.mdx +++ b/api-reference/partition/sdk-python.mdx @@ -3,10 +3,10 @@ title: Python SDK --- The [Unstructured Python SDK](https://github.com/Unstructured-IO/unstructured-python-client) client allows you to send one file at a time for processing by -the [Unstructured Partition Endpoint](/api/partition/overview). +the [Unstructured Partition Endpoint](/api-reference/partition/overview). To use the Python SDK, you'll first need to set an environment variable named `UNSTRUCTURED_API_KEY`, -representing your Unstructured API key. [Get your API key](/api/partition/overview). +representing your Unstructured API key. [Get your API key](/api-reference/partition/overview). ## Installation @@ -250,8 +250,8 @@ The parameter names used in this document are for the Python SDK, which follow s convention. Other than this difference in naming convention, the names used in the SDKs are the same across all methods. -* Refer to the [API parameters](/api/partition/api-parameters) page for the full list of available parameters. -* Refer to the [Examples](/api/partition/examples) page for some inspiration on using the parameters. +* Refer to the [API parameters](/api-reference/partition/api-parameters) page for the full list of available parameters. +* Refer to the [Examples](/api-reference/partition/examples) page for some inspiration on using the parameters. ## Migration guide diff --git a/api/partition/speed-up-large-files-batches.mdx b/api-reference/partition/speed-up-large-files-batches.mdx similarity index 82% rename from api/partition/speed-up-large-files-batches.mdx rename to api-reference/partition/speed-up-large-files-batches.mdx index dcfedb6f..5d7aac89 100644 --- a/api/partition/speed-up-large-files-batches.mdx +++ b/api-reference/partition/speed-up-large-files-batches.mdx @@ -4,13 +4,13 @@ title: Speed up processing of large files and batches When you use Unstructured, here are some techniques that you can try to help speed up the processing of large files and large batches of files. -Choose your partitioning strategy wisely. For example, if you have simple PDFs that don't have images and tables, you might be able to use the `fast` strategy. Try the `fast` strategy on a few of your documents before you try using the `hi_res` strategy. [Learn more](/api/partition/partitioning). +Choose your partitioning strategy wisely. For example, if you have simple PDFs that don't have images and tables, you might be able to use the `fast` strategy. Try the `fast` strategy on a few of your documents before you try using the `hi_res` strategy. [Learn more](/api-reference/partition/partitioning). -To speed up PDF file processing, the [Unstructured SDK for Python](/api/partition/sdk-python) and the [Unstructured SDK for JavaScript/TypeScript](/api/partition/sdk-jsts) provide the following parameters to help speed up processing a large PDF file: +To speed up PDF file processing, the [Unstructured SDK for Python](/api-reference/partition/sdk-python) and the [Unstructured SDK for JavaScript/TypeScript](/api-reference/partition/sdk-jsts) provide the following parameters to help speed up processing a large PDF file: - `split_pdf_page` (Python) or `splitPdfPage` (JavaScript/TypeScript), when set to true, splits the PDF file on the client side before sending it as batches to Unstructured for processing. The number of pages in each batch is determined internally. Batches can contain between 2 and 20 pages. - `split_pdf_concurrency_level` (Python) or `splitPdfConcurrencyLevel` (JavaScript/TypeScript) is an integer that specifies the number of parallel requests. The default is 5. The maximum is 15. This behavior is ignored unless `split_pdf_page` (Python) or `splitPdfPage` (JavaScript/TypeScript) is also set to true. - `split_pdf_allow_failed` (Python) or splitPdfAllowFailed` (JavaScript/TypeScript), when set to true, allows partitioning to continue even if some pages fail. - `split_pdf_page_range` (Python only) is a list of two integers that specify the beginning and ending page numbers of the PDF file to be sent. A `ValueError` is raised if the specified range is not valid. This behavior is ignored unless `split_pdf_page` is also set to true. -[Learn more](/api/partition/sdk-python#page-splitting). +[Learn more](/api-reference/partition/sdk-python#page-splitting). diff --git a/api/partition/text-as-html.mdx b/api-reference/partition/text-as-html.mdx similarity index 84% rename from api/partition/text-as-html.mdx rename to api-reference/partition/text-as-html.mdx index 4e4d45f6..f9987e2c 100644 --- a/api/partition/text-as-html.mdx +++ b/api-reference/partition/text-as-html.mdx @@ -21,11 +21,11 @@ import ExtractTextAsHTMLPy from '/snippets/how-to-api/extract_text_as_html.py.md ## Code -For the [Unstructured Python SDK](/api/partition/sdk-python), you'll need: +For the [Unstructured Python SDK](/api-reference/partition/sdk-python), you'll need: ## See also -- [Extract images and tables from documents](/api/partition/extract-image-block-types) +- [Extract images and tables from documents](/api-reference/partition/extract-image-block-types) - [Table Extraction from PDF](/examplecode/codesamples/apioss/table-extraction-from-pdf) \ No newline at end of file diff --git a/api/partition/transform-schemas.mdx b/api-reference/partition/transform-schemas.mdx similarity index 100% rename from api/partition/transform-schemas.mdx rename to api-reference/partition/transform-schemas.mdx diff --git a/api/supported-file-types.mdx b/api-reference/supported-file-types.mdx similarity index 100% rename from api/supported-file-types.mdx rename to api-reference/supported-file-types.mdx diff --git a/api/troubleshooting/api-key-url.mdx b/api-reference/troubleshooting/api-key-url.mdx similarity index 88% rename from api/troubleshooting/api-key-url.mdx rename to api-reference/troubleshooting/api-key-url.mdx index c5f596e0..ff2bb64c 100644 --- a/api/troubleshooting/api-key-url.mdx +++ b/api-reference/troubleshooting/api-key-url.mdx @@ -37,10 +37,10 @@ API error occurred: Status 404 For the API URL, note the following: -- For the [Unstructured Workflow Endpoint](/api/workflow/overview), the API URL is typically `https://platform.unstructuredapp.io/api/v1`. -- For the [Unstructured Partition Endpoint](/api/partition/overview), the API URL is typically `https://api.unstructuredapp.io/general/v0/general`. +- For the [Unstructured Workflow Endpoint](/api-reference/workflow/overview), the API URL is typically `https://platform.unstructuredapp.io/api/v1`. +- For the [Unstructured Partition Endpoint](/api-reference/partition/overview), the API URL is typically `https://api.unstructuredapp.io/general/v0/general`. -For the API key, the same API key works for both the [Unstructured Workflow Endpoint](/api/workflow/overview) key or [Unstructured Partition Endpoint](/api/partition/overview). This API key is in your Unstructured account dashboard. To access your dashboard: +For the API key, the same API key works for both the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) key or [Unstructured Partition Endpoint](/api-reference/partition/overview). This API key is in your Unstructured account dashboard. To access your dashboard: ![Unstructured account settings](/img/ui/AccountSettings.png) diff --git a/api/workflow/destinations/astradb.mdx b/api-reference/workflow/destinations/astradb.mdx similarity index 100% rename from api/workflow/destinations/astradb.mdx rename to api-reference/workflow/destinations/astradb.mdx diff --git a/api/workflow/destinations/azure-ai-search.mdx b/api-reference/workflow/destinations/azure-ai-search.mdx similarity index 100% rename from api/workflow/destinations/azure-ai-search.mdx rename to api-reference/workflow/destinations/azure-ai-search.mdx diff --git a/api/workflow/destinations/couchbase.mdx b/api-reference/workflow/destinations/couchbase.mdx similarity index 100% rename from api/workflow/destinations/couchbase.mdx rename to api-reference/workflow/destinations/couchbase.mdx diff --git a/api/workflow/destinations/databricks-delta-table.mdx b/api-reference/workflow/destinations/databricks-delta-table.mdx similarity index 88% rename from api/workflow/destinations/databricks-delta-table.mdx rename to api-reference/workflow/destinations/databricks-delta-table.mdx index 8c512a12..249b784e 100644 --- a/api/workflow/destinations/databricks-delta-table.mdx +++ b/api-reference/workflow/destinations/databricks-delta-table.mdx @@ -6,10 +6,10 @@ title: Delta Tables in Databricks This article covers connecting Unstructured to Delta Tables in Databricks. For information about connecting Unstructured to Delta Tables in Amazon S3 instead, see - [Delta Tables in Amazon S3](/api/workflow/destinations/delta-table). + [Delta Tables in Amazon S3](/api-reference/workflow/destinations/delta-table). For information about connecting Unstructured to Databricks Volumes instead, see - [Databricks Volumes](/api/workflow/destinations/databricks-volumes). + [Databricks Volumes](/api-reference/workflow/destinations/databricks-volumes). Send processed data from Unstructured to a Delta Table in Databricks. diff --git a/api/workflow/destinations/databricks-volumes.mdx b/api-reference/workflow/destinations/databricks-volumes.mdx similarity index 91% rename from api/workflow/destinations/databricks-volumes.mdx rename to api-reference/workflow/destinations/databricks-volumes.mdx index 7170de5e..50a6dc41 100644 --- a/api/workflow/destinations/databricks-volumes.mdx +++ b/api-reference/workflow/destinations/databricks-volumes.mdx @@ -6,7 +6,7 @@ title: Databricks Volumes This article covers connecting Unstructured to Databricks Volumes. For information about connecting Unstructured to Delta Tables in Databricks instead, see - [Delta Tables in Databricks](/api/workflow/destinations/databricks-delta-table). + [Delta Tables in Databricks](/api-reference/workflow/destinations/databricks-delta-table). Send processed data from Unstructured to Databricks Volumes. diff --git a/api/workflow/destinations/delta-table.mdx b/api-reference/workflow/destinations/delta-table.mdx similarity index 91% rename from api/workflow/destinations/delta-table.mdx rename to api-reference/workflow/destinations/delta-table.mdx index 78e7eb42..8813c1ac 100644 --- a/api/workflow/destinations/delta-table.mdx +++ b/api-reference/workflow/destinations/delta-table.mdx @@ -5,7 +5,7 @@ title: Delta Tables in Amazon S3 This article covers connecting Unstructured to Delta Tables in Amazon S3. For information about connecting Unstructured to Delta Tables in Databricks instead, see - [Delta Tables in Databricks](/api/workflow/destinations/databricks-delta-table). + [Delta Tables in Databricks](/api-reference/workflow/destinations/databricks-delta-table). Send processed data from Unstructured to a Delta Table, stored in Amazon S3. diff --git a/api/workflow/destinations/elasticsearch.mdx b/api-reference/workflow/destinations/elasticsearch.mdx similarity index 100% rename from api/workflow/destinations/elasticsearch.mdx rename to api-reference/workflow/destinations/elasticsearch.mdx diff --git a/api/workflow/destinations/google-cloud.mdx b/api-reference/workflow/destinations/google-cloud.mdx similarity index 100% rename from api/workflow/destinations/google-cloud.mdx rename to api-reference/workflow/destinations/google-cloud.mdx diff --git a/api/workflow/destinations/kafka.mdx b/api-reference/workflow/destinations/kafka.mdx similarity index 100% rename from api/workflow/destinations/kafka.mdx rename to api-reference/workflow/destinations/kafka.mdx diff --git a/api/workflow/destinations/milvus.mdx b/api-reference/workflow/destinations/milvus.mdx similarity index 100% rename from api/workflow/destinations/milvus.mdx rename to api-reference/workflow/destinations/milvus.mdx diff --git a/api/workflow/destinations/mongodb.mdx b/api-reference/workflow/destinations/mongodb.mdx similarity index 100% rename from api/workflow/destinations/mongodb.mdx rename to api-reference/workflow/destinations/mongodb.mdx diff --git a/api/workflow/destinations/motherduck.mdx b/api-reference/workflow/destinations/motherduck.mdx similarity index 100% rename from api/workflow/destinations/motherduck.mdx rename to api-reference/workflow/destinations/motherduck.mdx diff --git a/api/workflow/destinations/neo4j.mdx b/api-reference/workflow/destinations/neo4j.mdx similarity index 100% rename from api/workflow/destinations/neo4j.mdx rename to api-reference/workflow/destinations/neo4j.mdx diff --git a/api/workflow/destinations/onedrive.mdx b/api-reference/workflow/destinations/onedrive.mdx similarity index 100% rename from api/workflow/destinations/onedrive.mdx rename to api-reference/workflow/destinations/onedrive.mdx diff --git a/api-reference/workflow/destinations/overview.mdx b/api-reference/workflow/destinations/overview.mdx new file mode 100644 index 00000000..8219337b --- /dev/null +++ b/api-reference/workflow/destinations/overview.mdx @@ -0,0 +1,42 @@ +--- +title: Overview +--- + +To use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to manage destination connectors, do the following: + +- To get a list of available destination connectors, use the `UnstructuredClient` object's `destinations.list_destinations` function (for the Python SDK) or + the `GET` method to call the `/destinations` endpoint (for `curl` or Postman).. [Learn more](/api-reference/workflow/overview#list-destination-connectors). +- To get information about a destination connector, use the `UnstructuredClient` object's `destinations.get_destination` function (for the Python SDK) or + the `GET` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-a-destination-connector). +- To create a destination connector, use the `UnstructuredClient` object's `destinations.create_destination` function (for the Python SDK) or + the `POST` method to call the `/destinations` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#create-a-destination-connector). +- To update a destination connector, use the `UnstructuredClient` object's `destinations.update_destination` function (for the Python SDK) or + the `PUT` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#update-a-destination-connector). +- To delete a destination connector, use the `UnstructuredClient` object's `destinations.delete_destination` function (for the Python SDK) or + the `DELETE` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#delete-a-destination-connector). + +To create or update a destination connector, you must also provide settings that are specific to that connector. +For the list of specific settings, see: + +- [Astra DB](/api-reference/workflow/destinations/astradb) (`ASTRADB` for the Python SDK or `astradb` for `curl` or Postman) +- [Azure AI Search](/api-reference/workflow/destinations/azure-ai-search) (`AZURE_AI_SEARCH` for the Python SDK or `azure_ai_search` for `curl` or Postman) +- [Couchbase](/api-reference/workflow/destinations/couchbase) (`COUCHBASE` for the Python SDK or `couchbase` for `curl` or Postman) +- [Databricks Volumes](/api-reference/workflow/destinations/databricks-volumes) (`DATABRICKS_VOLUMES` for the Python SDK or `databricks_volumes` for `curl` or Postman) +- [Delta Tables in Amazon S3](/api-reference/workflow/destinations/delta-table) (`DELTA_TABLE` for the Python SDK or `delta_table` for `curl` or Postman) +- [Delta Tables in Databricks](/api-reference/workflow/destinations/databricks-delta-table) (`DATABRICKS_VOLUME_DELTA_TABLES` for the Python SDK or `databricks_volume_delta_tables` for `curl` or Postman) +- [Elasticsearch](/api-reference/workflow/destinations/elasticsearch) (`ELASTICSEARCH` for the Python SDK or `elasticsearch` for `curl` or Postman) +- [Google Cloud Storage](/api-reference/workflow/destinations/google-cloud) (`GCS` for the Python SDK or `gcs` for `curl` or Postman) +- [Kafka](/api-reference/workflow/destinations/kafka) (`KAFKA_CLOUD` for the Python SDK or `kafka-cloud` for `curl` or Postman) +- [Milvus](/api-reference/workflow/destinations/milvus) (`MILVUS` for the Python SDK or `milvus` for `curl` or Postman) +- [MongoDB](/api-reference/workflow/destinations/mongodb) (`MONGODB` for the Python SDK or `mongodb` for `curl` or Postman) +- [MotherDuck](/api-reference/workflow/destinations/motherduck) (`MOTHERDUCK` for the Python SDK or `motherduck` for `curl` or Postman) +- [Neo4j](/api-reference/workflow/destinations/neo4j) (`NEO4J` for the Python SDK or `neo4j` for `curl` or Postman) +- [OneDrive](/api-reference/workflow/destinations/onedrive) (`ONEDRIVE` for the Python SDK or `onedrive` for `curl` or Postman) +- [Pinecone](/api-reference/workflow/destinations/pinecone) (`PINECONE` for the Python SDK or `pinecone` for `curl` or Postman) +- [PostgreSQL](/api-reference/workflow/destinations/postgresql) (`POSTGRES` for the Python SDK or `postgres` for `curl` or Postman) +- [Qdrant](/api-reference/workflow/destinations/qdrant) (`QDRANT_CLOUD` for the Python SDK or `qdrant-cloud` for `curl` or Postman) +- [Redis](/api-reference/workflow/destinations/redis) (`REDIS` for the Python SDK or `redis` for `curl` or Postman) +- [Snowflake](/api-reference/workflow/destinations/snowflake) (`SNOWFLAKE` for the Python SDK or `snowflake` for `curl` or Postman) +- [S3](/api-reference/workflow/destinations/s3) (`S3` for the Python SDK or `s3` for `curl` or Postman) +- [Weaviate](/api-reference/workflow/destinations/weaviate) (`WEAVIATE` for the Python SDK or `weaviate` for `curl` or Postman) + diff --git a/api/workflow/destinations/pinecone.mdx b/api-reference/workflow/destinations/pinecone.mdx similarity index 100% rename from api/workflow/destinations/pinecone.mdx rename to api-reference/workflow/destinations/pinecone.mdx diff --git a/api/workflow/destinations/postgresql.mdx b/api-reference/workflow/destinations/postgresql.mdx similarity index 100% rename from api/workflow/destinations/postgresql.mdx rename to api-reference/workflow/destinations/postgresql.mdx diff --git a/api/workflow/destinations/qdrant.mdx b/api-reference/workflow/destinations/qdrant.mdx similarity index 100% rename from api/workflow/destinations/qdrant.mdx rename to api-reference/workflow/destinations/qdrant.mdx diff --git a/api/workflow/destinations/redis.mdx b/api-reference/workflow/destinations/redis.mdx similarity index 100% rename from api/workflow/destinations/redis.mdx rename to api-reference/workflow/destinations/redis.mdx diff --git a/api/workflow/destinations/s3.mdx b/api-reference/workflow/destinations/s3.mdx similarity index 100% rename from api/workflow/destinations/s3.mdx rename to api-reference/workflow/destinations/s3.mdx diff --git a/api/workflow/destinations/snowflake.mdx b/api-reference/workflow/destinations/snowflake.mdx similarity index 100% rename from api/workflow/destinations/snowflake.mdx rename to api-reference/workflow/destinations/snowflake.mdx diff --git a/api/workflow/destinations/weaviate.mdx b/api-reference/workflow/destinations/weaviate.mdx similarity index 100% rename from api/workflow/destinations/weaviate.mdx rename to api-reference/workflow/destinations/weaviate.mdx diff --git a/api/workflow/jobs.mdx b/api-reference/workflow/jobs.mdx similarity index 58% rename from api/workflow/jobs.mdx rename to api-reference/workflow/jobs.mdx index 5ddd8fdc..423cb184 100644 --- a/api/workflow/jobs.mdx +++ b/api-reference/workflow/jobs.mdx @@ -2,13 +2,13 @@ title: Jobs --- -To use the [Unstructured Workflow Endpoint](/api/workflow/overview) to manage jobs, do the following: +To use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to manage jobs, do the following: - To get a list of available jobs, use the `UnstructuredClient` object's `jobs.list_jobs` function (for the Python SDK) or - the `GET` method to call the `/jobs` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#list-jobs). + the `GET` method to call the `/jobs` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-jobs). - To get information about a job, use the `UnstructuredClient` object's `jobs.get_job` function (for the Python SDK) or - the `GET` method to call the `/jobs/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#get-a-job). -- A job is created automatically whenever a workflow runs on a schedule; see [Create a workflow](/api/workflow/workflows#create-a-workflow). - A job is also created whenever you run a workflow manually; see [Run a workflow](/api/workflow/overview#run-a-workflow). + the `GET` method to call the `/jobs/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-a-job). +- A job is created automatically whenever a workflow runs on a schedule; see [Create a workflow](/api-reference/workflow/workflows#create-a-workflow). + A job is also created whenever you run a workflow manually; see [Run a workflow](/api-reference/workflow/overview#run-a-workflow). - To cancel a running job, use the `UnstructuredClient` object's `jobs.cancel_job` function (for the Python SDK) or - the `POST` method to call the `/jobs//cancel` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#cancel-a-job). \ No newline at end of file + the `POST` method to call the `/jobs//cancel` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#cancel-a-job). \ No newline at end of file diff --git a/api/workflow/overview.mdx b/api-reference/workflow/overview.mdx similarity index 98% rename from api/workflow/overview.mdx rename to api-reference/workflow/overview.mdx index decde5ef..f74e4610 100644 --- a/api/workflow/overview.mdx +++ b/api-reference/workflow/overview.mdx @@ -5,7 +5,7 @@ title: Overview The [Unstructured UI](/ui/overview) features a no-code user interface for transforming your unstructured data into data that is ready for Retrieval Augmented Generation (RAG). -The Unstructured Workflow Endpoint, part of the [Unstructured API](/api/overview), enables a full range of partitioning, chunking, embedding, and +The Unstructured Workflow Endpoint, part of the [Unstructured API](/api-reference/overview), enables a full range of partitioning, chunking, embedding, and enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to various storage, databases, and vector stores; and use the latest and highest-performing models on the market today. It has built-in logic to deliver the highest quality results at the lowest cost. @@ -190,8 +190,8 @@ Skip ahead to start learning about how to use the REST endpoints to work with The following Unstructured SDKs, tools, and libraries do _not_ work with the Unstructured Workflow Endpoint: -- The [Unstructured JavaScript/TypeScript SDK](/api/partition/sdk-jsts) -- [Local single-file POST requests](/api/partition/sdk-jsts) to the Unstructured Partition Endpoint +- The [Unstructured JavaScript/TypeScript SDK](/api-reference/partition/sdk-jsts) +- [Local single-file POST requests](/api-reference/partition/sdk-jsts) to the Unstructured Partition Endpoint - The [Unstructured open source Python library](/open-source/introduction/overview) - The [Unstructued Ingest CLI](/ingestion/ingest-cli) - The [Unstructured Ingest Python library](/ingestion/python-ingest) @@ -222,7 +222,7 @@ To filter the list of source connectors, use the `ListSourcesRequest` object's ` or the query parameter `source_type=` (for `curl` or Postman), replacing `` with the source connector type's unique ID (for example, for the Amazon S3 source connector type, `S3` for the Python SDK or `s3` for `curl` or Postman). -To get this ID, see [Sources](/api/workflow/sources/overview). +To get this ID, see [Sources](/api-reference/workflow/sources/overview). @@ -418,10 +418,10 @@ the `POST` method to call the `/sources` endpoint (for `curl` or Postman). In the `CreateSourceConnector` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see -[Sources](/api/workflow/sources/overview). +[Sources](/api-reference/workflow/sources/overview). For the Python SDK, replace `` with the source connector type's unique ID (for example, for the Amazon S3 source connector type, `S3`). -To get this ID, see [Sources](/api/workflow/sources/overview). +To get this ID, see [Sources](/api-reference/workflow/sources/overview). @@ -547,10 +547,10 @@ the `PUT` method to call the `/sources/` endpoint (for `curl` or P In the `UpdateSourceConnector` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see -[Sources](/api/workflow/sources/overview). +[Sources](/api-reference/workflow/sources/overview). For the Python SDK, replace `` with the source connector type's unique ID (for example, for the Amazon S3 source connector type, `S3`). -To get this ID, see [Sources](/api/workflow/sources/overview). +To get this ID, see [Sources](/api-reference/workflow/sources/overview). You must specify all of the settings for the connector, even for settings that are not changing. @@ -753,7 +753,7 @@ To filter the list of destination connectors, use the `ListDestinationsRequest` the query parameter `destination_type=` (for `curl` or Postman), replacing `` with the destination connector type's unique ID (for example, for the Amazon S3 source connector type, `S3` for the Python SDK or `s3` for `curl` or Postman). -To get this ID, see [Destinations](/api/workflow/destinations/overview). +To get this ID, see [Destinations](/api-reference/workflow/destinations/overview). @@ -948,10 +948,10 @@ the `POST` method to call the `/destinations` endpoint (for `curl` or Postman). In the `CreateDestinationConnector` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see -[Destinations](/api/workflow/destinations/overview). +[Destinations](/api-reference/workflow/destinations/overview). For the Python SDK, replace `` with the destination connector type's unique ID (for example, for the Amazon S3 source connector type, `S3`). -To get this ID, see [Destinations](/api/workflow/destinations/overview). +To get this ID, see [Destinations](/api-reference/workflow/destinations/overview). @@ -1076,7 +1076,7 @@ the `PUT` method to call the `/destinations/` endpoint (for `curl` In the `UpdateDestinationConnector` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see -[Destinations](/api/workflow/destinations/overview). +[Destinations](/api-reference/workflow/destinations/overview). You must specify all of the settings for the connector, even for settings that are not changing. @@ -1541,7 +1541,7 @@ the `POST` method to call the `/workflows` endpoint (for `curl` or Postman). In the `CreateWorkflow` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the workflow. For the specific settings to include, see -[Create a workflow](/api/workflow/workflows#create-a-workflow). +[Create a workflow](/api-reference/workflow/workflows#create-a-workflow). @@ -1757,7 +1757,7 @@ the `POST` method to call the `/workflows//run` endpoint (for `curl To run a workflow on a schedule instead, specify the `schedule` setting in the request body when you create or update a -workflow. See [Create a workflow](/api/workflow/workflows#create-a-workflow) or [Update a workflow](/api/workflow/workflows#update-a-workflow). +workflow. See [Create a workflow](/api-reference/workflow/workflows#create-a-workflow) or [Update a workflow](/api-reference/workflow/workflows#update-a-workflow). ### Update a workflow @@ -1767,7 +1767,7 @@ the `PUT` method to call the `/workflows/` endpoint (for `curl` or In `UpdateWorkflow` object (for the Python SDK) or the request body (for `curl` or Postman), specify the settings for the workflow. For the specific settings to include, see -[Update a workflow](/api/workflow/workflows#update-a-workflow). +[Update a workflow](/api-reference/workflow/workflows#update-a-workflow). diff --git a/api/workflow/sources/azure-blob-storage.mdx b/api-reference/workflow/sources/azure-blob-storage.mdx similarity index 100% rename from api/workflow/sources/azure-blob-storage.mdx rename to api-reference/workflow/sources/azure-blob-storage.mdx diff --git a/api/workflow/sources/box.mdx b/api-reference/workflow/sources/box.mdx similarity index 100% rename from api/workflow/sources/box.mdx rename to api-reference/workflow/sources/box.mdx diff --git a/api/workflow/sources/confluence.mdx b/api-reference/workflow/sources/confluence.mdx similarity index 100% rename from api/workflow/sources/confluence.mdx rename to api-reference/workflow/sources/confluence.mdx diff --git a/api/workflow/sources/couchbase.mdx b/api-reference/workflow/sources/couchbase.mdx similarity index 100% rename from api/workflow/sources/couchbase.mdx rename to api-reference/workflow/sources/couchbase.mdx diff --git a/api/workflow/sources/databricks-volumes.mdx b/api-reference/workflow/sources/databricks-volumes.mdx similarity index 100% rename from api/workflow/sources/databricks-volumes.mdx rename to api-reference/workflow/sources/databricks-volumes.mdx diff --git a/api/workflow/sources/dropbox.mdx b/api-reference/workflow/sources/dropbox.mdx similarity index 100% rename from api/workflow/sources/dropbox.mdx rename to api-reference/workflow/sources/dropbox.mdx diff --git a/api/workflow/sources/elasticsearch.mdx b/api-reference/workflow/sources/elasticsearch.mdx similarity index 100% rename from api/workflow/sources/elasticsearch.mdx rename to api-reference/workflow/sources/elasticsearch.mdx diff --git a/api/workflow/sources/google-cloud.mdx b/api-reference/workflow/sources/google-cloud.mdx similarity index 100% rename from api/workflow/sources/google-cloud.mdx rename to api-reference/workflow/sources/google-cloud.mdx diff --git a/api/workflow/sources/google-drive.mdx b/api-reference/workflow/sources/google-drive.mdx similarity index 100% rename from api/workflow/sources/google-drive.mdx rename to api-reference/workflow/sources/google-drive.mdx diff --git a/api/workflow/sources/kafka.mdx b/api-reference/workflow/sources/kafka.mdx similarity index 100% rename from api/workflow/sources/kafka.mdx rename to api-reference/workflow/sources/kafka.mdx diff --git a/api/workflow/sources/mongodb.mdx b/api-reference/workflow/sources/mongodb.mdx similarity index 100% rename from api/workflow/sources/mongodb.mdx rename to api-reference/workflow/sources/mongodb.mdx diff --git a/api/workflow/sources/onedrive.mdx b/api-reference/workflow/sources/onedrive.mdx similarity index 100% rename from api/workflow/sources/onedrive.mdx rename to api-reference/workflow/sources/onedrive.mdx diff --git a/api/workflow/sources/outlook.mdx b/api-reference/workflow/sources/outlook.mdx similarity index 100% rename from api/workflow/sources/outlook.mdx rename to api-reference/workflow/sources/outlook.mdx diff --git a/api-reference/workflow/sources/overview.mdx b/api-reference/workflow/sources/overview.mdx new file mode 100644 index 00000000..87646c8b --- /dev/null +++ b/api-reference/workflow/sources/overview.mdx @@ -0,0 +1,40 @@ +--- +title: Overview +--- + +To use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to manage source connectors, do the following: + +- To get a list of available source connectors, use the `UnstructuredClient` object's `sources.list_sources` function (for the Python SDK) or + the `GET` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-source-connectors). +- To get information about a source connector, use the `UnstructuredClient` object's `sources.get_source` function (for the Python SDK) or + the `GET` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-a-source-connector). +- To create a source connector, use the `UnstructuredClient` object's `sources.create_source` function (for the Python SDK) or + the `POST` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#create-a-source-connector). +- To update a source connector, use the `UnstructuredClient` object's `sources.update_source` function (for the Python SDK) or + the `PUT` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#update-a-source-connector). +- To delete a source connector, use the `UnstructuredClient` object's `sources.delete_source` function (for the Python SDK) or + the `DELETE` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#delete-a-source-connector). + +To create or update a source connector, you must also provide settings that are specific to that connector. +For the list of specific settings, see: + +- [Azure](/api-reference/workflow/sources/azure-blob-storage) (`AZURE` for the Python SDK or `azure` for `curl` and Postman) +- [Box](/api-reference/workflow/sources/box) (`BOX` for the Python SDK or `box` for `curl` and Postman) +- [Confluence](/api-reference/workflow/sources/confluence) (`CONFLUENCE` for the Python SDK or `confluence` for `curl` and Postman) +- [Couchbase](/api-reference/workflow/sources/couchbase) (`COUCHBASE` for the Python SDK or `couchbase` for `curl` and Postman) +- [Databricks Volumes](/api-reference/workflow/sources/databricks-volumes) (`DATABRICKS_VOLUMES` for the Python SDK or `databricks_volumes` for `curl` and Postman) +- [Dropbox](/api-reference/workflow/sources/dropbox) (`DROPBOX` for the Python SDK or `dropbox` for `curl` and Postman) +- [Elasticsearch](/api-reference/workflow/sources/elasticsearch) (`ELASTICSEARCH` for the Python SDK or `elasticsearch` for `curl` and Postman) +- [Google Cloud Storage](/api-reference/workflow/sources/google-cloud) (`GCS` for the Python SDK or `gcs` for `curl` and Postman) +- [Google Drive](/api-reference/workflow/sources/google-drive) (`GOOGLE_DRIVE` for the Python SDK or `google_drive` for `curl` and Postman) +- [Kafka](/api-reference/workflow/sources/kafka) (`KAFKA_CLOUD` for the Python SDK or `kafka-cloud` for `curl` and Postman) +- [MongoDB](/api-reference/workflow/sources/mongodb) (`MONGODB` for the Python SDK or `mongodb` for `curl` and Postman) +- [OneDrive](/api-reference/workflow/sources/onedrive) (`ONEDRIVE` for the Python SDK or `onedrive` for `curl` and Postman) +- [Outlook](/api-reference/workflow/sources/outlook) (`OUTLOOK` for the Python SDK or `outlook` for `curl` and Postman) +- [PostgreSQL](/api-reference/workflow/sources/postgresql) (`POSTGRES` for the Python SDK or `postgres` for `curl` and Postman) +- [S3](/api-reference/workflow/sources/s3) (`S3` for the Python SDK or `s3` for `curl` and Postman) +- [Salesforce](/api-reference/workflow/sources/salesforce) (`SALESFORCE` for the Python SDK or `salesforce` for `curl` and Postman) +- [SharePoint](/api-reference/workflow/sources/sharepoint) (`SHAREPOINT` for the Python SDK or `sharepoint` for `curl` and Postman) +- [Snowflake](/api-reference/workflow/sources/snowflake) (`SNOWFLAKE` for the Python SDK or `snowflake` for `curl` and Postman) + + diff --git a/api/workflow/sources/postgresql.mdx b/api-reference/workflow/sources/postgresql.mdx similarity index 100% rename from api/workflow/sources/postgresql.mdx rename to api-reference/workflow/sources/postgresql.mdx diff --git a/api/workflow/sources/s3.mdx b/api-reference/workflow/sources/s3.mdx similarity index 100% rename from api/workflow/sources/s3.mdx rename to api-reference/workflow/sources/s3.mdx diff --git a/api/workflow/sources/salesforce.mdx b/api-reference/workflow/sources/salesforce.mdx similarity index 100% rename from api/workflow/sources/salesforce.mdx rename to api-reference/workflow/sources/salesforce.mdx diff --git a/api/workflow/sources/sharepoint.mdx b/api-reference/workflow/sources/sharepoint.mdx similarity index 100% rename from api/workflow/sources/sharepoint.mdx rename to api-reference/workflow/sources/sharepoint.mdx diff --git a/api/workflow/sources/snowflake.mdx b/api-reference/workflow/sources/snowflake.mdx similarity index 100% rename from api/workflow/sources/snowflake.mdx rename to api-reference/workflow/sources/snowflake.mdx diff --git a/api/workflow/workflows.mdx b/api-reference/workflow/workflows.mdx similarity index 97% rename from api/workflow/workflows.mdx rename to api-reference/workflow/workflows.mdx index cd9bd1e5..381e0507 100644 --- a/api/workflow/workflows.mdx +++ b/api-reference/workflow/workflows.mdx @@ -2,23 +2,23 @@ title: Workflows --- -To use the [Unstructured Workflow Endpoint](/api/workflow/overview) to manage workflows, do the following: +To use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to manage workflows, do the following: - To get a list of available workflows, use the `UnstructuredClient` object's `workflows.list_workflows` function (for the Python SDK) or - the `GET` method to call the `/workflows` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#list-workflows). + the `GET` method to call the `/workflows` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-workflows). - To get information about a workflow, use the `UnstructuredClient` object's `workflows.get_workflow` function (for the Python SDK) or - the `GET` method to call the `/workflows/` endpoint (for `curl` or Postman)use the `GET` method to call the `/workflows/` endpoint. [Learn more](/api/workflow/overview#get-a-workflow). + the `GET` method to call the `/workflows/` endpoint (for `curl` or Postman)use the `GET` method to call the `/workflows/` endpoint. [Learn more](/api-reference/workflow/overview#get-a-workflow). - To create a workflow, use the `UnstructuredClient` object's `workflows.create_workflow` function (for the Python SDK) or the `POST` method to call the `/workflows` endpoint (for `curl` or Postman). [Learn more](#create-a-workflow). - To run a workflow manually, use the `UnstructuredClient` object's `workflows.run_workflow` function (for the Python SDK) or - the `POST` method to call the `/workflows//run` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#run-a-workflow). + the `POST` method to call the `/workflows//run` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#run-a-workflow). - To update a workflow, use the `UnstructuredClient` object's `workflows.update_workflow` function (for the Python SDK) or the `PUT` method to call the `/workflows/` endpoint (for `curl` or Postman). [Learn more](#update-a-workflow). - To delete a workflow, use the `UnstructuredClient` object's `workflows.delete_workflow` function (for the Python SDK) or - the `DELETE` method to call the `/workflows/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#delete-a-workflow). + the `DELETE` method to call the `/workflows/` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#delete-a-workflow). -The following examples assume that you have already met the [requirements](/api/workflow/overview#requirements) and -understand the [basics](/api/workflow/overview#basics) of working with the Unstructured Workflow Endpoint. +The following examples assume that you have already met the [requirements](/api-reference/workflow/overview#requirements) and +understand the [basics](/api-reference/workflow/overview#basics) of working with the Unstructured Workflow Endpoint. ## Create a workflow @@ -269,10 +269,10 @@ Replace the preceding placeholders as follows: - `` (_required_) - A unique name for this workflow. - `` (_required_) - The ID of the target source connector. To get the ID, use the `UnstructuredClient` object's `sources.list_sources` function (for the Python SDK) or - the `GET` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#list-source-connectors). + the `GET` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-source-connectors). - `` (_required_) - The ID of the target destination connector. To get the ID, use the `UnstructuredClient` object's `destinations.list_destinations` function (for the Python SDK) or - the `GET` method to call the `/destinations` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#list-destination-connectors). + the `GET` method to call the `/destinations` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-destination-connectors). - `` (for the Python SDK) or `` (for `curl` or Postman) (_required_) - The workflow type. Available values include `CUSTOM` (for the Python SDK) and `custom` (for `curl` or Postman). If `` is set to `CUSTOM` (for the Python SDK), or if `` is set to `custom` (for `curl` or Postman), you must add a `workflow_nodes` array. For instructions, see [Custom workflow DAG nodes](#custom-workflow-dag-nodes). @@ -307,7 +307,7 @@ the `PUT` method to call the `/workflows/` endpoint (for `curl` or `` with the workflow's unique ID. To get this ID, see [List workflows](#list-workflows). In the request body, specify the settings for the workflow. For the specific settings to include, see -[Create a workflow](/api/workflow/workflows#create-a-workflow). +[Create a workflow](/api-reference/workflow/workflows#create-a-workflow). diff --git a/api/workflow/destinations/overview.mdx b/api/workflow/destinations/overview.mdx deleted file mode 100644 index 64d296a0..00000000 --- a/api/workflow/destinations/overview.mdx +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: Overview ---- - -To use the [Unstructured Workflow Endpoint](/api/workflow/overview) to manage destination connectors, do the following: - -- To get a list of available destination connectors, use the `UnstructuredClient` object's `destinations.list_destinations` function (for the Python SDK) or - the `GET` method to call the `/destinations` endpoint (for `curl` or Postman).. [Learn more](/api/workflow/overview#list-destination-connectors). -- To get information about a destination connector, use the `UnstructuredClient` object's `destinations.get_destination` function (for the Python SDK) or - the `GET` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#get-a-destination-connector). -- To create a destination connector, use the `UnstructuredClient` object's `destinations.create_destination` function (for the Python SDK) or - the `POST` method to call the `/destinations` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#create-a-destination-connector). -- To update a destination connector, use the `UnstructuredClient` object's `destinations.update_destination` function (for the Python SDK) or - the `PUT` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#update-a-destination-connector). -- To delete a destination connector, use the `UnstructuredClient` object's `destinations.delete_destination` function (for the Python SDK) or - the `DELETE` method to call the `/destinations/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#delete-a-destination-connector). - -To create or update a destination connector, you must also provide settings that are specific to that connector. -For the list of specific settings, see: - -- [Astra DB](/api/workflow/destinations/astradb) (`ASTRADB` for the Python SDK or `astradb` for `curl` or Postman) -- [Azure AI Search](/api/workflow/destinations/azure-ai-search) (`AZURE_AI_SEARCH` for the Python SDK or `azure_ai_search` for `curl` or Postman) -- [Couchbase](/api/workflow/destinations/couchbase) (`COUCHBASE` for the Python SDK or `couchbase` for `curl` or Postman) -- [Databricks Volumes](/api/workflow/destinations/databricks-volumes) (`DATABRICKS_VOLUMES` for the Python SDK or `databricks_volumes` for `curl` or Postman) -- [Delta Tables in Amazon S3](/api/workflow/destinations/delta-table) (`DELTA_TABLE` for the Python SDK or `delta_table` for `curl` or Postman) -- [Delta Tables in Databricks](/api/workflow/destinations/databricks-delta-table) (`DATABRICKS_VOLUME_DELTA_TABLES` for the Python SDK or `databricks_volume_delta_tables` for `curl` or Postman) -- [Elasticsearch](/api/workflow/destinations/elasticsearch) (`ELASTICSEARCH` for the Python SDK or `elasticsearch` for `curl` or Postman) -- [Google Cloud Storage](/api/workflow/destinations/google-cloud) (`GCS` for the Python SDK or `gcs` for `curl` or Postman) -- [Kafka](/api/workflow/destinations/kafka) (`KAFKA_CLOUD` for the Python SDK or `kafka-cloud` for `curl` or Postman) -- [Milvus](/api/workflow/destinations/milvus) (`MILVUS` for the Python SDK or `milvus` for `curl` or Postman) -- [MongoDB](/api/workflow/destinations/mongodb) (`MONGODB` for the Python SDK or `mongodb` for `curl` or Postman) -- [MotherDuck](/api/workflow/destinations/motherduck) (`MOTHERDUCK` for the Python SDK or `motherduck` for `curl` or Postman) -- [Neo4j](/api/workflow/destinations/neo4j) (`NEO4J` for the Python SDK or `neo4j` for `curl` or Postman) -- [OneDrive](/api/workflow/destinations/onedrive) (`ONEDRIVE` for the Python SDK or `onedrive` for `curl` or Postman) -- [Pinecone](/api/workflow/destinations/pinecone) (`PINECONE` for the Python SDK or `pinecone` for `curl` or Postman) -- [PostgreSQL](/api/workflow/destinations/postgresql) (`POSTGRES` for the Python SDK or `postgres` for `curl` or Postman) -- [Qdrant](/api/workflow/destinations/qdrant) (`QDRANT_CLOUD` for the Python SDK or `qdrant-cloud` for `curl` or Postman) -- [Redis](/api/workflow/destinations/redis) (`REDIS` for the Python SDK or `redis` for `curl` or Postman) -- [Snowflake](/api/workflow/destinations/snowflake) (`SNOWFLAKE` for the Python SDK or `snowflake` for `curl` or Postman) -- [S3](/api/workflow/destinations/s3) (`S3` for the Python SDK or `s3` for `curl` or Postman) -- [Weaviate](/api/workflow/destinations/weaviate) (`WEAVIATE` for the Python SDK or `weaviate` for `curl` or Postman) - diff --git a/api/workflow/sources/overview.mdx b/api/workflow/sources/overview.mdx deleted file mode 100644 index bbee8b47..00000000 --- a/api/workflow/sources/overview.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: Overview ---- - -To use the [Unstructured Workflow Endpoint](/api/workflow/overview) to manage source connectors, do the following: - -- To get a list of available source connectors, use the `UnstructuredClient` object's `sources.list_sources` function (for the Python SDK) or - the `GET` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#list-source-connectors). -- To get information about a source connector, use the `UnstructuredClient` object's `sources.get_source` function (for the Python SDK) or - the `GET` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#get-a-source-connector). -- To create a source connector, use the `UnstructuredClient` object's `sources.create_source` function (for the Python SDK) or - the `POST` method to call the `/sources` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#create-a-source-connector). -- To update a source connector, use the `UnstructuredClient` object's `sources.update_source` function (for the Python SDK) or - the `PUT` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#update-a-source-connector). -- To delete a source connector, use the `UnstructuredClient` object's `sources.delete_source` function (for the Python SDK) or - the `DELETE` method to call the `/sources/` endpoint (for `curl` or Postman). [Learn more](/api/workflow/overview#delete-a-source-connector). - -To create or update a source connector, you must also provide settings that are specific to that connector. -For the list of specific settings, see: - -- [Azure](/api/workflow/sources/azure-blob-storage) (`AZURE` for the Python SDK or `azure` for `curl` and Postman) -- [Box](/api/workflow/sources/box) (`BOX` for the Python SDK or `box` for `curl` and Postman) -- [Confluence](/api/workflow/sources/confluence) (`CONFLUENCE` for the Python SDK or `confluence` for `curl` and Postman) -- [Couchbase](/api/workflow/sources/couchbase) (`COUCHBASE` for the Python SDK or `couchbase` for `curl` and Postman) -- [Databricks Volumes](/api/workflow/sources/databricks-volumes) (`DATABRICKS_VOLUMES` for the Python SDK or `databricks_volumes` for `curl` and Postman) -- [Dropbox](/api/workflow/sources/dropbox) (`DROPBOX` for the Python SDK or `dropbox` for `curl` and Postman) -- [Elasticsearch](/api/workflow/sources/elasticsearch) (`ELASTICSEARCH` for the Python SDK or `elasticsearch` for `curl` and Postman) -- [Google Cloud Storage](/api/workflow/sources/google-cloud) (`GCS` for the Python SDK or `gcs` for `curl` and Postman) -- [Google Drive](/api/workflow/sources/google-drive) (`GOOGLE_DRIVE` for the Python SDK or `google_drive` for `curl` and Postman) -- [Kafka](/api/workflow/sources/kafka) (`KAFKA_CLOUD` for the Python SDK or `kafka-cloud` for `curl` and Postman) -- [MongoDB](/api/workflow/sources/mongodb) (`MONGODB` for the Python SDK or `mongodb` for `curl` and Postman) -- [OneDrive](/api/workflow/sources/onedrive) (`ONEDRIVE` for the Python SDK or `onedrive` for `curl` and Postman) -- [Outlook](/api/workflow/sources/outlook) (`OUTLOOK` for the Python SDK or `outlook` for `curl` and Postman) -- [PostgreSQL](/api/workflow/sources/postgresql) (`POSTGRES` for the Python SDK or `postgres` for `curl` and Postman) -- [S3](/api/workflow/sources/s3) (`S3` for the Python SDK or `s3` for `curl` and Postman) -- [Salesforce](/api/workflow/sources/salesforce) (`SALESFORCE` for the Python SDK or `salesforce` for `curl` and Postman) -- [SharePoint](/api/workflow/sources/sharepoint) (`SHAREPOINT` for the Python SDK or `sharepoint` for `curl` and Postman) -- [Snowflake](/api/workflow/sources/snowflake) (`SNOWFLAKE` for the Python SDK or `snowflake` for `curl` and Postman) - - diff --git a/examplecode/codesamples/api/huggingchat.mdx b/examplecode/codesamples/api/huggingchat.mdx index f4eef381..050cd2d7 100644 --- a/examplecode/codesamples/api/huggingchat.mdx +++ b/examplecode/codesamples/api/huggingchat.mdx @@ -3,15 +3,15 @@ title: Query processed PDF with HuggingChat --- This example uses the [Unstructured Ingest Python library](/ingestion/python-ingest) or the -[Unstructured JavaScript/TypeScript SDK](/api/partition/sdk-jsts) to send a PDF file to -the [Unstructured Partition Endpoint](/api/partition/overview) for processing. Unstructured processes the PDF and extracts the PDF's content. +[Unstructured JavaScript/TypeScript SDK](/api-reference/partition/sdk-jsts) to send a PDF file to +the [Unstructured Partition Endpoint](/api-reference/partition/overview) for processing. Unstructured processes the PDF and extracts the PDF's content. This example then sends some of the content to [HuggingChat](https://huggingface.co/chat/), Hugging Face's open-source AI chatbot, along with some queries about this content. To run this example, you'll need: - The [hugchat](https://pypi.org/project/hugchat/) package for Python, or the [huggingface-chat](https://www.npmjs.com/package/huggingface-chat) package for JavaScript/TypeScript. -- Your Unstructured API key and API URL. [Get an API key and API URL](/api/partition/overview). +- Your Unstructured API key and API URL. [Get an API key and API URL](/api-reference/partition/overview). - Your Hugging Face account's email address and account password. [Get an account](https://huggingface.co/join). - A PDF file for Unstructured to process. This example uses a sample PDF file containing the text of the United States Constitution, available for download from [https://constitutioncenter.org/media/files/constitution.pdf](https://constitutioncenter.org/media/files/constitution.pdf). diff --git a/examplecode/codesamples/apioss/table-extraction-from-pdf.mdx b/examplecode/codesamples/apioss/table-extraction-from-pdf.mdx index 88525226..66f03255 100644 --- a/examplecode/codesamples/apioss/table-extraction-from-pdf.mdx +++ b/examplecode/codesamples/apioss/table-extraction-from-pdf.mdx @@ -4,7 +4,7 @@ description: This section describes two methods for extracting tables from PDF f --- -This sample code utilizes the [Unstructured Open Source](/open-source/introduction/overview "Open Source") library and also provides an alternative method the utilizing the [Unstructured Partition Endpoint](/api/partition/overview). +This sample code utilizes the [Unstructured Open Source](/open-source/introduction/overview "Open Source") library and also provides an alternative method the utilizing the [Unstructured Partition Endpoint](/api-reference/partition/overview). ## Method 1: Using partition\_pdf @@ -33,7 +33,7 @@ print(tables[0].metadata.text_as_html) ## Method 2: Using Auto Partition or Unstructured API -By default, table extraction from all file types is enabled. To extract tables from PDFs and images using [Auto Partition](/open-source/core-functionality/partitioning#partition) or [Unstructured API parameters](/api/partition/api-parameters) simply set `strategy` parameter to `hi_res`. +By default, table extraction from all file types is enabled. To extract tables from PDFs and images using [Auto Partition](/open-source/core-functionality/partitioning#partition) or [Unstructured API parameters](/api-reference/partition/api-parameters) simply set `strategy` parameter to `hi_res`. **Usage: Auto Partition** diff --git a/examplecode/codesamples/oss/multi-files-api-processing.mdx b/examplecode/codesamples/oss/multi-files-api-processing.mdx index 83b52c6a..5ead525d 100644 --- a/examplecode/codesamples/oss/multi-files-api-processing.mdx +++ b/examplecode/codesamples/oss/multi-files-api-processing.mdx @@ -2,7 +2,7 @@ title: Multi-file API processing --- -This sample code utilizes the [Unstructured Partition Endpoint](/api/partition/overview). +This sample code utilizes the [Unstructured Partition Endpoint](/api-reference/partition/overview). ## Introduction diff --git a/examplecode/tools/langflow.mdx b/examplecode/tools/langflow.mdx index 0b921b10..5b41f5e1 100644 --- a/examplecode/tools/langflow.mdx +++ b/examplecode/tools/langflow.mdx @@ -21,7 +21,7 @@ Also: - [Sign up for an OpenAI account](https://platform.openai.com/signup), and [get your OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key). - [Sign up for a free Langflow account](https://astra.datastax.com/signup?type=langflow). -- [Get your Unstructured Partition Endpoint key](/api/partition/overview). +- [Get your Unstructured Partition Endpoint key](/api-reference/partition/overview). ## Create and run the demonstration project @@ -233,7 +233,7 @@ such as processing multiple files or using a different vector store. In this demonstration, you pass to Unstructured a single local file. To pass multiple local or non-local files to Unstructured instead, you can use the -[Unstructured UI](/ui/overview) or the [Unstructured API](/api/overview) or +[Unstructured UI](/ui/overview) or the [Unstructured API](/api-reference/overview) or [Unstructured Ingest](/ingestion/overview) outside of Langflow. To do this, you can: diff --git a/ingestion/how-to/extract-image-block-types.mdx b/ingestion/how-to/extract-image-block-types.mdx index df700d97..f0c7247f 100644 --- a/ingestion/how-to/extract-image-block-types.mdx +++ b/ingestion/how-to/extract-image-block-types.mdx @@ -15,7 +15,7 @@ and then show it. ## To run this example You will need a document that is one of the document types supported by the `extract_image_block_types` argument. -See the `extract_image_block_types` entry in [API Parameters](/api/partition/api-parameters). +See the `extract_image_block_types` entry in [API Parameters](/api-reference/partition/api-parameters). This example uses a PDF file with embedded images and tables. import SharedAPIKeyURL from '/snippets/general-shared-text/api-key-url.mdx'; diff --git a/ingestion/ingest-cli.mdx b/ingestion/ingest-cli.mdx index c475239c..94e3d602 100644 --- a/ingestion/ingest-cli.mdx +++ b/ingestion/ingest-cli.mdx @@ -8,7 +8,7 @@ The Unstructured Ingest CLI enables you to use command-line scripts to send file The Unstructured Ingest CLI does not work with the Unstructured API. - For information about the Unstructured API, see the [Unstructured API Overview](/api/workflow/overview). + For information about the Unstructured API, see the [Unstructured API Overview](/api-reference/workflow/overview). ## Installation diff --git a/ingestion/overview.mdx b/ingestion/overview.mdx index f5c2d1e4..1ddf6bf5 100644 --- a/ingestion/overview.mdx +++ b/ingestion/overview.mdx @@ -3,7 +3,7 @@ title: Overview --- - Unstructured recommends that you use the [Unstructured API](/api/overview) instead of the + Unstructured recommends that you use the [Unstructured API](/api-reference/overview) instead of the Unstructured Ingest CLI or the Unstructured Ingest Python library. The Unstructured API provides a full range of partitioning, chunking, embedding, and enrichment options for your files and data. diff --git a/ingestion/python-ingest.mdx b/ingestion/python-ingest.mdx index 6c5d3fcd..1c004186 100644 --- a/ingestion/python-ingest.mdx +++ b/ingestion/python-ingest.mdx @@ -8,7 +8,7 @@ The Unstructured Ingest Python library enables you to use Python code to send fi The Unstructured Ingest Python library does not work with the Unstructured API. - For information about the Unstructured API, see the [Unstructured API Overview](/api/workflow/overview). + For information about the Unstructured API, see the [Unstructured API Overview](/api-reference/workflow/overview). The following 3-minute video shows how to use the Unstructured Ingest Python library to send multiple PDFs from a local directory in batches to be ingested by Unstructured for processing: diff --git a/mint.json b/mint.json index b2c93f98..d5859fcd 100644 --- a/mint.json +++ b/mint.json @@ -79,7 +79,7 @@ }, { "name": "API", - "url": "api" + "url": "api-reference" }, { "name": "Example code", @@ -259,105 +259,105 @@ { "group": "Unstructured API", "pages": [ - "api/overview", - "api/supported-file-types" + "api-reference/overview", + "api-reference/supported-file-types" ] }, { "group": "Workflow Endpoint", "pages": [ - "api/workflow/overview", + "api-reference/workflow/overview", { "group": "Sources", "pages": [ - "api/workflow/sources/overview", - "api/workflow/sources/azure-blob-storage", - "api/workflow/sources/box", - "api/workflow/sources/confluence", - "api/workflow/sources/couchbase", - "api/workflow/sources/databricks-volumes", - "api/workflow/sources/dropbox", - "api/workflow/sources/elasticsearch", - "api/workflow/sources/google-cloud", - "api/workflow/sources/google-drive", - "api/workflow/sources/kafka", - "api/workflow/sources/mongodb", - "api/workflow/sources/onedrive", - "api/workflow/sources/outlook", - "api/workflow/sources/postgresql", - "api/workflow/sources/s3", - "api/workflow/sources/salesforce", - "api/workflow/sources/sharepoint", - "api/workflow/sources/snowflake" + "api-reference/workflow/sources/overview", + "api-reference/workflow/sources/azure-blob-storage", + "api-reference/workflow/sources/box", + "api-reference/workflow/sources/confluence", + "api-reference/workflow/sources/couchbase", + "api-reference/workflow/sources/databricks-volumes", + "api-reference/workflow/sources/dropbox", + "api-reference/workflow/sources/elasticsearch", + "api-reference/workflow/sources/google-cloud", + "api-reference/workflow/sources/google-drive", + "api-reference/workflow/sources/kafka", + "api-reference/workflow/sources/mongodb", + "api-reference/workflow/sources/onedrive", + "api-reference/workflow/sources/outlook", + "api-reference/workflow/sources/postgresql", + "api-reference/workflow/sources/s3", + "api-reference/workflow/sources/salesforce", + "api-reference/workflow/sources/sharepoint", + "api-reference/workflow/sources/snowflake" ] }, { "group": "Destinations", "pages": [ - "api/workflow/destinations/overview", - "api/workflow/destinations/astradb", - "api/workflow/destinations/azure-ai-search", - "api/workflow/destinations/couchbase", - "api/workflow/destinations/databricks-volumes", - "api/workflow/destinations/delta-table", - "api/workflow/destinations/databricks-delta-table", - "api/workflow/destinations/elasticsearch", - "api/workflow/destinations/google-cloud", - "api/workflow/destinations/kafka", - "api/workflow/destinations/milvus", - "api/workflow/destinations/mongodb", - "api/workflow/destinations/motherduck", - "api/workflow/destinations/neo4j", - "api/workflow/destinations/onedrive", - "api/workflow/destinations/pinecone", - "api/workflow/destinations/postgresql", - "api/workflow/destinations/qdrant", - "api/workflow/destinations/redis", - "api/workflow/destinations/s3", - "api/workflow/destinations/snowflake", - "api/workflow/destinations/weaviate" + "api-reference/workflow/destinations/overview", + "api-reference/workflow/destinations/astradb", + "api-reference/workflow/destinations/azure-ai-search", + "api-reference/workflow/destinations/couchbase", + "api-reference/workflow/destinations/databricks-volumes", + "api-reference/workflow/destinations/delta-table", + "api-reference/workflow/destinations/databricks-delta-table", + "api-reference/workflow/destinations/elasticsearch", + "api-reference/workflow/destinations/google-cloud", + "api-reference/workflow/destinations/kafka", + "api-reference/workflow/destinations/milvus", + "api-reference/workflow/destinations/mongodb", + "api-reference/workflow/destinations/motherduck", + "api-reference/workflow/destinations/neo4j", + "api-reference/workflow/destinations/onedrive", + "api-reference/workflow/destinations/pinecone", + "api-reference/workflow/destinations/postgresql", + "api-reference/workflow/destinations/qdrant", + "api-reference/workflow/destinations/redis", + "api-reference/workflow/destinations/s3", + "api-reference/workflow/destinations/snowflake", + "api-reference/workflow/destinations/weaviate" ] }, - "api/workflow/workflows", - "api/workflow/jobs" + "api-reference/workflow/workflows", + "api-reference/workflow/jobs" ] }, { "group": "Partition Endpoint", "pages": [ - "api/partition/overview", - "api/partition/post-requests", - "api/partition/sdk-python", - "api/partition/sdk-jsts", - "api/partition/api-parameters", - "api/partition/api-validation-errors", - "api/partition/examples", - "api/partition/document-elements", - "api/partition/partitioning", - "api/partition/chunking", - "api/partition/speed-up-large-files-batches", - "api/partition/get-elements", - "api/partition/text-as-html", - "api/partition/extract-image-block-types", - "api/partition/get-chunked-elements", - "api/partition/transform-schemas", - "api/partition/generate-schema", - "api/partition/pipeline-1" + "api-reference/partition/overview", + "api-reference/partition/post-requests", + "api-reference/partition/sdk-python", + "api-reference/partition/sdk-jsts", + "api-reference/partition/api-parameters", + "api-reference/partition/api-validation-errors", + "api-reference/partition/examples", + "api-reference/partition/document-elements", + "api-reference/partition/partitioning", + "api-reference/partition/chunking", + "api-reference/partition/speed-up-large-files-batches", + "api-reference/partition/get-elements", + "api-reference/partition/text-as-html", + "api-reference/partition/extract-image-block-types", + "api-reference/partition/get-chunked-elements", + "api-reference/partition/transform-schemas", + "api-reference/partition/generate-schema", + "api-reference/partition/pipeline-1" ] }, { "group": "Legacy APIs", "pages": [ - "api/legacy-api/overview", - "api/legacy-api/free-api", - "api/legacy-api/aws", - "api/legacy-api/azure" + "api-reference/legacy-api/overview", + "api-reference/legacy-api/free-api", + "api-reference/legacy-api/aws", + "api-reference/legacy-api/azure" ] }, { "group": "Troubleshooting", "pages": [ - "api/troubleshooting/api-key-url" + "api-reference/troubleshooting/api-key-url" ] }, { @@ -513,43 +513,43 @@ "redirects": [ { "source": "/api-reference/api-services/accessing-unstructured-api", - "destination": "/api/overview" + "destination": "/api-reference/overview" }, { "source": "/api-reference/api-services/api-parameters", - "destination": "/api/partition/api-parameters" + "destination": "/api-reference/partition/api-parameters" }, { "source": "/api-reference/api-services/api-validation-errors", - "destination": "/api/partition/api-validation-errors" + "destination": "/api-reference/partition/api-validation-errors" }, { "source": "/api-reference/api-services/aws", - "destination": "/api/legacy-api/aws" + "destination": "/api-reference/legacy-api/aws" }, { "source": "/api-reference/api-services/azure", - "destination": "/api/legacy-api/azure" + "destination": "/api-reference/legacy-api/azure" }, { "source": "/api-reference/api-services/chunking", - "destination": "/api/partition/chunking" + "destination": "/api-reference/partition/chunking" }, { "source": "/api-reference/api-services/document-elements", - "destination": "/api/partition/document-elements" + "destination": "/api-reference/partition/document-elements" }, { "source": "/api-reference/api-services/examples", - "destination": "/api/partition/examples" + "destination": "/api-reference/partition/examples" }, { "source": "/api-reference/api-services/free-api", - "destination": "/api/legacy-api/free-api" + "destination": "/api-reference/legacy-api/free-api" }, { "source": "/api-reference/api-services/overview", - "destination": "/api/overview" + "destination": "/api-reference/overview" }, { "source": "/api-reference/api-services/partition-via-api", @@ -557,39 +557,39 @@ }, { "source": "/api-reference/api-services/partitioning", - "destination": "/api/partition/partitioning" + "destination": "/api-reference/partition/partitioning" }, { "source": "/api-reference/api-services/post-requests", - "destination": "/api/partition/post-requests" + "destination": "/api-reference/partition/post-requests" }, { "source": "/api-reference/api-services/saas-api-development-guide", - "destination": "/api/overview" + "destination": "/api-reference/overview" }, { "source": "/api-reference/api-services/sdk-jsts", - "destination": "/api/partition/sdk-jsts" + "destination": "/api-reference/partition/sdk-jsts" }, { "source": "/api-reference/api-services/sdk-python", - "destination": "/api/partition/sdk-python" + "destination": "/api-reference/partition/sdk-python" }, { "source": "/api-reference/api-services/supported-file-types", - "destination": "/api/supported-file-types" + "destination": "/api-reference/supported-file-types" }, { "source": "/api-reference/best-practices/speed-up-large-files-batches", - "destination": "/api/partition/speed-up-large-files-batches" + "destination": "/api-reference/partition/speed-up-large-files-batches" }, { "source": "/api-reference/general/pipeline-1", - "destination": "/api/partition/pipeline-1" + "destination": "/api-reference/partition/pipeline-1" }, { "source": "/api-reference/how-to/:slug*", - "destination": "/api/partition/:slug*" + "destination": "/api-reference/partition/:slug*" }, { "source": "/api-reference/ingest/:slug*", @@ -597,7 +597,7 @@ }, { "source": "/api-reference/troubleshooting/api-key-url", - "destination": "/api/troubleshooting/api-key-url" + "destination": "/api-reference/troubleshooting/api-key-url" }, { "source": "/glossary/glossary", @@ -613,27 +613,27 @@ }, { "source": "/platform/api/:slug*", - "destination": "/api/workflow/:slug*" + "destination": "/api-reference/workflow/:slug*" }, { "source": "/platform-api/api/:slug*", - "destination": "/api/workflow/:slug*" + "destination": "/api-reference/workflow/:slug*" }, { "source": "/platform-api/legacy-api/:slug*", - "destination": "/api/legacy-api/:slug*" + "destination": "/api-reference/legacy-api/:slug*" }, { "source": "/platform-api/partition-api/choose-hi-res-model", - "destination": "/api/partition/partitioning" + "destination": "/api-reference/partition/partitioning" }, { "source": "/platform-api/partition-api/choose-partitioning-strategy", - "destination": "/api/partition/partitioning" + "destination": "/api-reference/partition/partitioning" }, { "source": "/platform-api/partition-api/embedding", - "destination": "/api/partition/embedding" + "destination": "/api-reference/partition/embedding" }, { "source": "/platform-api/partition-api/filter-files", @@ -641,7 +641,7 @@ }, { "source": "/platform-api/partition-api/:slug*", - "destination": "/api/partition/:slug*" + "destination": "/api-reference/partition/:slug*" } ], "analytics": { diff --git a/open-source/core-functionality/partitioning.mdx b/open-source/core-functionality/partitioning.mdx index 9fbbd758..c5559143 100644 --- a/open-source/core-functionality/partitioning.mdx +++ b/open-source/core-functionality/partitioning.mdx @@ -692,7 +692,7 @@ elements = partition_via_api( ``` -If you are using the [Unstructured Partition Endpoint](/api/partition/overview), you can use the `api_url` kwarg to point the `partition_via_api` function at your Unstructured Partition URL. +If you are using the [Unstructured Partition Endpoint](/api-reference/partition/overview), you can use the `api_url` kwarg to point the `partition_via_api` function at your Unstructured Partition URL. ```python import os diff --git a/open-source/introduction/overview.mdx b/open-source/introduction/overview.mdx index 5cbb4f8d..3613170c 100644 --- a/open-source/introduction/overview.mdx +++ b/open-source/introduction/overview.mdx @@ -3,7 +3,7 @@ title: Unstructured Open Source sidebarTitle: Overview --- -The `unstructured` open source library is designed as a starting point for quick prototyping and has [limits](#limits). For production scenarios, see the [Unstructured API](/api/overview) instead. +The `unstructured` open source library is designed as a starting point for quick prototyping and has [limits](#limits). For production scenarios, see the [Unstructured API](/api-reference/overview) instead. The `unstructured` [library](https://github.com/Unstructured-IO/unstructured) offers an open-source toolkit designed to simplify the ingestion and pre-processing of diverse data formats, including images and text-based documents @@ -44,7 +44,7 @@ and use cases. ## Limits -The open source library has the following limits as compared to the [Unstructured UI](/ui/overview) and the [Unstructured API](/api/overview): +The open source library has the following limits as compared to the [Unstructured UI](/ui/overview) and the [Unstructured API](/api-reference/overview): * Not designed for production scenarios. * Significantly decreased performance on document and table extraction. diff --git a/snippets/general-shared-text/azure-ai-search.mdx b/snippets/general-shared-text/azure-ai-search.mdx index 6ef5080a..5551a189 100644 --- a/snippets/general-shared-text/azure-ai-search.mdx +++ b/snippets/general-shared-text/azure-ai-search.mdx @@ -942,4 +942,4 @@ Here are some more details about these requirements: - [Search indexes in Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-an-index) - [Schema of a search index](https://learn.microsoft.com/azure/search/search-what-is-an-index#schema-of-a-search-index) - [Example index schema](https://learn.microsoft.com/rest/api/searchservice/create-index#examples) - - [Unstructured document elements and metadata](/api/partition/document-elements) \ No newline at end of file + - [Unstructured document elements and metadata](/api-reference/partition/document-elements) \ No newline at end of file diff --git a/snippets/general-shared-text/couchbase.mdx b/snippets/general-shared-text/couchbase.mdx index b62bbe97..c4634716 100644 --- a/snippets/general-shared-text/couchbase.mdx +++ b/snippets/general-shared-text/couchbase.mdx @@ -1,4 +1,4 @@ -- For the [Unstructured UI](/ui/overview) or the [Unstructured API](/api/overview), only Couchbase Capella clusters are supported. +- For the [Unstructured UI](/ui/overview) or the [Unstructured API](/api-reference/overview), only Couchbase Capella clusters are supported. - For [Unstructured Ingest](/ingestion/overview), Couchbase Capella clusters and local Couchbase server deployments are supported.