From 0ff683da0ad65917074a82468f87278e99103fec Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Mon, 25 Nov 2024 09:35:17 -0800 Subject: [PATCH] Platform: Remove Auto --- platform/overview.mdx | 6 +++--- platform/partitioning.mdx | 7 +------ platform/workflows.mdx | 15 +++------------ 3 files changed, 7 insertions(+), 21 deletions(-) diff --git a/platform/overview.mdx b/platform/overview.mdx index 9e6fd18c..f5ecc57d 100644 --- a/platform/overview.mdx +++ b/platform/overview.mdx @@ -23,9 +23,9 @@ To get your data RAG-ready, the Unstructured Platform moves it through the follo Routing determines which strategy Unstructured Platform uses to transforming your documents into Unstructured's canonical JSON schema. The Unstructured Platform provides these [partitioning](/platform/partitioning) strategies for document transformation: - - **Fast** is great for when there is extractable text available, like in HTML files or in the Microsoft Office Document format. - - **Hi Res** is best for PDFs and tables and where accurate classification of document elements is critical. - - If you're unsure which strategy to use, choose **Auto**, and the Unstructured Platform will handle the decision for you. + - **Basic** is ideal for simple, text-only documents. + - **Advanced** is best for PDFs, images, and complex file types. + - **Platinum** is for challenging documents, including scanned and handwritten content. diff --git a/platform/partitioning.mdx b/platform/partitioning.mdx index e5279132..50bcddae 100644 --- a/platform/partitioning.mdx +++ b/platform/partitioning.mdx @@ -21,10 +21,5 @@ To choose one of these strategies, select one of the **Partition Strategy** opti - **Fast**: This strategy is ideal for simple, text-based documents. - **Hi-Res**: This strategy is best for PDFs, images, and complex file types. -- **VLM**: For your most challenging documents, including scanned and handwritten content, use this strategy, which leverages vision - language models (VLMs). During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged - at the **Hi-Res** rate instead. -- **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table - or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the - **Fast** rate for that file. +- **VLM**: This strategy is for your most challenging documents, including scanned and handwritten content. diff --git a/platform/workflows.mdx b/platform/workflows.mdx index 2bc05b7d..9068826e 100644 --- a/platform/workflows.mdx +++ b/platform/workflows.mdx @@ -54,8 +54,7 @@ To create an automatic workflow: - **Basic** Ideal for simple, text-only documents. - **Advanced** Best for PDFs, images, and complex file types. - - **Platinum** For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs). - During processing, files that are not PDFs or images are processed by using the **Advanced** strategy and are charged at the **Advanced** rate instead. + - **Platinum** For your most challenging documents, including scanned and handwritten content. 9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors: @@ -110,11 +109,7 @@ There are two ways to create a custom workflow: - **Fast**: Ideal for simple, text-only documents. - **Hi-Res**: Best for PDFs, images, and complex file types. - - **VLM**: For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs). - During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged at the **Hi-Res** rate instead. - - **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table - or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the - **Fast** rate for that file. + - **VLM**: For your most challenging documents, including scanned and handwritten content. [Learn more](/platform/partitioning). @@ -267,11 +262,7 @@ There are two ways to create a custom workflow: - **Fast**: Ideal for simple, text-only documents. - **Hi-Res**: Best for PDFs, images, and complex file types. - - **VLM**: For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs). - During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged at the **Hi-Res** rate instead. - - **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table - or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the - **Fast** rate for that file. + - **VLM**: For your most challenging documents, including scanned and handwritten content. [Learn more](/platform/partitioning).