Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions platform/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ To get your data RAG-ready, the Unstructured Platform moves it through the follo
<Step title="Route">
Routing determines which strategy Unstructured Platform uses to transforming your documents into Unstructured's canonical JSON schema. The Unstructured Platform provides these [partitioning](/platform/partitioning) strategies for document transformation:

- **Fast** is great for when there is extractable text available, like in HTML files or in the Microsoft Office Document format.
- **Hi Res** is best for PDFs and tables and where accurate classification of document elements is critical.
- If you're unsure which strategy to use, choose **Auto**, and the Unstructured Platform will handle the decision for you.
- **Basic** is ideal for simple, text-only documents.
- **Advanced** is best for PDFs, images, and complex file types.
- **Platinum** is for challenging documents, including scanned and handwritten content.

</Step>
<Step title="Transform">
Expand Down
7 changes: 1 addition & 6 deletions platform/partitioning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,5 @@ To choose one of these strategies, select one of the **Partition Strategy** opti

- **Fast**: This strategy is ideal for simple, text-based documents.
- **Hi-Res**: This strategy is best for PDFs, images, and complex file types.
- **VLM**: For your most challenging documents, including scanned and handwritten content, use this strategy, which leverages vision
language models (VLMs). During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged
at the **Hi-Res** rate instead.
- **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table
or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the
**Fast** rate for that file.
- **VLM**: This strategy is for your most challenging documents, including scanned and handwritten content.

15 changes: 3 additions & 12 deletions platform/workflows.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@ To create an automatic workflow:

- **Basic** Ideal for simple, text-only documents.
- **Advanced** Best for PDFs, images, and complex file types.
- **Platinum** For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs).
During processing, files that are not PDFs or images are processed by using the **Advanced** strategy and are charged at the **Advanced** rate instead.
- **Platinum** For your most challenging documents, including scanned and handwritten content.

9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:

Expand Down Expand Up @@ -110,11 +109,7 @@ There are two ways to create a custom workflow:

- **Fast**: Ideal for simple, text-only documents.
- **Hi-Res**: Best for PDFs, images, and complex file types.
- **VLM**: For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs).
During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged at the **Hi-Res** rate instead.
- **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table
or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the
**Fast** rate for that file.
- **VLM**: For your most challenging documents, including scanned and handwritten content.

[Learn more](/platform/partitioning).

Expand Down Expand Up @@ -267,11 +262,7 @@ There are two ways to create a custom workflow:

- **Fast**: Ideal for simple, text-only documents.
- **Hi-Res**: Best for PDFs, images, and complex file types.
- **VLM**: For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs).
During processing, files that are not PDFs or images are processed by using the **Hi-Res** strategy and are charged at the **Hi-Res** rate instead.
- **Auto**: This strategy examines each file before processing it. If the file is an image, or if the file is a PDF and at least one embedded table
or image is found in it, **Hi-Res** is used to process that file and charged at the **Hi-Res** rate for that file. Otherwise, **Fast** is used and charged at the
**Fast** rate for that file.
- **VLM**: For your most challenging documents, including scanned and handwritten content.

[Learn more](/platform/partitioning).
</Accordion>
Expand Down