From b75d15bd9264f6245c5f1a27ac4a3de51bd25345 Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Wed, 30 Jul 2025 09:29:24 -0700 Subject: [PATCH] Enrichment workflow DAG nodes: clarify when enrichments are generated --- .../enrichment-image-summary-hi-res-only.mdx | 18 ++++++++++++++---- .../enrichment-images-tables-hi-res-only.mdx | 18 ++++++++++++++---- .../enrichment-table-summary-hi-res-only.mdx | 18 ++++++++++++++---- .../enrichment-table-to-html-hi-res-only.mdx | 18 ++++++++++++++---- 4 files changed, 56 insertions(+), 16 deletions(-) diff --git a/snippets/general-shared-text/enrichment-image-summary-hi-res-only.mdx b/snippets/general-shared-text/enrichment-image-summary-hi-res-only.mdx index 998cb1b8..e86b191e 100644 --- a/snippets/general-shared-text/enrichment-image-summary-hi-res-only.mdx +++ b/snippets/general-shared-text/enrichment-image-summary-hi-res-only.mdx @@ -1,7 +1,17 @@ - Image summary descriptions are generated only when the **Partitioner** node in a workflow is set to use the **High Res** [partitioning strategy](/ui/partitioning) and - the workflow also contains an image description enrichment node. + Unstructured can potentially generate image summary descriptions only for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and an image summary description node is added. + - With a **Partitioner** node set to use the **VLM** partitioning strategy. No image summary description node is needed (or allowed). + + Even with these configurations, Unstructured actually generates image summary descriptions only for files that contain images and are also eligible + for processing with the following partitioning strategies: + + - **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**. + - **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**. - Setting the **Partitioner** node to use **Auto**, **VLM**, or **Fast** in a workflow that also contains an image description enrichment node - will not produce any image summary descriptions, and it could also cause the workflow to stop running or produce unexpected results. + Unstructured never generates image summary descriptions for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Fast** partitioning strategy. + - With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain images. \ No newline at end of file diff --git a/snippets/general-shared-text/enrichment-images-tables-hi-res-only.mdx b/snippets/general-shared-text/enrichment-images-tables-hi-res-only.mdx index deb32a01..2e03f83a 100644 --- a/snippets/general-shared-text/enrichment-images-tables-hi-res-only.mdx +++ b/snippets/general-shared-text/enrichment-images-tables-hi-res-only.mdx @@ -1,7 +1,17 @@ - Image summary descriptions, table summary descriptions, and table-to-HTML output is generated only when the **Partitioner** node in a workflow is set to use the **High Res** [partitioning strategy](/ui/partitioning) and - the workflow also contains an image description, table description, or table-to-HTML enrichment node. + Unstructured can potentially generate image summary descriptions, table summary descriptions, and table-to-HTML output only for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and an image summary description node, table summary description node, or table-to-HTML output node is added. + - With a **Partitioner** node set to use the **VLM** partitioning strategy. No image summary description node, table summary description node, or table-to-HTML output node is needed (or allowed). + + Even with these configurations, Unstructured actually generates image summary descriptions, table summary descriptions, and table-to-HTML output only for files that contain images or tables and are also eligible + for processing with the following partitioning strategies: + + - **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**. + - **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**. - Setting the **Partitioner** node to use **Auto**, **VLM**, or **Fast** in a workflow that also contains an image description, table description, or table-to-HTML enrichment node - will not generate any image summary descriptions, table summary descriptions, or table-to-HTML output, and it could also cause the workflow to stop running or produce unexpected results. + Unstructured never generates image summary descriptions, table summary descriptions, or table-to-HTML output for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Fast** partitioning strategy. + - With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain images or tables. \ No newline at end of file diff --git a/snippets/general-shared-text/enrichment-table-summary-hi-res-only.mdx b/snippets/general-shared-text/enrichment-table-summary-hi-res-only.mdx index f7f4c7a4..3b089c16 100644 --- a/snippets/general-shared-text/enrichment-table-summary-hi-res-only.mdx +++ b/snippets/general-shared-text/enrichment-table-summary-hi-res-only.mdx @@ -1,7 +1,17 @@ - Table summary descriptions are generated only when the **Partitioner** node in a workflow is set to use the **High Res** [partitioning strategy](/ui/partitioning) and - the workflow also contains a table description enrichment node. + Unstructured can potentially generate table summary descriptions only for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and a table summary description node is added. + - With a **Partitioner** node set to use the **VLM** partitioning strategy. No table summary description node is needed (or allowed). + + Even with these configurations, Unstructured actually generates table summary descriptions only for files that contain tables and are also eligible + for processing with the following partitioning strategies: + + - **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**. + - **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**. - Setting the **Partitioner** node to use **Auto**, **VLM**, or **Fast** in a workflow that also contains a table description enrichment node - will not produce any table summary descriptions, and it could also cause the workflow to stop running or produce unexpected results. + Unstructured never generates table summary descriptions for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Fast** partitioning strategy. + - With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain tables. \ No newline at end of file diff --git a/snippets/general-shared-text/enrichment-table-to-html-hi-res-only.mdx b/snippets/general-shared-text/enrichment-table-to-html-hi-res-only.mdx index 48cf9d19..efe58d5c 100644 --- a/snippets/general-shared-text/enrichment-table-to-html-hi-res-only.mdx +++ b/snippets/general-shared-text/enrichment-table-to-html-hi-res-only.mdx @@ -1,7 +1,17 @@ - Table-to-HTML generation happens only when the **Partitioner** node in a workflow is set to use the **High Res** [partitioning strategy](/ui/partitioning) and - the workflow also contains a table-to-HTML enrichment node. + Unstructured can potentially generate table-to-HTML output only for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and a table-to-HTML output node is added. + - With a **Partitioner** node set to use the **VLM** partitioning strategy. No table-to-HTML output node is needed (or allowed). - Setting the **Partitioner** node to use **Auto**, **VLM**, or **Fast** in a workflow that also contains a table-to-HTML enrichment node - will not generate any table-to-HTML output, and it could also cause the workflow to stop running or produce unexpected results. + Even with these configurations, Unstructured actually generates table-to-HTML output only for files that contain tables and are also eligible + for processing with the following partitioning strategies: + + - **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**. + - **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**. + + Unstructured never generates table-to-HTML output for workflows that are configured as follows: + + - With a **Partitioner** node set to use the **Fast** partitioning strategy. + - With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain tables. \ No newline at end of file