From b7a74168b714b040303fc571a91263ce6e1ad8bf Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Mon, 6 Oct 2025 15:48:52 -0700 Subject: [PATCH 1/5] Deprecate Claude Sonnet 3.5 --- api-reference/partition/api-parameters.mdx | 10 ++-- api-reference/workflow/workflows.mdx | 46 ++++++++++++++----- .../deprecated-models-api.mdx | 13 ++++++ .../deprecated-models-ui.mdx | 13 ++++++ ui/enriching/image-descriptions.mdx | 3 ++ ui/enriching/ner.mdx | 4 ++ ui/enriching/table-descriptions.mdx | 5 +- ui/enriching/table-to-html.mdx | 3 ++ ui/walkthrough.mdx | 4 +- ui/workflows.mdx | 3 ++ 10 files changed, 85 insertions(+), 19 deletions(-) create mode 100644 snippets/general-shared-text/deprecated-models-api.mdx create mode 100644 snippets/general-shared-text/deprecated-models-ui.mdx diff --git a/api-reference/partition/api-parameters.mdx b/api-reference/partition/api-parameters.mdx index 9867bf89..ae296a21 100644 --- a/api-reference/partition/api-parameters.mdx +++ b/api-reference/partition/api-parameters.mdx @@ -57,17 +57,21 @@ Need help getting started? Check out the [Examples page](/api-reference/partitio Allowed values for `vlm_model_provider` and `vlm_model` pairs include the following: +import DeprecatedModelsAPI from '/snippets/general-shared-text/deprecated-models-api.mdx'; + + + | `vlm_model_provider` | `vlm_model` | |----------------------|------------------------------------------------| -| `anthropic` | `claude-3-5-sonnet-20241022` | -| `anthropic_bedrock` | `claude-3-5-sonnet-20241022` | +| `anthropic` | `claude-sonnet-4-202505142` | | `bedrock` | `us.amazon.nova-lite-v1:0` | | `bedrock` | `us.amazon.nova-pro-v1:0` | -| `bedrock` | `us.anthropic.claude-3-5-sonnet-20241022-v2:0` | | `bedrock` | `us.anthropic.claude-3-haiku-20240307-v1:0` | | `bedrock` | `us.anthropic.claude-3-opus-20240229-v1:0` | | `bedrock` | `us.anthropic.claude-3-sonnet-20240229-v1:0` | +| `bedrock` | `us.anthropic.claude-sonnet-4-20250514-v1:0` | | `bedrock` | `us.meta.llama3-2-11b-instruct-v1:0` | | `bedrock` | `us.meta.llama3-2-90b-instruct-v1:0` | | `openai` | `gpt-4o` | +| `openai` | `gpt-5-mini-2025-08-07` | | `vertexai` | `gemini-2.0-flash-001` | \ No newline at end of file diff --git a/api-reference/workflow/workflows.mdx b/api-reference/workflow/workflows.mdx index 919f04c6..266bcc33 100644 --- a/api-reference/workflow/workflows.mdx +++ b/api-reference/workflow/workflows.mdx @@ -1030,6 +1030,8 @@ A **Partitioner** node has a `type` of `partition`. #### Auto strategy +import DeprecatedModelsAPI from '/snippets/general-shared-text/deprecated-models-api.mdx'; + ```python @@ -1076,10 +1078,16 @@ Fields for `settings` include: - `strategy`: _Required_. The partitioning strategy to use. This field must be set to `auto`. - `provider`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM provider. Allowed values include `auto`, `openai`, `anthropic`, and `bedrock`. The default value is `anthropic`. - `provider_api_key`: _Optional_. If specified, use a non-default API key for calls to the specified VLM provider as needed. The default is none, which means to rely on using Unstructured's internal default API key for the VLM provider. -- `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-3-5-sonnet-20241022`. +- `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-sonnet-4-20250514`. + + + + - For `anthropic`, available values for `model` are: + + - `claude-3-7-sonnet-20250219` + - `claude-sonnet-4-20250514` + - `claude-sonnet-4-5-20250929` - - For `openai`, available values for `model` are `gpt-4o` and `gpt-4o-mini`. - - For `anthropic`, available values for `model` are `claude-3-5-sonnet-20241022` and `claude-3-7-sonnet-20250219`. - For `bedrock`, available values for `model` are: - `us.amazon.nova-lite-v1:0` @@ -1087,9 +1095,13 @@ Fields for `settings` include: - `us.anthropic.claude-3-opus-20240229-v1:0` - `us.anthropic.claude-3-haiku-20240307-v1:0` - `us.anthropic.claude-3-sonnet-20240229-v1:0` - - `us.anthropic.claude-3-5-sonnet-20241022-v2:0` - - `us.meta.llama3-2-11b-instruct-v1:0` - - `us.meta.llama3-2-90b-instruct-v1:0` + - `us.anthropic.claude-3-7-sonnet-20250219-v1:0` + - `us.anthropic.claude-sonnet-4-20250514-v1:0` + + - For `openai`, available values for `model` are: + + - `gpt-4o` + - `gpt-5-mini-2025-08-07` - `output_format`: _Output_. The format of the response. Allowed values include `text/html` and `application/json`. The default is `text/html`. - `format_html`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, true (the default) to apply Beautiful Soup's `prettify` method to the HTML that is generated by the VLM partitioner, which for example adds indentation for better readability. @@ -1144,10 +1156,16 @@ Fields for `settings` include: - `provider`: _Optional_. Use the specified VLM provider. Allowed values include `auto`, `openai`, `anthropic`, and `bedrock`. The default value is `anthropic`. - `provider_api_key`: _Optional_. If specified, use a non-default API key for calls to the specified VLM provider as needed. The default is none, which means to rely on using Unstructured's internal default API key for the VLM provider. -- `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-3-5-sonnet-20241022`. +- `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-sonnet-4-20250514`. + + + + - For `anthropic`, available values for `model` are: + + - `claude-3-7-sonnet-20250219` + - `claude-sonnet-4-20250514` + - `claude-sonnet-4-5-20250929` - - For `openai`, available values for `model` are `gpt-4o` and `gpt-4o-mini`. - - For `anthropic`, available values for `model` are `claude-3-5-sonnet-20241022` and `claude-3-7-sonnet-20250219`. - For `bedrock`, available values for `model` are: - `us.amazon.nova-lite-v1:0` @@ -1155,9 +1173,13 @@ Fields for `settings` include: - `us.anthropic.claude-3-opus-20240229-v1:0` - `us.anthropic.claude-3-haiku-20240307-v1:0` - `us.anthropic.claude-3-sonnet-20240229-v1:0` - - `us.anthropic.claude-3-5-sonnet-20241022-v2:0` - - `us.meta.llama3-2-11b-instruct-v1:0` - - `us.meta.llama3-2-90b-instruct-v1:0` + - `us.anthropic.claude-3-7-sonnet-20250219-v1:0` + - `us.anthropic.claude-sonnet-4-20250514-v1:0` + + - For `openai`, available values for `model` are: + + - `gpt-4o` + - `gpt-5-mini-2025-08-07` - `output_format`: _Output_. The format of the response. Allowed values include `text/html` and `application/json`. The default is `text/html`. - `format_html`: _Optional_. True (the default) to apply Beautiful Soup's `prettify` method to the HTML that is generated by the VLM partitioner, which for example adds indentation for better readability. diff --git a/snippets/general-shared-text/deprecated-models-api.mdx b/snippets/general-shared-text/deprecated-models-api.mdx new file mode 100644 index 00000000..b730abd1 --- /dev/null +++ b/snippets/general-shared-text/deprecated-models-api.mdx @@ -0,0 +1,13 @@ + + The following models are no longer available as of the following dates: + + - For `anthropic`, `claude-3-5-sonnet-20241022`: October 22, 2025 + - For `bedrock`, `us.anthropic.claude-3-5-sonnet-20241022-v2:0`: October 22, 2025 + + Unstructured recommends the following actions: + + - For new workflows, do not use any of these models. + - For any workflow that uses any of these models, update that workflow as soon as possible to use a different model. + + Workflows that attempt to use any of these models on or after its associated date will return errors. + \ No newline at end of file diff --git a/snippets/general-shared-text/deprecated-models-ui.mdx b/snippets/general-shared-text/deprecated-models-ui.mdx new file mode 100644 index 00000000..777a0c9b --- /dev/null +++ b/snippets/general-shared-text/deprecated-models-ui.mdx @@ -0,0 +1,13 @@ + + The following models are no longer available as of the following dates: + + - Amazon Bedrock Claude Sonnet 3.5: October 22, 2025 + - Anthropic Claude Sonnet 3.5: October 22, 2025 + + Unstructured recommends the following actions: + + - For new workflows, do not use any of these models. + - For any workflow that uses any of these models, update that workflow as soon as possible to use a different model. + + Workflows that attempt to use any of these models on or after its associated date will return errors. + \ No newline at end of file diff --git a/ui/enriching/image-descriptions.mdx b/ui/enriching/image-descriptions.mdx index 64787e3d..aefa865a 100644 --- a/ui/enriching/image-descriptions.mdx +++ b/ui/enriching/image-descriptions.mdx @@ -59,6 +59,7 @@ Any embeddings that are produced after these summaries are generated will be bas ## Generate image descriptions import EnrichmentImageSummaryHiResOnly from '/snippets/general-shared-text/enrichment-image-summary-hi-res-only.mdx'; +import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-ui.mdx'; To generate image descriptions, in an **Enrichment** node in a workflow, select **Image**, and then choose one of the available provider (and model) combinations that are shown. @@ -69,4 +70,6 @@ To generate image descriptions, in an **Enrichment** node in a workflow, select **Chunker** node before an image descriptions **Enrichment** node could cause incomplete or no image descriptions to be generated. + + diff --git a/ui/enriching/ner.mdx b/ui/enriching/ner.mdx index c60d59e7..e3b65401 100644 --- a/ui/enriching/ner.mdx +++ b/ui/enriching/ner.mdx @@ -137,12 +137,16 @@ prompt that is used to run NER. To do this, see the next section. # Generate a list of entities and their relationships +import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-ui.mdx'; + To generate a list of recognized entities and their relationships, in an **Enrichment** node in a workflow, specify the following: You can change a workflow's NER settings only through [Custom](/ui/workflows#create-a-custom-workflow) workflow settings. + + 1. Select **Text**. 2. For **Model**, select one of the available models that are shown. 3. The selected model will follow a default set of instructions (called a _prompt_) to perform NER using a set of predefined entity types and relationships. To experiment diff --git a/ui/enriching/table-descriptions.mdx b/ui/enriching/table-descriptions.mdx index bd8f4190..21b53c48 100644 --- a/ui/enriching/table-descriptions.mdx +++ b/ui/enriching/table-descriptions.mdx @@ -69,6 +69,7 @@ Any embeddings that are produced after these summaries are generated will be bas ## Generate table descriptions import EnrichmentTableSummaryHiResOnly from '/snippets/general-shared-text/enrichment-table-summary-hi-res-only.mdx'; +import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-ui.mdx'; To generate table descriptions, in an **Enrichment** node in a workflow, select **Table**, and then choose one of the available provider (and model) combinations that are shown. @@ -82,9 +83,9 @@ displayed, be sure to select **Table Description**. **Chunker** node before a table descriptions **Enrichment** node could cause incomplete or no table descriptions to be generated. - - + + ## Learn more diff --git a/ui/enriching/table-to-html.mdx b/ui/enriching/table-to-html.mdx index a3c98e37..e5aff12c 100644 --- a/ui/enriching/table-to-html.mdx +++ b/ui/enriching/table-to-html.mdx @@ -74,6 +74,7 @@ For workflows that use [chunking](/ui/chunking), note the following changes: ## Generate table-to-HTML output import EnrichmentTableToHTMLHiResOnly from '/snippets/general-shared-text/enrichment-table-to-html-hi-res-only.mdx'; +import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-ui.mdx'; To generate table-to-HTML output, in an **Enrichment** node in a workflow, for **Model**, select **OpenAI (GPT-4o)**. @@ -86,6 +87,8 @@ Make sure after you choose this provider and model, that **Table to HTML** is al **Chunker** node before a table-to-HTML output **Enrichment** node could cause incomplete or no table-to-HTML output to be generated. + + ## Learn more diff --git a/ui/walkthrough.mdx b/ui/walkthrough.mdx index f6a289c2..6a51c631 100644 --- a/ui/walkthrough.mdx +++ b/ui/walkthrough.mdx @@ -204,7 +204,7 @@ more complex content such as complex tables, multilanguage characters, and handw a. Click the close (**X**) button above the output on the right side of the screen.
b. In the workflow designer, click the **Partitioner** node and then, in the node's settings pane's **Details** tab, select **VLM**.
- c. Under **Select VLM Model**, under **Anthropic**, select **Claude 3.5 Sonnet**.
+ c. Under **Select VLM Model**, under **OpenAI**, select **GPT-4o**.
d. Click **Test**.
@@ -243,7 +243,7 @@ more complex content such as complex tables, multilanguage characters, and handw a. Click the close (**X**) button above the output on the right side of the screen.
b. In the workflow designer, click the **Partitioner** node and then, in the node's settings pane's **Details** tab, select **VLM**.
- c. Under **Select VLM Model**, under **Anthropic**, select **Claude 3.5 Sonnet**.
+ c. Under **Select VLM Model**, under **OpenAI**, select **GPT-4o**.
d. Click **Test**.
12. Notice how the output changes, now that you are using the **VLM** strategy: diff --git a/ui/workflows.mdx b/ui/workflows.mdx index af28a25f..b11d6707 100644 --- a/ui/workflows.mdx +++ b/ui/workflows.mdx @@ -235,6 +235,7 @@ If you did not previously set the workflow to run on a schedule, you can [run th #### Custom workflow node types import PlatformPartitioningStrategies from '/snippets/general-shared-text/platform-partitioning-strategies.mdx'; +import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-ui.mdx'; @@ -244,6 +245,8 @@ import PlatformPartitioningStrategies from '/snippets/general-shared-text/platfo For **VLM**, you must also choose a VLM provider and model from among the available choices that are shown. + + When you use the **VLM** strategy with embeddings for PDF files of 200 or more pages, you might notice some errors when these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images. From c8b1d4a5df4fbebd2edad44e0a68f29f856a0d28 Mon Sep 17 00:00:00 2001 From: Paul-Cornell Date: Mon, 6 Oct 2025 16:03:38 -0700 Subject: [PATCH 2/5] Apply suggestions from code review --- api-reference/partition/api-parameters.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/api-reference/partition/api-parameters.mdx b/api-reference/partition/api-parameters.mdx index ae296a21..1fb932a0 100644 --- a/api-reference/partition/api-parameters.mdx +++ b/api-reference/partition/api-parameters.mdx @@ -63,7 +63,7 @@ import DeprecatedModelsAPI from '/snippets/general-shared-text/deprecated-models | `vlm_model_provider` | `vlm_model` | |----------------------|------------------------------------------------| -| `anthropic` | `claude-sonnet-4-202505142` | +| `anthropic` | `claude-sonnet-4-20250514` | | `bedrock` | `us.amazon.nova-lite-v1:0` | | `bedrock` | `us.amazon.nova-pro-v1:0` | | `bedrock` | `us.anthropic.claude-3-haiku-20240307-v1:0` | From d2bace98e381a38623afe7f3262a76a288ff8fb8 Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Mon, 6 Oct 2025 16:13:01 -0700 Subject: [PATCH 3/5] Add warning to Enrichment node section --- ui/workflows.mdx | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ui/workflows.mdx b/ui/workflows.mdx index b11d6707..e89d56e4 100644 --- a/ui/workflows.mdx +++ b/ui/workflows.mdx @@ -302,6 +302,8 @@ import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models- Choose one of the following: + + - **Image** to summarize images. Also select one of the available provider (and model) combinations that are shown. From f5abc1571a096d8823e808b5839b27f7b68293cc Mon Sep 17 00:00:00 2001 From: Paul-Cornell Date: Tue, 7 Oct 2025 07:21:08 -0700 Subject: [PATCH 4/5] Apply suggestions from code review --- ui/walkthrough.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ui/walkthrough.mdx b/ui/walkthrough.mdx index 6a51c631..65115086 100644 --- a/ui/walkthrough.mdx +++ b/ui/walkthrough.mdx @@ -204,7 +204,7 @@ more complex content such as complex tables, multilanguage characters, and handw a. Click the close (**X**) button above the output on the right side of the screen.
b. In the workflow designer, click the **Partitioner** node and then, in the node's settings pane's **Details** tab, select **VLM**.
- c. Under **Select VLM Model**, under **OpenAI**, select **GPT-4o**.
+ c. Under **Select VLM Model**, under **Anthropic**, select **Claude Sonnet 4**.
d. Click **Test**.
@@ -243,7 +243,7 @@ more complex content such as complex tables, multilanguage characters, and handw a. Click the close (**X**) button above the output on the right side of the screen.
b. In the workflow designer, click the **Partitioner** node and then, in the node's settings pane's **Details** tab, select **VLM**.
- c. Under **Select VLM Model**, under **OpenAI**, select **GPT-4o**.
+ c. Under **Select VLM Model**, under **Anthropic**, select **Claude Sonnet 4**.
d. Click **Test**.
12. Notice how the output changes, now that you are using the **VLM** strategy: From 5fc8bd3de6dc1681fc8d6eac2920a1a084872f61 Mon Sep 17 00:00:00 2001 From: Paul Cornell Date: Tue, 7 Oct 2025 07:46:54 -0700 Subject: [PATCH 5/5] Added Gemini 2.0 Flash VLM --- api-reference/workflow/workflows.mdx | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/api-reference/workflow/workflows.mdx b/api-reference/workflow/workflows.mdx index 266bcc33..3b1fdb93 100644 --- a/api-reference/workflow/workflows.mdx +++ b/api-reference/workflow/workflows.mdx @@ -1076,7 +1076,7 @@ import DeprecatedModelsAPI from '/snippets/general-shared-text/deprecated-models Fields for `settings` include: - `strategy`: _Required_. The partitioning strategy to use. This field must be set to `auto`. -- `provider`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM provider. Allowed values include `auto`, `openai`, `anthropic`, and `bedrock`. The default value is `anthropic`. +- `provider`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM provider. Allowed values include `anthropic`, `auto`, `bedrock`, `openai`, and `vertexai`. The default value is `anthropic`. - `provider_api_key`: _Optional_. If specified, use a non-default API key for calls to the specified VLM provider as needed. The default is none, which means to rely on using Unstructured's internal default API key for the VLM provider. - `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-sonnet-4-20250514`. @@ -1103,6 +1103,10 @@ Fields for `settings` include: - `gpt-4o` - `gpt-5-mini-2025-08-07` + - For `vertexai`, available values for `model` are: + + - `gemini-2.0-flash-001` + - `output_format`: _Output_. The format of the response. Allowed values include `text/html` and `application/json`. The default is `text/html`. - `format_html`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, true (the default) to apply Beautiful Soup's `prettify` method to the HTML that is generated by the VLM partitioner, which for example adds indentation for better readability. - `unique_element_ids`: _Optional_. True (the default) to assign UUIDs to element IDs, which guarantees their uniqueness. This is useful for example when using them as primary keys in a database. False to assign a SHA-256 of the element's text as its element ID. @@ -1154,7 +1158,7 @@ Fields for `settings` include: Fields for `settings` include: -- `provider`: _Optional_. Use the specified VLM provider. Allowed values include `auto`, `openai`, `anthropic`, and `bedrock`. The default value is `anthropic`. +- `provider`: _Optional_. Use the specified VLM provider. Allowed values include `anthropic`, `auto`, `bedrock`, `openai`, and `vertexai`. The default value is `anthropic`. - `provider_api_key`: _Optional_. If specified, use a non-default API key for calls to the specified VLM provider as needed. The default is none, which means to rely on using Unstructured's internal default API key for the VLM provider. - `model`: _Optional_. If the Auto partitioning strategy needs to use the VLM partitioning strategy, then use the specified VLM. The default value is `claude-sonnet-4-20250514`. @@ -1181,6 +1185,10 @@ Fields for `settings` include: - `gpt-4o` - `gpt-5-mini-2025-08-07` + - For `vertexai`, available values for `model` are: + + - `gemini-2.0-flash-001` + - `output_format`: _Output_. The format of the response. Allowed values include `text/html` and `application/json`. The default is `text/html`. - `format_html`: _Optional_. True (the default) to apply Beautiful Soup's `prettify` method to the HTML that is generated by the VLM partitioner, which for example adds indentation for better readability. - `unique_element_ids`: _Optional_. True (the default) to assign UUIDs to element IDs, which guarantees their uniqueness. This is useful for example when using them as primary keys in a database. False to assign a SHA-256 of the element's text as its element ID.