Enable vertical text detection for rotated images <- Ingest test fixtures update by ryannikolaidis · Pull Request #4331 · Unstructured-IO/unstructured

ryannikolaidis · 2026-04-09T20:10:57Z

This pull request includes updated ingest test fixtures.
Please review and merge if appropriate.

Note

Low Risk
Updates are limited to test fixture golden files (HTML/JSON) with small text/id changes, so production behavior is unaffected; risk is mainly around masking or legitimizing unintended extraction regressions.

Overview
Updates ingest golden fixtures for layout-parser-paper.pdf structured output in both HTML and JSON.

The expected extracted content changes slightly (author line character corrections and an added trailing page number in a ListItem) and corresponding element_ids are updated to match the new extraction output.

^{Reviewed by Cursor Bugbot for commit 191ba7e. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 191ba7e. Configure here.}

cursor · 2026-04-09T20:15:36Z

...uctured-output/local-single-file-with-pdf-infer-table-structure/layout-parser-paper.pdf.json

-    "element_id": "4dfee7e352ae892814e46bb220094b0f",
-    "text": "Zejiang Shen! (4), Ruochen Zhang”, Melissa Dell?, Benjamin Charles Germain Lee*, Jacob Carlson®, and Weining Li>",
+    "element_id": "607ee712429ac9cf3540dbdc5e55e143",
+    "text": "Zejiang Shen! (4), Ruochen Zhang”, Melissa Dell?, Benjamin Charles Germain Lee*, Jacob Carlson?, and Weining Li®",


Markdown expected fixture not updated alongside JSON/HTML

Medium Severity

The JSON and HTML expected fixtures were updated but the corresponding markdown fixture at expected-structured-output-markdown/local-single-file-with-pdf-infer-table-structure/layout-parser-paper.pdf.md was not. The markdown file still contains old values (Jacob Carlson®, and Weining Li> and the header without 13). Since the markdown is derived from JSON via structured-json-to-markdown.sh, the markdown diff check will regenerate from the updated JSON and fail against the stale expected markdown.

Additional Locations (1)

test_unstructured_ingest/expected-structured-output/local-single-file-with-pdf-infer-table-structure/layout-parser-paper.pdf.json#L3294-L3296

^{Reviewed by Cursor Bugbot for commit 191ba7e. Configure here.}

Update ingest test fixtures

191ba7e

ryannikolaidis assigned vladimir-kivi-ds Apr 9, 2026

ryannikolaidis requested a review from vladimir-kivi-ds April 9, 2026 20:10

vladimir-kivi-ds merged commit cdc036e into vk/enable-vertical-text-detection-for-rotated-pages Apr 9, 2026
3 checks passed

vladimir-kivi-ds deleted the vk/enable-vertical-text-detection-for-rotated-pages|ingest-test-fixtures-update-21ed150 branch April 9, 2026 20:11

cursor bot reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable vertical text detection for rotated images <- Ingest test fixtures update#4331

Enable vertical text detection for rotated images <- Ingest test fixtures update#4331
vladimir-kivi-ds merged 1 commit intovk/enable-vertical-text-detection-for-rotated-pagesfrom
vk/enable-vertical-text-detection-for-rotated-pages|ingest-test-fixtures-update-21ed150

ryannikolaidis commented Apr 9, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ryannikolaidis commented Apr 9, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 9, 2026

Choose a reason for hiding this comment

Markdown expected fixture not updated alongside JSON/HTML

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryannikolaidis commented Apr 9, 2026 •

edited by cursor bot

Loading