Enable vertical text detection for rotated images <- Ingest test fixtures update#4331
Conversation
cdc036e
into
vk/enable-vertical-text-detection-for-rotated-pages
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 191ba7e. Configure here.
| "element_id": "4dfee7e352ae892814e46bb220094b0f", | ||
| "text": "Zejiang Shen! (4), Ruochen Zhang”, Melissa Dell?, Benjamin Charles Germain Lee*, Jacob Carlson®, and Weining Li>", | ||
| "element_id": "607ee712429ac9cf3540dbdc5e55e143", | ||
| "text": "Zejiang Shen! (4), Ruochen Zhang”, Melissa Dell?, Benjamin Charles Germain Lee*, Jacob Carlson?, and Weining Li®", |
There was a problem hiding this comment.
Markdown expected fixture not updated alongside JSON/HTML
Medium Severity
The JSON and HTML expected fixtures were updated but the corresponding markdown fixture at expected-structured-output-markdown/local-single-file-with-pdf-infer-table-structure/layout-parser-paper.pdf.md was not. The markdown file still contains old values (Jacob Carlson®, and Weining Li> and the header without 13). Since the markdown is derived from JSON via structured-json-to-markdown.sh, the markdown diff check will regenerate from the updated JSON and fail against the stale expected markdown.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 191ba7e. Configure here.


This pull request includes updated ingest test fixtures.
Please review and merge if appropriate.
Note
Low Risk
Updates are limited to test fixture golden files (HTML/JSON) with small text/id changes, so production behavior is unaffected; risk is mainly around masking or legitimizing unintended extraction regressions.
Overview
Updates ingest golden fixtures for
layout-parser-paper.pdfstructured output in both HTML and JSON.The expected extracted content changes slightly (author line character corrections and an added trailing page number in a
ListItem) and correspondingelement_ids are updated to match the new extraction output.Reviewed by Cursor Bugbot for commit 191ba7e. Bugbot is set up for automated code reviews on this repo. Configure here.