Skip to content

Fix for detection of marked content position#27

Merged
KateOrient merged 1 commit into
customizations-for-tagged-pdffrom
fix/bbox-position
Jun 3, 2026
Merged

Fix for detection of marked content position#27
KateOrient merged 1 commit into
customizations-for-tagged-pdffrom
fix/bbox-position

Conversation

@KateOrient
Copy link
Copy Markdown

Closes https://trello.com/c/ttjRjDxc

PR includes the following changes:

  • Removed EMC (OPS.endMarkedContent) operator from the list of operators that are processed as ones inside the content item. This prevents us from forming unnecessary content items each time we meet the end of the content item block.
  • Replaced simple strings conversion to UTF-8 with a proper and safe one (stringToPDFString) during the structure tree parsing. This fixes Failed to parse structure tree element: URI malformed errors and ensures we do not throw the whole structure element away in case it includes a string that cannot be fully decoded.

 - Wrong content items detection due to EMC processing
 - "Alt" text decoding errors
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e55dba0-1745-487d-a8f5-5d8df48cb1da

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/bbox-position

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@KateOrient KateOrient merged commit 0eb6c04 into customizations-for-tagged-pdf Jun 3, 2026
4 of 14 checks passed
@KateOrient KateOrient deleted the fix/bbox-position branch June 3, 2026 09:53
@KateOrient
Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants