-
-
Notifications
You must be signed in to change notification settings - Fork 745
Fix unescaped "<" in MathML in PDF #18520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ce`. It isn't used in speech (although maybe it should trigger a pause if wide), but it is used in some braille notations as a signal that this is a "fill in the blank" space. I also added the elementary math attributes used in MathPlayer. Neither MathCAT nor Access8Math currently support the elementary math notations, but it is on the list of things to implement for MathCAT. Note: potentially this could go into the beta, but I can't get the beta to build on my machine so I can't test the fix there.
|
This seems to have code changes from #18508 Could you clarify if that PR should be closed in favour of this one? There will be merge conflicts when we squash merge these. |
|
My apologies. I think I forgot to checkout master before creating the new branch. They are separate bugs, but both deal with PDF math and are in proximity to each other. Hence, they are slightly related. Please do whatever is simplest. |
|
@NSoiffer - it would probably be better to remove the duplication to make it easier to track and revert the separate changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an HTML syntax error in MathML PDF content where unescaped "<" characters were causing parsing issues. The fix ensures proper HTML escaping of values retrieved from PDF structure trees when generating MathML content.
- Added HTML escaping to node values in MathML generation
- Updated parameter passing for
html.escape()to use explicit keyword argument
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| user_docs/en/changes.md | Added changelog entry documenting the MathML escaping fix |
| source/NVDAObjects/IAccessible/adobeAcrobat.py | Applied HTML escaping to PDF node values and updated parameter syntax |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Link to issue number:
Fixes #18511
Summary of the issue:
If the structure tree is used for MathML in PDF, calls to getValue return the interpreted
>so that it becomes<and we have an HTML syntax error.Description of user facing changes:
The bug is fixed.
Description of developer facing changes:
None.
Description of development approach:
Called
html.escape()for the result of thegetValue(when it was not "none")Testing strategy:
I used the pdf doc that was part of the issue and tested to make sure I saw the error before the fix and that there was no error after the fix.b
Known issues with pull request:
Code Review Checklist:
@coderabbitai summary