Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(URGENT) PDF accessibility: math not showing up in PDF content text #871

Open
ronaldtse opened this issue Dec 1, 2022 · 6 comments
Open
Assignees

Comments

@ronaldtse
Copy link
Contributor

ronaldtse commented Dec 1, 2022

In PDF diffs it is important that the content does not introduce systematic changes from the original PDFs.

In the Metanorma-generated PDFs, the math content is missing from the content text. This is a sample from ISO 10303-50:
(https://github.com/metanorma/iso-10303-detached-docs/tree/main/sources/iso-10303-50)

Screen Shot 2022-12-01 at 12 44 07 PM

Screen Shot 2022-12-01 at 1 11 16 PM

In the generated PDFs, all the formula contents are "missing" from the content text. Given that we can insert AsciiMath for it, we would reduce a lot of these false positives.

@Intelligent2013
Copy link
Contributor

In the content tree there is text for x and y:
image

Copy-pasted text also contains them: arguments y and x, which.

Currently, mn2pdf inserts hidden math as the transparency text (metanorma/metanorma-bipm#188).
Looks like, Acrobat doesn't see such text in the comparison feature.

I'll try to remove transparency mode temporarily locally and test the comparing result.

@ronaldtse
Copy link
Contributor Author

Thank you for investigating! Indeed this is strange. I can verify in Preview that I can copy and paste this text.

@Intelligent2013
Copy link
Contributor

I've tested with different combination of color + transparency for hidden text:

  • color FFFFFF + transparency mode (currently) - there are differences in the Acrobat comparison results (i.e. Acrobat doesn't see the hidden text)
  • color FFFFFF without transparency - there are differences in the Acrobat comparison results (i.e. Acrobat doesn't see the hidden text)
  • color FEFEFE (almost 'white') without transparency - no differences in the Acrobat comparison results for y and x
    image
  • color FEFEFE (almost 'white') + transparency mode - there are differences in the Acrobat comparison results (i.e. Acrobat doesn't see the hidden text)

I.e. if the text in white color or/and transparent, then Acrobat ignore it in the compare feature. But it's available for copy-paste feature.
I don't figure out which is workaround solution can be applied...

@ronaldtse
Copy link
Contributor Author

Then let's just keep it as it for now. I don't think we do comparisons too often...

@ronaldtse
Copy link
Contributor Author

This is an interesting topic that @stuartgalt would be interested in as the PDF guru...

@ronaldtse
Copy link
Contributor Author

Similar to #870 this is a problem with Adobe Acrobat's Compare PDF feature. Letting @stuartgalt know in case the PDF TC has (or plan to have) specs for Compare PDF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏗 In progress
Development

No branches or pull requests

2 participants