Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TrueType] Recover from a missing "glyf" table by replacing it with dummy data, utilizing the existing code in sanitizeGlyphLocations #6848

Merged
merged 1 commit into from Jan 18, 2016

Conversation

Snuffleupagus
Copy link
Collaborator

It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.

This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in sanitizeGlyphLocations to replace a missing "glyf" table with dummy data.

Please note: Given that I'm not sure if we actually want to do this, I didn't want to waste time creating a reduced test-case. If this patch is deemed acceptable, I'll be happy to add a test.

Fixes #4684.
Fixes #6007.
Fixes #6829.

@timvandermeij
Copy link
Contributor

This looks like a good solution to me. I can confirm that the three mentioned issues are resolved with this PR. I think that if we are able to recover from bad data that we certainly should make an effort to do so. I'll leave the final decision for @yurydelendik or @brendandahl, but with a reduced test case added you have my blessing for this!

@brendandahl
Copy link
Contributor

This fix seems good to me. A reduced text selection test case would be nice.

@brendandahl brendandahl self-assigned this Jan 15, 2016
…ummy data, utilizing the existing code in `sanitizeGlyphLocations`

It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.

This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in `sanitizeGlyphLocations` to replace a missing "glyf" table with dummy data.

Fixes 4684.
Fixes 6007.
Fixes 6829.
@Snuffleupagus
Copy link
Collaborator Author

A reduced text test-case has been added, thanks for the review!

/botio test

@timvandermeij
Copy link
Contributor

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/a021431e6d26a7c/output.txt

@timvandermeij
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/fb0e31cd08b0e6d/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/e26e8021c0701b5/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/fb0e31cd08b0e6d/output.txt

Total script time: 20.86 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/e26e8021c0701b5/output.txt

Total script time: 21.33 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij
Copy link
Contributor

/botio makeref

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_makeref from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/8e8ee169c798263/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_makeref from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/8db9e8cf2094110/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/8db9e8cf2094110/output.txt

Total script time: 21.17 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/8e8ee169c798263/output.txt

Total script time: 21.19 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

timvandermeij added a commit that referenced this pull request Jan 18, 2016
[TrueType] Recover from a missing "glyf" table by replacing it with dummy data, utilizing the existing code in `sanitizeGlyphLocations`
@timvandermeij timvandermeij merged commit ec06610 into mozilla:master Jan 18, 2016
@timvandermeij
Copy link
Contributor

Thank you for patching this!

@Snuffleupagus Snuffleupagus deleted the recover-missing-glyf-table branch January 18, 2016 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants