Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up) #13415

Merged

Conversation

Snuffleupagus
Copy link
Collaborator

According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR #10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current limited fallback from NameOrNumberTree.get, and defer to the call-site to handle this case explicitly e.g. by using NameOrNumberTree.getAll for data where that makes sense. For well-formed documents, these changes should not lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the Catalog and a new unit-test is also added.

@Snuffleupagus Snuffleupagus force-pushed the getDestination-out-of-order branch 3 times, most recently from 632f6ab to 0434bd0 Compare May 21, 2021 13:43
@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/e58289562fe5c9d/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/be7c99392e34cc6/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/e58289562fe5c9d/output.txt

Total script time: 2.18 mins

  • Unit Tests: FAILED

…10274 follow-up)

According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/be7c99392e34cc6/output.txt

Total script time: 3.72 mins

  • Unit Tests: FAILED

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/28d3c5cc1f8f508/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/30933ee4a554762/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/28d3c5cc1f8f508/output.txt

Total script time: 3.72 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://3.101.106.178:8877/30933ee4a554762/output.txt

Total script time: 7.28 mins

  • Unit Tests: Passed

@timvandermeij timvandermeij merged commit d1d9b90 into mozilla:master May 21, 2021
@timvandermeij
Copy link
Contributor

Good idea, and thanks for the extra test!

@Snuffleupagus Snuffleupagus deleted the getDestination-out-of-order branch May 21, 2021 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants