Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Ensure that PDFDocumentProxy.hasJSActions won't fail if MissingDataExceptions are thrown during the associated worker-thread parsing #13234

Merged

Conversation

Snuffleupagus
Copy link
Collaborator

With the current implementation of PDFDocument.hasJSActions, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in two different ways:

  • The PDFDocument.fieldObjects getter (and its helper method), while it may return a Promise, still fetches all of its data synchronously and it can thus throw a MissingDataException during parsing.
  • The Catalog.jsActions getter, which is completely synchronous, can obviously throw a MissingDataException during parsing.

If either of these cases occur currently, the PDFDocumentProxy.hasJSActions method in the API can either return a rejected Promise (which it never should) or possibly "hang" and never resolve.

Please note: While I've not yet seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection and on manually throwing a MissingDataException on the first access of Catalog.jsActions to simulate this situation.

Finally, this patch adds a couple of API unit-tests for this (since none existed).

…Object

Given that this only an internal helper method, used by the `Catalog.{javaScript, jsActions}` getters, this change simplifies iteration of the returned data.
We can also (slightly) re-factor the code of the `jsActions` getter, and remove an obsolete[1] JSDoc-comment from the `openAction` getter.

---
[1] Not really relevant now that we've got proper scripting support.
… `MissingDataException`s are thrown during the associated worker-thread parsing

With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in *two* different ways:
 - The `PDFDocument.fieldObjects` getter (and its helper method), while it may *return* a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing.
 - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing.

If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a *rejected* Promise (which it never should) or possibly "hang" and never resolve.

*Please note:* While I've not *yet* seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection *and* on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation.

Finally, this patch adds a couple of *API* unit-tests for this (since none existed).
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/81dbe4b3194315a/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/17bafef863cfe8e/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/17bafef863cfe8e/output.txt

Total script time: 25.09 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/17bafef863cfe8e/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/81dbe4b3194315a/output.txt

Total script time: 28.58 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/81dbe4b3194315a/reftest-analyzer.html#web=eq.log

@timvandermeij timvandermeij merged commit ebeb3f7 into mozilla:master Apr 13, 2021
@timvandermeij
Copy link
Contributor

Nice find, and thanks for adding more unit test coverage!

@Snuffleupagus Snuffleupagus deleted the hasJSActions-MissingDataException branch April 14, 2021 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants