Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Add a new transferPdfData option to allow transferring more data to the worker-thread (bug 1809164) #15908

Merged
merged 1 commit into from Jan 11, 2023

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Jan 9, 2023

Also, removes the initialData-parameter JSDocs for the getDocument-function given that this parameter has been completely unused since PR #8982 (over five years ago). Note that the initialData-parameter is, and always was, intended to be provided when initializing a PDFDataRangeTransport-instance.

src/display/api.js Outdated Show resolved Hide resolved
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@Snuffleupagus Snuffleupagus changed the title [api-minor] Add a new takeOwnershipOfData option to allow transferring more data to the worker-thread (bug 1809164) [api-minor] Add a new transferBinaryData option to allow transferring more data to the worker-thread (bug 1809164) Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
src/display/api.js Outdated Show resolved Hide resolved
@Snuffleupagus Snuffleupagus force-pushed the bug-1809164 branch 2 times, most recently from 3a25528 to 58ee701 Compare January 9, 2023 21:36
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 9, 2023
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/899d98324e41b51/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/fd36afce3106dd0/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/899d98324e41b51/output.txt

Total script time: 25.90 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 6
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/899d98324e41b51/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/fd36afce3106dd0/output.txt

Total script time: 32.31 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 19

Image differences available at: http://54.193.163.58:8877/fd36afce3106dd0/reftest-analyzer.html#web=eq.log

@Snuffleupagus Snuffleupagus marked this pull request as ready for review January 10, 2023 12:17
@Snuffleupagus
Copy link
Collaborator Author

Hopefully this should work correctly, and there's now basic unit-tests for all the relevant code-paths.

@calixteman
Copy link
Contributor

What about:

const buffer = new Uint8Array(args.chunk).buffer;
?

@Snuffleupagus
Copy link
Collaborator Author

What about:

const buffer = new Uint8Array(args.chunk).buffer;

There's somewhat similar code found in all of the src/display/fetch_stream.js, src/display/network.js, and src/display/node_stream.js files, w.r.t. basically copying a TypedArray when reviving data, and it's not at all clear to me that it'd be generally safe to start changing that.


Even if we only change the mentioned src/display/transport_stream.js code, since that's probably the only thing that's generally safe, that code has quite limited test-coverage and I'm a bit worried about breaking the Firefox PDF Viewer in some weird edge-case.
Ninja-edit: Although looking the following code, I suppose that this should be safe in Firefox?

@calixteman
Copy link
Contributor

In looking at:
https://searchfox.org/mozilla-central/rev/a156a65ced2dae5913ae35a68e9445b8ee7ca457/toolkit/components/pdfjs/content/PdfStreamConverter.jsm
I noticed that all the postMessage calls don't use the 3rd argument for transferable.
So right now, it means that we get a copy of the data in pdf.js side.
I do wonder if we should it: I think it should be safe (note that I don't have a strong certainty), wdyt ?

Anyway I'm fine to take the risk to break some corner cases: if something is broken it's an opportunity to improve and add some new tests so it isn't that bad :).

Even the initial goal of the bug I filed was mainly to fix a memleak, I think it's not that bad to improve that stuff here: it could help to win few 0.01% of battery on mobile.

…more data to the worker-thread (bug 1809164)

Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.
@Snuffleupagus Snuffleupagus changed the title [api-minor] Add a new transferBinaryData option to allow transferring more data to the worker-thread (bug 1809164) [api-minor] Add a new transferPdfData option to allow transferring more data to the worker-thread (bug 1809164) Jan 10, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 10, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 10, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 10, 2023
@mozilla mozilla deleted a comment from pdfjsbot Jan 10, 2023
@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/34a66e107515721/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/98b7e48ea477de2/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/34a66e107515721/output.txt

Total script time: 2.44 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/98b7e48ea477de2/output.txt

Total script time: 7.02 mins

  • Unit Tests: Passed

Copy link
Contributor

@calixteman calixteman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you.

@Snuffleupagus Snuffleupagus merged commit f28bf23 into mozilla:master Jan 11, 2023
@Snuffleupagus Snuffleupagus deleted the bug-1809164 branch January 11, 2023 16:59
@Snuffleupagus
Copy link
Collaborator Author

I noticed that all the postMessage calls don't use the 3rd argument for transferable.
So right now, it means that we get a copy of the data in pdf.js side.
I do wonder if we should it: I think it should be safe (note that I don't have a strong certainty), wdyt ?

For the data that's coming directly from the PdfStreamConverter.jsm file it really ought to be safe, given how that data is being read:

For the data that's coming from the PdfJsNetwork.jsm file, I suppose that it really depends on how the browser XMLHttpRequest-implementation works. Assuming that multiple onreadystatechange calls don't share one underlying ArrayBuffer, transferring this data ought to be safe as well.

Maybe we can just try updating the relevant postMessage calls with a transfers-argument, and test if that works?
However, we probably want to land this PR in mozilla-central first to not have all changes in the same Nightly-version.

@calixteman
Copy link
Contributor

As far as I can tell, it'd be very strange to not have full ownership of the ArrayBuffer returned by xhr because it'd mean that the data could change if not read/copy immediately and in such a case I'd expected to have a big warning in the MDN doc.
And yes, we should change that stuff in 111 (it should be next week iirc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants