Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the image quality of thumbnails rendered by PDFThumbnailView.draw (issue 8233) #13197

Merged
merged 3 commits into from Apr 9, 2021

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Apr 7, 2021

The reason for the fairly large discrepancy, in the thumbnail quality, between the draw/setImage-methods is that in the former case we directly render the thumbnails at the final size that they'll appear at in the sidebar. In the latter case, we instead downsize the (generally) much larger "regular" pages.

To address this, I'm thus proposing that we let PDFThumbnailView.draw render thumbnails at twice their intended size and then downsize them to the final size.
Obviously this will increase peak memory usage during thumbnail rendering in PDFThumbnailView.draw, since doubling the width/height of a canvas will lead to its pixel-count increasing by a factor of 4. Furthermore, since you need four components per pixel (given that it's RGBA-data), this will thus lead to the temporary thumbnail canvas-sizes increasing by a factor of 16 during rendering. Hence why rendering thumbnails at their "original" scale, i.e. using something like PDFPageProxy.getViewport({ scale: 1 });, would be an absolutely terrible idea!

To reduce the size and scope of these changes, I've tried to re-factor and re-use as much of the existing downsizing-implementation already present in PDFThumbnailView as possible.

While this will generally not make thumbnails rendered by PDFThumbnailView.draw look identical to those based on the rendered pages (via PDFThumbnailView.setImage), it's a considerable improvement as far as I'm concerned and enough to call the issue fixed.

Please note: This patch will not lead to any additional overhead, in either memory usage or parsing, for thumbnails which are based on the rendered pages.

Fixes #8233

@Snuffleupagus Snuffleupagus force-pushed the issue-8233 branch 4 times, most recently from f967883 to 3c1024b Compare April 7, 2021 17:59
@Snuffleupagus Snuffleupagus force-pushed the issue-8233 branch 5 times, most recently from 4853a18 to 7ec5d08 Compare April 8, 2021 09:55
…riables

These properties are always updated/used together, and there's no other methods which depend on just one of them, hence they're changed into local variables instead.
Looking through the history of this code, it seems they were converted *from* local variables and to properties all the way back in PR 2914; however as far as I can tell from that diff it doesn't seem to have been necessary even back then!?
…oving the thumbnail

A loop is less efficient than just overwriting the content, which is what we've generally been using (for years) in other parts of the code-base (see e.g. `BaseViewer` and `PDFThumbnailViewer`).
@Snuffleupagus Snuffleupagus force-pushed the issue-8233 branch 2 times, most recently from 7e58ef0 to f4fe49d Compare April 8, 2021 11:44
@Snuffleupagus Snuffleupagus marked this pull request as ready for review April 8, 2021 11:48
@Snuffleupagus
Copy link
Collaborator Author

I needed some time to think through all of the code-paths involved here (hence why it's submitted as a "draft"), and hopefully this patch now covers all the edge-cases involved here. As mentioned above this isn't perfect, but I hope it's deemed good enough since we need to consider the memory usage as well.

/botio-linux preview

@pdfjsbot
Copy link

pdfjsbot commented Apr 8, 2021

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/e2dc3703560a623/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 8, 2021

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/e2dc3703560a623/output.txt

Total script time: 4.43 mins

Published

….draw` (issue 8233)

The reason for the fairly large discrepancy, in the thumbnail quality, between the `draw`/`setImage`-methods is that in the former case we *directly* render the thumbnails at the final size that they'll appear at in the sidebar. In the latter case, we instead downsize the (generally) much larger "regular" pages.

To address this, I'm thus proposing that we let `PDFThumbnailView.draw` render thumbnails at *twice* their intended size and then downsize them to the final size.
Obviously this will increase *peak* memory usage during thumbnail rendering in `PDFThumbnailView.draw`, since doubling the width/height of a `canvas` will lead to its pixel-count increasing by a factor of `4`. Furthermore, since you need four components per pixel (given that it's RGBA-data), this will thus lead to the *temporary* thumbnail `canvas`-sizes increasing by a factor of `16` during rendering. Hence why rendering thumbnails at their "original" scale, i.e. using something like `PDFPageProxy.getViewport({ scale: 1 });`, would be an absolutely terrible idea!

To reduce the size and scope of these changes, I've tried to re-factor and re-use as much of the existing downsizing-implementation already present in `PDFThumbnailView` as possible.

While this will generally *not* make thumbnails rendered by `PDFThumbnailView.draw` look *identical* to those based on the rendered pages (via `PDFThumbnailView.setImage`), it's a considerable improvement as far as I'm concerned and enough to call the issue fixed.

*Please note:* This patch will not lead to *any* additional overhead, in either memory usage or parsing, for thumbnails which are based on the rendered pages.
@Snuffleupagus
Copy link
Collaborator Author

Snuffleupagus commented Apr 8, 2021

Off-topic for this PR, but: I'm wondering if we should simply remove

* @property {boolean} [disableCanvasToImageConversion] - Don't convert the
* canvas thumbnails to images. This prevents `toDataURL` calls, but
* increases the overall memory usage. The default value is `false`.
since it is/was never actually used and we don't even expose a way to enable it (e.g. through AppOptions) besides manually editing the code?

In cases where access to canvas-data is blocked, often done to reduce fingerprinting, we'll have more general rendering issues (affecting pages) given some of the canvas-accesses in e.g. src/display/canvas.js and src/display/text_layer.js.

@timvandermeij timvandermeij merged commit f5e973d into mozilla:master Apr 9, 2021
@timvandermeij
Copy link
Contributor

Looks better to me too. Thank you for looking into this!

Regarding disableCanvasToImageConversion, I believe this was introduced indeed for avoiding canvas fingerprinting, in particular in the Tor browser (Firefox fork), but I don't know if it was actually used. All I can find is #7026.

@Snuffleupagus Snuffleupagus deleted the issue-8233 branch April 9, 2021 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Thumbnail downsampling is bad if thumbnail is rendered before full view of that page
3 participants