Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of arrayBufferToBinaryString #3121

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
36 changes: 24 additions & 12 deletions src/modules/addimage.js
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ import { atob, btoa } from "../libs/AtobBtoa.js";

var UNKNOWN = "UNKNOWN";

// Heuristic limit after which String.fromCharCode will start to overflow.
// Need to change to StringDecoder.
var ARRAY_BUFFER_LIMIT = 6000000;

Copy link
Contributor

@101arrowz 101arrowz Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure doesn't seem right. The max frame size is 1MB to begin with, so 6MB is too big for sure. On my device, the limit is around 100kB, but I'm pretty sure going for around 16K is safe. Could you verify this in the console?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit large for me, too. 16k sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some benchmarks from different batch sizes: https://jsbench.me/x5kmnswnmz/1. Sweet spot at least for Chromium 89 seems to be at 4096-8192 slight degradation at 16k but slowing down rapidly after that. Manual testing shows minor GCs going to major GC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's take 8k? :)

var imageFileTypeHeaders = {
PNG: [[0x89, 0x50, 0x4e, 0x47]],
TIFF: [
Expand Down Expand Up @@ -730,20 +734,28 @@ import { atob, btoa } from "../libs/AtobBtoa.js";
var arrayBufferToBinaryString = (jsPDFAPI.__addimage__.arrayBufferToBinaryString = function(
buffer
) {
try {
return atob(btoa(String.fromCharCode.apply(null, buffer)));
} catch (e) {
if (
typeof Uint8Array !== "undefined" &&
typeof Uint8Array.prototype.reduce !== "undefined"
) {
return new Uint8Array(buffer)
.reduce(function(data, byte) {
return data.push(String.fromCharCode(byte)), data;
}, [])
.join("");
// Skip direct conversion as it might take many hundred milliseconds before throwing.
if (buffer.length < ARRAY_BUFFER_LIMIT) {
try {
return atob(btoa(String.fromCharCode.apply(null, buffer)));
} catch (e) {
// Buffer was overflown, fall back to other methods
}
Copy link
Contributor

@101arrowz 101arrowz Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works (or at least it shouldn't if the second argument is an ArrayBuffer). ArrayBuffer cannot be indexed and therefore cannot be used in .apply. I'm pretty sure you need to convert to Uint8Array first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I know atob(btoa(str)) was in the previous code, but I don't think atob(btoa(str)) is any different from str. Is there an example that differs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering about that as well.

}
// Text decoder is much faster than string concatenation or reduce/join.
if (typeof TextDecoder !== "undefined") {
var decoder = new TextDecoder("ascii");
return decoder.decode(buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, for binary strings, this is not going to work. ASCII only covers 127 code points, i.e. 0x00 through 0x7F. We need 0x00 through 0xFF (the entire byte range). For example, this fails:

const inputBuffer = new Uint8Array([252, 129, 187, 147]);
const binStr = new TextDecoder('ascii').decode(inputBuffer.buffer);
console.log(binStr.split('').map(str => str.charCodeAt(0)));
// [252, 129, 187, 8220]

As it turns out, TextDecoder doesn't work well for binary string encoding, and this case should probably be removed entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this passed the test cases actually...Any ideas? Since that simple example yielded malformed output I would expect something to have broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how it passed library tests but it passed my own visual validation because the color palette was so small. In the end I did find it produced malformed data as some grays were going to green.
For some reason I had in mind ascii was 8 bit as well. And after some research TextDecoder still isn't a choice with any given text encoding to replace fromCharCode...

} else if (
typeof Uint8Array !== "undefined" &&
typeof Uint8Array.prototype.reduce !== "undefined"
) {
return new Uint8Array(buffer)
.reduce(function(data, byte) {
return data.push(String.fromCharCode(byte)), data;
}, [])
.join("");
}
Pantura marked this conversation as resolved.
Show resolved Hide resolved
});

/**
Expand Down