Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: improve fill & normalizeEncoding performance #18790

Closed
wants to merge 5 commits into from

Conversation

BridgeAR
Copy link
Member

@BridgeAR BridgeAR commented Feb 15, 2018

This improves the performance of Buffer#fill and of normalizeEncoding. The latter focuses on the common cases as can be seen in the benchmarks.

I made the Buffer.isEncoding() implementation stricter again after it was loosened in #7207. It will not return true for an empty string anymore.
normalizeEncoding is now also stricter and it returns undefined for false, NaN and 0.
undefined, null and '' are still "valid" utf8 encodings.
The Buffer#fill implementation will now also throw an OOB error in case end is a negative value. This makes it consistent with start and it helps to identify issues since before it would just been ignored instead.
Buffer#fill will throw the errors in JS from now on in case the OOB is detected in c++ and those errors contain the proper error code from now on.
Buffer#fill will from now on also accept null as valid utf8 encoding in case a string is provided. That was not the case before but we do accept it in other places and that makes it more consistent.

Buffer#fill performance
                                                                          confidence improvement accuracy (*)   (**)   (***)
 buffers/buffer-fill.js n=20000 size=10 type='fill("")'                          ***     16.25 %       ±5.16% ±6.89%  ±9.01%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", "utf8")'                 ***     17.49 %       ±3.59% ±4.78%  ±6.23%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", 0, "utf8")'              ***     14.82 %       ±4.17% ±5.55%  ±7.23%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", 0)'                      ***     21.59 %       ±4.88% ±6.50%  ±8.47%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t")'                         ***     23.16 %       ±3.88% ±5.17%  ±6.73%
 buffers/buffer-fill.js n=20000 size=10 type='fill("test")'                      ***     12.85 %       ±3.91% ±5.24%  ±6.91%
 buffers/buffer-fill.js n=20000 size=10 type='fill(0)'                           ***     22.39 %       ±2.48% ±3.30%  ±4.30%
 buffers/buffer-fill.js n=20000 size=10 type='fill(100)'                         ***     24.41 %       ±4.52% ±6.02%  ±7.85%
 buffers/buffer-fill.js n=20000 size=10 type='fill(400)'                         ***     22.00 %       ±2.27% ±3.03%  ±3.95%
 buffers/buffer-fill.js n=20000 size=10 type='fill(Buffer.alloc(1), 0)'          ***      8.60 %       ±3.85% ±5.17%  ±6.84%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("")'                        ***     22.84 %       ±6.16% ±8.20% ±10.68%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", "utf8")'               ***      8.74 %       ±4.89% ±6.57%  ±8.67%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", 0, "utf8")'            ***     10.69 %       ±4.94% ±6.60%  ±8.63%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", 0)'                    ***     14.17 %       ±4.13% ±5.51%  ±7.20%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t")'                       ***     21.25 %       ±3.83% ±5.10%  ±6.65%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("test")'                    ***     16.50 %       ±2.03% ±2.70%  ±3.52%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(0)'                         ***     29.73 %       ±4.40% ±5.89%  ±7.72%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(100)'                       ***     15.48 %       ±2.54% ±3.38%  ±4.40%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(400)'                       ***     16.19 %       ±2.62% ±3.49%  ±4.55%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(Buffer.alloc(1), 0)'         **      4.36 %       ±2.79% ±3.74%  ±4.90%
Normalize encoding performance
                                                                    confidence improvement accuracy (*)    (**)   (***)
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ascii'           ***     43.44 %       ±5.20% ±7.00%  ±9.28%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ASCII'           ***    136.89 %       ±1.27% ±1.69%  ±2.20%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='base64'          ***     66.18 %       ±2.95% ±3.93%  ±5.13%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='BASE64'          ***    150.70 %       ±2.20% ±2.94%  ±3.88%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='binary'           **      4.80 %       ±3.38% ±4.50%  ±5.86%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='BINARY'          ***     81.52 %       ±2.15% ±2.87%  ±3.73%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='hex'             ***     57.24 %       ±6.14% ±8.23% ±10.84%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='HEX'             ***    209.63 %       ±3.27% ±4.35%  ±5.67%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='latin1'          ***      7.58 %       ±1.59% ±2.12%  ±2.80%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='LATIN1'          ***     99.93 %       ±2.00% ±2.69%  ±3.55%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ucs-2'           ***    -13.48 %       ±1.73% ±2.32%  ±3.06%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UCS-2'           ***     70.29 %       ±1.09% ±1.46%  ±1.91%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ucs2'            ***    -21.51 %       ±3.98% ±5.30%  ±6.92%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UCS2'            ***    101.15 %       ±4.36% ±5.84%  ±7.68%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf-16le'        ***     12.98 %       ±4.21% ±5.60%  ±7.30%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF-16LE'        ***    159.48 %       ±2.72% ±3.65%  ±4.79%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf-8'           ***      7.24 %       ±3.28% ±4.41%  ±5.84%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF-8'           ***    102.20 %       ±2.58% ±3.45%  ±4.53%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf16le'         ***     10.11 %       ±1.42% ±1.90%  ±2.48%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF16LE'         ***    148.52 %       ±3.42% ±4.57%  ±6.01%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf8'            ***     14.78 %       ±3.00% ±3.99%  ±5.21%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF8'            ***    153.65 %       ±3.99% ±5.31%  ±6.91%
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

@BridgeAR BridgeAR added the semver-major PRs that contain breaking changes and should be released in the next major version. label Feb 15, 2018
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Feb 15, 2018
@BridgeAR BridgeAR force-pushed the buffer-fill branch 3 times, most recently from b2f5427 to f082114 Compare February 15, 2018 03:40
Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a good CITGM run.

@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3));
<!-- YAML
added: v0.5.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/REPLACEME
description: Negative `end` values throw an `ERR_INDEX_OF_OUT_BOUNDS` error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERR_INDEX_OUT_OF_RANGE?

@vsemozhetbyt vsemozhetbyt added the performance Issues and PRs related to the performance of Node.js. label Feb 15, 2018
Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

I think it would be interesting to add fill with larger values to the benchmark.

Actual changes LGTM

}

function slowCases(enc) {
switch (enc.length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this improves performance noticeably.

@BridgeAR
Copy link
Member Author

@benjamingr the performance gain for Buffer#fill goes down the bigger the buffer is. The break even point should be at about 25kb. Above that the filling will be the main time consumer.

@benjamingr
Copy link
Member

@BridgeAR right, but since we're adding a benchmark that will run when this code changes for a while, I think there is value in adding a larger buffer for the test case - I suspect allocating large'ish buffers is a pretty common use case. Even if there won't be a big difference here.

@BridgeAR
Copy link
Member Author

                                                                           confidence improvement accuracy (*)    (**)   (***)
 buffers/buffer-fill.js n=20000 size=16384 type='fill("")'                          *      7.97 %       ±7.19%  ±9.57% ±12.45%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", "utf8")'                **     11.49 %       ±7.88% ±10.49% ±13.67%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", 0, "utf8")'            ***     12.83 %       ±6.09%  ±8.15% ±10.70%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", 0)'                    ***     16.12 %       ±6.62%  ±8.87% ±11.66%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t")'                       ***     13.09 %       ±5.81%  ±7.76% ±10.17%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("test")'                     **      7.51 %       ±4.88%  ±6.50%  ±8.47%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(0)'                         ***     16.09 %       ±7.95% ±10.60% ±13.86%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(100)'                       ***     18.08 %       ±8.67% ±11.56% ±15.07%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(400)'                       ***     16.70 %       ±7.64% ±10.17% ±13.25%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(Buffer.alloc(1), 0)'                 1.10 %       ±7.26%  ±9.65% ±12.57%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("")'                                 3.66 %       ±4.81% ±6.45% ±8.48%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", "utf8")'                **      8.65 %       ±5.08% ±6.76% ±8.81%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", 0, "utf8")'             **      4.75 %       ±3.31% ±4.41% ±5.76%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", 0)'                             2.65 %       ±4.13% ±5.50% ±7.17%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t")'                         *      5.92 %       ±5.14% ±6.90% ±9.10%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("test")'                             3.32 %       ±4.13% ±5.50% ±7.18%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(0)'                                  3.52 %       ±4.73% ±6.29% ±8.19%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(100)'                               -0.48 %       ±3.06% ±4.07% ±5.31%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(400)'                         *      4.06 %       ±4.01% ±5.34% ±6.95%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(Buffer.alloc(1), 0)'                -1.73 %       ±3.65% ±4.88% ±6.38%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("")'                          *      1.77 %       ±1.33% ±1.78% ±2.33%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", "utf8")'                        0.82 %       ±1.31% ±1.75% ±2.28%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", 0, "utf8")'                    -0.97 %       ±1.56% ±2.08% ±2.72%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", 0)'                             0.72 %       ±1.69% ±2.25% ±2.95%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t")'                               -0.19 %       ±1.64% ±2.19% ±2.87%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("test")'                            -0.81 %       ±2.34% ±3.14% ±4.14%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(0)'                                  1.10 %       ±2.34% ±3.12% ±4.06%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(100)'                                0.40 %       ±0.87% ±1.16% ±1.51%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(400)'                         *      1.41 %       ±1.39% ±1.85% ±2.41%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(Buffer.alloc(1), 0)'                -0.09 %       ±1.94% ±2.58% ±3.36%

@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3));
<!-- YAML
added: v0.5.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/REPLACEME
description: Negative `end` values throw an `ERR_INDEX_OF_OUT_RANGE` error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OF_OUT -> OUT_OF :)

@BridgeAR BridgeAR added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 16, 2018
@BridgeAR BridgeAR requested a review from a team February 16, 2018 14:36
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

1) This improves the performance for Buffer#fill by using shortcuts.
2) It also ports throwing errors to JS. That way they contain the
proper error code.
3) Using negative `end` values will from now on result in an error
instead of just doing nothing.
4) Passing in `null` as encoding is from now on accepted as 'utf8'.
This focuses on the common case by making sure they are prioritized.
It also changes some typeof checks to test for undefined since
that is faster and it adds a benchmark.
Due to a consolidation the isEncoding function got less strict in
version 5.x.x. This commit makes sure we do not return `true` for
empty strings.
@BridgeAR
Copy link
Member Author

Rebased due to conflicts.

CI https://ci.nodejs.org/job/node-test-pull-request/13331/

@BridgeAR
Copy link
Member Author

BridgeAR commented Mar 2, 2018

Landed in d3af120...452eed9

@BridgeAR BridgeAR closed this Mar 2, 2018
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
1) This improves the performance for Buffer#fill by using shortcuts.
2) It also ports throwing errors to JS. That way they contain the
proper error code.
3) Using negative `end` values will from now on result in an error
instead of just doing nothing.
4) Passing in `null` as encoding is from now on accepted as 'utf8'.

PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
This focuses on the common case by making sure they are prioritized.
It also changes some typeof checks to test for undefined since
that is faster and it adds a benchmark.

PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
1) This improves the performance for Buffer#fill by using shortcuts.
2) It also ports throwing errors to JS. That way they contain the
proper error code.
3) Using negative `end` values will from now on result in an error
instead of just doing nothing.
4) Passing in `null` as encoding is from now on accepted as 'utf8'.

PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
This focuses on the common case by making sure they are prioritized.
It also changes some typeof checks to test for undefined since
that is faster and it adds a benchmark.

PR-URL: nodejs#18790
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@ChALkeR
Copy link
Member

ChALkeR commented Aug 23, 2018

@nodejs/security thoughts on preventing such changes in the future?
That even had a guard comment which also got removed by this.

My opinion is that a testcase should have been introduced at the same time when the comment was. Or, perhaps, even instead of the comment.

Upd: filed #22492.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. performance Issues and PRs related to the performance of Node.js. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants