net,stream: skip chunk check on incoming data #19559

lpinca · 2018-03-23T15:24:30Z

Do not validate data chunks read from the socket handle as they are
guaranteed to be buffers and validation is costly.

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
commit message follows commit guidelines

lpinca · 2018-03-23T15:26:42Z

lib/_stream_readable.js

@@ -192,9 +192,8 @@ Readable.prototype._destroy = function(err, cb) {
 // This returns true if the highWaterMark has not been hit yet,
 // similar to how Writable.write() returns true if you should
 // write() some more.
-Readable.prototype.push = function(chunk, encoding) {
+Readable.prototype.push = function(chunk, encoding, skipChunkCheck) {


Not sure if this is acceptable, but I have no better ideas. If it's ok, should we document it or keep it private?

Another option is to use a special encoding value to specify that chunk is a buffer and should not be checked.

Not a fan of this approach. What would likely be better is a new internal only Readable.prototype[kPush] = function(...) that Readable.prototype.push(...) can defer to and that our internal code can call directly.. then keep the type check in the existing push() and skip it in the internal one.

Whatever happens here, tho, the @nodejs/streams wg needs to take a look :-)

The private push would basically be just an helper for

node/lib/_stream_readable.js

Line 225 in f2e0288

function readableAddChunk(stream, chunk, encoding, addToFront, skipChunkCheck) {

so maybe we can add that to the prototype hidden behind a symbol?

lpinca · 2018-03-23T15:32:20Z

CI: https://ci.nodejs.org/job/node-test-pull-request/13836/

mscdex · 2018-03-23T18:38:56Z

Can we get benchmark (at least net and streams.Readable) results for this?

lpinca · 2018-03-23T19:26:30Z

No noticeable difference on net benchmarks but that's because they are not stressing this code path enough. Will post results of an ad hoc Readable benchmark in a bit.

Btw I've opened this only because chunkInvalid() is marked as hot in some flame graphs I'm generating.

jasnell · 2018-03-23T19:33:09Z

lib/_stream_readable.js

@@ -192,9 +192,8 @@ Readable.prototype._destroy = function(err, cb) {
 // This returns true if the highWaterMark has not been hit yet,
 // similar to how Writable.write() returns true if you should
 // write() some more.
-Readable.prototype.push = function(chunk, encoding) {
+Readable.prototype.push = function(chunk, encoding, skipChunkCheck) {


Not a fan of this approach. What would likely be better is a new internal only Readable.prototype[kPush] = function(...) that Readable.prototype.push(...) can defer to and that our internal code can call directly.. then keep the type check in the existing push() and skip it in the internal one.

Whatever happens here, tho, the @nodejs/streams wg needs to take a look :-)

lpinca · 2018-03-23T20:46:12Z

@mscdex

'use strict';

const common = require('../common');
const { Readable } = require('stream');

const bench = common.createBenchmark(main, {
  n: [1e6]
});

function main({ n }) {
  const buf = Buffer.alloc(64);
  const readable = new Readable({
    read() {}
  });

  readable.on('end', function() {
    bench.end(n);
  });
  readable.on('resume', function() {
    bench.start();
    for (var i = 0; i < n; i++)
      this.push(buf, undefined, true);
    this.push(null);
  });
  readable.resume();
}

$ cat compare2.csv | Rscript benchmark/compare.R 
                           confidence improvement accuracy (*)   (**)  (***)
 streams/skip.js n=1000000        ***      6.86 %       ±2.06% ±2.78% ±3.67%

so it probably does not worth the effort. I should try on v8.x and v9.x but I'm too lazy to recompile now.

lpinca · 2018-03-24T09:50:15Z

Same benchmark on v9.x-staging:

                           confidence improvement accuracy (*)   (**)  (***)
 streams/skip.js n=1000000        ***     35.99 %       ±1.42% ±1.88% ±2.45%

which explains why chunkInvalid() was marked as hot in my flame graph.

Edit: same results on on v8.x-staging.

mcollina

I’m very conflicted by this PR. It’d be worthy running benchmarks and applying this also for HTTP (post) and TLS.

I would prefer having this as an option at creation time and documenting it.
This seems like a feature we should support and offet to our users as well.

Also, some of the benefits might come from calling push with the right number of arguments and avoid the trampoline.

lpinca · 2018-03-24T20:31:26Z

@mcollina

I've run net benchmarks and there is no noticeable difference on master but that's expected, I should and will rerun them on v8.x where the impact is big. Also the stream must be in flowing mode to see anything appreciable.

This seems like a feature we should support and offet to our users as well.

Agreed but why having it as an instance option instead of a per push option?

Also, some of the benefits might come from calling push with the right number of arguments and avoid the trampoline.

Not sure I understand care to elaborate?

mcollina · 2018-03-24T23:57:24Z

Also, some of the benefits might come from calling push with the right number of arguments and avoid the trampoline.

Not sure I understand care to elaborate?

If you define a function with 2 arguments, but you invoke it with only one, V8 will place a trampoline between the two to fill in the missing arguments. For maximum performance, we should always call this.push(chunk, enc), not just this.push(chunk). I do not know how much this impact performance.

Agreed but why having it as an instance option instead of a per push option?

Either your code is "safe", or it's not. And also I prefer if we do not add a third parameter to push, for the reasons above.

addaleax · 2018-03-25T00:34:08Z

Hey! Just for context, one of my motivations for #18537 (string_decoder in C++) was that we could move the .setEncoding() functionality for StreamBases into C++, which would get rid of an extra allocation + copy operation in the string case, which gave around a 20 % - 30 % increase in those benchmarks (prototype: addaleax/node@e548b13).

I wouldn’t expect many people to call that for network sockets, but for file streams or HTTP/2 streams this seems a lot more likely, so I’ve been waiting for progress on #19060 (merging handling for StreamBases).

I don’t know whether that conflicts with this patch, but in case it does, I just wanted to mention it. :)

lpinca · 2018-03-25T05:45:49Z

If you define a function with 2 arguments, but you invoke it with only one, V8 will place a trampoline between the two to fill in the missing arguments. For maximum performance, we should always call this.push(chunk, enc), not just this.push(chunk). I do not know how much this impact performance.

Yes, but it doesn't matter in this case, the benchmark always run with 3 arguments and the 30% improvement is consistent.

lpinca · 2018-03-25T11:47:11Z

@mcollina does it better align with your idea now?
New CI: https://ci.nodejs.org/job/node-test-pull-request/13860/

There are cases where there is no need to validate the `chunk` argument of `Readable.prototype.push()` because it is known beforehand that `chunk` is valid. Make `Readable` constructor accept a `skipChunkCheck` option to skip `chunk` validation.

lpinca · 2018-03-25T15:41:25Z

Benchmark results on v9.x-staging:

$ cat compare.csv | Rscript benchmark/compare.R 
                                                                 confidence improvement accuracy (*)   (**)  (***)
 streams/readable-skip-chunk-check.js n=1000000 skipChunkCheck=0                 0.08 %       ±0.55% ±0.73% ±0.95%
 streams/readable-skip-chunk-check.js n=1000000 skipChunkCheck=1        ***     32.79 %       ±1.37% ±1.84% ±2.43%

Do not validate data chunks read from the socket handle as they are guaranteed to be buffers and validation is costly.

mcollina · 2018-03-25T18:46:53Z

Yes, but it doesn't matter in this case, the benchmark always run with 3 arguments and the 30% improvement is consistent.

I mean, in net it's currently called with 1 and the trampoline will be used.

I'm not really concerned about the benchmarks within stream, but rather something involving net or http. The benefits should be prominent everywhere.

lpinca · 2018-03-26T08:12:35Z

@mcollina, I've just rerun net benchmarks on v9.x-staging and there aren't noticeable differences, just a tiny (~5/6 % with one star) gain on some tests, but I think they are not representative.

Please take a look at this: https://foo-pcsvujxlnp.now.sh/

mcollina · 2018-03-26T18:20:12Z

Can you run http simple as well? I can't see a definite benefit on all my checks.

lpinca · 2018-03-26T19:26:02Z

I think there is no point, if net benchmarks show no gain I really doubt http (a layer on top) will be different.
That said I still think there is value in skipping chunk validation when possible because a good amount of CPU cycles is wasted there as show in the above graph and the added "readable" benchmark.
The only drawback added by this PR is the maintenance burden added by the skipChunkCheck option.

Feel free to close if you think there is no value.

mcollina · 2018-03-26T22:40:32Z

What it is puzzling me is why there is no impact of this in real life scenarios. Truly a 30% improvement in streams should be noticeable.
Maybe our current net and http benchmarks do not cover this hot path?
How did you spot this in the first place?

lpinca · 2018-03-27T05:52:50Z

My guess is that the gain is masked by other, heavier things.

Maybe our current net and http benchmarks do not cover this hot path?

Yes I think so but I've also tried to write an ad hoc test case where a very fast TCP client writes many small chunks as fast as possible with no luck. The chunks are coalesced into a single big TCP packet making the test useless.

How did you spot this in the first place?

By running and profiling an echo server benchmark in ws using uws as client (to make sure the client is faster than the server) and sending a lot (100k) of small messages.

mcollina · 2018-03-27T11:10:53Z

@lpinca in that case, what was the performance improvement of this change?

lpinca · 2018-03-27T14:07:46Z

@mcollina still insignificant but chunkInvalid() appeared in the graph, so I said why paying for something that is not needed?

I think the reason why it doesn't make a difference is that only a chunk is pushed per tick? It might make sense to test with tens of thousands connected sockets, but I'm not passionate enough to do that, sorry.

lpinca · 2018-03-27T19:31:53Z

Closing as per #19559 (comment). It's easier to upgrade Node.js than adding an option that is only useful on Node.js < 10 and that we will have to support forever.

nodejs-github-bot added net Issues and PRs related to the net subsystem. stream Issues and PRs related to the stream subsystem. labels Mar 23, 2018

lpinca commented Mar 23, 2018

View reviewed changes

jasnell requested changes Mar 23, 2018

View reviewed changes

mcollina reviewed Mar 24, 2018

View reviewed changes

lpinca force-pushed the skip/chunk-check branch from 6f7aa29 to 07477b7 Compare March 25, 2018 11:42

addaleax approved these changes Mar 25, 2018

View reviewed changes

stream: add skipChunkCheck option

f651d6c

There are cases where there is no need to validate the `chunk` argument of `Readable.prototype.push()` because it is known beforehand that `chunk` is valid. Make `Readable` constructor accept a `skipChunkCheck` option to skip `chunk` validation.

lpinca force-pushed the skip/chunk-check branch from 07477b7 to fe4513b Compare March 25, 2018 15:01

net: skip chunk check on incoming data

359b1ae

Do not validate data chunks read from the socket handle as they are guaranteed to be buffers and validation is costly.

lpinca force-pushed the skip/chunk-check branch from fe4513b to 359b1ae Compare March 25, 2018 17:23

lpinca closed this Mar 27, 2018

lpinca deleted the skip/chunk-check branch March 27, 2018 19:32

lpinca mentioned this pull request Apr 8, 2018

stream: use instanceof when checking chunk type #19872

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net,stream: skip chunk check on incoming data #19559

net,stream: skip chunk check on incoming data #19559

lpinca commented Mar 23, 2018

lpinca Mar 23, 2018

lpinca Mar 23, 2018

jasnell Mar 23, 2018

lpinca Mar 23, 2018

lpinca commented Mar 23, 2018

mscdex commented Mar 23, 2018

lpinca commented Mar 23, 2018

jasnell Mar 23, 2018

lpinca commented Mar 23, 2018 •

edited

Loading

lpinca commented Mar 24, 2018 •

edited

Loading

mcollina left a comment

lpinca commented Mar 24, 2018

mcollina commented Mar 24, 2018

addaleax commented Mar 25, 2018

lpinca commented Mar 25, 2018 •

edited

Loading

lpinca commented Mar 25, 2018 •

edited

Loading

lpinca commented Mar 25, 2018 •

edited

Loading

mcollina commented Mar 25, 2018

lpinca commented Mar 26, 2018

mcollina commented Mar 26, 2018

lpinca commented Mar 26, 2018

mcollina commented Mar 26, 2018

lpinca commented Mar 27, 2018 •

edited

Loading

mcollina commented Mar 27, 2018

lpinca commented Mar 27, 2018

lpinca commented Mar 27, 2018 •

edited

Loading

net,stream: skip chunk check on incoming data #19559

net,stream: skip chunk check on incoming data #19559

Conversation

lpinca commented Mar 23, 2018

Checklist

lpinca Mar 23, 2018

Choose a reason for hiding this comment

lpinca Mar 23, 2018

Choose a reason for hiding this comment

jasnell Mar 23, 2018

Choose a reason for hiding this comment

lpinca Mar 23, 2018

Choose a reason for hiding this comment

lpinca commented Mar 23, 2018

mscdex commented Mar 23, 2018

lpinca commented Mar 23, 2018

jasnell Mar 23, 2018

Choose a reason for hiding this comment

lpinca commented Mar 23, 2018 • edited Loading

lpinca commented Mar 24, 2018 • edited Loading

mcollina left a comment

Choose a reason for hiding this comment

lpinca commented Mar 24, 2018

mcollina commented Mar 24, 2018

addaleax commented Mar 25, 2018

lpinca commented Mar 25, 2018 • edited Loading

lpinca commented Mar 25, 2018 • edited Loading

lpinca commented Mar 25, 2018 • edited Loading

mcollina commented Mar 25, 2018

lpinca commented Mar 26, 2018

mcollina commented Mar 26, 2018

lpinca commented Mar 26, 2018

mcollina commented Mar 26, 2018

lpinca commented Mar 27, 2018 • edited Loading

mcollina commented Mar 27, 2018

lpinca commented Mar 27, 2018

lpinca commented Mar 27, 2018 • edited Loading

lpinca commented Mar 23, 2018 •

edited

Loading

lpinca commented Mar 24, 2018 •

edited

Loading

lpinca commented Mar 25, 2018 •

edited

Loading

lpinca commented Mar 25, 2018 •

edited

Loading

lpinca commented Mar 25, 2018 •

edited

Loading

lpinca commented Mar 27, 2018 •

edited

Loading

lpinca commented Mar 27, 2018 •

edited

Loading