Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: use instanceof when checking chunk type #19872

Closed
wants to merge 1 commit into from

Conversation

lpinca
Copy link
Member

@lpinca lpinca commented Apr 7, 2018

Use the instanceof operator instead of Stream._isUint8Array() when
checking if a chunk is a Uint8Array as this is faster.

'use strict';

const assert = require('assert');
const Benchmark = require('benchmark');
const { _isUint8Array } = require('stream');

const suite = new Benchmark.Suite();
const buf = Buffer.alloc(0);
var ret;

suite.add('instanceof', function () {
  ret = buf instanceof Uint8Array;
  assert(ret);
});
suite.add('_isUint8Array', function () {
  ret = _isUint8Array(buf);
  assert(ret);
});
suite.on('cycle', function (event) {
  console.log(event.target.toString());
});
suite.on('complete', function () {
  console.log('Fastest is ' + this.filter('fastest').map('name'));
});

suite.run({ async: true });

Node.js 9.11.1:

$ node index.js 
instanceof x 151,468,107 ops/sec ±0.31% (90 runs sampled)
_isUint8Array x 66,015,037 ops/sec ±0.28% (89 runs sampled)
Fastest is instanceof

Node.js master:

$ ../node/node index.js 
instanceof x 217,093,783 ops/sec ±0.14% (94 runs sampled)
_isUint8Array x 179,445,229 ops/sec ±0.22% (93 runs sampled)
Fastest is instanceof
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

Use the `instanceof` operator instead of `Stream._isUint8Array()` when
checking if a chunk is a `Uint8Array` as this is faster.
@nodejs-github-bot nodejs-github-bot added the stream Issues and PRs related to the stream subsystem. label Apr 7, 2018
@lpinca
Copy link
Member Author

lpinca commented Apr 7, 2018

@devsnek
Copy link
Member

devsnek commented Apr 7, 2018

i feel compelled to mention util.types.isUint8Array, i assume we want to start making our type checks consistent

@lpinca
Copy link
Member Author

lpinca commented Apr 7, 2018

@devsnek Stream._isUint8Array === util.types.isUint8Array

@devsnek
Copy link
Member

devsnek commented Apr 7, 2018

@lpinca why not then change util.types.isUint8Array to use instanceof

@lpinca
Copy link
Member Author

lpinca commented Apr 7, 2018

Yes it makes sense.

@anliting
Copy link

anliting commented Apr 7, 2018

It seems that isUint8Array have a different semantic comparing to instanceof:

// start: copied from /lib/internal/util/types.js
const ReflectApply = Reflect.apply;
function uncurryThis(func) {
  return (thisArg, ...args) => ReflectApply(func, thisArg, args);
}
const TypedArrayPrototype = Object.getPrototypeOf(Uint8Array.prototype);
const TypedArrayProto_toStringTag =
    uncurryThis(
      Object.getOwnPropertyDescriptor(TypedArrayPrototype,
                                      Symbol.toStringTag).get);

function isUint8Array(value) {
  return TypedArrayProto_toStringTag(value) === 'Uint8Array';
}
// end: copied from /lib/internal/util/types.js

let a={__proto__:Uint8Array.prototype}
console.log(
    a instanceof Uint8Array,    // true
    isUint8Array(a)             // false
)

Also see (Uint8Array might also in the same case): https://stackoverflow.com/a/22289982

@mscdex mscdex added the performance Issues and PRs related to the performance of Node.js. label Apr 7, 2018
@lpinca
Copy link
Member Author

lpinca commented Apr 7, 2018

@anliting good point. That's a good reason to not change util.types functions. I don't know if we need to be so strict on streams. In v6.x there was no support for Uint8Array in streams and we were using instanceof to check if a chunk was a buffer.

@addaleax
Copy link
Member

addaleax commented Apr 7, 2018

This might potentially be breaking because instanceof does something else than util.types.isUint8Array, as noted above.

If you are looking for a faster alternative, did you try chunk[Symbol.toStringTag] === 'Uint8Array'? That is not fool-proof either, but I could imagine that the engine could handle it pretty well.

why not then change util.types.isUint8Array to use instanceof

That defeats the purpose of having these functions – the util.types checks are supposed something that works in a cross-context manner and isn’t fooled by modifying prototypes in some way.

@lpinca
Copy link
Member Author

lpinca commented Apr 8, 2018

If you are looking for a faster alternative, did you try chunk[Symbol.toStringTag] === 'Uint8Array'? That is not fool-proof either, but I could imagine that the engine could handle it pretty well.

It has the same performance of util.types.isUint8Array() and as you says it's not better than instanceof:

const assert = require('assert');

class Foo {
  get [Symbol.toStringTag]() {
    return 'Uint8Array';
  }
}

const foo = new Foo();
assert(foo[Symbol.toStringTag] === 'Uint8Array');

@BridgeAR
Copy link
Member

BridgeAR commented Apr 8, 2018

In general: I am a fan of having strict type checks. In this specific case, I believe it will not make a difference besides the performance gain, so I am +1.

@addaleax

This might potentially be breaking because instanceof does something else than util.types.isUint8Array [...]

The first check does not seem to be breaking, because it will just allow weird chunks to be accepted instead of throwing an error. The second entry seems different though and it could be counted as breaking. But why would someone want to try to pass in a manipulated chunk? It would only be bad for the user in that case, so I do not see an issue there either.

BridgeAR
BridgeAR previously approved these changes Apr 8, 2018
@TimothyGu
Copy link
Member

TimothyGu commented Apr 8, 2018

The compatibility issue at play is that this will no longer accept Uint8Array objects from other contexts. I am not convinced this is a good idea. It’s not that this new check is “not as strict,” but rather “not as correct.”

I would also be very much surprised if the result from this benchmark affects any real-world applications of Node.js.

@TimothyGu
Copy link
Member

TimothyGu commented Apr 8, 2018

I will also note that V8 has been willing to do more optimizations on %TypedArrayPrototype%[\@\@toStringTag], as seen in #15663 and https://crbug.com/v8/6874.

@BridgeAR
Copy link
Member

BridgeAR commented Apr 8, 2018

@TimothyGu thanks a lot for bringing up the context. I did not think about that anymore.

@BridgeAR BridgeAR dismissed their stale review April 8, 2018 17:49

Removing my LG due to the context part.

@BridgeAR
Copy link
Member

BridgeAR commented Apr 8, 2018

@TimothyGu

I will also note that V8 has been willing to do more optimizations on %TypedArrayPrototype%[@@toStringTag]

The referenced issue is marked as fixed. So master should already reflect that performance boost with the toStringTag.

@lpinca
Copy link
Member Author

lpinca commented Apr 8, 2018

The referenced issue is marked as fixed. So master should already reflect that performance boost with the toStringTag.

Yes, it is no longer 2/3x slower but still not as fast as instanceof as reported by the above benchmark.
WRT to the context issue, doesn't it also apply for Buffer.isBuffer() and all other places where instanceof is already used?

Also please take a look at #19559 which is the reason why I would like to make chunkInvalid() as fast as possible.

@addaleax
Copy link
Member

addaleax commented Apr 8, 2018

WRT to the context issue, does it also apply for Buffer.isBuffer() and all other places where instanceof is already used?

Yes, it’s a general issue with instanceoffoo instanceof Bar checks the prototype chain of foo for Bar.prototype, and different contexts generally will have different Bar.prototype objects for built-in Bars.

Also, by the way: We did move away from instanceof in quite a few places, because it’s such a complex operation and usually there were more performant alternatives than that.

If you are looking for a faster alternative, did you try chunk[Symbol.toStringTag] === 'Uint8Array'? That is not fool-proof either, but I could imagine that the engine could handle it pretty well.

It has the same performance of util.types.isUint8Array() and as you says it's not better than instanceof:

It can be fooled, but if it is, I think we can assume that it’s intentional – and the approach does work in a multi-context way.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/153/

@lpinca
Copy link
Member Author

lpinca commented Apr 8, 2018

@addaleax yes my point was that we still have the context issue in other places in particular with Buffer.isBuffer() and this is a known "issue". Is it common to move data chunk in streams between different contexts and have we had issues opened for this reason in Node.js <= 6?

Fooling instanceof is also intentional imho. If there is something safer and faster I am all for it obviously :)

@addaleax
Copy link
Member

addaleax commented Apr 8, 2018

Is it common to move data chunk in streams between different contexts and have we had issues opened for this reason in Node.js <= 6?

Not really, I guess.

Fooling instanceof is also intentional imho. If there is something safer and faster I am all for it obviously :)

I don’t know, tbh. Fwiw, here’s the PR that originally moved away from instanceof, including the benchmarks that showed a significant performance increase at the time: #11608

@lpinca
Copy link
Member Author

lpinca commented Apr 8, 2018

I remember, instanceof was very slow just like Function.prototype.bind() which is now fast! It's hard to follow VMs changes.

@BridgeAR
Copy link
Member

BridgeAR commented Apr 8, 2018

We did move away from instanceof in quite a few places, because it’s such a complex operation and usually there were more performant alternatives than that.

With Turbofan instanceof is by far not that bad anymore. It is still slower than e.g. typeof but not even close as bad as it used to be.

@addaleax
Copy link
Member

addaleax commented Apr 8, 2018

@lpinca How did you benchmark this? The benchmark CI shows no significant results for streams (positive or negative), which is kind of surprising to me.

@lpinca
Copy link
Member Author

lpinca commented Apr 9, 2018

@addaleax benchmark source is in PR description. You started it on master where the impact is low.
Here is a run against v8.x: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/154/console
Not a huge improvement but still relevant.

@addaleax
Copy link
Member

addaleax commented Apr 9, 2018

@lpinca I see – I’m not sure then what to think. I would probably consider this a breaking change, but it seems like the places where we’d actually want it are v8.x and v9.x?

@lpinca
Copy link
Member Author

lpinca commented Apr 9, 2018

Yes, if it can't be backported it doesn't make sense. it's a ~6% improvement on read and ~6% on write, not bad for duplex streams, multiply that for thousands of streams and the gain is not insignificant imho.

There are no explicit -1 yet but it seems people are inclined toward that. If anyone makes it explicit I'll close this.

@BridgeAR
Copy link
Member

I have no strong opinion about this and would like to defer this to @addaleax

@lpinca
Copy link
Member Author

lpinca commented Apr 16, 2018

No approvals or disapprovals, this is going nowhere, closing.

@lpinca lpinca closed this Apr 16, 2018
@lpinca lpinca deleted the use/instanceof branch April 16, 2018 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants