Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES2016 compat issue with Buffer subclassing Uint8Array #4701

Closed
littledan opened this issue Jan 14, 2016 · 64 comments
Closed

ES2016 compat issue with Buffer subclassing Uint8Array #4701

littledan opened this issue Jan 14, 2016 · 64 comments
Assignees
Labels
buffer Issues and PRs related to the buffer subsystem.

Comments

@littledan
Copy link

In the ES2016 draft specification, TypedArray methods like
%TypedArray%.prototype.subarray() call out to a constructor for the result
based on the receiver. Ordinarily, the constructor is instance.constructor,
but subclasses can override this using the Symbol.species property on the
constructor.

Buffer.prototype.slice calls out to %TypedArray%.prototype.subarray, which
calls this calculated constructor with three arguments. The argument pattern
doesn't correspond to a constructor for Buffer, so without setting
Symbol.species appropriately, the wrong kind of result is created.

This issue came up because I'm working on implementing spec-compliant
subclassable TypedArrays in V8. feross helpfully reported that I broke his
buffer library for the browser. I wrote a couple patches to update both Node
and the browser buffer library for ES2016 TypedArray subclassing:
littledan@834338e
littledan/buffer@2e60129

For now (Chrome 49/V8 4.9), I'm keeping the legacy behavior that
%TypedArray%.prototype.subarray calls out to a base class constructor
like Uint8Array. However, in the future, I'd like to ship proper TypedArray subclassing,
which would benefit from a patch like this. What do you think?

@littledan
Copy link
Author

@jeisinger

@littledan
Copy link
Author

The associated V8 bug is https://bugs.chromium.org/p/v8/issues/detail?id=4665

@littledan
Copy link
Author

Another option is to call the three-argument constructor for TypedArray from Buffer.prototype.slice, rather than calling subarray. This would side-step the need to override @@species.

@jeisinger
Copy link
Contributor

I wouldn't recommend using @@species before it's shipping in v8

@Fishrock123
Copy link
Contributor

Could you please put @@species in backticks so that it does not link to the user? Thanks. :)

@littledan
Copy link
Author

@jeisinger To clarify, there's no need to merge this patch until @@species is shipping in V8.

@littledan
Copy link
Author

FWIW, a third possible fix is here: feross/buffer#95 . I think this would be a bit slower than basing it on @@species or calling the three-argument TypedArray constructor, however.

@LinusU
Copy link
Contributor

LinusU commented Jan 19, 2016

Wouldn't a solution be to let the buffer constructor accept (buffer, byteOffset, length), like the TypedArrays does? That way you don't even need the .setPrototypeOf call to coerce the returned Uint8Array into a Buffer.

@trevnorris
Copy link
Contributor

@LinusU Belated, but a PR is on it's way to allow Buffers to accept an ArrayBuffer with byteoffset and length.

@trevnorris
Copy link
Contributor

@LinusU

That way you don't even need the .setPrototypeOf call to coerce the returned Uint8Array into a Buffer.

Can you expand on this?

@littledan
Copy link
Author

Once you upgrade to V8 5.0, you can take advantage of ES2015's built-in support for TypedArray subclassing. We have a web compat workaround which means that subarray will return instances of a built-in TypedArray type, but generally you can just make an ES2015-style class declaration which extends Uint8Array and calls the super constructor. I don't know how the performance compares today on V8, but on some engines like SpiderMonkey, dynamic proto chain modifications carry a signficant performance penalty, and subclassing based on ES2015 classes is definitely a style which we want to encourage in the future.

@LinusU
Copy link
Contributor

LinusU commented Feb 16, 2016

@trevnorris If buffer is declared as class Buffer extends Uint8Array {} then let b = new Buffer(8); b.subarray(0, 4) will return a buffer.

Currently, it returns a Uint8Array, which we then use setPrototypeOf on.

Does it clear things up? I can try and make some better examples otherwise :)

@littledan
Copy link
Author

@LinusU Yes, but... I had to revert the behavior of subarray in particular in V8 due to a web compat issue from old versions of feross/buffer which did not support the three-argument constructor. For all other methods, it should work, though, and the ES2016 draft spec still gives subarray this behavior. Hopefully this will be eventually shippable though.

@LinusU
Copy link
Contributor

LinusU commented Feb 17, 2016

Yes, hopefully we'll be able to ship this sooner rather than later :)

@trevnorris
Copy link
Contributor

@LinusU We're changing the Buffer constructor so it'll accept all the same parameters as Uint8Array, but we won't be using class Buffer extends Uint8Array {}. There were a couple blockers when I initially investigated doing that, but it's late and can't think of them all. One is that new would then be required, whereas it's now optional. Another is in the case of passing a string. The string is sent directly to C++ where memory is allocated, and written to. Since there's no way to then call super() with an existing char* (well, okay. I could do it, but I don't want to even further hack around v8 internals than I already have) it's not doable.

@LinusU
Copy link
Contributor

LinusU commented Feb 18, 2016

Sounds good 👌

@littledan
Copy link
Author

@trevnorris +1

@jasnell
Copy link
Member

jasnell commented Mar 22, 2016

What's the status on this one @trevnorris ?

@trevnorris
Copy link
Contributor

@jasnell This should be testable now that you've landed the new Buffer(array_buffer[, byteOffset[, length]]) API.

@trevnorris
Copy link
Contributor

@jasnell My last comment was poorly worded. What I mean is that a blocker was the new signature that has now landed.

As for subclassing, that simply isn't an option. It prevents being able to allocate memory ourselves that the Uint8Array will use. Making it impossible to construct a new Buffer instance from incoming data packets without copying over the contents.

@littledan
Copy link
Author

@trevnorris @jasnell OK, I'd expect that with this change, my @@species-based patch is not needed.

Re subclassing: I don't see why subclassing a real TypedArray wouldn't allow you to allocate memory however you want. If you have an underlying ArrayBuffer, then you can make your constructor call the super constructor wit this array_buffer, byte_offset, length signature (spec'd here). This constructor does not copy the ArrayBuffer. It is the same mechanism that subarray uses. Or, what is the problem with that strategy?

@trevnorris
Copy link
Contributor

@littledan Don't know why that usage hadn't crossed my mind... I would like some analysis done to see how much of the ecosystem we'd break by requiring new. In the mean time I can get a PR put together.

@RReverser
Copy link
Member

Thought of it again, unfortunately that won't work exactly because we inherit from Uint8Array :/

So yeah, either disallowing regular calls without new or exposing FastBuffer are the only options (well, we can hack around V8 and allow calling our classes without new, but I don't like going down that road).

@littledan
Copy link
Author

@RReverser V8 is just trying to follow the spec here; if Node.js diverges from the ES spec, that's a pretty significant decision that has implications for the evolution of JavaScript. I think we all benefit from the web and Node using the same language.

There was a "call constructor" proposal before TC39 at some point, to allow classes to implement being called. However, it was dropped, I believe for technical reasons that I don't quite understand. It was championed by @wycats and @allenwb, who might be able to provide more information.

@RReverser
Copy link
Member

There was a "call constructor" proposal before TC39 at some point, to allow classes to implement being called. However, it was dropped

Ah, that's what happened. Just tried to find it among the proposals and couldn't, even though strongly remember that it existed.

@RReverser
Copy link
Member

V8 is just trying to follow the spec here; if Node.js diverges from the ES spec, that's a pretty significant decision that has implications for the evolution of JavaScript. I think we all benefit from the web and Node using the same language.

I'm not arguing with this in any way. Personally, I like the idea of enforcing new in major updates to simplify transition to ES6 classes without hacks, but Buffer (as well as other Node.js classes) was allowed to be "just called" long before ES6 classes appeared, and this is a breaking change that needs to be evaluated with a great care.

@allenwb
Copy link

allenwb commented Jul 12, 2016

Note that even without explicit callable constructor support it is still possible to use function declarations to define a subclassable "class".

@RReverser
Copy link
Member

RReverser commented Jul 12, 2016

@allenwb The difficult part here is not having function that other classes can extend from, but the part where such function instantiates this as descendant of Uint8Array - neither super() from regular function nor Uint8Array.apply(this, arguments) are not possible.

@allenwb
Copy link

allenwb commented Jul 12, 2016

@RReverser
can't you do:

function Buffer(...args) {
       let buf = Reflect.construt(Uint8Array, whatever(args), new.target);
       ...
       return buf
}

@RReverser
Copy link
Member

RReverser commented Jul 13, 2016

@allenwb That's not very different from factory approach we currently have in the sense that subclassing

class B extends Buffer {
  constructor() {
    super(); // still won't work
  }
}

@seishun
Copy link
Contributor

seishun commented Jul 13, 2016

I might be confusing something here, but isn't the issue basically that the Buffer constructor currently overrides this by returning an object?

@RReverser
Copy link
Member

@seishun Partially.

  1. Function returning an object (and not just class) is required for factory style to work: Buffer(10) == new Buffer(10).
  2. You can't subclass such function because super doesn't care about returned objects.
  3. You can't make function to subclass this from Uint8Array, you need a proper class for that.

The problem is that you can't have all at once - you either lose factory style without new (breaking backward compat for Node.js users) or don't have ES6-style subclassing (losing forward compat for ES6 users).

@allenwb
Copy link

allenwb commented Jul 13, 2016

function Buffer(...args) {       
     if (new.target===undefined) return new Buffer(...args); //called as factory function

     //called as constructor, either via direct new, [[Constructor]], or subclass constructor super  
     return Reflect.construt(Uint8Array, whatever(args), new.target);
}

class SubBuffer extends Buffer {
   constructor(...args) {
       super(...args);
    }
}

2 You can't subclass such function because super doesn't care about returned objects.

Sure it does, see step 8 of SuperCall : super Arguments at https://tc39.github.io/ecma262/#sec-super-keyword-runtime-semantics-evaluation and step 13 of https://tc39.github.io/ecma262/#sec-ecmascript-function-objects-construct-argumentslist-newtarget

@RReverser
Copy link
Member

Sure it does, see step 8 of SuperCall : super Arguments at https://tc39.github.io/ecma262/#sec-super-keyword-runtime-semantics-evaluation and step 13 of https://tc39.github.io/ecma262/#sec-ecmascript-function-objects-construct-argumentslist-newtarget

@allenwb I stand corrected! Didn't know that it will automagically become this, thought it wouldn't work for extending natives.

Your example seems to work, so I'll try to prepare PR that would make advantage of it.

@trevnorris
Copy link
Contributor

@RReverser can you check benchmarks? The Buffer constructor is one place we're already hurting.

@allenwb Just want to double sanity check this. We have to turn char*'s into a Buffer. Currently we're creating a Local<Uint8Array> then using SetPrototype() to set its prototype to that of Buffer.prototype. This approach won't affect that correct?

@RReverser
Copy link
Member

The Buffer constructor is one place we're already hurting.

Sure, that's exactly what I was fixing in previous PR and want to be sure it won't regress again.

@allenwb
Copy link

allenwb commented Jul 13, 2016

@trevnorris

almost, you would want to SetPrototypeOf to new.target.prototype to make sure that Buffer subclass instances have the correct prototype.

But, does using SetPrototypeOf have any deoptimizing impact in V8 or other engines? If it does, I would hope that Reflect.construct did not. (It's the difference between supply the prototype before allocation or changing it after allocation)

Regard, when you think you have a complete subclassable JS implementation for Buffer point me at it and I'll review it for you.

@trevnorris
Copy link
Contributor

@allenwb

almost, you would want to SetPrototypeOf to new.target.prototype to make sure that Buffer subclass instances have the correct prototype.

So you're saying we'd also want to do both?:

buf->SetPrototypeOf(context, buf_prototype);
buf->GetPrototype()->SetPrototypeOf(context, buf_prototype);

But, does using SetPrototypeOf have any deoptimizing impact in V8 or other engines?

Whether it does or not I'm not sure there's a way around it (that doesn't require hacking the ArrayBuffer allocator) since the data is coming in through the I/O event as a char* on the native side.

@allenwb
Copy link

allenwb commented Jul 13, 2016

@trevnorris

OK, I'm loosing track of what exactly you are trying to accomplish. I thought you were trying to define a version of Buffer that could be subclassed, similar to what I showed in #4701 (comment) . Whether Buffer is actually implemented in C++ rather than JS doesn't change the requirements.

Assuming that SubBuffer extends Buffer then an object created via new SubBuffer() must have its [[Prototype]] set to SubBuffer.prototype as part of its allocation process. This is complicated by that fact that the allocation takes place within the Buffer constructor (or perhaps the Uint8Array constructor) rather than within the SubBuffer constructor. To make that work, the actual constructor that new was applied to must be passed along to the superclass constructor that does the allocation. In JS code, that is the role played by the the new.target meta property. It is used to communicated the subclass being allocated up to the superclass constructor that is doing the allocation such that [[Prototype]] can be set appropriately.

This must be the case regardless of whether Buffer is implemented in JS code or in C++. However, if it is implemented in C++ then the V8 C++ API must provide something that is equivalent to new.target and Reflect.construct. I'm not very familiar with the V8 C++ APIs and don't know what they may have recently done to that API to support subclassing. But, "ES6 subclassable builtins" essentially require the equivalent of these mechanisms so I assume V8 has added something in the API to support them.

So you're saying we'd also want to do both?:

No, if we are talking about new SubBuffer() initiated allocation you would never set the prototype to buf_prototype, it needs to be set to SubBuffer.prototype, however that value is made available to you.

Whether it does or not I'm not sure there's a way around it (that doesn't require hacking the ArrayBuffer allocator) since the data is coming in through the I/O event as a char* on the native side.

That is essentially the job of Reflect.construct or an equivalent C++ level API. V8 should be providing this.

I probably need more details of the usage, but I'm not sure why ArrayBuffer would be an issue? Aren't we're talking about SubBuffer being a subclass of Buffer which is a subclass of Uint8Array? They would all default to using ArrayBuffer as their backing store. (Of course, ArrayBuffer is (according to ES6) also supposed to be subclassable if that should prove useful).

I'm probably missing something about what you need to accomplish so feel free to clarify.

@RReverser RReverser self-assigned this Jul 14, 2016
@trevnorris
Copy link
Contributor

@allenwb Basically the native code looks like so:

Local<ArrayBuffer> ab = ArrayBuffer::New(
    isolate, data, length, ArrayBufferCreationMode::kInternalized);
Local<Uint8Array> ui = Uint8Array::New(ab, 0, length);
Maybe<bool> mb = ui->SetPrototype(context, buffer_prototype_object);

Where buffer_prototype_object is Buffer.prototype. What I'm wondering is if the new JS implementation will require changes to the native code.

Nevermind my comment about leveraging the v8::ArrayBuffer::Allocator. It's a hack to allow using the JS API to create the new Buffer instance, but still passing in the required char*.

@allenwb
Copy link

allenwb commented Jul 14, 2016

@trevnorris the last line needs to pass the prototype of the Buffer subclass as the second argument to SetPrototype. But I don't know whether or how V8 makes that value available to you via its C++ API. It should be available because it is required to support the Reflect.construct function (and that is really what is going on here).

So, this sounds like a question for @littledan : Has V8 make Uint8Array subclassable yet, as required by ES6? How is the newTarget parameter of [[Construct]] exposed in the V8 C++ API such that super (or Reflect.construct) calls from subclass constructors will work as specified when the superclass is implemented in C++?

@RReverser
Copy link
Member

RReverser commented Jul 15, 2016

@trevnorris I guess we don't really need that even now; instead, we can construct FastBuffer from C++ land and pass that ArrayBuffer as an argument.

@trevnorris
Copy link
Contributor

@RReverser I would be interested in seeing benchmark results for that.

@RReverser
Copy link
Member

@trevnorris Well I've made changes for that one locally, but is there any specific benchmark code for those C++ New methods or should I just come up with some?

@trevnorris
Copy link
Contributor

@RReverser I was thinking of yanking the specific operations happening in Buffer::New(), though that probably won't make a big difference. We don't have any benchmarks for native modules in core (future improvement) so will need to make some new ones.

@littledan
Copy link
Author

If you're a constructor implemented in C++ (as a FunctionCallback) being called, then new.target is exposed to you through FunctionCallbackInfo::NewTarget. However, something implementing a subclass of a TypedArray constructor in C++ will run into other issues when trying to call its super constructor:

As far as I can tell, the V8 API currently only exposes new.target through Reflect.construct and the individual TypedArray constructors, which you can get to only through property access. Maybe we should expose a convenience API for Reflect.construct more directly in the V8 API to make it easier to call out to it.

Calling out to the actual TypedArray constructor this way will be correct, but might be slower in practice for now in some cases, as it is implemented in JavaScript whereas the constructor path exposed by the API is entirely C++; the transition may have some performance cost. I have not benchmarked it yet. I am looking into rewriting this path in C++ for code health reasons, and that change may have knock-on benefits for the speed of this sort of use case.

cc @jochen

@jasnell
Copy link
Member

jasnell commented May 30, 2017

Does this need to remain open?

@TimothyGu
Copy link
Member

I believe this can be closed since the original issue about subclassing was fixed, and Buffer subclassability was further discussed in #9531.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem.
Projects
None yet
Development

No branches or pull requests