New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow all ArrayBufferView types as Buffers #12223

Closed
wants to merge 8 commits into
from

Conversation

Projects
None yet
8 participants
@TimothyGu
Member

TimothyGu commented Apr 5, 2017

Right now, many modules allow the use of plain Uint8Arrays in addition to Buffers. This PR begins the work to expand such support to all ArrayBufferView types (including all TypedArray types and DataView).

This work was inspired by #1826, which requests support for ArrayBuffer as well. But ArrayBuffer is treated separately from its views in V8's API and it is easier to get started with the work here.

The first two commits add the necessary infrastructure for consuming and testing all ArrayBufferView types. The last two demonstrate how to update the JS layer for it. (Specifically, using the new isArrayBufferView utility function and update usage of buf.length to buf.byteLength).

I don't intend this PR to make its way into 8.0.0.

Relevant discussions on IRC with @addaleax
2017-03-23 20:20:54 TimothyGu  So I was looking at https://github.com/nodejs/node/issues/1826
2017-03-23 20:21:23 TimothyGu  As we are trying to support Uint8Array in C++ binding it doesn't
                               look much more work to support all the other ArrayBufferView types
2017-03-23 20:21:40 TimothyGu  including all TypedArray and DataView
2017-03-23 20:21:54 addaleax   right … that was also pointed out to me in the last streams wg
                               meeting
2017-03-23 20:22:32 addaleax   I understand where it’s coming from, but so far I have wanted to
                               keep the changes pretty minimal
2017-03-23 20:23:06 addaleax   bc going from Buffer support to Uint8Array support is, for the most
                               part, a tiny change
2017-03-23 20:23:27 addaleax   if somebody (you?) wants to work on this, yay
2017-03-23 20:23:40 TimothyGu  TBH it doesn't take much of a change on the C++ side to get it
                               working
2017-03-23 20:23:46 TimothyGu  http://sprunge.us/SgET
2017-03-23 20:23:54 TimothyGu  or http://sprunge.us/SgET?diff
2017-03-23 20:25:12 TimothyGu  so I thought maybe we could just support all of them at the same
                               time
2017-03-23 20:25:25 addaleax   that’s nice, yes
...
2017-03-23 20:28:20 addaleax   re: generic ArrayBufferView support … the one reason I’ve not been
                               doing that is that my personal goal is to give users a more
                               standard way of passing in binary data, and Uint8Arrays work just
                               fine for that purpose. the other reason is that there is at least a
                               small bit of ambiguity, because interpreting e.g. a Uint16Array as
                               a Uint8Array can be done in two ways (entry-for-entry or via the...
2017-03-23 20:28:21 addaleax   ...underlying storage)
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)
@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Apr 5, 2017

Contributor

I agree with @addaleax about the possible ambiguity with how to retrieve the byte values (by index or by byte of the underlying storage). That makes me lean towards -1 on this change.

Do we know if there are other libraries (node or browser) who allow these other non-8-bit typed arrays to be used as 8-bit arrays? It might be good to see if there is a precedent for how we might handle it in node?

Contributor

mscdex commented Apr 5, 2017

I agree with @addaleax about the possible ambiguity with how to retrieve the byte values (by index or by byte of the underlying storage). That makes me lean towards -1 on this change.

Do we know if there are other libraries (node or browser) who allow these other non-8-bit typed arrays to be used as 8-bit arrays? It might be good to see if there is a precedent for how we might handle it in node?

@TimothyGu

This comment has been minimized.

Show comment
Hide comment
@TimothyGu

TimothyGu Apr 5, 2017

Member

@mscdex

Web standards usually use Web IDL as an abstraction for type conversions and other common tasks potentially used by many standards. In addition to ArrayBuffer, all TypedArray types, and DataView, Web IDL specifies two additional types for abstraction of buffer sources:

Additionally, Web IDL defines the abstract operations getting a reference to the bytes held by a buffer source and getting a copy of the bytes held by a buffer source, which are what the Web Standards actually call upon to perform the task of getting the bytes. They both retrieve the byte values by underlying storage, not by index.

This effectively means that all Web APIs using ArrayBufferView or BufferSource Web IDL type, i.e. almost all Web APIs that take a generic byte buffer, support all TypedArray types.

Examples include:

At the same time, not all Web APIs accept all TypedArray types, but this always happens for a good reason. Two notable exceptions I found:

On the other hand, I'm not familiar with any examples in the Node.js ecosystem, considering Buffer has been the standard for byte buffers since Node.js' inception.

/cc @domenic

Member

TimothyGu commented Apr 5, 2017

@mscdex

Web standards usually use Web IDL as an abstraction for type conversions and other common tasks potentially used by many standards. In addition to ArrayBuffer, all TypedArray types, and DataView, Web IDL specifies two additional types for abstraction of buffer sources:

Additionally, Web IDL defines the abstract operations getting a reference to the bytes held by a buffer source and getting a copy of the bytes held by a buffer source, which are what the Web Standards actually call upon to perform the task of getting the bytes. They both retrieve the byte values by underlying storage, not by index.

This effectively means that all Web APIs using ArrayBufferView or BufferSource Web IDL type, i.e. almost all Web APIs that take a generic byte buffer, support all TypedArray types.

Examples include:

At the same time, not all Web APIs accept all TypedArray types, but this always happens for a good reason. Two notable exceptions I found:

On the other hand, I'm not familiar with any examples in the Node.js ecosystem, considering Buffer has been the standard for byte buffers since Node.js' inception.

/cc @domenic

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 5, 2017

Member

Two notable exceptions I found:

Another exception is in readable byte streams (e.g. fetch body streams). This is because we need not only bytes, but also an offset and a length (for reasons I'll leave out here but can explain if desired). And look, we already have a type for that: Uint8Array = { buffer: ArrayBuffer, byteOffset, byteLength }. So we are using Uint8Array for streams.

But this is basically a supporting argument: we only restrict to Uint8Array when we also find the byteOffset and byteLength into the larger ArrayBuffer to be meaningful. Otherwise, any form of binary data is acceptable. And indeed, as @TimothyGu pointed out, when accepting data in stream's BYOB-read() method, we accept any type.

On the other hand, I'm not familiar with any examples in the Node.js ecosystem, considering Buffer has been the standard for byte buffers since Node.js' inception.

Well, jsdom's upcoming v10 will accept any of the BufferSource types, or Buffer.

Member

domenic commented Apr 5, 2017

Two notable exceptions I found:

Another exception is in readable byte streams (e.g. fetch body streams). This is because we need not only bytes, but also an offset and a length (for reasons I'll leave out here but can explain if desired). And look, we already have a type for that: Uint8Array = { buffer: ArrayBuffer, byteOffset, byteLength }. So we are using Uint8Array for streams.

But this is basically a supporting argument: we only restrict to Uint8Array when we also find the byteOffset and byteLength into the larger ArrayBuffer to be meaningful. Otherwise, any form of binary data is acceptable. And indeed, as @TimothyGu pointed out, when accepting data in stream's BYOB-read() method, we accept any type.

On the other hand, I'm not familiar with any examples in the Node.js ecosystem, considering Buffer has been the standard for byte buffers since Node.js' inception.

Well, jsdom's upcoming v10 will accept any of the BufferSource types, or Buffer.

@mscdex mscdex added the semver-major label Apr 5, 2017

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Apr 5, 2017

Contributor

@TimothyGu Have you compared at least the Buffer benchmarks to see what (if any) performance difference there is before and after these changes?

Contributor

mscdex commented Apr 5, 2017

@TimothyGu Have you compared at least the Buffer benchmarks to see what (if any) performance difference there is before and after these changes?

@TimothyGu

This comment has been minimized.

Show comment
Hide comment
@TimothyGu

TimothyGu Apr 5, 2017

Member

@mscdex there are no differences in zlib. I did not test crypto, but I don't expect there to be any performance differences since the code is practically identical (except for changing Uint8Array to ArrayBufferView), both in JS and in C++.

Member

TimothyGu commented Apr 5, 2017

@mscdex there are no differences in zlib. I did not test crypto, but I don't expect there to be any performance differences since the code is practically identical (except for changing Uint8Array to ArrayBufferView), both in JS and in C++.

Show outdated Hide outdated lib/crypto.js
@@ -42,7 +42,7 @@ const timingSafeEqual = binding.timingSafeEqual;
const Buffer = require('buffer').Buffer;
const stream = require('stream');
const util = require('util');
const { isUint8Array } = process.binding('util');
const { isArrayBufferView } = process.binding('util');

This comment has been minimized.

@lpinca

lpinca Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

@lpinca

lpinca Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

This comment has been minimized.

@addaleax

addaleax Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

@lpinca yep, those do the same thing.

@addaleax

addaleax Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

@lpinca yep, those do the same thing.

This comment has been minimized.

@TimothyGu

TimothyGu Apr 5, 2017

Member

Yes, I think the standard version should be better in this case.

@TimothyGu

TimothyGu Apr 5, 2017

Member

Yes, I think the standard version should be better in this case.

@addaleax

Is this WIP? There are a few more places in the API where currently only Uint8Arrays are supported (or at least it’s documented that way).

Anyway, I think I’m +1 on this change… @nodejs/collaborators Thoughts?

Show outdated Hide outdated lib/crypto.js
@@ -42,7 +42,7 @@ const timingSafeEqual = binding.timingSafeEqual;
const Buffer = require('buffer').Buffer;
const stream = require('stream');
const util = require('util');
const { isUint8Array } = process.binding('util');
const { isArrayBufferView } = process.binding('util');

This comment has been minimized.

@addaleax

addaleax Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

@lpinca yep, those do the same thing.

@addaleax

addaleax Apr 5, 2017

Member

Isn't this equivalent to ArrayBuffer.isView()?

@lpinca yep, those do the same thing.

@@ -99,7 +99,7 @@ function zlibBuffer(engine, buffer, callback) {
var chunk;
while (null !== (chunk = engine.read())) {
buffers.push(chunk);
nread += chunk.length;
nread += chunk.byteLength;

This comment has been minimized.

@addaleax

addaleax Apr 5, 2017

Member

chunk is engine output, so it’s always a Buffer, right?

@addaleax

addaleax Apr 5, 2017

Member

chunk is engine output, so it’s always a Buffer, right?

This comment has been minimized.

@TimothyGu

TimothyGu Apr 5, 2017

Member

I think so, but I'd rather err on the side of replacing too many Buffer.lengths rather than missing an ArrayBufferView.byteLength, since Buffer.length === Buffer.byteLength.

@TimothyGu

TimothyGu Apr 5, 2017

Member

I think so, but I'd rather err on the side of replacing too many Buffer.lengths rather than missing an ArrayBufferView.byteLength, since Buffer.length === Buffer.byteLength.

Show outdated Hide outdated test/common.js
const out = [];
for (const type of arrayBufferViews) {
const { BYTES_PER_ELEMENT = 1 } = type;
if (Number.isInteger(byteLength % BYTES_PER_ELEMENT)) {

This comment has been minimized.

@addaleax

addaleax Apr 5, 2017

Member

Did you mean byteLength % BYTES_PER_ELEMENT === 0, or Number.isInteger(byteLength / BYTES_PER_ELEMENT)?

@addaleax

addaleax Apr 5, 2017

Member

Did you mean byteLength % BYTES_PER_ELEMENT === 0, or Number.isInteger(byteLength / BYTES_PER_ELEMENT)?

This comment has been minimized.

@TimothyGu

TimothyGu Apr 5, 2017

Member

Uhh… that's embarrassing. Good catch.

@TimothyGu

TimothyGu Apr 5, 2017

Member

Uhh… that's embarrassing. Good catch.

@mcollina

mcollina approved these changes Apr 5, 2017 edited

LGTM as it is. I think it make sense to support JS types for binary data wherever possible.

@TimothyGu

This comment has been minimized.

Show comment
Hide comment
@TimothyGu

TimothyGu Apr 5, 2017

Member

@addaleax

Is this WIP? There are a few more places in the API where currently only Uint8Arrays are supported (or at least it’s documented that way).

Somewhat. My intention was to incrementally update individual modules, instead of changing all the modules at the same time – sort of like how the support for Uint8Arrays was incrementally added. The crypto and zlib modules in this PR serve as examples for how it's done.

Member

TimothyGu commented Apr 5, 2017

@addaleax

Is this WIP? There are a few more places in the API where currently only Uint8Arrays are supported (or at least it’s documented that way).

Somewhat. My intention was to incrementally update individual modules, instead of changing all the modules at the same time – sort of like how the support for Uint8Arrays was incrementally added. The crypto and zlib modules in this PR serve as examples for how it's done.

@TimothyGu

This comment has been minimized.

Show comment
Hide comment
@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 11, 2017

Member

It looks to me like this PR does not take into account byteOffset. Is that correct? If so that seems pretty bad.

I think you need adapter code like https://github.com/tmpvar/jsdom/blob/28d08f58b82cc2100ad36b99cf5e8b4bbb3fc291/lib/api.js#L329-L333

Member

domenic commented Apr 11, 2017

It looks to me like this PR does not take into account byteOffset. Is that correct? If so that seems pretty bad.

I think you need adapter code like https://github.com/tmpvar/jsdom/blob/28d08f58b82cc2100ad36b99cf5e8b4bbb3fc291/lib/api.js#L329-L333

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax Apr 11, 2017

Member

It looks to me like this PR does not take into account byteOffset. Is that correct? If so that seems pretty bad.

@domenic Can you point to the code that you think doesn’t account for the offset? It does look like that’s handled correctly everywhere.

Member

addaleax commented Apr 11, 2017

It looks to me like this PR does not take into account byteOffset. Is that correct? If so that seems pretty bad.

@domenic Can you point to the code that you think doesn’t account for the offset? It does look like that’s handled correctly everywhere.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 11, 2017

Member

Well, for example, https://github.com/TimothyGu/node/blob/13152405dd8f248192fea558a9bc206201f884f1/lib/crypto.js does not contain the string "byteOffset" anywhere.

Member

domenic commented Apr 11, 2017

Well, for example, https://github.com/TimothyGu/node/blob/13152405dd8f248192fea558a9bc206201f884f1/lib/crypto.js does not contain the string "byteOffset" anywhere.

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax Apr 11, 2017

Member

@domenic For most of the APIs that use C++, including the crypto ones, the ABV is unwrapped in https://github.com/nodejs/node/pull/12223/files#diff-772f489c7d0a32de3badbfbcb5fd200dR441. That does add the byteOffset to the pointer that actually ends up used (both with and without this PR), so I see no reason for concern here.

Member

addaleax commented Apr 11, 2017

@domenic For most of the APIs that use C++, including the crypto ones, the ABV is unwrapped in https://github.com/nodejs/node/pull/12223/files#diff-772f489c7d0a32de3badbfbcb5fd200dR441. That does add the byteOffset to the pointer that actually ends up used (both with and without this PR), so I see no reason for concern here.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 11, 2017

Member

Great, that is what I was missing. Thank you!

Member

domenic commented Apr 11, 2017

Great, that is what I was missing. Thank you!

@TimothyGu

This comment has been minimized.

Show comment
Hide comment
Member

TimothyGu commented Apr 12, 2017

Landed in ec53921...2ced07c.

@TimothyGu TimothyGu closed this Apr 12, 2017

@TimothyGu TimothyGu deleted the TimothyGu:typedarray branch Apr 12, 2017

TimothyGu added a commit that referenced this pull request Apr 12, 2017

src: allow ArrayBufferView as instance of Buffer
PR-URL: #12223
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>

TimothyGu added a commit that referenced this pull request Apr 12, 2017

test: add common.getArrayBufferViews(buf)
PR-URL: #12223
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>

TimothyGu added a commit that referenced this pull request Apr 12, 2017

crypto: support all ArrayBufferView types
PR-URL: #12223
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>

TimothyGu added a commit that referenced this pull request Apr 12, 2017

zlib: support all ArrayBufferView types
PR-URL: #12223
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>

@TimothyGu TimothyGu referenced this pull request Apr 14, 2017

Closed

string_decoder: support Uint8Array input to methods #11613

4 of 4 tasks complete

@jasnell jasnell referenced this pull request May 11, 2017

Closed

8.0.0 Release Proposal #12220

tniessen added a commit to tniessen/node that referenced this pull request Feb 8, 2018

@tniessen tniessen referenced this pull request Feb 8, 2018

Closed

crypto: fix meta and description #18651

2 of 2 tasks complete

tniessen added a commit that referenced this pull request Feb 9, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

MylesBorins added a commit that referenced this pull request Feb 21, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

MylesBorins added a commit that referenced this pull request Feb 21, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

MylesBorins added a commit that referenced this pull request Feb 21, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

MylesBorins added a commit that referenced this pull request Feb 21, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

gibfahn added a commit that referenced this pull request Apr 13, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

gibfahn added a commit that referenced this pull request Apr 13, 2018

doc: fix description of createDecipheriv
PR-URL: #18651
Refs: #12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>

MayaLekova added a commit to MayaLekova/node that referenced this pull request May 8, 2018

doc: fix description of createDecipheriv
PR-URL: nodejs#18651
Refs: nodejs#12223
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment