allow writing to an existing Uint8Array, and add lastChunkHandling option #33

bakkot · 2023-12-13T05:43:37Z

Fixes #21: this adds let {read, written } = Uint8Array.fromBase64Into(string, target, options) and let { read, written } = Uint8Array.fromHexInto(string, target, options) methods which allow writing to an existing Uint8Array, matching TextEncoder's encodeInto().

These methods also add an outputOffset option in the options bag which lets you specify an offset to start writing at, letting you skip the subarray call. TextEncoder doesn't have this but apparently it's important for V8. Moved this part into #34.

Also fixes #13: this adds a lastChunkHandling: 'stop-before-partial' option to the base64 decoders, which allows an efficient streaming decoder to be implemented in userland (efficient meaning, no passes over the input string in userland). See the updated stream.mjs. The other values for lastChunkHandling are "loose" and "strict"; "strict" enforces padding and zero-ness of overflow bits. "loose" is the default and matches the behavior of atob.

As a consequence of the refactor to use lastChunkHandling, there's no longer an option to forbid whitespace in the base64 decoders. This is easy to do in userland if you really want to, though there is rarely occasion for it. Whitespace is always permitted in the base64 decoder, and never in the hex decoder.

When this lands the proposal should be ready for stage 3 (or rather 2.7).

cc @phoddie @syg

annevk · 2023-12-13T08:17:28Z

If subarray is slow, shouldn't that be optimized instead? Letting each API pay the price seems like the wrong tradeoff.

@domenic objected to this before and I kinda agree with him that's bad.

Especially for a v1 we shouldn't do this. Once v1 has been out for a while and perf metrics are provided, we could reconsider, but note that nobody ever provided such metrics for encodeInto().

syg · 2023-12-13T15:27:00Z

If subarray is slow, shouldn't that be optimized instead? Letting each API pay the price seems like the wrong tradeoff.

Can you elaborate on what "optimizing subarray" means? I don't think this is as widespread as "each API", it's more like "each API that wishes to be as high performance as possible", which are much rarer.

V8's opinion here is that we have numerous internal complaints of TextDecoder's poor performance that points to static APIs that take explicit buffers and offsets as an incremental step, and the encoding version ought to be symmetric.

phoddie · 2023-12-13T20:15:05Z

FWIW – I agree with the comments of both @annevk and @syg. Passing the subview bounds as API arguments increases complexity, all the more so here where the is an input and output buffer. But, performance also matters.

An alternative approach could keep the API minimal and support high performance. The performance issue arises largely, as I understand (and observe in Moddable's work on embedded), from having to create a new view to access different parts of the buffer. That's necessary because the byteOffset and byteLength properties of a view are fixed at construction. If they were mutable, a single view instance could be used to access different parts of the buffer.

For a very long time, there was no precedent for views changing after construction. @syg's proposal for Resizable Array Buffer changed that by allowing TypedArrays to opt-into having their byteLength property track the size of the underlying resizable buffer. We could build on that precedent to allow a view to opt-into having mutable byteOffset and byteLength properties.

Such an approach would require careful consideration (compatibility, memory safety, etc). The benefits seem worthwhile. It would keep Base64 APIs simple and while allowing them to be performant. It would also benefit other situations, such as the TextDecoder performance issue Shu cites.

annevk · 2023-12-14T13:24:15Z

V8's opinion here is that we have numerous internal complaints of TextDecoder's poor performance that points to static APIs that take explicit buffers and offsets as an incremental step, and the encoding version ought to be symmetric.

Given that these complaints haven't surfaced until now (and performance metrics still haven't surfaced to my knowledge) while encodeInto() has been around for a long time, it's not clear that should drive decision-making here.

I'm more than willing to accept that we have to change certain APIs to also take an offset, but that should be done in a much more deliberate manner with data to back it up.

We agreed on the fundamental design here in 2015/2018:

There's some further background in whatwg/encoding#172.

syg · 2023-12-14T17:40:04Z

There're two threads here I'd like to tease apart:

TextEncoder / TextDecoder APIs have performance issues in some Google workloads, and whether those APIs should change or amended.
The design of this proposal needing to be analogous to TextEncoder / TextDecoder.

We should strive for (2) but it is not a hard requirement, but I'm sensing that @annevk (and perhaps @domenic) weigh it more than TC39 is currently treating it?

Given that these complaints haven't surfaced until now (and performance metrics still haven't surfaced to my knowledge) while encodeInto() has been around for a long time, it's not clear that should drive decision-making here.

Are you saying our performance desires shouldn't drive design of this TC39 proposal because we've seen performance issues in TextDecoder and Encoder, and keeping a close analogy with that existing API trumps that concern until you can validate that performance issues in the analogous API? That is, does this boil down to a judgment call of the DX footgun of a (small IMO) divergence here that can be healed with future proposals?

I hear you on the "first I've heard of this part", I'll try to compile some numbers.

annevk · 2023-12-14T17:46:54Z

You're saying you want a design that's different from TextDecoder because of concerns with that API that haven't been made public since the API was added. I'm saying we need strong justification for divergence of established patterns, especially when past proposals for equivalent API shapes (taking an offset) were rejected as documented in the referenced threads.

And to be clear, performance is also important for encodeInto(). It's important for all bring-your-own buffer APIs.

syg · 2023-12-14T17:58:47Z

You're saying you want a design that's different from TextDecoder because of concerns with that API that haven't been made public since the API was added. I'm saying we need strong justification for divergence of established patterns, especially when past proposals for equivalent API shapes (taking an offset) were rejected as documented in the referenced threads.

Thanks, that's clearer. My read of the past rejection is @domenic's (and yours?) preference that we do not design around performance limitations around object allocation, against which I am now raising the contradictory point from V8 that we do care about it. It's fair to ask for why we reached this conclusion; I'll work to provide something there.

Still I'd like to push back against "we need strong justification for divergence of established patterns", when that pattern is outside the purview of JS core stdlib and the propose divergence doesn't preclude re-convergence later.

It's important for all bring-your-own buffer APIs.

Yep, that's the goal here. I think we agree that high performance BYOB APIs are important. V8 in particular wants new APIs to have BYOB overloads that make the "right" (which we are debating) tradeoffs between DX and performance.

annevk · 2023-12-14T18:50:39Z

I don't see how it matters where the pattern has been established. No need to ship the org chart.

ljharb · 2023-12-14T18:56:18Z

It's not about the org, it's about what constraints and use cases the API was designed with and for.

annevk · 2023-12-15T09:34:14Z

As I explained above those are the same.

jridgewell · 2023-12-15T19:13:44Z

I think @bakkot is reacting to me saying subarray is slow. My measurements are for a source map VLQ library from a few years ago, let me whip up a new comparison that shows the relative performance of my current single-subarray vs the slower recreating-subarray approach.

bakkot · 2023-12-15T19:24:29Z

It's apparently also expensive in JSC, though I haven't benchmarked that either.

jridgewell · 2023-12-15T22:08:21Z

https://jsbench.github.io/#94c46cc814cab852a54aef37c56d59c4 shows a marginal improvement in Chrome and Safari (no change in Firefox) when using a single subarray rather than creating multiple. I remember this being a much larger difference. But even a relatively small number of subarrays (2 vs 36 in this example) is enough to impact the performance even when doing relatively expensive calculations (encoding source mappings into VLQ mappings).

dead-claudia · 2023-12-30T06:12:11Z

https://jsbench.github.io/#94c46cc814cab852a54aef37c56d59c4 shows a marginal improvement in Chrome and Safari (no change in Firefox) when using a single subarray rather than creating multiple. I remember this being a much larger difference. But even a relatively small number of subarrays (2 vs 36 in this example) is enough to impact the performance even when doing relatively expensive calculations (encoding source mappings into VLQ mappings).

@jridgewell I thought Base64 VLQ parsing and generation shouldn't be doing full Base64 <-> binary byte sequence conversion, just alphabet <-> 6-bit segment. It's not only much faster to do it directly letter by letter, but also easier.

Could you explain how full Base64 byte sequence encoding and decoding come into play here aside from just decoding and encoding the content of inline data:application/json;charset=utf-8;base64,... source map URLs? I'm struggling at the premise here.

jridgewell · 2023-12-31T07:08:31Z

Huh? Discussion in this PR thread is about whether we should lean into typed array subarrays for the API, or should we use offset indexes into an existing typed array. The source map VLQ decoding is totally separate from this proposal, I'm using a library that I maintain to show a real performance impact caused by calling typedArray.subarray(…) repeatedly.

IMHO, we should be using offset indexes. And, if someone wants to use subarrays because it makes their code cleaner, then it'll transparently work because the offset indexes can default to 0.

dead-claudia · 2023-12-31T08:55:12Z

@jridgewell Oh, okay. Sorry about that then. Feel free to ignore me.

bakkot · 2024-01-07T23:23:58Z

I am basically neutral on outputOffset. Since it's controversial, I'm going to split it out into its own PR: #34, and land the remaining less controversial parts as is.

write to existing buffers, and lastChunkHandling

8dc1667

bakkot force-pushed the byob branch from aa0fdf6 to 8dc1667 Compare December 13, 2023 05:45

This was referenced Dec 13, 2023

Support for writing to an existing buffer #26

Closed

encoding into an existing buffer #21

Closed

remove outputOffset

cb2ff6c

bakkot force-pushed the byob branch from 3cb9470 to cb2ff6c Compare January 7, 2024 23:23

bakkot merged commit 82b8b6d into main Jan 7, 2024
1 check passed

bakkot deleted the byob branch January 7, 2024 23:24

This was referenced Jan 7, 2024

add outputOffset to the into methods #34

Open

bikeshedding fromPartialBase64 name #25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow writing to an existing Uint8Array, and add lastChunkHandling option #33

allow writing to an existing Uint8Array, and add lastChunkHandling option #33

bakkot commented Dec 13, 2023 •

edited

Loading

annevk commented Dec 13, 2023

syg commented Dec 13, 2023

phoddie commented Dec 13, 2023

annevk commented Dec 14, 2023

syg commented Dec 14, 2023

annevk commented Dec 14, 2023

syg commented Dec 14, 2023

annevk commented Dec 14, 2023

ljharb commented Dec 14, 2023

annevk commented Dec 15, 2023

jridgewell commented Dec 15, 2023

bakkot commented Dec 15, 2023

jridgewell commented Dec 15, 2023 •

edited

Loading

dead-claudia commented Dec 30, 2023

jridgewell commented Dec 31, 2023

dead-claudia commented Dec 31, 2023

bakkot commented Jan 7, 2024 •

edited

Loading

allow writing to an existing Uint8Array, and add lastChunkHandling option #33

allow writing to an existing Uint8Array, and add lastChunkHandling option #33

Conversation

bakkot commented Dec 13, 2023 • edited Loading

annevk commented Dec 13, 2023

syg commented Dec 13, 2023

phoddie commented Dec 13, 2023

annevk commented Dec 14, 2023

syg commented Dec 14, 2023

annevk commented Dec 14, 2023

syg commented Dec 14, 2023

annevk commented Dec 14, 2023

ljharb commented Dec 14, 2023

annevk commented Dec 15, 2023

jridgewell commented Dec 15, 2023

bakkot commented Dec 15, 2023

jridgewell commented Dec 15, 2023 • edited Loading

dead-claudia commented Dec 30, 2023

jridgewell commented Dec 31, 2023

dead-claudia commented Dec 31, 2023

bakkot commented Jan 7, 2024 • edited Loading

bakkot commented Dec 13, 2023 •

edited

Loading

jridgewell commented Dec 15, 2023 •

edited

Loading

bakkot commented Jan 7, 2024 •

edited

Loading