Open
Description
What is the issue with the Encoding Standard?
I've been asked to file a new issue after this comment #333 (comment) and here I am doing that:
- no runtime on the server prefers standards API to convert JS UTF-16 based strings into UTF-8 compatible strings, yet our Operating Systems, files, the Web itself, runs over UTF-8 to maximize compatibility with all Programming Languages and avoid misleading, or error prone, conversions all over the place
- no library which goal is to serialize as binary data JS uses TextEncoder or TextDecoder if not just after trying to avoid both by all means because these are extremely slow, via
encodeInto
variants or not, compared to more or less accurate JS only solutions - all JS only solutions are fast enough with laptops plugged in, but overall CPUs degrade 5X+ performance once devices are in battery save mode ... or better, all libraries perform extremely bad out of just JS code and it's not clear why these APIs are so slow compared to NodeJS, Bun or other JS runtime solutions that are not based on these APIs
- if all server runtimes need to avoid these APIs and SharedArrayBuffer + Atomics is used to convert JS references otherwise impossible to convert natively as binary, it's clear to me we're lacking a primitive which only purpose would be to return a buffer out of
String.prototype.toUTF8Buffer()
or any other named API which goal is to just do that without all the performance caveats behind the scene - at the same time, it's not clear why every JS solution literally outperform these native APIs so that something might be really off behind these API implementations, as most libraries are based on RFC standards and has been proven to work for years bypassing TextEncoder or TextDecoder usage
As summary, it would be great to understand why these APIs are so slow and why there is no interest in having best performance backed in to transform strings as binary data and decode these.
Use cases:
- Atomics via SharedArrayBuffer where the only language these APIs speak is binary and views of buffers
- cross posting + cross programming language communication
- file handling where all APIs returns
arrayBuffer
but that's both async (usually) and yet needs conversion to text too if these are traveling - leader tab patterns used to enable OPFS where data can only travel as binary to be fast enough, avoiding any encoding/decoding all over the place (buffers travel fast)
- serialization where to know the length of a JS string we need all sort of invariants to what
byteLength
could provide and yet, that would not solve the whole issue: we also need/want that buffer after
Thanks in advance for considering any alternative API that would boost performance or considering investigating how come we need to loop charCodeAt
all over the Web to be sure we're either faster than native APIs, which goal supposed to be to simplify that encoding/decoding dance, or needed to compute the resulting length of a JS string, usually before needing that string as buffer anyway 🙏
Metadata
Metadata
Assignees
Labels
No labels