Skip to content

add btoa()/atob() builtins#1444

Open
saghul wants to merge 1 commit intomasterfrom
feat/base64
Open

add btoa()/atob() builtins#1444
saghul wants to merge 1 commit intomasterfrom
feat/base64

Conversation

@saghul
Copy link
Copy Markdown
Contributor

@saghul saghul commented Apr 9, 2026

See the updated commit description.

quickjs.c Outdated
static char B64_PAIR_LUT[4096][2];
static int B64_PAIR_INIT = 0;

static inline void b64_pair_init_once(void) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not thread-safe. Why don't you precompute the LUT? Faster than doing it once per process, plus it can be stored in read-only memory.

Aside: ALL_CAPS suggests B64_PAIR_LUT etc. are macros when they're not.

quickjs.c Outdated
unsigned ch2 = (unsigned char)src[k];
uint8_t f2 = B64_FLAGS[ch2];
if (f2 & K_WS) continue; // ws
if (ch2!='=') { if (err) *err = 1; return 0; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is kind of hideous :)

@chqrlie
Copy link
Copy Markdown
Collaborator

chqrlie commented Apr 10, 2026

8KB of extra data seems wasteful for little gain for a rare use case. Please keep quickjs small.

@saghul
Copy link
Copy Markdown
Contributor Author

saghul commented Apr 10, 2026

There are also base64 operations in uint8array, this should be useful for that too.

@saghul
Copy link
Copy Markdown
Contributor Author

saghul commented Apr 10, 2026

@bnoordhuis I think I addressed all the feedback :-) I'll admit I just ported the original PR to master without looking much into it 😅

@chqrlie
Copy link
Copy Markdown
Collaborator

chqrlie commented Apr 10, 2026

The static array b64_pair is just useless bloat. Did anyone actually benchmark the performance gain and what usecase make it compelling to go down this road?

@saghul
Copy link
Copy Markdown
Contributor Author

saghul commented Apr 10, 2026

The static array b64_pair is just useless bloat. Did anyone actually benchmark the performance gain and what usecase make it compelling to go down this road?

The original author is @aabbdev and there was some discussion and benchmarks on the original PR. which got stalled and I'm trying to revive here. #1143

@bptato
Copy link
Copy Markdown
Contributor

bptato commented Apr 10, 2026

This PR still doesn't match the spec.

  • It encodes/decodes using the "base64url" charset, but standard btoa doesn't. '-' and '_' ought to be swapped with '+' and '/'. ref. forgiving-base64 encode, Table 1: The Base64 Alphabet. Test cases: btoa("\xFF"), btoa("\xFA").
    (Strangely enough, the original PR even acknowledges this in a comment.)
  • atob("Aaa==") should throw (too much padding), but it doesn't. It even accepts atob("Aaa==============") :)

On JS_AddIntrinsicBase64: I still think it should be together with JS_AddIntrinsicDOMException, otherwise the API is very easy to misuse. I'd suggest JS_AddIntrinsicSerialization which also anticipates structuredClone.

On the LUT: compared to chqrlie's naive solution, on my (old-ish) x86_64 CPU (+ clang 21) the LUT does speed up encoding by 8%. But considering that

  • even aabbdev says that calling costs are the main bottleneck, not encoding
  • according to chqrlie, the pair solution is actually slower on aarch64
  • a small size is one of QuickJS's main selling points

I agree with chqrlie in that including the LUT is a poor tradeoff.

On the decoder: chqrlie's decoder wins with -O3 -march=native. (With just -O3, aabbdev's decoder wins.)

@saghul
Copy link
Copy Markdown
Contributor Author

saghul commented Apr 10, 2026

Thanks for the feedback @bptato ! I'll try some alternatives. I already noticed some changes would ne becessary when making the uint8array base64 functions PR...

- Introduce global btoa() and atob() functions
- Standard base64 alphabet (RFC 4648)
- Decoder implements forgiving-base64-decode (WHATWG Infra spec)
- Tolerant to whitespace, validates padding per spec
- JS_AddIntrinsicAToB() ensures DOMException is registered

Co-authored-by: Saúl Ibarra Corretgé <s@saghul.net>
@saghul
Copy link
Copy Markdown
Contributor Author

saghul commented Apr 11, 2026

Updated, thanks for the patience everyone!

I think @chqrlie and @bptato are right, sorry it took me a bit to get it folks.

We can make it work first, then see if / how to optimize it.

I made the intrinsics function check and add DOMException if necessary. I'm not sure about giving it a more generic name yet, we can do that when we get to structuredClone.

Please take a look!

Comment on lines +60846 to +60848
uint32_t v = ((uint32_t)src[i] << 16)
| ((uint32_t)src[i + 1] << 8)
| (uint32_t)src[i + 2];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really just a suggestion but per the integer promotion rules you can write it more succinctly as demonstrated below and it compiles down to the same code:

Suggested change
uint32_t v = ((uint32_t)src[i] << 16)
| ((uint32_t)src[i + 1] << 8)
| (uint32_t)src[i + 2];
uint32_t v = 65536*src[i+0] + 256*src[i+1] + src[i+2];

If you were to read four fields you'd have to write the multiplier as 16777216u or 0x1000000u to force unsigned arithmetic.

Comment on lines +60885 to +60888
for (i = 0; i < len; i++) {
if (!(b64_flags[(unsigned char)src[i]] & K_WS))
nws++;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 98% sure any decent compiler will compile it to branchless code anyway but if you want to be explicit (and help tinycc a hand):

Suggested change
for (i = 0; i < len; i++) {
if (!(b64_flags[(unsigned char)src[i]] & K_WS))
nws++;
}
for (i = 0; i < len; i++)
nws += !(b64_flags[(unsigned char)src[i]] & K_WS);

nws++;
if (bits >= 8) {
bits -= 8;
dst[j++] = (uint8_t)((acc >> bits) & 0xFF);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AND is superfluous if you're casting to uint8_t right after (and the cast itself isn't necessary either but it also doesn't hurt.)

JSString *s, *ostr;
size_t len, out_len, written;

val = JS_ToString(ctx, argv[0]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe set ret = JS_EXCEPTION here, then you don't have to ret = ThingThatThrows(ctx) everywhere.

What you're doing now isn't wrong but ⤴️ is what we do in most places.

Comment on lines +60998 to +60999
written = b64_encode(in8, len, (char *)str8(ostr));
str8(ostr)[written] = '\0';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be slightly faster to cache the result of str8(ostr) in a local variable because str8 is not so trivial that you can rely on compilers doing that for you.

Also applies to line 61,071 and below.

};

static const char b64_flags[256] = {
[' ']=K_WS, ['\t']=K_WS, ['\r']=K_WS, ['\n']=K_WS,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ASCII whitespace" also includes form feed (\f).
See: https://infra.spec.whatwg.org/#ascii-whitespace

@bptato
Copy link
Copy Markdown
Contributor

bptato commented Apr 11, 2026

I agree with getting it right before optimizing, but can we at least not make it O(N*2)?
Here's a cleaner option:

static size_t
b64_decode(const char *src, size_t len, uint8_t *dst, int *err)
{
    size_t i, j;
    uint32_t acc;
    int seen, pad;
    unsigned ch;

    acc = 0;
    seen = 0;
    for (i = 0, j = 0; i < len; i++) {
        ch = (unsigned char)src[i];
        if ((b64_flags[ch] & K_WS))
            continue;
        if (!(b64_flags[ch] & K_VAL))
            break;
        acc = (acc << 6) | b64_val[ch];
        seen++;
        if (seen == 4) {
            dst[j++] = (acc >> 16) & 0xFF;
            dst[j++] = (acc >> 8) & 0xFF;
            dst[j++] = (acc >> 0) & 0xFF;
            seen = 0;
            acc = 0;
        }
    }

    if (seen != 0) {
        if (seen == 3) { /* discard last 2 bits */
            dst[j++] = (acc >> 10) & 0xFF;
            dst[j++] = (acc >> 2) & 0xFF;
        } else if (seen == 2) { /* discard last 4 bits */
            dst[j++] = (acc >> 4) & 0xFF;
        } else { /* remainder is 1, return failure */
            *err = 1;
            return 0;
        }
        for (pad = 0; i < len; i++) {
            ch = (unsigned char)src[i];
            if (pad < 2 && ch == '=')
                pad++;
            else if (!(b64_flags[ch] & K_WS))
                break;
        }
        if (pad != 0 && seen + pad != 4) {
            /* got padding, but code point length is not a multiple of 4 */
            *err = 1;
            return 0;
        }
    }

    *err = i < len;
    return j;
}

If remainder (seen) is 0, there can be no padding (2.). If it is 1, we return failure (3.). And if it's 2 or 3, we read up to 2 =s, and then the sum of seen + pad must be 4 (2.1).
(K_PAD can be removed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants