Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLDR: We request the support for TextEncoder in FLEDGE bidding worklet. #961

Open
zhaozegoog opened this issue Dec 21, 2023 · 5 comments
Open

Comments

@zhaozegoog
Copy link

TLDR: We request the support for TextEncoder in FLEDGE bidding worklet.

We’re testing our WASM implementation of generateBid, and we found difficulties as FLEDGE worklet doesn’t support TextEncoder. This is essential as our implementation depends on Protobuf.

When we use polyfill instead of TextEncoder, the resulting bidding script is significantly slower than javascript, which is in contradict to what we observe in our simulation with TextEncoder (running under Chrome main thread).

@caraitto
Copy link
Collaborator

If you're using Protobufs (a binary format) and WASM (which takes binary data in formats like ArrayBuffer, where Uint8Array is a view into that ArrayBuffer), and are running into the lack of TextEncoder, it sounds like you're looking for ways to get strings (I assume encoding the binary data in things like base64) from JavaScript into WASM. In that case, TextEncoder might be the wrong API, as it takes UTF-16 code units from an input string and converts that into a UTF-8 Uint8Array -- IIUC, it can’t correctly round trip arbitrary byte sequences that aren’t valid UTF-16.

Instead, a library that does base64 -> ArrayBuffer is what you probably want instead -- I wrote a pure-JS library that seems to perform well (comparable to TextEncoder), code below. The base64 format is described in https://datatracker.ietf.org/doc/html/rfc4648.

Or is the protobuf parsing happening in a JavaScript library that itself invokes the TextEncoder library?

Here’s the pure JavaScript code I wrote that does UTF-8 -> ArrayBuffer conversion:

const b64Chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

const lookup = new Uint8Array(256);
for (let i = 0; i < b64Chars.length; i++) {
  lookup[b64Chars.charCodeAt(i)] = i;
}

function b64ToArrayBuf(b64String) {
  const outputBuf = allocateBufFor(b64String);
  return b64ToArrayBufInto(b64String, outputBuf);
}

function allocateBufFor(b64String) {
  if (b64String.length % 4 !== 0) {
    throw "b64ToArrayBuf(): Expected input size divisible by 4; actual size " +
        b64String.length;
  }

  if (b64String.length === 0) {
    return new ArrayBuffer(0);
  }

  let outputLength = b64String.length / 4 * 3;
  if (b64String[b64String.length - 1] === "=") {
    outputLength--;
    if (b64String[b64String.length - 2] === "=") {
      outputLength--;
    }
  }

  return new ArrayBuffer(outputLength);
}

function b64ToArrayBufInto(b64String, outputBuf) {
  const outputUint8Array = new Uint8Array(outputBuf);
  let outPtr = 0;

  for (let i = 0; i < b64String.length; i += 4) {
    const part0 = lookup[b64String.charCodeAt(i)];
    const part1 = lookup[b64String.charCodeAt(i + 1)];
    const part2 = lookup[b64String.charCodeAt(i + 2)];
    const part3 = lookup[b64String.charCodeAt(i + 3)];

    outputUint8Array[outPtr++] = (part0 << 2) | (part1 >> 4);
    outputUint8Array[outPtr++] = ((part1 & 0x0F) << 4) | (part2 >> 2);
    outputUint8Array[outPtr++] = ((part2 & 0x03) << 6) | part3;
  }

  return outputBuf;
}

@zhaozegoog
Copy link
Author

Thanks for the suggestion!

We're testing your implementation and will report back on the efficiency.

Or is the protobuf parsing happening in a JavaScript library that itself invokes the TextEncoder library?

Unfortunately that's the case. There is a function serializeBinary (which returns Uinit8Array) in the protobuf library (JS) that is calls TextEncoder underneath, and hence your suggested implementation won't be sufficient.

Would appreciate if you have any more suggestions.

@droundy
Copy link

droundy commented Feb 2, 2024

We're running into a similar issue with TextDecoder, which is used by the glue javascript generated by wasm-bindgen for rust.

@JensenPaul
Copy link
Collaborator

@droundy, would you please add your affiliation to your GitHub profile?
I checked out wasm-bindgen and was looking around to see how they used TextDecoder. I see suspect you're running into this use of it. Do you have a rust example that generates a use of TextDecoder?

@droundy
Copy link

droundy commented Feb 16, 2024

I've added my affiliation with NextRoll to my profile.

Cargo.toml
lib.rs

You can build (once you put them in the right directories with

wasm-pack build --target web --release --no-typescript --out-dir pkg

which results in a pkg/minimal_bidding_logic.js that uses TextDecoder to convert rust strings into javascript strings. Right now we're using a polyfill to manage this, but it's not ideal.

I'm sure my example could be more minimal, and I haven't actually tested this minimal example in a browser to confirm that it uses those bindings, because I don't know of a convenient way to trigger an auction without messing with our staging deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants