Skip to content

Add TextEncoder polyfill (WHATWG Encoding Standard)#171

Merged
bkaradzic-microsoft merged 2 commits into
BabylonJS:mainfrom
bkaradzic-microsoft:add-textencoder-polyfill
Jun 1, 2026
Merged

Add TextEncoder polyfill (WHATWG Encoding Standard)#171
bkaradzic-microsoft merged 2 commits into
BabylonJS:mainfrom
bkaradzic-microsoft:add-textencoder-polyfill

Conversation

@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor

Adds a TextEncoder polyfill that mirrors the existing TextDecoder polyfill in this repository, so non-Babylon-Native consumers can get both halves of the WHATWG Encoding Standard from JsRuntimeHost without having to pull in or duplicate a separate implementation.

TextEncoder is needed by older Chakra-based runtimes where the global is not built in. Modern V8 / JSC / Hermes runtimes already expose it natively, in which case Initialize() is a no-op.

Surface

  • TextEncoder() constructor — UTF-8 only, per spec
  • encoding accessor — always "utf-8"
  • encode(input)Uint8Array
  • encodeInto(source, destination){ read, written }
    • read is in UTF-16 code units, so a 4-byte UTF-8 sequence (code point outside the BMP) reports 2
    • Multi-byte UTF-8 sequences are never split across the destination boundary

Build / wiring

  • New gated option JSRUNTIMEHOST_POLYFILL_TEXTENCODER (default ON), matching the pattern used by every other polyfill in this repo.
  • New Polyfills/TextEncoder/ library with the same layout as Polyfills/TextDecoder/ (CMakeLists.txt, Include/Babylon/Polyfills/TextEncoder.h, Source/TextEncoder.cpp, README.md).
  • Linked into the unit-test executable (Windows + Android) and initialized in Tests/UnitTests/Shared/Shared.cpp next to TextDecoder::Initialize.

Tests

Adds a new describe("TextEncoder", ...) block in Tests/UnitTests/Scripts/tests.ts covering:

  • encoding === "utf-8"
  • encode() on ASCII, undefined / no-arg, multi-byte UTF-8 (e.g. "é"[0xC3, 0xA9]), and embedded null bytes
  • encodeInto() happy path, refusal to split a multi-byte sequence when the destination is too small, and the surrogate-pair read semantics for U+1F600
  • encodeInto() TypeError when the destination is not a Uint8Array

Motivation

Surfaced during the BabylonNative review of BabylonJS/BabylonNative#1708, where this polyfill originally lived. Per @bghgary's review feedback, TextEncoder belongs alongside TextDecoder in JsRuntimeHost so it's available to any consumer of this runtime host, not just BN.

Mirrors the existing TextDecoder polyfill so non-Babylon-Native consumers can
get both halves of the WHATWG Encoding Standard from JsRuntimeHost without
having to pull in or duplicate a separate implementation.

TextEncoder is needed by older Chakra-based runtimes where the global is not
built in. Modern V8 / JSC / Hermes runtimes already expose it natively, in
which case `Initialize()` is a no-op.

Surface:
- TextEncoder() constructor (UTF-8 only, per spec)
- encoding accessor (always "utf-8")
- encode(input) -> Uint8Array
- encodeInto(source, destination) -> { read, written }
  - "read" is in UTF-16 code units (so a 4-byte UTF-8 sequence reports 2)
  - multi-byte UTF-8 sequences are never split across the destination boundary

Wired into the gated CMake option JSRUNTIMEHOST_POLYFILL_TEXTENCODER (ON),
the same pattern used for every other polyfill in this repo, and exercised
by new unit tests alongside the existing TextDecoder tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 1, 2026 15:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a WHATWG-compatible TextEncoder polyfill (UTF-8 only) to complement the existing TextDecoder polyfill, with build-time gating and unit-test wiring so older Chakra-based runtimes can rely on JsRuntimeHost for both APIs.

Changes:

  • Introduces new Polyfills/TextEncoder library (C++ implementation + public header + README) and wires it into the polyfills CMake.
  • Adds JSRUNTIMEHOST_POLYFILL_TEXTENCODER CMake option (default ON) and includes the new polyfill when enabled.
  • Updates unit tests to link/initialize TextEncoder and adds JS test coverage for encoding, encode, and encodeInto.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
Tests/UnitTests/Shared/Shared.cpp Initializes the new TextEncoder polyfill in the unit test runtime.
Tests/UnitTests/Scripts/tests.ts Adds a TextEncoder test suite covering encoding, encode, and encodeInto.
Tests/UnitTests/CMakeLists.txt Links the new TextEncoder library into the unit test executable.
Tests/UnitTests/Android/app/src/main/cpp/CMakeLists.txt Links TextEncoder into the Android unit test JNI target.
Polyfills/TextEncoder/Source/TextEncoder.cpp Implements the TextEncoder polyfill (encoding, encode, encodeInto).
Polyfills/TextEncoder/README.md Documents supported surface area and usage.
Polyfills/TextEncoder/Include/Babylon/Polyfills/TextEncoder.h Adds the public Initialize API for the polyfill.
Polyfills/TextEncoder/CMakeLists.txt Defines the TextEncoder library target and include paths.
Polyfills/CMakeLists.txt Adds conditional inclusion of the TextEncoder subdirectory.
CMakeLists.txt Adds JSRUNTIMEHOST_POLYFILL_TEXTENCODER option.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Polyfills/TextEncoder/Source/TextEncoder.cpp Outdated
Comment thread Tests/UnitTests/Scripts/tests.ts Outdated
bkaradzic-microsoft added a commit to bkaradzic-microsoft/BabylonNative that referenced this pull request Jun 1, 2026
- Initialize AbortController polyfill in AppContext alongside the other
  JsRuntimeHost polyfills and link it into the Win32 / Android Playground
  binaries (Android JNI link fix for BabylonNativeJNI).
- Extend the validation_native.js `document` shim with `createEvent` and
  `dispatchEvent` so playgrounds that synthesize DOM events stop tripping
  `document.dispatchEvent is not a function`.
- Re-enable the one playground (serialization round-trip) that was previously
  quarantined for that error.

Note on review history: an earlier revision of this PR also introduced
native `TextEncoder` and `PointerEvent` polyfills directly under
`Polyfills/`. Per review feedback those have been split off:
- `TextEncoder` belongs alongside `TextDecoder` in JsRuntimeHost
  (WHATWG Encoding Standard); proposed there in BabylonJS/JsRuntimeHost#171
  and will be wired into Playground in a follow-up once that lands.
- `PointerEvent` was dropped pending an offline discussion about whether
  BabylonNative should polyfill DOM input types at all when
  `DeviceInputSystem` already exists.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread Polyfills/TextEncoder/Source/TextEncoder.cpp Outdated
Babylon.js does not use TextEncoder.encodeInto anywhere in its sources
(verified by grep across packages/), and the (non-trivial) UTF-16 code
unit accounting it requires is the bulk of the polyfill's complexity.
Keep only encoding/encode for the common case; encodeInto can be added
back at the time a real consumer appears.

This also obsoletes the destination-type validation feedback from
review (the typed-array guard only existed for encodeInto).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bkaradzic-microsoft bkaradzic-microsoft enabled auto-merge (squash) June 1, 2026 19:44
@bkaradzic-microsoft bkaradzic-microsoft merged commit 81b01d2 into BabylonJS:main Jun 1, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants