Skip to content

N-API v7 across runtimes (V8 / JavaScriptCore / Chakra)#189

Draft
matthargett wants to merge 48 commits into
BabylonJS:mainfrom
rebeckerspecialties:napi-v7
Draft

N-API v7 across runtimes (V8 / JavaScriptCore / Chakra)#189
matthargett wants to merge 48 commits into
BabylonJS:mainfrom
rebeckerspecialties:napi-v7

Conversation

@matthargett
Copy link
Copy Markdown

Third PR in the N-API stack (after #116 napi-tests, #183 napi-shared-android). Raises the default
NAPI_VERSION 5 -> 7 and brings the v6/v7 surface across every runtime. Fork CI: 22/22 green.

Engine BigInt instance_data ArrayBuffer detach
V8 (Android/Win/Linux) native yes yes
JavaScriptCore (macOS/iOS) C-API (macOS 15+) + JS-eval fallback yes transfer()
JavaScriptCore (jsc-android ~2020) ENOTSUP feature-detect (engine has no BigInt) yes n/a
Chakra (Win32/UWP) ENOTSUP feature-detect yes
JSI n/a — backend is the node-addon-api C++ wrapper, not the C API

Conformance is honest per engine: test_bigint runs for real on V8 + Apple JSC; a literal-free
test_bigint_unsupported fallback asserts the ENOTSUP throw on engines with no BigInt (jsc-android,
Chakra). test_instance_data runs everywhere v6 is supported. detach + get_all_property_names land at
the v9 bump (their vendored tests need node_api_basic_env); the detach feature is implemented now.

Notable: jsc-android (~2020) has no BigInt at all — its parser rejects 0n literals — so the JSC
backend feature-detects BigInt at env init and reports ENOTSUP, exactly like Chakra. The JSI backend
implements only the C++ Napi:: wrapper (zero C napi_*), so the C-API conformance suite is gated off
it; covering JSI would mean a full C-API-over-jsi port, out of scope here.

Also fixed several pre-existing cross-platform CI failures this surfaced (first broad fork CI of the
napi stack): node_lite engine selection by compile-define (Android-JSC, Linux-V8), <cstdint> on
Linux, napi-into-addon linking on Windows/UWP, the GSL C4875 warnings-as-error toolchain drift, the
JSI shared-header include path, and on-device surfacing of in-process node_lite failures.

Draft — stacked; base is upstream main so the diff includes the #116 + #183 commits until those merge
(merge order: #183, then #116, then this). Validated 22/22 on the fork CI matrix via the twin rebeckerspecialties#4.


Landing sequence

Inter-related N-API PRs; intended order (✅ done / ⏳ pending):

  1. Fix Windows/UWP CI build (MSVC C4875 + JSI include path) #185 — Fix Windows/UWP CI build (MSVC C4875 + JSI include path). Independent build/toolchain fix; lands first so CI is green under everything below. (fork twin Fix Windows/UWP CI build (MSVC C4875 + JSI include path) #185 — 22/22 green)
  2. Build napi as a shared library on Android #183 — Build napi as a shared library on Android (so in-process addons resolve napi_*).
  3. Add N-API compliance tests #116 — N-API compliance tests (the Android addons resolve napi_* via the shared libnapi.so from the step above).
  4. N-API v7 across runtimes (V8 / JSC / Chakra) — this PR. (fork twin N-API v7 across runtimes (V8 / JavaScriptCore / Chakra) rebeckerspecialties/JsRuntimeHost#4 — 22/22 green on the fork CI matrix.)this PR

Motivating RFCs: #186 (jsc-android → maintained JSC) · #187 (WebWorker via N-API → Factotum into BabylonNative).

…ction when the receiver was undefined, so the VM forcibly substituted the global object even in strict mode. The new implementation always routes through Function.prototype.call, preserving the exact thisArg. This only affected JSC: Chakra already pushes recv onto the argv array before invoking JsCallFunction, and V8 hands the raw recv value to Function::Call. Neither engine coerces in strict mode, so no additional fixes were required.
…bug fix in strict mode was actually found by the suite! The failing behavior was exercised by Tests/NodeApi/test/js-native-api/3_callbacks/test.js. New cmake targets emit a node-lite binary and a NodeApiTests binary, all currently enabled tests for currently supported NAPI v5 pass on Mac. Next step is to enable them for running in Android simulator.
…x handles during instrumentation, causing crashes, and now route console output through the new NodeLiteRuntime::Callbacks. On Android we forward stdout/stderr to logcat via callbacks to work around this for now. Added Android-specific shims (node_lite_android.cpp, child_process_android.cpp) so native module loading uses dlopen and JS child_process.spawnSync safely reports “unsupported”. Extended the Node‑API harness to allow in-process execution: RunNodeLiteScript captures output, SetNodeApiTestEnvironment lets the JNI layer provide a base directory and asset manager, and the GTest registration path uses that configuration instead of shelling out to the node_lite executable
…-- a use-after-free. Will check sanitizers under Android next
Two build-restoration fixes (no behavior/impl or NAPI-version changes), needed
after rebasing onto upstream HEAD and building with the current Xcode/libc++:

- node_lite: NodeApi::CallFunction took std::span<napi_value> but is only ever
  called with braced-init-lists ({a,b,c}). Newer libc++ correctly rejects
  constructing a non-const std::span from an initializer_list (that ctor is
  C++26). Switch the parameter to std::initializer_list<napi_value> (begin()
  yields the const napi_value* napi_call_function wants).

- Tests/NodeApi: the POST_BUILD copy_directory of each .node runs as an Xcode
  script phase BEFORE Xcode's implicit CodeSign phase signs the original, so the
  copied addons that node_lite/NodeApiTests dlopen are unsigned on a clean build
  and macOS refuses to load them. Ad-hoc sign the copies directly (APPLE only).
…mulator)

Build-restoration fixes for the Android in-process NodeApi harness after rebasing
onto upstream HEAD (no impl/NAPI-version changes). Each was a latent break in the
napi-tests Android integration, surfaced by a clean build on a current toolchain:

- CMakeLists.txt: drop the AndroidExtensions Globals.cpp 'patch' step. It file(COPY)'d
  patches/AndroidExtensions/Globals.cpp, which was never committed in any ref (author's
  local-only file). Upstream uses a newer AndroidExtensions pin and needs no patch.
- build.gradle: bump default ndkVersion 23.1.7779620 -> 28.2.13676358 (matches CI's
  NDK_VERSION). NDK 23's libc++ can't compile googletest 1.17.0's <=> usage. Also map the
  Android sanitizer flag JSR_ENABLE_ASAN -> ENABLE_SANITIZERS (the upstream option kept
  during the rebase).
- Tests/NodeApi/CMakeLists.txt: use ${JsRuntimeHost_SOURCE_DIR} instead of
  ${CMAKE_SOURCE_DIR} for Core/Node-API include paths. On Android JsRuntimeHost is added
  as a subdirectory of the app, so CMAKE_SOURCE_DIR was the app dir (headers not found);
  the project-scoped var is correct in both standalone (macOS) and nested (Android) builds.
- Tests/NodeApi/CMakeLists.txt: allow the .node modules to link with unresolved napi_*
  symbols on Android (-Wl,--unresolved-symbols=ignore-all), the ELF equivalent of Apple's
  -undefined dynamic_lookup; they bind at dlopen time from the host (UnitTestsJNI).
- Shared.cpp: gate the Android NodeApi-harness block on NODE_API_AVAILABLE_NATIVE_TESTS
  (defined only by UnitTestsJNI) so the standalone UnitTests target -- built but unused on
  Android -- doesn't try to compile AndroidExtensions/NodeApi code it doesn't link.
The instrumented run aborted with 'use of deleted global reference': the harness fell
back to android::global::GetAppContext() (GetFilesDir -> GetObjectClass) whose JNI global
ref is not valid during the instrumented run. JNI.cpp now computes a writable base dir from
the still-valid instrumentation Context and passes it plus the native AAssetManager to
SetNodeApiTestEnvironment() before RunTests() -- the wiring the harness was designed for
(see e1fce6b) but which was never actually connected.

This removes the crash and lets ConfigureNodeApiTests run. NOTE: on-device execution of the
NodeApi conformance tests is still not achieved -- CopyAssetsRecursive relies on
AAssetManager subdirectory enumeration (AAssetDir_getNextFileName lists files only, not
dirs) so the nested test tree isn't copied, and the native .node modules are neither
packaged nor loadable from an app-writable dir on API 29+. Tracked as follow-up.
Before this, the instrumented run passed vacuously -- no NodeApi tests ran. Several
layered fixes get them executing on the emulator (macOS path unchanged: still 12/12):

#1 Asset enumeration: AAssetManager can't list subdirectories, so CopyAssetsRecursive
   copied nothing. copyNodeApiTests now emits a file manifest (manifest.txt -- not a
   dotfile, which aapt would drop) and Shared.cpp copies each listed file.

#2 Native module packaging/loading: build each addon as lib<name>.so on Android so AGP
   packages it into lib/<abi>/ (nativeLibraryDir, the only dlopen-able location on API
   29+); node_lite_android loads it by soname; ResolveModulePath resolves the (on-disk
   absent) .node so LoadNativeModule runs.

V8 lifecycle (in-process node_lite shares the host's V8): reuse the host's already-
   initialized V8 platform (fixes 'Wrong initialization order'); hold a Locker +
   Isolate::Scope so multi-isolate access is locked (fixes 'Entering the V8 API without
   proper locking').

KNOWN REMAINING (tracked): node_lite calls Node-API outside any napi callback during
   NodeLiteRuntime::Initialize/script execution, which on V8 needs a live HandleScope +
   current Context. v8::HandleScope/Context::Scope are stack-only (operator new is
   private) so they can't be held across the holder; this needs a scope-wrapping rework
   of node_lite's V8 entry points (or napi_open_handle_scope + context enter). Until then
   the on-device native tests segfault in napi_create_object.
…eate_object segfault)

NodeApiEnvScope -> jsr_open_napi_env_scope was a no-op stub: it allocated a scope
struct but never entered the env's V8 isolate/context. On JSC that's fine (the env
carries its context explicitly), but on V8 node_lite then calls Node-API outside any
napi callback with no *current context*, so napi_create_object -> v8::Object::New(isolate)
segfaulted during NodeLiteRuntime::Initialize. Enter the env's context on open and exit
it on close (Android only). The in-process V8 runtime now initializes and runs tests.
… error

Step toward in-process error handling: ExitOnException was noexcept, but the in-process
runner installs a fatal handler that throws NodeLiteFatalError (rather than std::exit) so
the harness can turn a JS error into a ProcessResult. Throwing from the noexcept function
std::terminate'd the test process. Dropped noexcept so it propagates to RunNodeLiteScript.

(Partial: other noexcept teardown paths -- NodeApiHandleScope/NodeApiEnvScope dtors calling
NODE_LITE_CALL, and the env-holder dtor's onUnhandledError -> ExitWithJSError -- can still
throw during unwinding when a test errors. Full in-process error-path exception-safety is
the remaining Android item.)
NodeApiHandleScope/NodeApiEnvScope destructors used the throwing NODE_LITE_CALL, and the
JsRuntimeHostEnvHolder destructor's onUnhandledError can invoke the throwing in-process fatal
handler -- both std::terminate if they fire while a NodeLiteFatalError is unwinding. Make the
scope dtors ignore the close status and wrap onUnhandledError in try/catch.

Correct robustness fixes, but they do NOT yet resolve the remaining in-process failure: when a
test errors, a *second* NodeLiteFatalError is thrown during unwinding (double-exception ->
std::terminate). The escaping throw site isn't visible in the tombstone (stack already unwound)
and needs on-device lldb to pinpoint. macOS unaffected (12/12).
…winding

Don't re-throw NodeLiteFatalError from the in-process fatal handler when std::uncaught_exceptions()
> 0, to avoid a double-exception std::terminate. (Correct hardening, but the remaining in-process
abort is a *single* uncaught NodeLiteFatalError escaping RunNodeLiteScript's catch -- a scope-exit
destructor throw on a test that leaves a pending exception; needs on-device lldb to pinpoint.)
…(fixes terminate)

THE fix for the in-process abort. ExitWithJSError / ExitWithJSAssertError / ExitWithMessage
were declared noexcept. With the default fatal handler they call std::exit (never throw), but
the in-process runner installs a handler that *throws* NodeLiteFatalError (caught by
RunNodeLiteScript and turned into a ProcessResult). A throw crossing a noexcept boundary is an
immediate std::terminate -- so when any test errored (e.g. the expected-error basics tests
throw_string/mustcall_failure), the whole instrumented run aborted instead of reporting a
result. Removing noexcept lets the throw unwind to the catch. Confirmed on the emulator via a
temporary _Unwind_Backtrace probe (now removed): the throw stack was
HandleFatalError <- ExitWithMessage(noexcept!) <- ExitWithJSError <- RunTestScript <- RunNodeLiteScript.

Net effect: the in-process Android run no longer aborts; the js-native-api v5 tests (2-5) pass;
the remaining failures are the basics harness self-tests, run through the generic fixture rather
than the specialized test_basics.cpp path macOS uses. macOS unaffected (still 12/12).
… napi

The js-native-api conformance addons are dlopen'd in-process by the Android
harness and import napi_* from the host (libUnitTestsJNI.so). The host is
loaded RTLD_LOCAL by System.loadLibrary, and bionic's linker-namespace model
does not surface its statically-linked (but exported) napi_* symbols to a
dlopen'd module -- so the addon cannot bind them at load time. Post-hoc
RTLD_GLOBAL promotion of the host is a no-op on bionic (confirmed on device:
the module dlopen still returns NULL with the host re-opened RTLD_GLOBAL).

Making these tests runnable on Android requires building napi as a shared
library (libnapi.so) depended on by both the host and the addons -- a
packaging change affecting every Android consumer, deferred to a separate
change per the v5-suite-in-place scope. Until then, skip the in-process addon
tests on Android with a clear reason; macOS runs the full v5 addon suite
(12/12) as the reference.

This unblocks the Android suite: it now builds, the in-process harness runs
without aborting, and the suite passes (addon tests reported SKIPPED).
… in-process)

Dynamic .node loading is never shipped to the Play / Quest stores, and bionic won't resolve a
dlopen'd addon's napi_* imports against the System.loadLibrary-loaded host anyway (the addon carries
no DT_NEEDED for napi; RTLD_GLOBAL host promotion is a no-op on bionic). Rather than make napi a
shared library for every Android consumer (tracked separately, task BabylonJS#9), compile the conformance
addons directly into the host (UnitTestsJNI) and resolve them in-process.

To link several addons into one binary without symbol clashes:
- node_api.h: make NODE_API_MODULE_REGISTER_FUNCTION / _GET_API_VERSION_FUNCTION overridable.
- entry_point.h (JSR_NODE_API_STATIC_LINK): give Init internal linkage and emit a per-addon load-time
  constructor that self-registers its uniquely-suffixed registrar/version functions with the host.
- The Android CMakeLists compiles each addon as an OBJECT library with per-module unique entry-point
  names and links them into UnitTestsJNI.
- node_lite_android LoadFunction resolves entry points from the in-process static registry by module
  name instead of dlopen+dlsym.

Removes the Android GTEST_SKIP. The 4 v5 js-native-api conformance tests now execute in-process and
PASS on Android (2_function_arguments, 3_callbacks, 4_object_factory, 5_function_factory). macOS is
unchanged (the desktop dynamic .node path uses the #else branches).
The conformance suite runs gtest in-process; its results (RUN/OK/FAILED and failure
file:line:message) went to stdout, which Android discards -- leaving only the JUnit "expected 0,
was 1" with no detail. Pump stdout/stderr to logcat (tag NodeApiTests) so test output and any
pre-crash native context are visible via `adb logcat -s NodeApiTests`.
…by static linking)

The conformance addons are statically linked into the in-process Android test host (f32130e), so the
standalone SHARED .so + -Wl,--unresolved-symbols=ignore-all + lib<name>.so naming that the old
dlopen-on-Android path needed are dead. add_node_api_module now early-returns on Android and is a
clean desktop-only MODULE .node helper. Also fixes a stale node_lite comment describing the
abandoned soname-dlopen path.

No functional change on desktop (MODULE .node, -undefined dynamic_lookup, POST_BUILD staging,
codesign all preserved); macOS still 12/12.
V8Platform::EnsureInitialized() became a no-op once we found the host AppRuntime already initializes
V8's process-global platform; the class and its unused init_flag_/platform_ members were left over
from the abandoned platform-init attempt. Fold the (still-important) "don't re-init the platform"
rationale into a comment at the isolate-creation site, and drop the dead class plus the now-unused
<mutex> / <libplatform> includes. No behavior change; Android still 4/4 js-native-api.
…d libnapi.so

Replaces the interim static-link-into-host approach with the dynamic .node model used by
nodejs/node-api-cts (add_node_api_cts_addon), so the Android suite and a future node-api-cts
migration share one addon model.

- Core/Node-API: build napi as a SHARED library (libnapi.so) on Android. It exports all 106 napi_*
  (default visibility -- no global -fvisibility=hidden), and the host plus every addon depend on the
  one libnapi.so via a real DT_NEEDED, so there is a single napi instance. Static elsewhere.
- The conformance addons are again standalone SHARED lib<name>.so (packaged into nativeLibraryDir),
  now linking napi (DT_NEEDED libnapi.so) instead of -Wl,--unresolved-symbols=ignore-all.
- node_lite_android resolves entry points via dlopen(soname)+dlsym again; the addon's napi_* bind
  from libnapi.so at load. Reverts the static-link infra (entry_point.h JSR_NODE_API_STATIC_LINK
  branch, node_api.h overridable registrar macros, the host's per-addon OBJECT libraries).

Verified on device: lib2_function_arguments.so has DT_NEEDED [libnapi.so], its napi_* are imports,
libnapi.so exports the 106 napi_*, and all 4 v5 js-native-api tests pass in-process. macOS unchanged
(12/12; desktop keeps static napi + dlopen'd MODULE .node).
Replace the hand-rolled pipe+thread stdout pump (added while bringing up the in-process Node-API
harness) with android::StdoutLogger::Start()/Stop() from AndroidExtensions, which the rest of the
UnitTests host already uses. Same effect -- the in-process gtest output (incl. failure
file:line:message) is visible in logcat (tag StdoutLogger) -- with less bespoke code.

Verified on emulator: 8/8 UnitTests pass incl. 4/4 js_native_api, gtest output present in logcat.
Adds test_reference_double_free (js-native-api) to the enabled suite -- a real v5 reference/double-free
test, green on macOS (system JSC, including ASan) and Android (V8). Its test_wrap.js is quarantined via
a known-failing allow-list in test_main.cpp: JSC napi_remove_wrap on an unwrapped object returns
napi_invalid_arg (the consecutive remove_wrap+delete_reference path itself does not crash); tracked as a
separate JSC fix.

The rest of the reference/finalizer/wrap suite (test_reference, test_finalizer, 6_object_wrap) targets
newer Node-API -- node_api_symbol_for (v9), napi_get_instance_data (v6),
node_api_basic_env / node_api_post_finalizer (v9) -- and is documented for the staged NAPI_VERSION bump.
…e, Windows error reporting)

- JSC napi_call_function: invoke through the canonical Function.prototype.call (captured once at env
  init) rather than the target function's own, user-overridable "call" property -- so `func.call = ...`
  cannot change native call behavior.
- child_process_android SpawnSync: return status 1 (was -1, which wrapped in the uint32_t status
  field) for the unsupported path.
- node_lite_windows LoadFunction: format the narrow lib_path.string() with %s (path::c_str() is
  wchar_t* on Windows -> UB) and report ::GetLastError() instead of errno for LoadLibraryA failures.
Bump the [BABYLON-NATIVE-ADDITION] default (still -DNAPI_VERSION-overridable) so the v6/v7 surface is
exposed. Non-breaking: the enabled conformance addons use only v1-v5, and the JSC/macOS build stays
green at v7 (13 pass / 1 skip). V8 implements the full surface for free; JSC/Chakra v6/v7 functions
are added incrementally in follow-ups, feature-detect-failing where the engine C API can't express
them (BigInt, ArrayBuffer detach).
Per-env instance data + a finalizer that runs at env teardown. Enables the test_instance_data
conformance test, green on JSC/macOS at v7.
BigInt: create via the macOS-15+ C API (JSBigIntCreateWith*) with a JS-BigInt-global eval fallback
for jsc-android/older; extract (get_value_*) via BigInt.asIntN/asUintN + toString round-trips; typeof
reports napi_bigint (kJSTypeBigInt on macOS 15+); create_bigint_words enforces the Node INT_MAX /
RangeError size limits before reading the words buffer. Detach: ArrayBuffer.prototype.transfer()
(ES2024) with .detached for is_detached; ENOTSUP throw where transfer() is absent. test_bigint green
on JSC/macOS.
…cate)

macOS 15+/iOS 18+ use the kJSTypeBigInt fast path; older JSC (jsc-android ~2020) does not surface
kJSTypeBigInt through the C API, so fall back to a cached `typeof v === 'bigint'` predicate created at
env init. Keeps the v6 BigInt conformance test green on the jsc-android target.
Chakra: add napi_set/get_instance_data (per-env slot, finalized at env teardown by ~napi_env__) and the
BigInt create/get functions as feature-detection stubs that throw a JS-catchable ENOTSUP error -- the
Win10 OS edge-mode Chakra predates BigInt and exposes no JsBigInt* API, so there is no value-preserving
fallback. JSI (Core/Node-API-JSI) does not implement the v6/v7 C surface, so its conformance addons are
gated to the v1-v5 set. V8 + JavaScriptCore execute the v6/v7 tests for real; Chakra links them.
The harness picked the engine from the platform (__APPLE__ == JSC, __ANDROID__ == V8), which breaks any
non-default pairing: Android-JSC compiled the V8 path (#include <v8.h> -> not found) and Linux-V8 hit no
engine branch at all. Expose JSR_NAPI_ENGINE_<ENGINE> as a PUBLIC compile definition on the napi target
and branch on that instead. Fixes the Android_JSC + Ubuntu(V8) regressions; the #else is now a
compiling throw stub so Chakra/JSI builds still compile (they don't run node_lite in CI).
… at link

The .node addons referenced napi_* but never linked the napi library, so every Win32/UWP build failed
with unresolved external symbols (napi_create_function, napi_get_last_error_info, ...). MSVC binds DLL
imports at link time, so link napi directly (mirrors the Android branch). Fixes the pre-existing
Windows/UWP addon link failures.
…him not green on-device)

test_instance_data (v6) runs on both Android engines; BigInt runs for real on V8. On jsc-android the
eval-based BigInt shim fails on-device with no error detail from the in-process runner (macOS JSC + V8
both pass test_bigint), so it's gated to V8 and tracked as a follow-up.
…uild

windows-latest's newer MSVC promotes C4875 (deprecated non-string [[gsl::suppress]] arg) from the
vendored GSL headers to an error under /WX, breaking every Win32/UWP build (pre-existing on main).
Suppress the dependency-side deprecation.
…ng JSI C1083)

The JSI napi target only had its own include/ on the include path, but napi.h includes the shared
<napi/js_native_api_types.h> from Core/Node-API/Include/Shared (not built when the engine is JSI). Add
that directory so the JSI backend compiles. Pre-existing failure (JSI was red on main with the same
C1083); unblocks Win32/UWP JSI.
…jsc-android ~2020)

jsc-android (~2020) ships without BigInt -- its parser even rejects `0n` literals -- so the eval-based
BigInt shim can't work there. Detect BigInt once at env init (`typeof BigInt === 'function' && typeof
BigInt(1) === 'bigint'`, which parses on every JSC) and have the create/get paths throw a JS-catchable
ENOTSUP when it's absent, matching the Chakra backend. macOS/iOS JSC (which have BigInt) are unaffected.
…e BigInt by engine support

The standard test_bigint uses `0n` literals, which don't parse on engines without BigInt (jsc-android
~2020, Win10 Chakra). Add a feature-detection fallback that asserts napi_create_bigint_int64 throws
ENOTSUP, and gate it to run exactly where BigInt is absent (Chakra; jsc-android) while the real
test_bigint runs on V8 + Apple JSC. Keeps the desktop + Android run-lists in sync.
The in-process runner keeps the assertion/exception message + stack in ProcessResult.std_error, which
never reached the device log -- making on-device conformance failures undebuggable (a 0 ms FAILED with
no detail). Log it on non-zero status. This is what surfaced the jsc-android BigInt SyntaxError.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant