Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for -sEXPORT_ES6/*.mjs on Node.js #17915

Merged
merged 3 commits into from Nov 17, 2022

Conversation

kleisauke
Copy link
Collaborator

@kleisauke kleisauke commented Sep 22, 2022

As described in #11792, require() and __dirname doesn't exist in
an ES6 module. Emscripten uses this to import built-in core Node.js
modules. For example, the node:fs module is used for synchronously
importing the *.wasm binary, when not linking with -sSINGLE_FILE.

To work around this, ES6 modules on Node.js may import
createRequire() from node:module to construct the require()
function, allowing modules to be imported in a CommonJS manner.

Emscripten targets a variety of environments, which can be
categorized as:

  1. Multi-environment builds, which is the default when
       -sENVIRONMENT=* is not specified at link time.
  2. Single-environment, e.g. only web or Node.js as target.

For use case (1), this commit ensures that an async function is
emitted, allowing Node.js modules to be dynamically imported. This is
necessary given that static import declarations cannot be put in
conditionals. Inside the module, for Node.js only, it's using the
above-mentioned createRequire()-construction.

For use case (2), when only Node.js is targeted, a static import
declaration utilize the same createRequire()-construction.

For both use cases, -sUSE_ES6_IMPORT_META=0 is not allowed, when
Node.js is among the targets, since it is not possible to mimic
__dirname when import.meta support is not available.

This commit does not change anything for use case (2), when only the
web is targeted (-sENVIRONMENT=web).

Resolves: #11792.
Resolves: #17797.

@sbc100 sbc100 requested review from kripken and RReverser and removed request for kripken September 22, 2022 18:28
@kleisauke kleisauke changed the title Ensure EXPORT_ES6 is tested on Node.js Add support for -sEXPORT_ES6/*.mjs on Node.js Oct 14, 2022
@kleisauke
Copy link
Collaborator Author

This should be ready for review, but I'm unsure how to proceed due to this FIXME:

emscripten/src/shell.js

Lines 205 to 208 in bc619d3

// FIXME: How about the -sUSE_ES6_IMPORT_META=0 case? Is there a way of
// getting the current absolute path of the module when support for
// `import.meta.url` is not available?
scriptDirectory = nodePath.dirname(fileURLToPath(import.meta.url));

Also, perhaps we should swap all require()'s with import()'s?

# TODO: Swap all `require()`'s with `import()`'s?

But AFAIK that requires a lot more changes, and I'm not sure it's worth it. For example, should we then eagerly import the crypto module too?

emscripten/src/library.js

Lines 2229 to 2238 in f9d26b0

if (ENVIRONMENT_IS_NODE) {
// for nodejs with or without crypto support included
try {
var crypto_module = require('crypto');
// nodejs has crypto support
return () => crypto_module['randomBytes'](1)[0];
} catch (e) {
// nodejs doesn't have crypto support
}
}

/cc @curiousdannii the creator of issue #11792.

@kripken
Copy link
Member

kripken commented Oct 17, 2022

cc @brendandahl who might know this ES6 stuff? I'm afraid I don't know much about it myself...

emcc.py Show resolved Hide resolved
/**
* @param {URL|string} url
*/
var fileURLToPath = function(url) {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs seem to exist on the url module, rather than at the top level..?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are, but they're imported into the module's global scope (see emcc.py).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't closure be able to see that then? It seems odd to me that we should need to declare these in this global location like this? Perhaps a local annotation where they are declared/imported?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem odd. Maybe closure doesn't understand await import('url')?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it's a bit odd that this is needed. I think this was necessary when targeting both web/node as well as node only. That is:

if (typeof process == 'object' &&
    typeof process.versions == 'object' &&
    typeof process.versions.node == 'string') {
  const { createRequire } = await import('module');
  var { pathToFileURL, fileURLToPath } = await import('url');
  var require = createRequire(import.meta.url);
}

(-sENVIRONMENT=web,node)

import { createRequire } from 'module';
import { pathToFileURL, fileURLToPath } from 'url';
const require = createRequire(import.meta.url);

(-sENVIRONMENT=node)

Perhaps closure is run on an indeterminate file instead of the final JS file with these imports? Let me verify this.

Copy link
Collaborator Author

@kleisauke kleisauke Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that Closure indeed has some limitations with dynamic import expressions.
https://github.com/google/closure-compiler/wiki/JS-Modules#dynamic-import-expressions

I moved these externals to the correct file with commit 68377b6 and removed the need of this URL import() with commit bae5bf5.

src/shell.js Show resolved Hide resolved

@parameterized({
'': (True, [],),
'no_import_meta': (False, ['-sUSE_ES6_IMPORT_META=0'],),
})
@node_pthreads
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pthreads here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test links with -pthread, so Node.js before version 16 needs the --experimental-wasm-bulk-memory --experimental-wasm-threads flags to successfully execute the subdir/hello_world.mjs script.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with some comments

@curiousdannii
Copy link
Contributor

This is a clever temporary solution, but I'm pretty sure it wouldn't help when the code is bundled, so it's not a full solution to #11792. Substantial parts of the JS output will need to be redesigned in order to be async, so as I've said before, before #11792 can be properly solved it will need direction from the core Emscripten team.

@sbc100
Copy link
Collaborator

sbc100 commented Oct 17, 2022

This is a clever temporary solution, but I'm pretty sure it wouldn't help when the code is bundled, so it's not a full solution to #11792. Substantial parts of the JS output will need to be redesigned in order to be async, so as I've said before, before #11792 can be properly solved it will need direction from the core Emscripten team.

Are you ok with landing this as a temporary fix? It seems like have some test coverage is better than none.

@curiousdannii
Copy link
Contributor

I haven't tested it myself, but the idea is a good one, and it would definitely be good to get some more test coverage.

@brendandahl
Copy link
Collaborator

cc @brendandahl who might know this ES6 stuff? I'm afraid I don't know much about it myself...

Looks and works how'd I'd expect.

@kleisauke
Copy link
Collaborator Author

but I'm pretty sure it wouldn't help when the code is bundled

Indeed, I think some bundlers will still fail on await import() (although that is only used when targeting both web/node) or import.meta.url (needed to get the current absolute path of the module, see the FIXME in #17915 (comment)).

Substantial parts of the JS output will need to be redesigned in order to be async

Is there a design document of how this would look like? I suppose Emscripten needs to change this emit:

var Module = (() => {
  var _scriptDir = import.meta.url;
  
  return (
function(Module) {
  Module = Module || {};

...

  return Module.ready
})();
export default Module;

to:

var Module = (() => {
  var _scriptDir = import.meta.url;
  
  return (
async function(Module) {
  Module = Module || {};

...

  return Module.ready
})();
export default Module;

(i.e., second function needs async, untested)

To allow await import() statements inside the module (annotated with ... in the above example). However, that wouldn't work for normal ES import statements (import ...), since that can't be put in conditionals. Currently, this (draft) PR uses this import syntax only when targeting Node (-sENVIRONMENT=node).

@kleisauke
Copy link
Collaborator Author

I resolved the above mentioned FIXME by ignoring -sUSE_ES6_IMPORT_META=0 when targeting Node, see commit bae5bf5. AFAIK, this option was introduced to fix compat on old browsers and bundlers.

@RReverser
Copy link
Collaborator

I suppose Emscripten needs to change this emit:

Actually... maybe best to do that as part of this PR? Then it wouldn't depend on top-level await, which is a relatively new feature in browsers (definitely newer than import.meta and we use conditional checks even for that) and still not supported by some tooling/bundlers.

@kleisauke
Copy link
Collaborator Author

Then it wouldn't depend on top-level await, which is a relatively new feature in browsers (definitely newer than import.meta and we use conditional checks even for that)

Ah, I think you're right. Let me fix that. I assumed that guarding the top-level await import for Node.js would prevent browsers from failing, but this probably didn't work for Node.js before v14.8 either.

@kleisauke
Copy link
Collaborator Author

... should be fixed with commit e8f83e9.

src/preamble.js Outdated Show resolved Hide resolved
src/shell.js Outdated
// EXPORT_ES6 + ENVIRONMENT_IS_NODE always requires use of import.meta.url,
// since there's no way getting the current absolute path of the module when
// support for that is not available.
scriptDirectory = nodePath.dirname(nodeUrl.fileURLToPath(import.meta.url));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I'm less sure about, but also worth checking if we can get away with just keeping the URL here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I wasn't sure about this. I removed the fileURLToPath construction with commit cff5945, however, that may cause the locateFile function to return file:// strings on Node.js.

emscripten/src/shell.js

Lines 152 to 161 in 84a6341

// `/` should be present at the end if `scriptDirectory` is not empty
var scriptDirectory = '';
function locateFile(path) {
#if expectToReceiveOnModule('locateFile')
if (Module['locateFile']) {
return Module['locateFile'](path, scriptDirectory);
}
#endif
return scriptDirectory + path;
}

Let's see if the CI likes that, though locally it still passes the other.test_emcc_output_mjs* other.test_export_es6* tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however, that may cause the locateFile function to return file:// strings on Node.js.

Now I'm really curious, if file:// strings are not accepted directly, then how would same strings returned from locateFile work...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm really curious, if file:// strings are not accepted directly, then how would same strings returned from locateFile work...

IIUC, file:// strings are accepted after that isFileURI(filename) ? new URL(filename) ... change in read_, which will probably always be called on Node.js when reading files. But maybe it lacks some test coverage for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, that makes sense. (well, at least if nothing else uses the result of locateFile)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9074bb7 :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, while doing that, I noticed this:

emscripten/src/shell.js

Lines 136 to 137 in 84a6341

#if EXPORT_ES6
var _scriptDir = import.meta.url;

... a couple of lines further ...

emscripten/src/shell.js

Lines 364 to 370 in 84a6341

#if MODULARIZE
// When MODULARIZE, this JS may be executed later, after document.currentScript
// is gone, so we saved it, and we use it here instead of any other info.
if (_scriptDir) {
scriptDirectory = _scriptDir;
}
#endif

So, perhaps scriptDirectory was already a file://-string on Node.js when linking with -sEXPORT_ES6? Let me verify this.

Copy link
Collaborator Author

@kleisauke kleisauke Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... nvm, I didn't look carefully, that is guarded with #if SHARED_MEMORY && !MODULARIZE, though MODULARIZE is forcefully enabled when linking with -sEXPORT_ES6:

emscripten/emcc.py

Lines 2253 to 2257 in 89a494f

if settings.EXPORT_ES6 and not settings.MODULARIZE:
# EXPORT_ES6 requires output to be a module
if 'MODULARIZE' in user_settings:
exit_with_error('EXPORT_ES6 requires MODULARIZE to be set')
settings.MODULARIZE = 1

So I'm not sure if this comment is up-to-date:

emscripten/src/shell.js

Lines 134 to 135 in 84a6341

// In MODULARIZE mode _scriptDir needs to be captured already at the very top of the page immediately when the page is parsed, so it is generated there
// before the page load. In non-MODULARIZE modes generate it here.

Edit: I need more coffee.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6e52016 :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you're in a danger zone :) But if it works, great!

src/shell.js Outdated Show resolved Hide resolved
src/shell.js Outdated Show resolved Hide resolved
@kleisauke
Copy link
Collaborator Author

This PR failed to auto-roll due to a failed test case on the macOS runner, see:
https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket/8797224671288030801/+/u/Emscripten_testsuite__other_/stdout

Looks like Node/V8 failed on an assertion(?), curious. I'll investigate.

@kleisauke
Copy link
Collaborator Author

... I opened PR #18230 for that.

@RReverser
Copy link
Collaborator

Yet another reason to finally update Node.js in emsdk to supported versions...

@lgarron
Copy link

lgarron commented Nov 18, 2022

Since this PR has landed, it can be tested by installing the "tip-of-tree" build:

Fabulous, thank you for the instructions! 😻
I'm able to confirm that this allows https://github.com/cubing/twsearch/ to run directly in node. 🤩

This also includes passing through esbuild.

Before:

> node build/esm-test/test.js 
file:///Users/lgarron/Code/git/github.com/cubing/twsearch/build/esm-test/twsearch-5NPKYV2U.js:43
        scriptDirectory = __dirname + "/";
        ^

ReferenceError: __dirname is not defined
    at file:///Users/lgarron/Code/git/github.com/cubing/twsearch/build/esm-test/twsearch-5NPKYV2U.js:43:9
    at importOnce (file:///Users/lgarron/Code/git/github.com/cubing/twsearch/build/esm-test/test.js:10:16)

Node.js v19.0.1

After:

> node build/esm-test/test.js
Created new moves U2 U' L2 L' F2 F' R2 R' B2 B' D2 D'
State size is about 2.2145 x 10^23 log2 77.5513
Requiring 117 bits 16 bytes per entry; 16 from identity.
Found 7 canonical move states.
Calculated canonical states in 0.00399995
For memsize 1073741824 bytesize 1073741824 subshift 42 memshift 32 shardshift 26
Initializing memory in 0.173
Filling table at depth 0 with val 0 saw 1 (1) in 0
Filling table at depth 1 with val 0 saw 18 (19) in 0
Filling table at depth 2 with val 0 saw 243 (262) in 0
Filling table at depth 3 with val 0 saw 3240 (3502) in 0.000999928
Solving test
Filling table at depth 4 with val 0 saw 43239 (46756) in 0.00699997
Filling table at depth 5 with val 0 saw 575085 (624124) in 0.04
Depth 11 finished in 1.514
Filling table at depth 6 with val 0 saw 7623202 (8331112) in 0.547
Depth 12 finished in 1.545
 L U2 F' R2 L' U' L U R2 F2 U2 F' L'
Found 1 solution at maximum depth 13 lookups 35468730 in 7.063 rate 5.02177e+06
Alg { startCharIndex: 0, endCharIndex: 35 } L U2 F' R2 L' U' L U R2 F2 U2 F' L'

lgarron added a commit to cubing/twsearch that referenced this pull request Nov 18, 2022
@lgarron
Copy link

lgarron commented Nov 18, 2022

I do have a small nit, though, which is that the imports are still not prefixed with node:, e.g. node:module. This is necessary to distinguish the import from an npm package name. Would that be most appropriate to file a new issue for?

@lgarron
Copy link

lgarron commented Nov 18, 2022

I do have a small nit, though, which is that the imports are still not prefixed with node:, e.g. node:module. This is necessary to distinguish the import from an npm package name. Would that be most appropriate to file a new issue for?

I've created #18235 for this.

@RReverser
Copy link
Collaborator

I think node: prefix is a relatively recent addition in Node 18+, and not something supported in Node 14 that Emscripten ships with. Besides, I'm not sure what the bundler & other tooling support is like.

I'd rather err on the side of compatibility with more tooling for now.

shesek added a commit to shesek/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
shesek added a commit to shesek/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
shesek added a commit to shesek/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
shesek added a commit to shesek/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
shesek added a commit to shesek/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
jgriffiths pushed a commit to ElementsProject/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
jgriffiths pushed a commit to ElementsProject/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
jgriffiths pushed a commit to ElementsProject/libwally-core that referenced this pull request Dec 2, 2022
Updating emsdk to v3.1.27 was necessary for emscripten-core/emscripten#17915,
which allows using the same WASM build for both browser and nodejs envs.
zbjornson added a commit to cellengine/lmfit.js that referenced this pull request Dec 30, 2022
* `EXPORT_ES6` and `ENVIRONMENT=*node*` requires `USE_ES6_IMPORT_META` to be set. This changed in emscripten-core/emscripten#17915.

* `_malloc` and `_free` must be declared as exported functions to avoid removal by the compiler.

* Remove [`WASM_ASYNC_COMPILATION=0`](https://emsettings.surma.technology/#WASM_ASYNC_COMPILATION).

  The current code was broken by emscripten-core/emscripten#12650. Previously, `WASM_ASYNC_COMPILATION=0` would return `Promise<Module>`, but that PR made it return `Module` directly. (Took ages to find the cause of this!)

  Removing the flag to keep it async avoids a breaking change in this library. Node.js v14.8.0 and later support top-level await, so async isn't that ugly. Alternatively, the Node.js version could easily be changed to load synchronously. Node.js has no limit on the size of synchronously loaded WebAssembly modules, unlike Chromium.
zbjornson added a commit to cellengine/lmfit.js that referenced this pull request Dec 30, 2022
* `EXPORT_ES6` and `ENVIRONMENT=*node*` requires `USE_ES6_IMPORT_META` to be set. This changed in emscripten-core/emscripten#17915.

* `_malloc` and `_free` must be declared as exported functions to avoid removal by the compiler.

* Remove [`WASM_ASYNC_COMPILATION=0`](https://emsettings.surma.technology/#WASM_ASYNC_COMPILATION).

  The current code was broken by emscripten-core/emscripten#12650. Previously, `WASM_ASYNC_COMPILATION=0` would return `Promise<Module>`, but that PR made it return `Module` directly. (Took ages to find the cause of this!)

  Removing the flag to keep it async avoids a breaking change in this library. Node.js v14.8.0 and later support top-level await, so async isn't that ugly. Alternatively, the Node.js version could easily be changed to load synchronously. Node.js has no limit on the size of synchronously loaded WebAssembly modules, unlike Chromium.
@paulcdejean
Copy link

Hello this is a neat new feature but I don't see any documentation of it anywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants