-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: support require()ing synchronous ESM graphs #51977
Conversation
Review requested:
|
The
notable-change
Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section. |
5a27731
to
612b870
Compare
Before we get into the technical details, I just want to give a heartfelt THANK YOU to @joyeecheung for taking this on, and express my awe of her brilliance in figuring out how to achieve it. |
I think the hooks do affect
What does this mean? Doing the extension searching for .cjs and/or .mjs in the filename? I wouldn’t worry about that for this PR; anyone doing |
I LOVE this idea. It will simplify so many things. Let's keep going with it. |
515d02d
to
1ab2592
Compare
They only affect the
It means I don't know what happens when this happens, and there are not yet any test for it. |
Big +1 on the idea and I think bun shows this is feasible and users like it. |
b4c3f5c
to
9905e2d
Compare
9905e2d
to
a235a15
Compare
I wanted to use this feature with mime package. In its documentation the usage in ESM module is explained like this: import mime from 'mime';
mime.getType('txt'); // ⇨ 'text/plain' So after reading trough this thread I thought that if I use the flag const mime = require('mime');
mime.getType('txt'); // ⇨ TypeError: mime.getType is not a function But it didn't work. I had to extract the const mime = require('mime').default;
mime.getType('txt'); // ⇨ 'text/plain' My question: is this the intended behaviour? |
Yes, const mime = require('mime');
mime.getType('txt'); // ⇨ TypeError: mime.getType is not a function is equivalent to import * as mime from 'mime';
mime.getType('txt'); // ⇨ TypeError: mime.getType is not a function |
OK, I understand. Is it stated somewhere in the docs? Because what you said:
is in my opinion very important and should be highlighted to prevent confusion in the future. |
https://nodejs.org/docs/latest/api/modules.html#loading-ecmascript-modules-using-require mentions that "require() will load the requested module as an ES Module, and return the module name space object", but maybe if it's not clear enough you could consider submitting a PR that adds |
PR-URL: nodejs#51977 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Guy Bedford <guybedford@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
This patch adds `require()` support for synchronous ESM graphs under the flag `--experimental-require-module` This is based on the the following design aspect of ESM: - The resolution can be synchronous (up to the host) - The evaluation of a synchronous graph (without top-level await) is also synchronous, and, by the time the module graph is instantiated (before evaluation starts), this is is already known. If `--experimental-require-module` is enabled, and the ECMAScript module being loaded by `require()` meets the following requirements: - Explicitly marked as an ES module with a `"type": "module"` field in the closest package.json or a `.mjs` extension. - Fully synchronous (contains no top-level `await`). `require()` will load the requested module as an ES Module, and return the module name space object. In this case it is similar to dynamic `import()` but is run synchronously and returns the name space object directly. ```mjs // point.mjs export function distance(a, b) { return (b.x - a.x) ** 2 + (b.y - a.y) ** 2; } class Point { constructor(x, y) { this.x = x; this.y = y; } } export default Point; ``` ```cjs const required = require('./point.mjs'); // [Module: null prototype] { // default: [class Point], // distance: [Function: distance] // } console.log(required); (async () => { const imported = await import('./point.mjs'); console.log(imported === required); // true })(); ``` If the module being `require()`'d contains top-level `await`, or the module graph it `import`s contains top-level `await`, [`ERR_REQUIRE_ASYNC_MODULE`][] will be thrown. In this case, users should load the asynchronous module using `import()`. If `--experimental-print-required-tla` is enabled, instead of throwing `ERR_REQUIRE_ASYNC_MODULE` before evaluation, Node.js will evaluate the module, try to locate the top-level awaits, and print their location to help users fix them. PR-URL: nodejs#51977 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Guy Bedford <guybedford@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Notable changes: benchmark: * add AbortSignal.abort benchmarks (Raz Luvaton) nodejs#52408 buffer: * improve `base64` and `base64url` performance (Yagiz Nizipli) nodejs#52428 crypto: * deprecate implicitly shortened GCM tags (Tobias Nießen) nodejs#52345 deps: * (SEMVER-MINOR) update simdutf to 5.0.0 (Daniel Lemire) nodejs#52138 * (SEMVER-MINOR) update undici to 6.3.0 (Node.js GitHub Bot) nodejs#51462 * (SEMVER-MINOR) update undici to 6.2.1 (Node.js GitHub Bot) nodejs#51278 dns: * (SEMVER-MINOR) add order option and support ipv6first (Paolo Insogna) nodejs#52492 doc: * update release gpg keyserver (marco-ippolito) nodejs#52257 * add release key for marco-ippolito (marco-ippolito) nodejs#52257 * add UlisesGascon as a collaborator (Ulises Gascón) nodejs#51991 * (SEMVER-MINOR) deprecate fs.Stats public constructor (Marco Ippolito) nodejs#51879 events,doc: * mark CustomEvent as stable (Daeyeon Jeong) nodejs#52618 fs: * add stacktrace to fs/promises (翠 / green) nodejs#49849 lib, url: * (SEMVER-MINOR) add a `windows` option to path parsing (Aviv Keller) nodejs#52509 net: * (SEMVER-MINOR) add CLI option for autoSelectFamilyAttemptTimeout (Paolo Insogna) nodejs#52474 report: * (SEMVER-MINOR) add `--report-exclude-network` option (Ethan Arrowood) nodejs#51645 src: * (SEMVER-MINOR) add `string_view` overload to snapshot FromBlob (Anna Henningsen) nodejs#52595 * (SEMVER-MINOR) add C++ ProcessEmitWarningSync() (Joyee Cheung) nodejs#51977 * (SEMVER-MINOR) add uv_get_available_memory to report and process (theanarkh) nodejs#52023 * (SEMVER-MINOR) preload function for Environment (Cheng Zhao) nodejs#51539 stream: * (SEMVER-MINOR) support typed arrays (IlyasShabi) nodejs#51866 test_runner: * (SEMVER-MINOR) add suite() (Colin Ihrig) nodejs#52127 * (SEMVER-MINOR) add `test:complete` event to reflect execution order (Moshe Atlow) nodejs#51909 util: * (SEMVER-MINOR) support array of formats in util.styleText (Marco Ippolito) nodejs#52040 v8: * (SEMVER-MINOR) implement v8.queryObjects() for memory leak regression testing (Joyee Cheung) nodejs#51927 watch: * mark as stable (Moshe Atlow) nodejs#52074 PR-URL: nodejs#52793
@joyeecheung if you are interested I guess we can backport this to v20 (only 2 out of 3 commits landed on v20x) |
If it doesn't land cleanly I can prepare a backport - there are a couple of bug fixes that should be backported together too. |
yes thanks I think this has been baking for lts enough |
Note: it's not landing cleanly because #50322 and #51999 are not yet on v20.x. #51999 is technically not non-breaking but also I imagine the breakage is tiny - only printing some warnings when it's about to quit with unsettled TLA, though I'd let the releasers decide wether it's okay to backport it to v20.x. For now I'll just prepare a backport with non-risky patches. |
It seems backporting the detection-related patches would be somewhat problematic if #50322 isn't backported first (then all the module type detection logic would need to be rewritten to match the old internal API, which would make backporting #50322 later even harder). I'll try to just backport the require(esm) part and drop detection for now until #50322 is backported. |
#52868 depends on #52058 which depends on #51758 which depends on a new V8 API that's not on v20.x. I guess the safest way to backport the chain is to add a variant of #51758 which use the old V8 API, that would be slow but at least allow us to avoid conflicts of future patches depending on the utility... |
This patch adds `require()` support for synchronous ESM graphs under the flag `--experimental-require-module` This is based on the the following design aspect of ESM: - The resolution can be synchronous (up to the host) - The evaluation of a synchronous graph (without top-level await) is also synchronous, and, by the time the module graph is instantiated (before evaluation starts), this is is already known. If `--experimental-require-module` is enabled, and the ECMAScript module being loaded by `require()` meets the following requirements: - Explicitly marked as an ES module with a `"type": "module"` field in the closest package.json or a `.mjs` extension. - Fully synchronous (contains no top-level `await`). `require()` will load the requested module as an ES Module, and return the module name space object. In this case it is similar to dynamic `import()` but is run synchronously and returns the name space object directly. ```mjs // point.mjs export function distance(a, b) { return (b.x - a.x) ** 2 + (b.y - a.y) ** 2; } class Point { constructor(x, y) { this.x = x; this.y = y; } } export default Point; ``` ```cjs const required = require('./point.mjs'); // [Module: null prototype] { // default: [class Point], // distance: [Function: distance] // } console.log(required); (async () => { const imported = await import('./point.mjs'); console.log(imported === required); // true })(); ``` If the module being `require()`'d contains top-level `await`, or the module graph it `import`s contains top-level `await`, [`ERR_REQUIRE_ASYNC_MODULE`][] will be thrown. In this case, users should load the asynchronous module using `import()`. If `--experimental-print-required-tla` is enabled, instead of throwing `ERR_REQUIRE_ASYNC_MODULE` before evaluation, Node.js will evaluate the module, try to locate the top-level awaits, and print their location to help users fix them. PR-URL: nodejs#51977 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Guy Bedford <guybedford@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Opened #53500 |
Notable changes: benchmark: * add AbortSignal.abort benchmarks (Raz Luvaton) nodejs#52408 buffer: * improve `base64` and `base64url` performance (Yagiz Nizipli) nodejs#52428 crypto: * deprecate implicitly shortened GCM tags (Tobias Nießen) nodejs#52345 deps: * (SEMVER-MINOR) update simdutf to 5.0.0 (Daniel Lemire) nodejs#52138 * (SEMVER-MINOR) update undici to 6.3.0 (Node.js GitHub Bot) nodejs#51462 * (SEMVER-MINOR) update undici to 6.2.1 (Node.js GitHub Bot) nodejs#51278 dns: * (SEMVER-MINOR) add order option and support ipv6first (Paolo Insogna) nodejs#52492 doc: * update release gpg keyserver (marco-ippolito) nodejs#52257 * add release key for marco-ippolito (marco-ippolito) nodejs#52257 * add UlisesGascon as a collaborator (Ulises Gascón) nodejs#51991 * (SEMVER-MINOR) deprecate fs.Stats public constructor (Marco Ippolito) nodejs#51879 events,doc: * mark CustomEvent as stable (Daeyeon Jeong) nodejs#52618 fs: * add stacktrace to fs/promises (翠 / green) nodejs#49849 lib, url: * (SEMVER-MINOR) add a `windows` option to path parsing (Aviv Keller) nodejs#52509 net: * (SEMVER-MINOR) add CLI option for autoSelectFamilyAttemptTimeout (Paolo Insogna) nodejs#52474 report: * (SEMVER-MINOR) add `--report-exclude-network` option (Ethan Arrowood) nodejs#51645 src: * (SEMVER-MINOR) add `string_view` overload to snapshot FromBlob (Anna Henningsen) nodejs#52595 * (SEMVER-MINOR) add C++ ProcessEmitWarningSync() (Joyee Cheung) nodejs#51977 * (SEMVER-MINOR) add uv_get_available_memory to report and process (theanarkh) nodejs#52023 * (SEMVER-MINOR) preload function for Environment (Cheng Zhao) nodejs#51539 stream: * (SEMVER-MINOR) support typed arrays (IlyasShabi) nodejs#51866 test_runner: * (SEMVER-MINOR) add suite() (Colin Ihrig) nodejs#52127 * (SEMVER-MINOR) add `test:complete` event to reflect execution order (Moshe Atlow) nodejs#51909 util: * (SEMVER-MINOR) support array of formats in util.styleText (Marco Ippolito) nodejs#52040 v8: * (SEMVER-MINOR) implement v8.queryObjects() for memory leak regression testing (Joyee Cheung) nodejs#51927 watch: * mark as stable (Moshe Atlow) nodejs#52074 PR-URL: nodejs#52793
Notable changes: benchmark: * add AbortSignal.abort benchmarks (Raz Luvaton) nodejs#52408 buffer: * improve `base64` and `base64url` performance (Yagiz Nizipli) nodejs#52428 crypto: * deprecate implicitly shortened GCM tags (Tobias Nießen) nodejs#52345 deps: * (SEMVER-MINOR) update simdutf to 5.0.0 (Daniel Lemire) nodejs#52138 * (SEMVER-MINOR) update undici to 6.3.0 (Node.js GitHub Bot) nodejs#51462 * (SEMVER-MINOR) update undici to 6.2.1 (Node.js GitHub Bot) nodejs#51278 dns: * (SEMVER-MINOR) add order option and support ipv6first (Paolo Insogna) nodejs#52492 doc: * update release gpg keyserver (marco-ippolito) nodejs#52257 * add release key for marco-ippolito (marco-ippolito) nodejs#52257 * add UlisesGascon as a collaborator (Ulises Gascón) nodejs#51991 * (SEMVER-MINOR) deprecate fs.Stats public constructor (Marco Ippolito) nodejs#51879 events,doc: * mark CustomEvent as stable (Daeyeon Jeong) nodejs#52618 fs: * add stacktrace to fs/promises (翠 / green) nodejs#49849 lib, url: * (SEMVER-MINOR) add a `windows` option to path parsing (Aviv Keller) nodejs#52509 net: * (SEMVER-MINOR) add CLI option for autoSelectFamilyAttemptTimeout (Paolo Insogna) nodejs#52474 report: * (SEMVER-MINOR) add `--report-exclude-network` option (Ethan Arrowood) nodejs#51645 src: * (SEMVER-MINOR) add `string_view` overload to snapshot FromBlob (Anna Henningsen) nodejs#52595 * (SEMVER-MINOR) add C++ ProcessEmitWarningSync() (Joyee Cheung) nodejs#51977 * (SEMVER-MINOR) add uv_get_available_memory to report and process (theanarkh) nodejs#52023 * (SEMVER-MINOR) preload function for Environment (Cheng Zhao) nodejs#51539 stream: * (SEMVER-MINOR) support typed arrays (IlyasShabi) nodejs#51866 test_runner: * (SEMVER-MINOR) add suite() (Colin Ihrig) nodejs#52127 * (SEMVER-MINOR) add `test:complete` event to reflect execution order (Moshe Atlow) nodejs#51909 util: * (SEMVER-MINOR) support array of formats in util.styleText (Marco Ippolito) nodejs#52040 v8: * (SEMVER-MINOR) implement v8.queryObjects() for memory leak regression testing (Joyee Cheung) nodejs#51927 watch: * mark as stable (Moshe Atlow) nodejs#52074 PR-URL: nodejs#52793
Summary
This patch adds
require()
support for synchronous ESM graphs underthe flag
--experimental-require-module
This is based on the the following design aspect of ESM:
also synchronous, and, by the time the module graph is instantiated
(before evaluation starts), this is is already known.
If
--experimental-require-module
is enabled, and the ECMAScriptmodule being loaded by
require()
meets the following requirements:"type": "module"
field inthe closest package.json or a
.mjs
extension.await
).require()
will load the requested module as an ES Module, and returnthe module name space object. In this case it is similar to dynamic
import()
but is run synchronously and returns the name space objectdirectly.
If the module being
require()
'd contains top-levelawait
, or the modulegraph it
import
s contains top-levelawait
,ERR_REQUIRE_ASYNC_MODULE
will be thrown. In this case, users shouldload the asynchronous module using
import()
.If
--experimental-print-required-tla
is enabled, instead of throwingERR_REQUIRE_ASYNC_MODULE
before evaluation, Node.js will evaluate themodule, try to locate the top-level awaits, and print their location to
help users find them.
Background
There were some previous discussions about this idea back in 2019 (e.g. #49450). I I didn't go through all of them, but in 2024 I believe we can agree that not supporting
require(esm)
is creating enough pain for our users that we should really deprioritize the drawbacks of it. A non-perfect solution is still better than having nothing at all IMO.There was a previous attempt in #30891 which tried to support TLA from the start and thus needed to run the event loop recursively, which would be unsafe and therefore it was closed (synchronous-only
require(esm)
was brought up in #30891 (comment) but the PR didn't end up going that way). I have the impression that there were some other attempts before, but non active AFAIK.This PR tries to keep it simple - only load ESM synchronously when we know it's synchronous (which is part of the design of ESM and is supported by the V8 API), and if it contains TLA, we throw. That should at least address the majority of use cases of ESM (TLA in a module that's supposed to be import'ed is already not a great idea, they are more meant for entry points. If they are really needed, users can use
import()
to make that asynchronicity explicit).When I was refactoring the module loader implementation and touching the V8 Module API to fix other issues, this idea appears to be natural to me (since ESM is really designed to have this synchronocity in mind) and does not actually need that much work in 2024 (er, with some refactorings that I already did for other issues at least..), so here is another attempt at it.
Motivation
The motivation for this is probably obvious, but I'll give my take again in case there are unfamiliar readers: CJS/ESM interop would always be done on a best-effort basis and they should not be mixed if avoidable, but today the majority of the popular packages out there in the registry are still CJS. There needs to be an escape hatch for simple cases while the transition happens.
With
require(esm)
, when a dependency goes ESM-only, it is less likely to be a breaking change for users as long as it's a synchronous ESM (with no top-level await), which should be the case most of the time. This helps package authors transition to ESM without worrying about user experience, or having to release it as dual module which bloats thenode_modules
size even further and leads to identity problems due to the duplication.The design of ESM already ensures that synchronous evaluation and therefore interop with CJS for a synchronous graph is possible (e.g. see tc39/proposal-top-level-await#61), and we won't be alone in restricting TLA for certain features(e.g. w3c/ServiceWorker#1407 service workers on the web also disallows TLA) it would be a shame not to make use of that. Ongoing proposal like import defer could also help addressing the lazy-loading needs without breaking the synchronous aspect of ESM.
TODOs
There are still some feature interactions that this implementation doesn't handle (e.g.
--experimental-detect-module
or--experimental-loader
or--experimental-wasm-modules
). Some edge cases involving cycles probably would have undefined behaviors. I don't think this needs to handle interactions with everything (especially other experimental features) perfectly to land as a first iteration of an experimental feature. We can continue iterating on it while it's experimental.