-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: make CJS load from ESM loader #47999
Conversation
Review requested:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really promising! Great work!
I assume the end goal is to eventually remove the current CommonJS loader? And so we would need to figure out how to make the whole “CJS-only” flow entirely sync, so as not to break async hooks tests.
- how to make
require('./file.json')
work ifrequire
does not have import attributes but the ESM loader enforce atype: 'json'
attribute to load JSON modules?
I think we should track the origin of each module load: whether it was via require
or import
. Then the answer here is simple: if the origin was require
, look at the file extension. (I assume that’s how the CJS loader does it today.) Basically if the origin is require
, apply all the existing CJS semantics with extension searching, etc.
- Calling
require(‘./file.cjs’)
shortcut the cjs-module-lexer pass that detects CJS exports (because WASM instantiation is async), but if we do a laterimport(‘./test.cjs’)
dynamic import gets the expected list of exports?
I feel like we should be able to get away with not needing to know the list of named exports for a CommonJS module. The CJS loader doesn’t know or use them, I think? I know the ESM loader expects them as it does the linking phase as part of inserting ESM modules into V8, but maybe there could be some alternate code path within that part of the flow where if the origin was require
, to insert the module into V8 the way the CJS loader does now, without the linking phase.
- How to make dynamic import work from CJS?
Well it already works in the current CJS loader; what’s the issue specifically?
- How much of the legacy CJS API do we want to port?
As in, stuff like require.cache
and require.resolve
and so on? I assume we’d want to port as much as possible, especially stuff we see used in big popular projects.
-
- all the TODOs and probably more
Specifically, here's the use case: // target.cjs
exports.test = 1; // target-reexport.cjs
module.exports = require('./target.cjs'); // middlepoint.cjs
require('./target-reexport.cjs'); // middlepoint.mjs
import { test } from './target-reexport.cjs'; // entry1.mjs
import './middlepoint.mjs';
import('./middlepoint.cjs');
// No issue, node is able to detect `test` as a named export. // entry2.mjs
import './middlepoint.cjs';
import('./middlepoint.mjs'); // SyntaxError: Named export 'test' not found. Meaning we'll get different results for the same file depending on the order of imports. I don't think that's acceptable, but we might get away with it as long as this is opt-in and experimental.
|
Surely in order to drop the current CJS loading code, every single use case, with no exceptions, would need to be met? The CJS system has been marked "stable" and "locked", I believe, for a very long time, and to me that means that no feature would ever be removed. |
I’m not following this. This is an ESM file importing an ESM file; I would expect it to work. Is this issue that the stack is
This just seems like something to be fixed in the code. Surely there’s some way to make this work in a unified-loader architecture since it already works in the current split loader setup. |
That's my understanding as well.
That seems obvious, where else would we fix it? |
I just mean that this isn't a design question, it's just a matter of fixing a bug. As in we don't need to contemplate changing the API or anything like that. |
Speaking of $ node -p __filename
[eval]
$ node -p __dirname
.
$ echo 'console.log({__filename,__dirname})' | node
{ __filename: '[stdin]', __dirname: '.' } We could look at how other tools in the exosystem that hook into the CJS loader to provide virtual modules capabilities, but it seems they usually rely on also overriding the fs module so it's transparent to users, but I have a feeling that's not a goal for us. |
0eee2c3
to
859bb3e
Compare
That's my understanding as well. Node's classic module system has always assumed that the module identifier is a filename, and this matches I think a reasonably un-surprising choice would be for edit to add: Another idea would be to allow the loader to provide a |
@@ -211,6 +221,7 @@ translators.set('commonjs', async function commonjsStrategy(url, source, | |||
}); | |||
|
|||
function cjsPreparseModuleExports(filename, source) { | |||
// TODO: Do we want to keep hitting the user mutable CJS loader here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to in order to interoperate with transpiling workflows that overwrite require.extensions today, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there would be value in proposing an alternative that gets rid of all the legacy stuff (including require.extensions
) – even if that can never be the default, because of backward compat – otherwise it's simply too difficult to integrate. Maybe it's a bad idea, we'll see, I'm not married to it, but I'd like us to try it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Never" is quite a long time. Having a "no legacy" opt in would definitely be a very nice first step towards removing require.extensions eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think like the other TODOs, we want to avoid referencing the CommonJS loader whenever possible; if a custom loader has opted into handling CommonJS, then that custom loader’s author should know not to expect require.extensions
to be used for those modules. If users mix loaders that opt into the new behavior and loaders that use require.extensions
, well, I don’t know; maybe we try to error informatively in such a case? But I don’t think that’s something that we need to go out of our way to support.
af6b95c
to
8df8dc0
Compare
There are 13 test failures to address:
There are probably tons of edge cases to also take into consideration – although supporting every CJS edge cases is probably not a goal for this PR, but we should at least document what's working and what's not – thankfully @isaacs has volunteered to help with writing tests for that. Any additional help would be appreciated, please consider getting involved :) |
@aduh95 why is that not a goal? The CJS module loading system is the most stable one in node, which includes all the edge cases. |
It's not a goal for this PR, I should have say (edited my message above). The long term plans of what to support and how is still open to discussion, the goal in this PR is to achieve a MVP. |
thanks, that clarifies |
@aduh95 found a few issues: https://gist.github.com/isaacs/a9aacc1efeca7d565f2f7828467d9e6e
I just started looking at this again, so not much insight as to why that's happening, but figured it might be obvious/easy to you, having spent more time in this code. I'll keep poking at it, hopefully have more useful feedback here shortly. |
It is performed, the issue was that the new translator didn't support |
Can you please wait for a review from me before landing this? The off-thread loader change broke ESM support in APMs and we want to make sure this doesn't break CJS too. |
Can you make a list of APIs that are necessary to preserve? I would imagine that APMs will just be required to start using the loaders API, even for CommonJS, to preserve current functionality. We’ll probably land #46826 before this (or we can ensure that it’s released before this) which should make the upgrade path simpler for APM authors and hidden from end users; the |
Forcing APMs to "catch up" is going to go very badly. ESM changes way too quickly for APMs to keep up with it unless we have people full time on just that and that's not likely to happen. We're more likely to just not support those Node.js versions and push harder on increasing diagnostics_channel adoption. Most APM users are very slow to upgrade anyway. |
I'm a little confused. Off-threading was over ½ a year in the making, and the design from an external perspective never changed throughout that period. Why is it a game of catch-up now, and what do you need to make that easier? (If this is a lengthy discussion, let's move it to the loaders repo). PS We are very, very eager to know use-cases in the wild we need to consider, and we're actively collecting them. If you can detail your considerations, that would be most welcome—or, better yet, codify them in tests it make it obvious if we've broken them. |
@Qard Please provide a list of CommonJS APIs that APMs rely on and we can try to preserve their functionality while getting the Loaders API to support CommonJS. But in my mind, if there’s any conflict between those goals, finishing the Loaders API takes priority. It’s just broken without full CommonJS support, and we can’t leave it in that state indefinitely just because some vendors who are profiting off of Node won’t invest in supporting Node’s newest versions. That said, I’m not trying to break anyone so if there’s any way to maintain compatibility with monkey-patching CommonJS internals or whatever techniques these tools rely on, I’m happy to continue supporting it whenever possible. |
Can you clarify this point? Would this break how |
I agree that adding ESM to CJS should be designed as an extension to be successful, not a replacement. |
That was just a hypothetical musing, that we don't need to debate as it's unlikely to happen at least anytime soon. The goal with this PR is to get all tests passing, which would mean we're not breaking any backward compatibility for any CommonJS APIs that are documented. My question for @Qard is whether there are any CommonJS APIs that APMs are relying on that aren't covered by tests; things like monkey patching that we don't officially support, but can try to preserve if this PR breaks them. |
See: https://github.com/DataDog/dd-trace-js/blob/master/packages/dd-trace/src/ritm.js I won't have time to get into more detail until Tuesday, but there are a bunch of underscore-prefixed methods on the Module class which need to be accessible for APMs to continue working. My plan has been to migrate people to diagnostics_channel so module patching becomes no longer relevant, however that module patching mechanism needs to continue to work for at least the mid-term. It's important to understand that you're dealing with big enterprises that plan things in quarters or years. If you want to break compatibility with something, especially something this heavily relied upon, you can't just go do that in a few months you generally need a year or more to migrate users and there needs to be something to migrate to before that can happen. And pre-release discussion doesn't count, there's no visibility for that unless a company has people actively engaging in that group which is generally not the case. That's why the off-thread loaders thing was so disruptive, because there was no flag to turn it off in the mid-term. Yes there was some leading work that went into it, but we don't have people actively engaging in the loaders group because we are a small team and don't have the time to be in every discussion. At most APMs the Node.js team is only a couple people and most of their time is going into delivering OKRs. I would love to have people working on Node.js core full-time, and I've been pushing for that, but that's simply not the reality, especially right now with hiring across the market being so tight. |
Notable changes: deps: * V8: cherry-pick 93275031284c (Joyee Cheung) nodejs#48660 doc: * add new TSC members (Michael Dawson) nodejs#48841 * add rluvaton to collaborators (Raz Luvaton) nodejs#49215 esm: * unflag import.meta.resolve (Guy Bedford) nodejs#49028 * add `initialize` hook, integrate with `register` (Izaak Schroeder) nodejs#48842 * unflag `Module.register` and allow nested loader `import()` (Izaak Schroeder) nodejs#48559 inspector: * (SEMVER-MINOR) open add `SymbolDispose` (Chemi Atlow) nodejs#48765 module: * implement `register` utility (João Lenon) nodejs#46826 * make CJS load from ESM loader (Antoine du Hamel) nodejs#47999 src: * add built-in `.env` file support (Yagiz Nizipli) nodejs#48890 * initialize cppgc (Daryl Haresign and Joyee Cheung) nodejs#48660 and nodejs#45704 test_runner: * (SEMVER-MINOR) expose location of tests (Colin Ihrig) nodejs#48975 PR-URL: nodejs#49185
PR-URL: nodejs#49530 Refs: nodejs#48842 Refs: nodejs#47999 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Akhil Marsonya <akhil.marsonya27@gmail.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Jacob Smith <jacob@frende.me>
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652)
* chore: upgrade to Node.js v20 * src: allow embedders to override NODE_MODULE_VERSION nodejs/node#49279 * src: fix missing trailing , nodejs/node#46909 * src,tools: initialize cppgc nodejs/node#45704 * tools: allow passing absolute path of config.gypi in js2c nodejs/node#49162 * tools: port js2c.py to C++ nodejs/node#46997 * doc,lib: disambiguate the old term, NativeModule nodejs/node#45673 * chore: fixup Node.js BSSL tests * nodejs/node#49492 * nodejs/node#44498 * deps: upgrade to libuv 1.45.0 nodejs/node#48078 * deps: update V8 to 10.7 nodejs/node#44741 * test: use gcUntil() in test-v8-serialize-leak nodejs/node#49168 * module: make CJS load from ESM loader nodejs/node#47999 * src: make BuiltinLoader threadsafe and non-global nodejs/node#45942 * chore: address changes to CJS/ESM loading * module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652) * bootstrap: optimize modules loaded in the built-in snapshot nodejs/node#45849 * test: mark test-runner-output as flaky nodejs/node#49854 * lib: lazy-load deps in modules/run_main.js nodejs/node#45849 * url: use private properties for brand check nodejs/node#46904 * test: refactor `test-node-output-errors` nodejs/node#48992 * assert: deprecate callTracker nodejs/node#47740 * src: cast v8::Object::GetInternalField() return value to v8::Value nodejs/node#48943 * test: adapt test-v8-stats for V8 update nodejs/node#45230 * tls: ensure TLS Sockets are closed if the underlying wrap closes nodejs/node#49327 * test: deflake test-tls-socket-close nodejs/node#49575 * net: fix crash due to simultaneous close/shutdown on JS Stream Sockets nodejs/node#49400 * net: use asserts in JS Socket Stream to catch races in future nodejs/node#49400 * lib: fix BroadcastChannel initialization location nodejs/node#46864 * src: create BaseObject with node::Realm nodejs/node#44348 * src: implement DataQueue and non-memory resident Blob nodejs/node#45258 * sea: add support for V8 bytecode-only caching nodejs/node#48191 * chore: fixup patch indices * gyp: put filenames in variables nodejs/node#46965 * build: modify js2c.py into GN executable * fix: (WIP) handle string replacement of fs -> original-fs * [v20.x] backport vm-related memory fixes nodejs/node#49874 * src: make BuiltinLoader threadsafe and non-global nodejs/node#45942 * src: avoid copying string in fs_permission nodejs/node#47746 * look upon my works ye mighty and dispair * chore: patch cleanup * [api] Remove AllCan Read/Write https://chromium-review.googlesource.com/c/v8/v8/+/5006387 * fix: missing include for NODE_EXTERN * chore: fixup patch indices * fix: fail properly when js2c fails in Node.js * build: fix js2c root_gen_dir * fix: lib/fs.js -> lib/original-fs.js * build: fix original-fs file xforms * fixup! module: make CJS load from ESM loader * build: get rid of CppHeap for now * build: add patch to prevent extra fs lookup on esm load * build: greatly simplify js2c modifications Moves our original-fs modifications back into a super simple python script action, wires up the output of that action into our call to js2c * chore: update to handle moved internal/modules/helpers file * test: update @types/node test * feat: enable preventing cppgc heap creation * feat: optionally prevent calling V8::EnableWebAssemblyTrapHandler * fix: no cppgc initialization in the renderer * gyp: put filenames in variables nodejs/node#46965 * test: disable single executable tests * fix: nan tests failing on node headers missing file * tls,http2: send fatal alert on ALPN mismatch nodejs/node#44031 * test: disable snapshot tests * nodejs/node#47887 * nodejs/node#49684 * nodejs/node#44193 * build: use deps/v8 for v8/tools Node.js hard depends on these in their builtins * test: fix edge snapshot stack traces nodejs/node#49659 * build: remove js2c //base dep * build: use electron_js2c_toolchain to build node_js2c * fix: don't create SafeSet outside packageResolve Fixes failure in parallel/test-require-delete-array-iterator: === release test-require-delete-array-iterator === Path: parallel/test-require-delete-array-iterator node:internal/per_context/primordials:426 constructor(i) { super(i); } // eslint-disable-line no-useless-constructor ^ TypeError: object is not iterable (cannot read property Symbol(Symbol.iterator)) at new Set (<anonymous>) at new SafeSet (node:internal/per_context/primordials:426:22) * fix: failing crashReporter tests on Linux These were failing because our change from node::InitializeNodeWithArgs to node::InitializeOncePerProcess meant that we now inadvertently called PlatformInit, which reset signal handling. This meant that our intentional crash function ElectronBindings::Crash no longer worked and the renderer process no longer crashed when process.crash() was called. We don't want to use Node.js' default signal handling in the renderer process, so we disable it by passing kNoDefaultSignalHandling to node::InitializeOncePerProcess. * build: only create cppgc heap on non-32 bit platforms * chore: clean up util:CompileAndCall * src: fix compatility with upcoming V8 12.1 APIs nodejs/node#50709 * fix: use thread_local BuiltinLoader * chore: fixup v8 patch indices --------- Co-authored-by: Keeley Hammond <vertedinde@electronjs.org> Co-authored-by: Samuel Attard <marshallofsound@electronjs.org>
* chore: upgrade to Node.js v20 * src: allow embedders to override NODE_MODULE_VERSION nodejs/node#49279 * src: fix missing trailing , nodejs/node#46909 * src,tools: initialize cppgc nodejs/node#45704 * tools: allow passing absolute path of config.gypi in js2c nodejs/node#49162 * tools: port js2c.py to C++ nodejs/node#46997 * doc,lib: disambiguate the old term, NativeModule nodejs/node#45673 * chore: fixup Node.js BSSL tests * nodejs/node#49492 * nodejs/node#44498 * deps: upgrade to libuv 1.45.0 nodejs/node#48078 * deps: update V8 to 10.7 nodejs/node#44741 * test: use gcUntil() in test-v8-serialize-leak nodejs/node#49168 * module: make CJS load from ESM loader nodejs/node#47999 * src: make BuiltinLoader threadsafe and non-global nodejs/node#45942 * chore: address changes to CJS/ESM loading * module: make CJS load from ESM loader (nodejs/node#47999) * lib: improve esm resolve performance (nodejs/node#46652) * bootstrap: optimize modules loaded in the built-in snapshot nodejs/node#45849 * test: mark test-runner-output as flaky nodejs/node#49854 * lib: lazy-load deps in modules/run_main.js nodejs/node#45849 * url: use private properties for brand check nodejs/node#46904 * test: refactor `test-node-output-errors` nodejs/node#48992 * assert: deprecate callTracker nodejs/node#47740 * src: cast v8::Object::GetInternalField() return value to v8::Value nodejs/node#48943 * test: adapt test-v8-stats for V8 update nodejs/node#45230 * tls: ensure TLS Sockets are closed if the underlying wrap closes nodejs/node#49327 * test: deflake test-tls-socket-close nodejs/node#49575 * net: fix crash due to simultaneous close/shutdown on JS Stream Sockets nodejs/node#49400 * net: use asserts in JS Socket Stream to catch races in future nodejs/node#49400 * lib: fix BroadcastChannel initialization location nodejs/node#46864 * src: create BaseObject with node::Realm nodejs/node#44348 * src: implement DataQueue and non-memory resident Blob nodejs/node#45258 * sea: add support for V8 bytecode-only caching nodejs/node#48191 * chore: fixup patch indices * gyp: put filenames in variables nodejs/node#46965 * build: modify js2c.py into GN executable * fix: (WIP) handle string replacement of fs -> original-fs * [v20.x] backport vm-related memory fixes nodejs/node#49874 * src: make BuiltinLoader threadsafe and non-global nodejs/node#45942 * src: avoid copying string in fs_permission nodejs/node#47746 * look upon my works ye mighty and dispair * chore: patch cleanup * [api] Remove AllCan Read/Write https://chromium-review.googlesource.com/c/v8/v8/+/5006387 * fix: missing include for NODE_EXTERN * chore: fixup patch indices * fix: fail properly when js2c fails in Node.js * build: fix js2c root_gen_dir * fix: lib/fs.js -> lib/original-fs.js * build: fix original-fs file xforms * fixup! module: make CJS load from ESM loader * build: get rid of CppHeap for now * build: add patch to prevent extra fs lookup on esm load * build: greatly simplify js2c modifications Moves our original-fs modifications back into a super simple python script action, wires up the output of that action into our call to js2c * chore: update to handle moved internal/modules/helpers file * test: update @types/node test * feat: enable preventing cppgc heap creation * feat: optionally prevent calling V8::EnableWebAssemblyTrapHandler * fix: no cppgc initialization in the renderer * gyp: put filenames in variables nodejs/node#46965 * test: disable single executable tests * fix: nan tests failing on node headers missing file * tls,http2: send fatal alert on ALPN mismatch nodejs/node#44031 * test: disable snapshot tests * nodejs/node#47887 * nodejs/node#49684 * nodejs/node#44193 * build: use deps/v8 for v8/tools Node.js hard depends on these in their builtins * test: fix edge snapshot stack traces nodejs/node#49659 * build: remove js2c //base dep * build: use electron_js2c_toolchain to build node_js2c * fix: don't create SafeSet outside packageResolve Fixes failure in parallel/test-require-delete-array-iterator: === release test-require-delete-array-iterator === Path: parallel/test-require-delete-array-iterator node:internal/per_context/primordials:426 constructor(i) { super(i); } // eslint-disable-line no-useless-constructor ^ TypeError: object is not iterable (cannot read property Symbol(Symbol.iterator)) at new Set (<anonymous>) at new SafeSet (node:internal/per_context/primordials:426:22) * fix: failing crashReporter tests on Linux These were failing because our change from node::InitializeNodeWithArgs to node::InitializeOncePerProcess meant that we now inadvertently called PlatformInit, which reset signal handling. This meant that our intentional crash function ElectronBindings::Crash no longer worked and the renderer process no longer crashed when process.crash() was called. We don't want to use Node.js' default signal handling in the renderer process, so we disable it by passing kNoDefaultSignalHandling to node::InitializeOncePerProcess. * build: only create cppgc heap on non-32 bit platforms * chore: clean up util:CompileAndCall * src: fix compatility with upcoming V8 12.1 APIs nodejs/node#50709 * fix: use thread_local BuiltinLoader * chore: fixup v8 patch indices --------- Co-authored-by: Keeley Hammond <vertedinde@electronjs.org> Co-authored-by: Samuel Attard <marshallofsound@electronjs.org>
This PR adds support for
source
in theload
hook when loading'commonjs'
modules. CJS modules loaded this way will receive arequire
function that's not the standard CJS one. It implements only a subset of the CJS API, and monkey-patching the CJS API will (mostly1) not affect it. Loading a module using thatrequire
function will not call into the CJS loader but the ESM one – and similarly, callingrequire.resolve
will use the ESMresolve
hook.Footnotes
There are still some dependency to the legacy loader. The long-term goal would be to get rid of those. ↩