Async hooks and CommonJS loaders #45711

GeoffreyBooth · 2022-12-02T06:40:46Z

GeoffreyBooth
Dec 2, 2022
Collaborator

A requested feature for the Loaders API is to have loaders support CommonJS as well as ESM. This would allow libraries like ts-node to use a single code path for their customizations, rather than implementing TypeScript transpilation for ESM using loader hooks and again for CommonJS using require.extensions, for example; and it would allow things that are currently impossible, like require of an HTTPS URL.

The technical way to achieve this in the Node codebase is to merge the code in lib/internal/modules/cjs/ with the far larger lib/internal/modules/esm/, to have just one loader rather than two that lib/internal/modules/run_main.js chooses between. The ESM loader can already load both ESM and CommonJS modules, so the change here is that it would do so for require as well as import, and for CommonJS entry points. There would be some tricky problems to solve around promises and synchronous loading; but even if it proves impossible for some reason, there are maintainability reasons for trying to have the current ESM loader at least handle all entry points even if it can’t handle all require calls.

In #43408 (comment) and following comments, and #44323 spun off from that discussion, we investigated Node core tests that fail when the ESM loader is used to load a CommonJS entry point. For background, currently the node command uses the CommonJS loader unless the entry point is detected as ESM (.mjs file or .js under "type": "module", or string/STDIN with --input-type=module) or if either of the --import or --loader flags are passed. Arguably all the Node codebase tests should pass regardless of whether the entry point is handled by the CommonJS or the ESM loader, since the ESM loader can load CommonJS; if any tests fail, that points to possible bugs or brittle tests.

I ran the experiment again recently, of setting run_main.js line 77 to const useESMLoader = true and seeing which tests fail when the ESM loader is always used. In 80dcc068fb there are 160 tests failing, most of which are async_hooks. Based on discussion in #44323 and openjs-foundation/summit#340, it seems that most if not all of these failures are expected: createHooks captures internal async activity, and when the ESM loader is engaged there’s simply more internal async activity going on, and so the hooks are called more often; but the tests are asserting that a specific number of calls should have happened, and therefore the tests fail. Arguably the tests are just too specific, and could be rewritten to ignore hook calls that happen as a result of internal activity rather than async resources created by the tests themselves.

Would be acceptable to go ahead and make the ESM loader apply to all entry points (so like useESMLoader = true) and update the tests accordingly? As a precursor to the lib/internal/modules/ refactor that would move CommonJS handling into the same flow as ESM and therefore through the ESM loader hooks. Are any users relying on the same expectations as these tests, that there shouldn’t be any internal async activity pending when createHooks is called? If not, or if we’re okay with breaking that expectation, then we can just update the tests. Arguably we should be okay doing that, as even if we consider createHooks not quite experimental, no one should be relying on internals being unchanging. What does everyone think, or are there better solutions to consider?

@nodejs/tsc @nodejs/async_hooks @nodejs/loaders

joyeecheung · 2022-12-02T10:56:26Z

joyeecheung
Dec 2, 2022
Maintainer

I have some concerns over merging the loaders:

Loading the ESM loader while loading a CJS-only graph incurs an unnecessary overhead, as the ESM loader is much more complicated than CJS loader
That makes support of CJS modules in startup snapshots much harder, because now support for ESM becomes a prerequisite, which is much harder because of the promises and asynchronicity.
The CJS loader was...supposed to be permanently locked at one point, until Unlock module and remove Locked from stability index #11661, so we still allow changes to it to some extent, but I think changing the synchronous nature of it and adding promises would be too far when that only covers a niche use case that can be worked around in the user land.

20 replies

GeoffreyBooth Dec 7, 2022
Collaborator Author

So that’s just an implementation detail that we can work around, right?

The ESM loader does some stuff that needs to be async. Sure we could replace readFile with readFileSync, but there’s --experimental-network-imports which allows import of https URLs and does a bunch of async calls as part of fetching remote modules (

node/lib/internal/modules/esm/load.js

Lines 32 to 63 in f6052c6

    
           async function getSource(url, context) { 
        
             const parsed = new URL(url); 
        
             let responseURL = url; 
        
             let source; 
        
             if (parsed.protocol === 'file:') { 
        
               source = await readFileAsync(parsed); 
        
             } else if (parsed.protocol === 'data:') { 
        
               const match = RegExpPrototypeExec(DATA_URL_PATTERN, parsed.pathname); 
        
               if (!match) { 
        
                 throw new ERR_INVALID_URL(url); 
        
               } 
        
               const { 1: base64, 2: body } = match; 
        
               source = BufferFrom(decodeURIComponent(body), base64 ? 'base64' : 'utf8'); 
        
             } else if (experimentalNetworkImports && ( 
        
               parsed.protocol === 'https:' || 
        
               parsed.protocol === 'http:' 
        
             )) { 
        
               const res = await fetchModule(parsed, context); 
        
               source = await res.body; 
        
               responseURL = res.resolvedHREF; 
        
             } else { 
        
               const supportedSchemes = ['file', 'data']; 
        
               if (experimentalNetworkImports) { 
        
                 ArrayPrototypePush(supportedSchemes, 'http', 'https'); 
        
               } 
        
               throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, supportedSchemes); 
        
             } 
        
             if (policy?.manifest) { 
        
               policy.manifest.assertIntegrity(parsed, source); 
        
             } 
        
             return { __proto__: null, responseURL, source }; 
        
           }

). Even if I tried to isolate that as much as possible into its own function separate from the main flow, at some point near the root of the ESM loader I would need an async function so that I could call and await this new “get remote source” function; and even if that function never gets called, the fact that the root function itself is async at all generates async activity that async hooks detects. The async function itself, and each chunk of it between await keywords (correct me here @Qard and @mcollina) generates an async resource that async hooks trigger for. As long as there’s any async activity anywhere in the ESM loader, whether it’s this built-in feature or supporting user custom hooks that could be async (even if async only for ESM) then we’ll have async activity that triggers async hooks. As far as I’m aware it’s not possible to reimplement the ESM loader to be fully synchronous without breaking a lot of current functionality (at least, if it stays as JavaScript) and it’s also not possible to make it fully synchronous for certain flows only, because the root would need to be async since some of the children would be async.

GeoffreyBooth Dec 8, 2022
Collaborator Author

The JS-based ESM loader itself, to me, is already years-old code that’s difficult to make sense of (I once thought the CJS loader was difficult to refactor to support in snapshot, now it seems trivial in comparison).

100% agree.

if we just copy-paste the one in d8 and use it as an alternative loader for ESM support in snapshot - would actually be quite easy (probably even easier than supporting CJS loader in the snapshot)

This sounds fine to me. I think it’s fine to impose some limitations on the code that users want to snapshot; even requiring it to be a single file doesn’t bother me. The point of the snapshots feature is to squeeze out high performance, right? So it follows that users would tree-shake and bundle and minify, and saving that output as a single file (or other transformations to fit within whatever d8’s limitations are) doesn’t seem unreasonable, especially while the feature is still early in its development. Some features of the ESM loader, like supporting user loaders or the CommonJS named exports detection, feel like they would be ridiculously complicated to reimplement in C++; so if the snapshots loader is never going to have parity with the ESM loader, it might as well target just what’s needed for the goal (fast startup) and leave the rest out, and users have to conform their input to the requirements.

joyeecheung Dec 8, 2022
Maintainer

The point of the snapshots feature is to squeeze out high performance, right?

The point of the snapshots is to have the heap deserialized instead of initializing it from scratch. Having better startup performance is just one of the implications, it also allows you to serialize loaded data/code into the blob in a VM-specific format and avoid tempering. I don't think a build step gives you much benefit, just like most people don't do any of that when deploying a Node.js app.

like supporting user loaders or the CommonJS named exports detection, feel like they would be ridiculously complicated to reimplement in C++

I don't think it would much harder than implementing them in C++ (don't we already implement CommonJS named exports detection in C anyway?), as we can always call back to JS if necessary, and if the JS part is written clean enough there's no implications for snapshots either. The current ESM loader is difficult to snapshot because the code depends on too much runtime states from the top level, and involves a bunch of TDZs and circular dependencies. That's not even the nature of implementing certain things in JS, it's just code smell, the difference is that when the code smell is in C++ it does not affect the snapshotability much, whereas code smell in JS makes is harder to refactor for snapshotability. I don't think the spec mandates that you can't have a JS-based ESM loader with no brittle TDZs, no circular dependencies, only queries CLI options as late as possible, does not create promises until it starts loading modules, and always finish the promises/async operations after loading is complete. That loader would be snapshotable even implmeneted in JS, but it has been difficult to refactor lib/internal/modules/esm to be like that.

Anyway I think as long as internally we can have an async-free code path for CJS, it would still be possible to just use that for supporting CJS in snapshot, similarly we can also just support ESM in snapshot by having a different loader. And we probably want to keep the async-free code path for a very long time anyway, so that users can always revert back to the that path with some kind of option when they cannot update their dependency to work around the breakage.

GeoffreyBooth Dec 8, 2022
Collaborator Author

The current ESM loader is difficult to snapshot because the code depends on too much runtime states from the top level, and involves a bunch of TDZs and circular dependencies.

Do you mind opening an issue highlighting these concerns specifically, including links to the lines of code that need fixing? This is worth cleaning up regardless of the snapshots feature.

don’t we already implement CommonJS named exports detection in C anyway?

No, it’s in Wasm, using https://github.com/nodejs/cjs-module-lexer. There aren’t that many async parts of the ESM loader; the knottiest part is the ModuleJob stuff, where a bunch of promises are created as part of preparing all the modules for evaluation. If the core of the ESM loader could be instrumented in C++, and all the async stuff happens there and it calls out into JavaScript for user loaders and named exports detection and other things that are hard to implement in C++, that might be a reasonable refactor. It would probably also be a big performance boost to any medium-to-large app, as module loading is one of the hottest paths in the codebase. Unfortunately neither I nor Jacob are familiar with C++, so it would need to be you or @aduh95 or anyone else who knows C++ and modules and wants to take this on.

joyeecheung Dec 8, 2022
Maintainer

Do you mind opening an issue highlighting these concerns specifically, including links to the lines of code that need fixing?

When I can find the words to describe concretely what needs to be done, I probably might’ve already made the refactor happen. I think “not having TDZ, not having circular dependencies, and don’t query system states like CLI options at the top level” are already pretty actionable already, no?

No, it’s in Wasm, using https://github.com/nodejs/cjs-module-lexer.

It’s not hand-written wasm (I don’t think any sane person would do that, not for anything complex at least), just wasm compiled from C ;) So it’s technically possible to just copy paste and use the C version directly (if we ignore whether it’s civil to simply copy code over).

And again, I still find it hard to believe that implementing the loader in async JS is absolutely inevitable, when most other runtimes are implementing their loader in native, where async JS isn’t a thing. I think it’s fine to say “the ESM loader is too complicated but no one really wants to spend so much time into the dark corners and refactor it, so we will leave it as is as long as it works”. But it’s not fine to say “we will also block/break other features if no one is unwilling to go into the dark corners of the ESM loader and refactor to make it work for them”. If we want to make ESM loader the bottleneck of the project, at least the loader should be one of the easiest to work with piece of code first, but personally I think what we have now is the opposite of that.

mcollina · 2022-12-02T18:00:02Z

mcollina
Dec 2, 2022
Maintainer

I'd be -1 such a proposal. I would be ok if the two paths were merged only if a loader was enabled.

9 replies

mcollina Dec 2, 2022
Maintainer

monkey patching and async activity.

GeoffreyBooth Dec 2, 2022
Collaborator Author

So the monkey-patching concern I feel like I have a grasp on; we’d have to identify what CommonJS loader APIs people are commonly monkey-patching and avoid breaking those patterns. What about the async activity? That’s the main part I’m asking about; what are the consequences of the minimal version of this change (like useESMLoader = true, without any refactoring or removal of the cjs files) with regards to async hooks and other APIs that observe internal async resources?

JakobJingleheimer Dec 4, 2022
Collaborator

We were already planning to provide module utilities like resolvePackageJSON(). I dunno what all people might be monkey-patching in CJS, but would that possibly be addressed by those module utilities? If so, we could faze out monkey-patching during v19 and/or v20, and then cut it completely in the next major.

If we made this The One Right—err, ModuleLoader and the necessary pieces of CJSLoader were modularly available, I don't see why ModuleLoader couldn't behave synchronously for a fully CJS path (and asynchronously for an ESM or hybrid path); this one ModuleLoader might be a monstrosity just from the merge, but if it isn't, my first thoughts say having 2 paths (on top of the merge) within it wouldn't add significantly more complexity.

Qard Dec 5, 2022
Collaborator

That's likely way too short of a time scale to actually work. Monkey-patching of the CJS module system is quite deeply embedded into the ecosystem. It will take quite a bit of time to migrate the ecosystem away and that will require alternative APIs being available to migrate to before that can even start.

joyeecheung Dec 7, 2022
Maintainer

I dunno what all people might be monkey-patching in CJS, but would that possibly be addressed by those module utilities?

If the plan is to replace these monkey-patched parts, I think we need to have more concrete ideas about how to replace these and then get enough feedback to check if the alternatives actually cover the use cases, and leave ample time for users to migrate away (I would say we need multiple major releases). Even then it’s still questionable if we can simply remove the code. “We’ll provide alternatives for you to rewrite your code to avoid breakage” is generally not a great in the first place, especially when the component being broken is something as old and widely used as the CJS module loader, it would be better to still leave an option for users to completely revert to the old loader to work around broken dependencies that cannot be updated.

Qard · 2022-12-02T18:06:51Z

Qard
Dec 2, 2022
Collaborator

It's worth pointing out that until APMs support loaders, switching off the default CJS module loader would completely break every APM. I don't think we can do this in the short or probably even medium-term. We need to migrate APMs to support loaders first. As far as I'm aware, Datadog is still the only APM with loader support, and only very experimental support, which I suspect is also currently broken on latest Node.js.

I'm of the opinion that until loaders is properly stable we should not be considering removing the existing CJS implementation, and even then we will need a full deprecation cycle.

4 replies

GeoffreyBooth Dec 2, 2022
Collaborator Author

It’s worth pointing out that until APMs support loaders, switching off the default CJS module loader would completely break every APM.

Can you go into detail on this? What exactly would break? The stuff they accomplish by monkey-patching require, or other internals? (Which ones?) Couldn’t the new flow still support the important things being monkey-patchable, to preserve the functionality that existing tools rely on?

Qard Dec 2, 2022
Collaborator

Every APM currently functions by patching the existing CJS module system internals. We function by intercepting the module loads which then lets us intercept the module itself before it's returned to the user and apply our patches first. If the module loader changes our patches to it will break and we won't be able to patch any of the modules we instrument, rendering every APM completely non-functional.

Supporting loaders in APMs is one possible solution to that, but nobody does that yet because it's been very much a moving target up to now so no one has tried to support it yet other than Datadog. There hasn't been enough use of it yet to verify it actually works correctly though. It needs a bunch more investment from us to get it to actually stable, which necessitates loaders itself being stable first. The rest of the ecosystem also hasn't even tried it yet, so there's a long wait for the ecosystem to adopt it.

Another, and in my opinion better, solution to the patching problem is to migrate the ecosystem to diagnostics_channel, which has had more traction so far and eliminates the need for loaders for the APM case altogether.

Both directions require ecosystem effort though and will therefore take time to reach a point where we're no so reliant on patching the CJS module loader the way we currently do.

GeoffreyBooth Dec 2, 2022
Collaborator Author

If the module loader changes our patches to it will break

I think this is a bit oversimplified. We can change the module loader while still exposing whatever APIs libraries are monkey-patching, so that they can continue to do so. Such a requirement might limit how much we can refactor the module loader, but obviously we would want to preserve as much backward compatibility as possible and so therefore I would want a really good reason to break existing APIs (even if they’re supposedly internal) that people are relying on. It would help to know which APIs are most important for APMs, and what they’re doing with them.

The original question was about additional async activity happening internally. Assume that we do preserve all the CommonJS APIs that people monkey-patch, and the monkey-patching still works; would the changed internal async activity break any APMs?

Qard Dec 2, 2022
Collaborator

Yeah, I meant that as just directly swapping useESMLoader right now will break everything. If you manage to keep the original API largely intact that should be okay, but I would recommend working with some APMs to make sure nothing breaks. Specifically, we want to be sure https://www.npmjs.com/package/require-in-the-middle still works as intended.

And yeah, the change in internal async activity may break some APMs, but to be frank that will only happen if they are using async_hooks in unsafe ways. I know at least the major ones are all generally written defensively enough that they should not be impacted.

I'm somewhat of the opinion that we should just not worry about that as it's very, very edge-case-y and exact timing/ordering of async_hooks should never have been relied on in the first place given that async_hooks itself can also be enabled/disabled at arbitrary times and therefore can easily be constructed in ways that can not resolve the complete graph--which is fine and expected, or at least is supposed to be expected.

GeoffreyBooth · 2022-12-02T18:57:42Z

GeoffreyBooth
Dec 2, 2022
Collaborator Author

I ran a quick-and-dirty benchmark to see if a no-op application loaded faster under the CommonJS loader as compared with the ESM loader. They’re equivalent:

So simply setting useESMLoader = true by itself shouldn’t have a performance impact, I would think; though the follow-up of piping module loading through the ESM loader’s load hook might very well be slower than however CommonJS currently does it. We could possibly preserve that part of the flow for the old code if we discover a performance regression there.

2 replies

Qard Dec 2, 2022
Collaborator

What's the difference when you have a tree with some depth though? Say five modules deep, but the modules do nothing but import/require the next thing? I suspect that is a more real-world example of what the performance would be like.

Either way though, I personally don't particularly care about startup performance. I've never found that to be nearly as important as actual runtime performance. 😅

GeoffreyBooth Dec 3, 2022
Collaborator Author

Making a series of files 0.js through 5.js, where each one is like import './1.js' or require('./1') etc until the last is just a semicolon, also shows no appreciable difference:

What’s more relevant I think is whether using the ESM loader for the CommonJS set of files is noticeably slower, and it doesn’t seem to be:

(The no-op --import is there to cause useESMLoader to evaluate to true without me needing to rebuild Node.)

I’m sure there must be some circumstances where it makes a noticeable difference; we could run the full benchmark suite on a build where useESMLoader is hard-coded to true to try to tease those out, or run a command like hyperfine --warmup 3 'node entry.js' 'node --import "data:text/javascript,;" entry.js' where entry.js is the entry point of a large CommonJS app, to measure if useESMLoader = true makes a real-world large CommonJS app noticeably slower, at least to start up.

GeoffreyBooth · 2022-12-05T17:45:51Z

GeoffreyBooth
Dec 5, 2022
Collaborator Author

I wanted to split this discussion (#45711 (reply in thread), #45711 (reply in thread), #45711 (reply in thread), #45711 (reply in thread), #45711 (reply in thread)) out into its own thread so it doesn’t get lost in the noise:

What expectations should users have regarding new features being supported for CommonJS apps versus ESM apps?

In other words, when is it acceptable to ship a new feature that only works in CommonJS? Or only in ESM? What if shipping it for both requires breaking changes to the CommonJS module loader or to async hooks?

Both CommonJS and ESM are marked as stable, and have been for years (the last version of Node that had ESM as experimental has gone EOL already). When users ask, as they often do, about whether (or when) Node will deprecate CommonJS we always reply that there are no intentions to do so, and that both module systems are fully supported as equals and first-class.

In my mind, the implications of that statement are that new features need to support both module systems before the new feature can become stable (if not even earlier in the development process) and that existing features that lack equivalent support in both systems can be said to have a bug that needs fixing, unless there’s something fundamental about the feature that precludes parity.

Alternatively, we could change our public position on this. As @joyeecheung wrote above, “I don’t think [the CommonJS loader] being a non-legacy means it needs to get new features. It just needs to be stable and not change much (including not going to get any fancy new features that would break potential edge cases) beyond bug fixes or security fixes. I think sacrificing compatibility (e.g. by making it async in any way) for new features really goes against how we had been advertising it with - an almost-locked, stable, as backwards-compatible as possible component.” Whatever we call this, and both “legacy” and “stable” seem like not-quite-right fits, this is a different promise than both systems being first-class and fully supported.

If the spectrum spans from “support both fully” to “CommonJS is locked,” then what doesn’t fit would be new features that only work in CommonJS. That would imply that we’re not supporting ESM fully, when I don’t think there’s any doubt that ESM is stable and not on any path toward legacy or deprecation. If we want to allow new features that are ESM-only, I think that would be fine, providing that we state somewhere that CommonJS users should have no expectation of new features working in CommonJS; but the reverse doesn’t make sense. I’m also fine with a rule that says that new features must work in both unless something fundamentally precludes them from doing so, and the bar for shipping “only works in one system” features should be very high.

The practical implications of this question on the discussions above are related to --loader supporting CommonJS (it wouldn’t have to if we consider the CommonJS loader locked/legacy) and startup snapshots (it would not be allowed to become stable until it supported ESM). I don’t think we should allow a policy of new features only supporting one module system or the other, where either system could be chosen, as that would leave users with the worst of both worlds: some new features would only work in CommonJS but other new features would only work in ESM, and therefore some combinations of features would be impossible to use together.

25 replies

joyeecheung Dec 9, 2022
Maintainer

and we can’t tell our users “sorry, but doesn’t work for ESM apps because it was too much work for us to either get the feature working with the ESM loader we have or to refactor the ESM loader as needed.”

Why not as long as the thing is still experimental? We have been effectively saying something similar for Web APIs in many builtins. I think it's totally fine to tell users "sorry, this experimental feature doesn't work for X" no matter what X is and what the reason is. I think being experimental already implies that it's WIP and can be broken in some use cases. Actually we even had done that for ESM for some time since it initially landed as an experimental feature ("sorry, you can't do require('./foo.mjs') in CJS apps, and you can't even do import() or import.meta in ESM, but that's okay, because ESM is experimental, and we'll figure out how to let you do it later").

jasnell Dec 9, 2022
Maintainer

Exactly. There should be no expectation while a thing is experimental and WIP that it will work for all uses. Otherwise it becomes way too difficult to make progress

GeoffreyBooth Dec 9, 2022
Collaborator Author

There should be no expectation while a thing is experimental and WIP that it will work for all uses.

People keep bringing this up but it’s not part of the original question. I was asking about new features becoming stable. Clearly experimental features will need some leeway to be broken or incomplete so that people can land reasonably-sized chunks. It seems like we have consensus though that a feature should work for ESM users before becoming stable, so I think we can consider that question settled.

I do have a concern about a feature becoming relied upon by users while never graduating to stable, like async hooks. That’s a problem I would like a solution for, whether it’s different release lines or a “beta” status between experimental and stable or something else. Criteria for a feature to become stable are meaningless if a feature can stay experimental forever and get wide adoption as such.

joyeecheung Dec 9, 2022
Maintainer

I think we are far from the point where we need to start worrying about these features blocking the adoption of ESM. The most important reason why ESM still isn't adopted by the ecosystem is that the interop between ESM and CJS alone still doesn't allow painless migration. Breakages in async hooks or startup snapshots are only going to affect a small proportion of the users (many users aren't even aware of their existence), while breakages in module loading is going to affect much more people. When most of the ecosystem can painlessly migrate to ESM without running into issues in module loading (e.g. not having to ask for changes in your dependency regarding exports), we probably would've already solved any additional breakages in other power features, so I don't think this is the time when we should prioritize making sure all the features support ESM. It might be different if we are talking about a "experimental but everyone uses it" feature, but I don't see any new feature becoming as widely used as the module loading itself. At least I think the concern is too hypothetical to worth any investment into a separate release line (personally I don't think anything is worth the work of a separate release line, given how much we struggle with build these days, unless we suddenly start to have a flock of contributors working on build full-time).

Qard Dec 9, 2022
Collaborator

Async_hooks should be legacy. That's basically it. It hasn't really been "experimental" for a long time.

arcanis · 2022-12-05T20:14:25Z

arcanis
Dec 5, 2022

This will probably be a very controversial point, but ESM asynchronous nature adds so many challenges, would it make things easier to turn them into CJS? Then we could specify that loaders may export both sync and async versions of their hooks, and use one version or another depending on the context. ESM would use the async version, whereas CJS would use the sync version. Keeping sync and async in sync would be up to loaders' authors.

I know a lot of work was already made on the assumption that loaders are implemented in ESM (including off-threading), and I'd myself have to modify the (experimental) loaders my team already ships, so I understand if that doesn't fit the direction.

3 replies

JakobJingleheimer Dec 5, 2022
Collaborator

I dunno if that's controversial, just factual: ESM has concerns that CJS doesn't.

JakobJingleheimer Dec 5, 2022
Collaborator

I dunno if making a custom loader potentially CJS solves any problems, but it sure as heck creates a few very difficult problems that Nodejs has specifically avoid for a long time. MAYBE if we've merged node's module loader it would be more feasible, but several people smarter than me have called "not it", so I'd be nervous about tackling it.

GeoffreyBooth Dec 5, 2022
Collaborator Author

ESM asynchronous nature adds so many challenges, would it make things easier to turn them into CJS?

Making user loaders CommonJS, or sync, wouldn’t help matters; the ESM loader itself generates async activity even with no user loaders present. And we can’t transpile ESM to CommonJS at runtime, that’s a nonstarter, not least because it would mean breaking spec (no more top-level await, broken execution order, etc.).

It has occurred to me, regarding #45711 (reply in thread): what if we replaced the ESM loader’s async readFile with readFileSync like the CommonJS loader has? And generally refactored the rest of the ESM loader to be entirely synchronous like the CommonJS loader is, if that’s even possible? (And user loaders would be able to stay async if we can land the off-thread PR, or they would change to needing to be sync too.) Then all of these problems would go away: the async hooks tests would pass under ESM, etc.

But I’m pretty sure this isn’t an option, probably for spec reasons; it just wouldn’t be spec-compliant ESM if it were sync, I’d assume. And the refactoring would be immense; there are a lot of async functions in the ESM loader. So it’s probably pie-in-the-sky. But is it an option, if perhaps a distant one?

bmeck · 2022-12-09T00:42:33Z

bmeck
Dec 9, 2022

Just for context sake the original ESM loader was in C++. I believe it was ported in 2019 to JS as an effort to be easier to hack on to mixed effects.

…

On Thu, Dec 8, 2022, 4:55 PM Geoffrey Booth ***@***.***> wrote: If we had a better ESM loader implementation I would’ve agreed that it’s time to prioritize ESM, but unfortunately we don’t. It’s sad that we already have technical debt in ESM loader even before it actually gets wide adoption, but I don’t think making the technical debt our bottleneck is the solution of this problem. I don’t think it’s productive to blame the ESM loader, as if new features supporting ESM is blocked by some undefined refactoring that the loader needs. I’m sure the ESM loader can be improved, but it’s doing a lot of things—much more than the CommonJS loader—and it’s been stable for years. There are lots of parts of the codebase that can be improved, and we can’t tell our users “sorry, but doesn’t work for ESM apps because it was too much work for us to either get the feature working with the ESM loader we have or to refactor the ESM loader as needed.” It’s time to prioritize ESM because if we don’t, any feature that doesn’t work in ESM is incurring technical debt that we’ll have to pay off when lack of ESM support goes from annoying to unacceptable. So if there’s no easy way to get snapshots working with ESM without refactoring the ESM loader, then whoever is championing snapshots will need to refactor the ESM loader (or find a way to ship snapshots ESM support without such a refactoring). Most of the original authors of the ESM loader aren’t around anymore. — Reply to this email directly, view it on GitHub <#45711 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABZJI6DCXTJE32UWWEQFJTWMJRNVANCNFSM6AAAAAASRSW5ME> . You are receiving this because you are on a team that was mentioned.Message ID: ***@***.***>

0 replies

Flarna · 2022-12-09T17:49:45Z

Flarna
Dec 9, 2022
Collaborator

If loader hooks are async and as a result also ESM loader must be async. Would this imply that something like this is no longer working if loader hooks are used because requrire is async then?

const winston = require("winston")
const logger = winston.createLogger();

Or is this made sync again by moving the whole async loading part into the upcoming loader thread and do an event loop blocking wait on the actual main/worker thread?

0 replies

Node.js

Async hooks and CommonJS loaders #45711

GeoffreyBooth Dec 2, 2022 Collaborator

Replies: 9 comments · 63 replies

joyeecheung Dec 2, 2022 Maintainer

GeoffreyBooth Dec 7, 2022 Collaborator Author

GeoffreyBooth Dec 8, 2022 Collaborator Author

joyeecheung Dec 8, 2022 Maintainer

GeoffreyBooth Dec 8, 2022 Collaborator Author

joyeecheung Dec 8, 2022 Maintainer

mcollina Dec 2, 2022 Maintainer

mcollina Dec 2, 2022 Maintainer

GeoffreyBooth Dec 2, 2022 Collaborator Author

JakobJingleheimer Dec 4, 2022 Collaborator

Qard Dec 5, 2022 Collaborator

joyeecheung Dec 7, 2022 Maintainer

Qard Dec 2, 2022 Collaborator

GeoffreyBooth Dec 2, 2022 Collaborator Author

Qard Dec 2, 2022 Collaborator

GeoffreyBooth Dec 2, 2022 Collaborator Author

Qard Dec 2, 2022 Collaborator

GeoffreyBooth Dec 2, 2022 Collaborator Author

Qard Dec 2, 2022 Collaborator

GeoffreyBooth Dec 3, 2022 Collaborator Author

GeoffreyBooth Dec 5, 2022 Collaborator Author

joyeecheung Dec 9, 2022 Maintainer

jasnell Dec 9, 2022 Maintainer

GeoffreyBooth Dec 9, 2022 Collaborator Author

joyeecheung Dec 9, 2022 Maintainer

Qard Dec 9, 2022 Collaborator

arcanis Dec 5, 2022

JakobJingleheimer Dec 5, 2022 Collaborator

JakobJingleheimer Dec 5, 2022 Collaborator

GeoffreyBooth Dec 5, 2022 Collaborator Author

bmeck Dec 9, 2022

Flarna Dec 9, 2022 Collaborator

GeoffreyBooth
Dec 2, 2022
Collaborator

Replies: 9 comments 63 replies

joyeecheung
Dec 2, 2022
Maintainer

GeoffreyBooth Dec 7, 2022
Collaborator Author

GeoffreyBooth Dec 8, 2022
Collaborator Author

joyeecheung Dec 8, 2022
Maintainer

GeoffreyBooth Dec 8, 2022
Collaborator Author

joyeecheung Dec 8, 2022
Maintainer

mcollina
Dec 2, 2022
Maintainer

mcollina Dec 2, 2022
Maintainer

GeoffreyBooth Dec 2, 2022
Collaborator Author

JakobJingleheimer Dec 4, 2022
Collaborator

Qard Dec 5, 2022
Collaborator

joyeecheung Dec 7, 2022
Maintainer

Qard
Dec 2, 2022
Collaborator

GeoffreyBooth Dec 2, 2022
Collaborator Author

Qard Dec 2, 2022
Collaborator

GeoffreyBooth Dec 2, 2022
Collaborator Author

Qard Dec 2, 2022
Collaborator

GeoffreyBooth
Dec 2, 2022
Collaborator Author

Qard Dec 2, 2022
Collaborator

GeoffreyBooth Dec 3, 2022
Collaborator Author

GeoffreyBooth
Dec 5, 2022
Collaborator Author

joyeecheung Dec 9, 2022
Maintainer

jasnell Dec 9, 2022
Maintainer

GeoffreyBooth Dec 9, 2022
Collaborator Author

joyeecheung Dec 9, 2022
Maintainer

Qard Dec 9, 2022
Collaborator

arcanis
Dec 5, 2022

JakobJingleheimer Dec 5, 2022
Collaborator

JakobJingleheimer Dec 5, 2022
Collaborator

GeoffreyBooth Dec 5, 2022
Collaborator Author

bmeck
Dec 9, 2022

Flarna
Dec 9, 2022
Collaborator