Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will module modes be non-deterministic? #1

Closed
robpalme opened this issue Nov 21, 2018 · 23 comments
Closed

Will module modes be non-deterministic? #1

robpalme opened this issue Nov 21, 2018 · 23 comments

Comments

@robpalme
Copy link
Contributor

The doc says:

(a) a given module location can only be instantiated in one mode (CJS or ESM)
(b) the mode observed by "import" is deterministic based on the package that contains the module
(c) we're not changing the behaviour of require()

So far so good. Now assume we have an ESM package containing a leaf-node module (no dependencies).

  • If someone require()s the module (perhaps via a deep-import), it will be loaded as CJS.
  • If someone import()s the module, it will be loaded as ESM.

Only one of these can succeed due to (a). Presumably the other will throw. Now we have a module whose mode is determined by whoever dynamically loads it first. The seems like it could be non-deterministic and subject to races.

Are we introducing the possibility of non-deterministic mode selection?

@GeoffreyBooth
Copy link
Owner

@bmeck should probably answer this as he’s more the expert, (and please chime in to correct me), but my thinking is that a “dual-mode” package should really be thought of as two packages that happen to share the same package name/bare specifier. So say lodash jumps on the ESM bandwagon and adds an "exports" field to its package.json, but it also has a "main" field. Say "exports" points to an esm/ subfolder in the package and "main" points to a cjs/ subfolder. If you import lodash you’ll get literally a different package than if you require it, in any implementation (the new one we’re working on or in --experimental-modules). This is the “identity” problem that @bmeck was talking about.

Most of the time one of those subfolders will be just the transpiled output of the other, like cjs/ being the Babel output of esm/, but there’s no reason they couldn’t each be hand-crafted; and there’s no reason they couldn’t have entirely different code. Maybe the ESM version of Lodash has a few more functions than the CJS version does. They’re literally two different packages. Even if one is just the transpiled version of the other, that doesn’t mean they’re equivalent; there could be subtle differences in behavior between say native async functions and Babel’s polyfilled versions, or the Babel transpilation itself could have bugs. No matter what, the required and imported versions of lodash are at least a little different, even if the difference is unnoticeable.

For a package like Lodash consisting entirely of static functions, that doesn’t really matter. But a package where a singleton object stores state in the object will have a really hard time with this behavior, as there’s really two copies of the object and two copies of the state. (@bmeck please correct me if I’m screwing this up.) This is the danger inherent in dual-mode packages, something that dual-mode package authors will need to be aware of; it might be a reason to not release a package as dual-mode, or to release the ESM version under a new name (or deprecate the CJS version behind a breaking change version bump). If it’s enough of an issue Node could decide to not support dual-mode packages at all.

Another thing to consider is different packages in the module graph loading the same package. Say your project imports lodash, but one of your project’s CommonJS dependencies requires it. Now both the ESM and the CJS versions of lodash are imported into the module graph. This is similar to different versions being imported, like say your project imports lodash@2 and a dependency imports lodash@1, and each package gets the version they’re expecting. Package boundaries can help with distinguishing the ESM/CJS versions, just as they currently help (I think) with distinguish different semver versions of a package. Another potential solution to this identity problem could be restricting a package boundary to a single ESM/CJS version of a package. Just as your project can’t both require lodash@1 and lodash@2 within the same package, your package would also need to choose between importing either the ESM or CJS version of lodash.

@robpalme
Copy link
Contributor Author

This question is not really about dual-mode packages. We face this problem even with pure ESM packages into which people can perform a deep require.

Whilst we can say "don't do that", I just want to be clear on what we are specifying will happen in that case. I expect it will arise in practice if well-known packages attempt to convert to ESM without renaming.

@GeoffreyBooth
Copy link
Owner

So the “deep require question” I assume means what happens when you write code like this?

import "dual-mode-package/ambiguous.js";
// const require = createRequire...
require("dual-mode-package/ambiguous.js");

Where ambiguous.js contains neither import/export statements nor require or exports or any other CommonJS global, so the file actually could be parsed as either CommonJS or as ESM. And this could be an issue because the file might behave differently if it’s parsed as ESM and therefore gets automatic strict mode. (See demo of that.)

I guess Node would have no choice but to throw in such a circumstance, because like you say it would be a crapshoot as to whether the import statement or the require statement happened first. Node could also throw if you try to both import and require the same bare specifier; technically it wouldn’t need to, but it’s probably safer to do so as I don’t know what valid use case there would be for loading both the ESM and CJS versions of a dual-mode package. The important part would be to restrict everything to a package boundary, so that my project could import lodash but a dependency could require it.

@robpalme
Copy link
Contributor Author

Yes. The code example captures the problematic case.

And it can all be side-stepped if users are careful to stick to public package entrypoints.

When you say, "Node would have to throw", would that throw be:

  1. Upon the load in which a mode-switch is attempted, or
  2. Immediately on any attempt to deep require into an ESM-only package

@Fishrock123
Copy link

a given module location

This means absolute filepath on disk, right? Does that have any browser implications?

@robpalme
Copy link
Contributor Author

@Fishrock123 It's the resolved location, which for Node will be a fully qualified file path, yes.

@Fishrock123
Copy link

Fishrock123 commented Nov 21, 2018

Ok I have another question here too:

Given this case:

import "dual-mode-package/is-dual-mode.js";
// const require = createRequire...
require("dual-mode-package/is-dual-mode.js");

Does require() now get a Promise?

If so I think that is possibly problematic because suddenly one case does not work out of the box.

@GeoffreyBooth
Copy link
Owner

I don’t know when that throw would be. Perhaps @bmeck can answer. I’m not sure how require works in an ESM context. I know (I think) that all import and export statements are statically evaluated and modules loaded before code gets evaluated (?) and so issues related to specifiers can throw at that point before any code is executed; but I assume that can’t be the case for require, so the throw would happen when require is evaluated. Maybe that’s too late, I guess is what you’re implying, and so it should be addressed.

I think this is an issue somewhat unrelated to this proposal, that both implementations have now, since both implementations have both import statements and createRequire. So perhaps open this as an issue in the main repo or in the ecmascript-modules repo? Or is there something particular to this proposal that’s affected by this?

@Fishrock123 I don’t think browsers are affected by this because browsers can’t import CJS and they also don’t have require. I guess the issue we’re discussing is related to what would happen in this case, where ambiguous.js could be either a Script or a Module as in https://github.com/GeoffreyBooth/browser-equivalence-edge-case:

<script src="ambiguous.js"></script>
<script type="module" src="ambiguous.js"></script>

I don’t know how browsers handle that, but presumably we should do something similar.

@GeoffreyBooth
Copy link
Owner

Does require() now get a Promise?

I don’t think so, because that would be a breaking change in terms of how require works. I think require is always synchronous, even the require generated via createRequireFromPath. But someone else who understands that feature better should please correct me.

@robpalme
Copy link
Contributor Author

@GeoffreyBooth To be clear, my feeling is that module type selection ought to be statically analysable. This is an important invariant. Races are bad news for a module system.

Whilst we could debate which repo this issue goes into, the reason we are discussing it is because the choices/constraints declared by this proposal lead to the question. I don't think we can declare this proposal accepted/coherent until we answer it.

@robpalme
Copy link
Contributor Author

@Fishrock123 require will never wrap the result in promise. The need to preserve backwards compatibility prevents that from ever happening.

@robpalme
Copy link
Contributor Author

@GeoffreyBooth I'll try be more constructive and direct.

This proposal chooses to use package definitions to determine the types of the modules contained within the package. This seems reasonable.

I think we should take that to the logical conclusion: if someone violates the static type designation by attempting to dynamically load the module using a different mode, we ought to guarantee that fails, i.e. the require must throw.

This preserves the integrity of the static claims and makes the module graph statically analysable.

@GeoffreyBooth
Copy link
Owner

I’ll try be more constructive and direct.

Much appreciated 😄

Again I’m not the expert on how Node’s module loading system works, I haven’t contributed code to it, so someone else should address this. If you feel confident, would you mind submitting a PR to this proposal describing the behavior you expect to happen and under what conditions? And others who know the loading system well can review it and we can discuss the best way to handle the issue.

@robpalme
Copy link
Contributor Author

@GeoffreyBooth Yes, I will send you the PR tomorrow morning. It is late in London now 😉

@Fishrock123
Copy link

require will never wrap the result in promise. The need to preserve backwards compatibility prevents that from ever happening.

import "dual-mode-package/is-dual-mode.js";
// const require = createRequire...
require("dual-mode-package/is-dual-mode.js");

In this case, wouldn't that still be non-deterministic?

Say it is a "dual mode" package.

If you import it first, an ESM is loaded. To load an ESM source in CJS from require() you MUST return a promise. If we aren't doing that, you need to load the CJS version, if that happens, you have two versions of the same module loaded.

If you require() it first, both get CJS.

@GeoffreyBooth
Copy link
Owner

@robpalme please see #3 before you start your PR, I’m assuming you’ll probably want to base off of that (assuming the rest of the team likes it 😄)

@Fishrock123 The proposal has require of ESM as unsupported, perhaps you meant import()?

I did a little experiment. I created an index.html with this:

<html>
	<head>
		<script src="./ambiguous.js"></script>
		<script src="./ambiguous.js" type="module"></script>
	</head>
	<body></body>
</html>

And ambiguous.js with this:

var mode = this === undefined ? 'In strict mode' : 'In sloppy mode';

console.log(mode);

And served them via http-server, and loaded them in Chrome. In the console I see:

In sloppy mode
In strict mode

So it seems to me that the browser loads the same file twice, once as a Script and again as a Module, and executes both. Oddly, whether I put the type="module" first or second, I always see In sloppy mode printed first. Not sure what this all means, but it’s at least not throwing an error.

@Fishrock123
Copy link

Fishrock123 commented Nov 23, 2018

Well, what I was getting at is that, as I understand it, if dual-mode-package is a dual mode package:

// const require = createRequire...
require("dual-mode-package"); // loads cjs

import "dual-mode-package"; // now loads same cjs
import "dual-mode-package"; // loads esm
// const require = createRequire...
require("dual-mode-package"); // error?

or, replace those import statements with import() functions.

@guybedford
Copy link
Contributor

Implementing a stateful error like this seems incredibly brittle if it is relying on "first load" or similar.

Also bear in mind that if a dependency already loads a dual mode package as CommonJS, that stopping an app user from loading modules seems like an unnecessary restriction.

The whole point of the constraint of having a single interpretation per file is that we get predictable loading.

I haven't yet see anyone propose a deterministic way to do this that won't lead to any of the above issues. And without that I personally don't see a problem with having the ESM loader having its own classification of modules (and which absolute paths to defer to the CJS loader), while the CJS loader is treated as separate legacy loader which isn't changed to support backwards compatibility.

The above gives us the guarantee that for the ESM loader, there is only one interpretation of each file. While yes dual mode packages in the CJS loader might result in duplicate interpretations in some cases, but that seems preferable to sacrificing backwards compatibility or forward compatibility even.

@GeoffreyBooth
Copy link
Owner

So I did some reading after my experiment above. According to this article, browsers treat <script mode="module" as defer by default, so they will always be loaded after any traditional non-module <script> tags. That’s why I was always seeing “In sloppy mode” first.

Since import and export statements are statically analyzable, couldn’t Node always load them first? And that would just be the rule: all ESM imports and exports load first, and if you try to require the same thing then it either happens second (if we allow such things) or throws (if we don’t). This would be the opposite of browsers’ always loading scripts before modules, but at least it’s deterministic.

@Fishrock123 in your most recent example, you have import "dual-mode-package"; // now loads same cjs; per this repo’s proposal, an import statement of a dual-mode package would always load the ESM version of that package. Even if we make that correction, though, I think imports and exports are still evaluated in a different phase than require. There was a lot of talk in earlier issues about parse phases and so on within Node’s module loader, and require statements would come at the end of the process because require requires code evaluation whereas import does not, so I think import and export would always happen first. There’s some discussion of this in nodejs/modules#81, I welcome someone who understands the phases to please correct me and/or explain this better than I can.

@GeoffreyBooth
Copy link
Owner

From @guybedford in #3 (comment):

Because you can have dynamic import. You can also run CommonJS first via -r or other process hooks like in loaders. . . .

Personally I would be fine with import getting the ESM version and require getting the CJS version, I think that would align with most user intuitions for dual mode packages. If we really don’t want this then the solution is simple - don’t have dual mode packages at all!

I think there’s a case to be made for giving users the rope to potentially hang themselves if that’s what they choose. Node need not catch every mistake. I think what we should focus on is ensuring that if a race condition occurs because of dual loading, it should be the result of an intentional user action (like --require dual-mode-package before a file with import 'dual-mode-package') and not because of something like a loader or a dependency pulling in a package in a different mode. We need to protect the user from other people’s code causing a race condition, but not the user’s own code.

There is also a middle ground between allowing and banning dual-mode packages: ban import interoperability of dual-mode packages. So a CommonJS file can require('dual-mode-package') and an ESM file can import 'dual-mode-package', but Node would throw if the CommonJS file tried to import() that package or the ESM file tried to require (via createRequireFromPath) that package. This middle ground would ensure that each context, CommonJS or ESM, would only get exactly one version of a package.

@bmeck
Copy link

bmeck commented Nov 26, 2018

I'd like to separate the ideas of implementation preventing improper usage vs static definition of formats.


We have a proposal only covering import and asserting forming a static knowledge of formats applied to import.

require will never be sanely locked down due to a variety decisions that prevent any future APIs from being able to avoid collisions:

  1. require.extensions forces conversion to CJS even if other formats eventually become supported
  2. Any unknown file will be treated as CJS (such as .wasm, .coffee, etc.)

Those two combine to make it so that any API that acts as a module loading system inherently collides with require due to having too loose of restrictions to allow for future compatibility when require was being designed and shipped.

The browser has a similar problem with <script type=text/javascript>. It loads any resource as a JS Script, regardless of if the format is declared to be a PNG image or even an HTML page.

The solution is to define statically what formats things are. We cannot replace/"fix" require's logic. However, we can declare that require does not respect formats. Even if we are currently only talking about import in this proposal, the ability to define formats is universal an can apply to many things. Therefore, any API re-using the same format logic can become safe to use in the future by ensuring it doesn't accidentally occupy more space than expected, thus avoiding part of the problem that require faces in overreach.


Now, preventing usage in which a module is loaded in the wrong format.

Currently require has no knowledge of formats. We could add warnings or throw or even abort when it loads something that is not in the expected format but we encounter problems that make any such approach brittle as @guybedford talks about.

  1. Adding logic to require.extensions means that some things could have both a CJS and ESM loading path. Which one should prevent the other from working? Since ESM loaders are not being looked at as dynamically loaded, we can apply them to format analysis first. However, require is the backwards compatibility path and if we treat ESM as first that means that CJS loading using require.extensions breaks.
  2. This proposal allows package.json to contain both exports and main which means that even if it defines a package to be ESM, a file at main is expected to be CJS. We could prohibit ESM loaders from allowing such collisions by giving warnings/throwing/aborting. However, this main could just be a delegation file that makes require('foo') act similar to import('foo'). In addition exports as defined in https://github.com/jkrems/proposal-pkg-exports allows for people to prevent ever loading the main as ESM:
{
  "main": "cjs.js",
  "exports": {
    "": "esm.js",
    "cjs.js": "esm.js"
  }
}

Since cjs.js would be loaded as ESM if it could ever be loaded from an in-package import it would still have a known format, but I would still expect it to be the ESM format and the usage of main to fall under legacy behavior since it deals with require which is an API that doesn't handle formats.


I would rather we just let people see the errors and see how people are working around them than trying to make these trade offs our selves.

We cannot prevent some format problems due to how require played loose with formats and didn't define a static system for determining a format that could be shared over time. I'd expect dual mode packages to be fairly rare given the problems when react tried to introduce them and ended up completely removing support in their libraries and recommending avoiding them after various problems. We should follow the logic that the community is already moving away from the idea of dual mode packages due to this.

@GeoffreyBooth GeoffreyBooth mentioned this issue Nov 30, 2018
robpalme added a commit to robpalme/node-import-file-specifier-resolution-proposal that referenced this issue Jan 14, 2019
Resolve the open issue of what happens when a file is requested to be loaded in both modes: ESM and CommonJS.

See GeoffreyBooth#1 for discussion.
robpalme added a commit to robpalme/node-import-file-specifier-resolution-proposal that referenced this issue Jan 14, 2019
Resolve the open issue of what happens when a file is requested to be loaded in both modes: ESM and CommonJS.

See GeoffreyBooth#1 for discussion.
@robpalme
Copy link
Contributor Author

Thanks for all the feedback.

From @guybedford it seems clear that we don't want the ambiguous case to result in a brittle non-deterministic runtime error. I also buy @bmeck 's argument that we also cannot forbid require() from overriding any interpretation semantics we define for the modern import syntax. And finally, we're all agreed that require() won't provide access to loading ES modules.

To solve these constraints, I propose permitting dual-instantiation of files. Effectively each module system (ESM and CJS) gets its own registry/namespace to store instantiated modules. In my opinion, this is a relatively small change purely to provide well-defined semantics. The wording for this can be found in #11

@GeoffreyBooth
Copy link
Owner

Resolved via #11 (I think, please reopen if otherwise).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants