Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Discussion about an implicit (or explicit) "node" scheme #169

Closed
SMotaal opened this issue Aug 15, 2018 · 28 comments
Closed

Discussion about an implicit (or explicit) "node" scheme #169

SMotaal opened this issue Aug 15, 2018 · 28 comments

Comments

@SMotaal
Copy link

SMotaal commented Aug 15, 2018

This compliments #168 with a more open discussion about the notion of assuming an implicit (or explicit) node scheme which builds on the existing internal/implicit use of node: in experimental-modules.

This thread opens the discussion about the notion that a theoretical (or experimental) Node.js ESM loader can be considered the one and only implementation for a node+file: scheme.

Other schemes which are not supported by such a loader (like node+https:) can still be valid schemes that live in the realm of other platforms like Electron, or handled by third-party handlers that may or may not interface with node's own loader.

@SMotaal
Copy link
Author

SMotaal commented Aug 15, 2018

@jkrems

Would a protocol like commonjs:// be more clear?

Absolutely, which is more ambiguously than node+commonjs: (dropping node+ since it is always within the node scheme. Funny enough commonjs: is not relevant for CommonJS modules since the only time a specifier is actually a file: URL (not an OS-specific path) is "theoretically" when importing into ESM or when using import(…) instead of require(…) in cjs.

The disadvantage of a protocol solution (last time I tried it) was that it becomes weird for non-absolute things. commonjs://./foo.node looks really weird and isn't quite how "normal" URLs would work.

Certainly, those are merely conceptual schemes for design purposes (and possibly internal resolution algorithms). The only time you would need to explicitly refer to commonjs://./foo.node is when commonjs: is defined in isolation. Within Node (or electron) the idea that an implied node scheme exists (which predicates that an explicit referrer (an absolute URI for the requesting module) along with any absolute or relative specifiers, the main module maybe being the one exception. This means that it resolves to commonjs://./foo.js only internally, which defers to the commonjs loader that converts this file-like URI to a OS-specific file-system path

@devsnek
Copy link
Member

devsnek commented Aug 15, 2018

strong -1 on commonjs: or node+commonjs or whatever else... schemes refer to the protocol used to access a resource, not the type of the resource itself. it would be a misuse of urls to do so.

@SMotaal
Copy link
Author

SMotaal commented Aug 15, 2018

@devsnek that commonjs: is not a protocol nor a scheme per say, but within the context of a URI-based module system (like ESM) that can theoretically load a commonjs module through an interface with that system, such specifiers must assume a "non-standard" scheme which takes the form of a URI that makes it possible to infer concrete and consistent parameters that will result in exactly one "wrapped" module instance throughout the runtime span

@SMotaal SMotaal changed the title Discussion about an implicit (or explicit) node scheme Discussion about an implicit (or explicit) "node" scheme Aug 15, 2018
@ljharb
Copy link
Member

ljharb commented Aug 15, 2018

I don't think it's likely that import will work in node with any schemes, including http/https, and i don't even think it should work with file:// - unless we have consensus on those, I don't think it makes sense to add a new scheme.

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

@ljharb so I've thought about this for a while and wanted to push this discussion further but with more clarity.

Assume for a moment that I not proposing that import … from 'node:…' or import('node:…') would be handled by node, I am simply saying that for a theoretical third-party implementation (ie bundlers, renderers... etc) parsing an ESM file that is somehow identified as a "node" ESM file (maybe MIME or not) a bare specifier will need to be associated with a scheme, at the very least to construct a URL relative to a base or referrer.

So this whole direction of the discussion about supporting or not protocols in node files has nothing to do with my intent. Let's try extra hard to not let hot topics overshadow seemingly related issues which can be very critical.

Can you clarify if you are just against the node resolving part (which is not the intent) and if not, are you against specifiers being URLs in general? I am just trying to relay my own thinking to steer the conversation in a more productive path.

@ljharb
Copy link
Member

ljharb commented Aug 26, 2018

Although anyone will be able to use loaders and transpilers to use any specifier they want - I’m against node having specifiers work by default. Every path in node is file:// and so does not need an explicit protocol, http/https URLs would be dangerous imo (altho these might end up being acceptable), and i don’t see the benefit of adding additional protocols. A protocol talks about how you access the format - but not what the format is. What access method needs disambiguation here?

(In addition, I’m firmly against encoding the module format anywhere but in the module itself - either in its contents (altho this won’t work for JS scripts vs modules), in its filename (ie, it’s extension), or in “closest package.json” metadata that overrides one of those two options)

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

http/https URLs would be dangerous imo
I’m firmly against encoding the module format

Again, that is all besides the points I am trying to raise. And as far as node is concerned, I don't think I disagree with your logic.

Every path in node is file:// and so does not need an explicit protocol

This imho is where I might be able to articulate my thinking better. My version of this is more like "while every path in node is file:// and so does not need an explicit protocol" not every specifier is a path.

A protocol talks about how you access the format - but not what the format is. What access method needs disambiguation here?

For instance, bare specifiers are not a path nor a URL until they are resolved by node's ESM subsystem to a file in some package. More importantly though, bare specifiers that resolve to a dynamic module instance, including at the very least every single builtin module, not to mention all non-esm if you consider the --experimental-modules implementation, those are not paths, but they are resources, just like blobs are resources, and they too have a scheme.

But here is where this all matters from my point of view, and @demurgos's example is spot on, we should not just be focusing on specifiers from the resolution aspect (not touching this one), we should also consider how they interoperate at the very least inside the parent runtime, which happens to be v8/chrome+webkit, possible chakra... and so on. This is just conceptual, not implementational.

@ljharb
Copy link
Member

ljharb commented Aug 26, 2018

I could certainly see the internal - not user-visible - use of an implicit node:// protocol on bare specifiers, just like resolved file paths might have file:// and source text modules might have another protocol. However, i wouldn’t want users to be able by default to have two ways to represent a specifier to the same module (noting that ./foo and ../src/foo might refer to the same module, but I’d still consider these the “same way”). In other word, i don’t want such a user-visible signal, as mentioned in the linked comment.

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

i wouldn’t want users to be able by default to have two ways to represent a specifier to the same module (noting that ./foo and ../src/foo …

absolutely, my own thinking too, the source text specifier node:foo would be invalid (unless maybe a custom loader wants to play with that) and assuming it makes it somehow to the module wrap, the parent runtime will see it node:[…]node:foo and this will be an issue for the custom loader(s) and never a node thing to address.

An instantiated module "node:process" in an electron app for instance, is only accessible by an ES module when the source text is requiring or importing "process". It cannot be accessed from a <script type=module> unless you register a custom scheme "node:" and somehow map it to the module instance for that same module wrap (this last part is secondary to the issue at hand)

@devsnek
Copy link
Member

devsnek commented Aug 26, 2018

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

@devsnek I know, I blame you for all this discussion. If I did not see this, I think I would have remained relatively ignorant of the notion of the ModuleWrap's src and url and how they interplay with v8's Modules...

My concerns are with all other non-path ModuleWraps (dynamic modules like cjs) and how a simple spinoff on "node:builtins" which do not factor into node's ESM resolution at all, could potentially prevent conflicting coercions or assumptions in other realms of node's own ecosystem (for instance Electron & NW.js)

@devsnek
Copy link
Member

devsnek commented Aug 26, 2018

@SMotaal the implementation has a lot of freedom about what it does for specifiers. the only we have is that our specifiers must be valid javascript strings. we currently use urls because they're nice generic interface for saying where something came from, but we can change it to be anything we want.

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

That's true, and we usually aim for what is best for everyone involved in the ecosystem. Which is why I think node should be a little more clear (at least in concept) about why specifiers (aside from relative paths to specific files) make sense, which can happen if we align them to the theoretical of a scheme, now we tell someone who has no trouble understanding URLs and for all intents and purposes has associated specifiers with relative or absolute URLs, that a specifier resolved by node is a little different from file:// because sometimes it resolves a relative path slightly different than file:// would and it resolves a bare specifier, which does not really exist in the file:// protocol.

If I see URLs, and everyone tells me URLs, the browser says URLs, node tells me special characters (or windows path aspects) are not valid (ie expects proper file:// syntax), and I am quite the expert in my own domain, I even lookup the specs, but not necessarily in node's own story with modules…

This guy really wants your help, and making sure that our implementation (which does not resolve node:* at all) can be viewed as one that follows a special scheme, and that scheme has the following clear rules, and the following exceptions to the rules, and sometimes even, a custom loader can break those rules, and this is why you probably need to file an issue there or consider a different loader. This guy will likely try to look for a loader that clearly state that it conforms to the same conceptual scheme and is far more predictable than the one that only accidentally does until it does not.

@SMotaal
Copy link
Author

SMotaal commented Aug 26, 2018

I will somewhat proactively reiterate that this is simply conceptual, it will affect how look at things and on the very rare occasion may influence some design choices, but in no way am I associating this with the idea that node will resolve node:module instead of, as alias for, or aliased by the specifier module (per the current --experimental-modules default) — node:module is merely a ModuleWrap's url in the reference implemention and does not interfere with the ESM loader's resolve nor the ModuleWrap's (last I checked).

@SMotaal SMotaal added the modules-agenda To be discussed in a meeting label Sep 21, 2018
@SMotaal
Copy link
Author

SMotaal commented Sep 21, 2018

I am hoping we can discuss this in our next meeting regarding:

  1. Specifiers are/not URLs (near future at least)
  2. Predictable resource indication (ie ModuleMap vs Debugging/Introspection)
  3. Implications for renderers like Electron and NW.js (and non-Chromium engines)
  4. Closing this issue?

@ljharb
Copy link
Member

ljharb commented Sep 21, 2018

The next meeting overlaps with TC39, so a number of us will be unable to attend. It may need to wait until the following meeting.

@SMotaal
Copy link
Author

SMotaal commented Sep 21, 2018

In that case, we would have a casual discussion and hold the more formal one next meeting.

@GeoffreyBooth
Copy link
Member

So are we meeting this week? If not, can that be announced somewhere? I potentially have big plans for Wednesday at noon Pacific! 😋

@jkrems
Copy link
Contributor

jkrems commented Oct 3, 2018

One reason I have been playing with a cjs-bridge:// scheme is that it cleanly expresses the difference between "this module represents 1:1 the contents of the resource that can be found at this URL" and "this module represents an artificial module whose source will not match what is at this URL". E.g. if I open the devtools and look at file:///some-commonjs.js, I would expect it to match what is in the file and not to be some cryptic generated wrapping code. Given that V8 won't ever realistically support running CommonJS natively, this makes CommonJS a special case where resource content and V8 won't ever actually match.

@ljharb
Copy link
Member

ljharb commented Oct 3, 2018

It seems like it puts the necessity of knowing the module format in the hands of the consumer, which a number of us find unacceptable.

@jkrems
Copy link
Contributor

jkrems commented Oct 3, 2018

Not necessarily, depending on where it appears and how it's used. It can be used 100% internally for developer tooling niceness.

@bmeck
Copy link
Member

bmeck commented Oct 3, 2018

@jkrems we might need to give it a more generic term, for things like WASM you can see existing implementations that load modules that are compiled to source text module records. The problem of modules being facades is not limited to CJS.

@jkrems
Copy link
Contributor

jkrems commented Oct 3, 2018

Yeah, there's various overlapping concerns ("bridge module", "reflection util module", "not actually a file but compiled into node as a built-in"). For WASM my current assumption/hope is that it would end up as a "true" module in V8 because it would actually run as a module and should support linking etc.. But that's pending additional work that we don't (directly) control.

EDIT: Added the example of node built-in as a protocol reason.

@MylesBorins
Copy link
Contributor

Can this be removed from the agenda?

@SMotaal
Copy link
Author

SMotaal commented Nov 7, 2018

I think it needs to be addressed down the road... So I'll remove the label, but keep it open please.

@SMotaal SMotaal removed the modules-agenda To be discussed in a meeting label Nov 7, 2018
@MylesBorins
Copy link
Contributor

Can this be closed?

@SMotaal
Copy link
Author

SMotaal commented Apr 20, 2019

I don't know @MylesBorins… If std: (or hopefully something more/less communicable) is moving forward, this thread might be of relevance.

The concepts of @nodejs as a package scope and an implicit/explicit node: locator scheme are not contentious concepts in my mind — c/p node:@nodejs/… and node:@org/package.

But I really think this discussion is best punted for the time being.

Do we close and reopen?

@MylesBorins
Copy link
Contributor

Namespace discussions are still quite premature / early.

closing for now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants