Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: introduce node:interop to expose the path of the current module #49070

Closed
wants to merge 2 commits into from

Conversation

aduh95
Copy link
Contributor

@aduh95 aduh95 commented Aug 8, 2023

This is an alternative to #48740. This exposes a new "magic" module node:interop that exports the path of the importing module. This aims to simplify uses of paths from within ESM modules, obviously it only works for modules loaded from the file: protocol (i.e. the modules corresponding to files on the FS). I think this would be a better approach than #48740 because it doesn't pollute import.meta, and it provides an early error when it is used from an unsupported protocol, and doesn't go in the way of folks who do not want any path within their ESM code.

Before it can land, this would need:

  • agreement on the principle
  • bikesheding on the module specifier (currently node:interop)
  • bikeshedding on the export names (currently __filename and __dirname)
  • bikeshedding on the error that it should throw when imported from a non-file: protocol (currently ERR_INVALID_URL_SCHEME)
  • Docs

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders
  • @nodejs/modules

@nodejs-github-bot nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Aug 8, 2023
@bakkot
Copy link

bakkot commented Aug 8, 2023

it doesn't pollute import.meta

I know there's some concerns about adding stuff to import.meta, but for what it's worth, when TC39 added import.meta it was explicitly supposed to be for stuff like this. It is in fact the very first use case mentioned in the readme for the proposal. I would not consider this to be "polluting" import.meta - that's what it's for. By contrast, a module with different behavior depending on where you import it is much more magic, and is definitely not something we (TC39) intended.

Node can do what it wants, of course, but personally #48740 makes much more sense and fits better with the language, in my opinion.

(Also, as a user: the whole point of #48740 is convenience, and this PR is strictly less convenient than the original.)

Copy link
Contributor

@JakobJingleheimer JakobJingleheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on the implementation—very clever 🙌

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Aug 8, 2023

👍 on the implementation—very clever 🙌

I second this. I was doubtful that this would be possible.

What if the same module is resolved multiple times, from different places? Like from a sibling file, but then from a file one level down via something like import '../file.js'? I assume everything would be fine, but maybe we should have a test for cases like this.

Also I feel like node:interop should be importable from CommonJS too. It could just return __dirname and __filename like the local variables.

@GeoffreyBooth
Copy link
Member

when TC39 added import.meta it was explicitly supposed to be for stuff like this.

I understand that that’s what TC39 intended, but we (and Deno, and WHATWG) are very concerned about code portability and interoperability. import.meta seems to users like part of the language, not something that’s obviously runtime-specific like import.meta.node would be, for example. We feel that additions to import.meta need to go through a standardization process to avoid incompatibility surprises; see wintercg/proposal-common-minimum-api#50. We’re happy to add anything that gets blessed by WinterCG and/or WHATWG, but inevitably there will be things that we won’t be able to reach consensus on and an approach like the one in this PR allows us to still provide such features to our users without the compatibility hazard.

@bakkot
Copy link

bakkot commented Aug 8, 2023

I understand that that’s what TC39 intended, but we (and Deno, and WHATWG) are very concerned about code portability and interoperability.

Sure, but this approach is strictly worse for code portability and interoperability, isn't it? let logPrefix = import.meta.filename ?? '' is interoperable everywhere. A module containing import { __filename } from 'node:interop' won't execute anywhere except on Node, even if all of the rest of it is cross-platform. And it has fundamentally different module semantics than the rest of the language - modules aren't generally supposed to behave differently depending on which file you're importing them from.

So this approach seems worse from an interop point of view, in addition to a consistency-of-the-language point of view. (And also from a developer experience point of view.)

@GeoffreyBooth
Copy link
Member

Sure, but this approach is strictly worse for code portability and interoperability, isn’t it?

No, because it’s obvious. If you’re importing from node:interop, you know you’re writing code that will only run on Node (or on platforms that offer Node compatibility). It’s still not great for code portability and interoperability, but it’s not as much of a footgun. A better question is whether this approach is better than import.meta.node.*.

Ideally we can reach a standard with WinterCG and we can ship that. I don’t think anyone would prefer this over wintercg/proposal-common-minimum-api#50, so I doubt this would land while the process is still in progress for trying to reach a standard. However it’s probably a good idea to explore how we would ship module-specific APIs in the future that weren’t (or couldn’t be) standards, because sooner or later there will be such an API that Node wants to add that we can’t get spec support for.

@bakkot
Copy link

bakkot commented Aug 8, 2023

I guess I have the opposite opinion: I would regard a design with which it is at least possible to write portable code to be more interoperable than a design which makes that impossible, even though if you write non-portable code it will be more obvious with the latter design.

It's good to provide hints to users, but forcing otherwise-portable correct code to be non-portable for the sake of louder errors in the case of incorrect code - that really seems like it has the priorities backwards.

const path = fileURLToPath(parsedParentURL);
const dir = dirname(path);
return {
url: `data:text/javascript,export const __filename=${JSONStringify(path)};export const __dirname=${JSONStringify(dir)}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be implemented as a SyntheticModule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the benefit of doing so? (Not opposed, just curious.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheaper on VM side / doesn't need to tie it to a poison-able cache key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we could also implement these as functions, like:

import { dirname } from 'node:interop'
const __dirname = dirname()

And then maybe we wouldn’t need to create them ahead of time. We could generate the code to be evaluated, and then evaluate it, only when the function is called.

Yes it’s slightly more annoying to type dirname() instead of __dirname or dirname, but if it improves performance then it’s probably worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a cached getter would be a "6 of one", but with better DX :) i think no additional implementation effort (just a choice of A or B).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't just CPU/Mem when having unused values here, data: will cache the source text in the V8 Context's Module cache and engage a full parser vs just piping values where SyntheticModule ~= a Map and avoiding cache growing.

@bmeck
Copy link
Member

bmeck commented Aug 8, 2023

There was desire for js:context on my part (even was in an ancient esm impl) but it was countered and import.meta was landed somewhat in response for these. I didn't take part in the design of import meta so much due to family issues at the time but revisiting why it was discouraged. Particularly the main objections came from web spec wanting to ensure module specifiers resolved to fully qualified unique names (leading to URL cache key etc). This should be a bit more careful to cache and sidechannel poison probably from 2 modules sharing the synthetic/data one probably and have a story for fully resolving to a flat cache space.

@Qard
Copy link
Member

Qard commented Aug 9, 2023

I'm with @bakkot that this seems a bit counter to interoperability. By putting stuff on import.meta it does mean there's some divergence, but that divergence can be resolved fairly easily by implementing the same feature in the same way in other runtimes whereas this node:interop module provides no path for other runtimes to align with unless they want to mirror this very implementation-specific behaviour. 🤔

@ljharb
Copy link
Member

ljharb commented Aug 9, 2023

I second everything @bakkot says here; import.meta is the only place this sort of thing is intended for, and it violates the spec imo to have a magic module that imports a different value based on who imports it.

@isaacs
Copy link
Contributor

isaacs commented Aug 9, 2023

import should pull in something that isnt dependent on the thing importing it, and is always the same thing no matter who imports it (assuming it resolves to the same module specifier).

Having something imported that varies based on the module path of the thing importing feels like an inappropriate intimacy code smell to me.

Copy link
Member

@targos targos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concerns as @bakkot, @ljharb and @isaacs

@aduh95
Copy link
Contributor Author

aduh95 commented Aug 9, 2023

I'm fairly certain this does not violate the spec, the same specifier can resolve to different URLs based on who is importing it (e.g. import './file.js' will load a different module based on which subdirectory the module importing it is).

The other critics are valid, but subjective (which is fine, but I have nothing to counter them except a different opinion).

@ljharb
Copy link
Member

ljharb commented Aug 9, 2023

@aduh95 that's fair; it would certainly be incredibly surprising to import something that's not a relative path/URL and have it depend on where you import it from, at least.

@arcanis
Copy link
Contributor

arcanis commented Aug 9, 2023

Technically that's the case of all bare imports in the ecosystem, which resolve to different versions and thus content depending on which package imports them.

That said here it definitely feels crossing a line. Enabling introspection this way is an interesting thought experiment, but I think I'd personally prefer a boring solution to the boring problem of "where is stored the current script".

@bmeck
Copy link
Member

bmeck commented Aug 9, 2023

@aduh95

I'm fairly certain this does not violate the spec, the same specifier can resolve to different URLs based on who is importing it (e.g. import './file.js' will load a different module based on which subdirectory the module importing it is).

That is valid, but how caches for things like import attributes are designed spec wise, they expect a flat cache in the end. There are plenty of ways to do this but each imported module is expected to have a way to uniquely identify it. So when solving a poisoned cache/side channel issue of using data: moving towards a SyntheticModule makes sense, but TC39 is designing for flat unique identity systems when designing things including more forwards looking Compartment compatibility which uses the term full module specifier. Having modules resolve in a context sensitive manner is inherent but the final resolution is not expected to retain that linkage for a variety of things including GC.

@benjamingr
Copy link
Member

I agree with others' sentiment - this is very clever, but I'd prefer the "boring" import.meta.filename approach. It lets us use import.meta.filename to check if there is a filename to log for example during runtime whereas if I import node:interop for the same test I need a sham in my import map in order to use the code in a browser - making it less portable.

So I am also -1 on this unless we can't land import.meta.filename.

Co-authored-by: Moshe Atlow <moshe@atlow.co.il>
@isaacs
Copy link
Contributor

isaacs commented Aug 13, 2023

@aduh95 @arcanis Yes, that's why I said:

(assuming it resolves to the same module specifier)

Ie, if a file at /x/y/z does import './file.js' and a file at /x/y does import './z/file.js', then those would be expected to be the same thing, because they're resolving to the same absolute/final module specifier.

Builtins always resolve to themselves, so if import 'node:interop' resolves to a different thing based on the importer, then idk if that's a violation of the spec, officially, but it's certainly a violation of reasonable expectations.

Module-specific stuff belongs on import.meta (for esm) or in the module scope as a pseudo-global (for commonjs). If it can't go there for some reason, I think that's a strong argument for not having it anywhere.

We already have fileURLToPath(import.meta.url). Let's not add new warts on the consistency of the API just to try to make that one use case slightly prettier, that never ends well. It's amazing that we've kept it as clean as it is.

@aduh95
Copy link
Contributor Author

aduh95 commented Jan 29, 2024

#48740 has landed

@aduh95 aduh95 closed this Jan 29, 2024
@aduh95 aduh95 deleted the node-interop branch January 29, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet