Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce pseudo internal Node APIs #66

Closed
vkurchatkin opened this issue Aug 20, 2017 · 9 comments
Closed

Reduce pseudo internal Node APIs #66

vkurchatkin opened this issue Aug 20, 2017 · 9 comments

Comments

@vkurchatkin
Copy link

vkurchatkin commented Aug 20, 2017

Currently esm uses the following APIs:

  • new Module
  • Module.wrap
  • Module._compile
  • Module._extensions
  • Module._resolveFilename
  • Module._cache
  • Module._nodeModulePaths
  • path._makeLong
  • process.binding
    • fs
      • getStatValues
      • stat
      • internalModuleReadFile
      • internalModuleStat
    • natives
    • util
      • arrow_message_private_symbol
      • decorated_private_symbol
      • decorateErrorStack
      • setHiddenValue
  • process._tickCallback
  • process._preload_modules
  • REPLServer.prototype.createContext

This APIs are considered internal (or private) and shouldn't be used. Their behaviour can be changed or they can disappear at any time, even in patch release, thus breaking esm and all of its dependants.

@jdalton
Copy link
Member

jdalton commented Aug 20, 2017

Hi @vkurchatkin!

Our usage of pseudo internals falls into a couple buckets.

  • Required for hooking the common module loader, REPL, or run main scenario
    • Common Loader

      • new Module
      • Module.wrap
      • Module._cache
      • Module._compile
      • Module._extensions
      • Module._nodeModulePaths
      • Module._resolveFilename
    • REPL

      • REPLServer.prototype.createContext
    • Run main

      • process._preload_modules
      • process._tickCallback

These are used to either enable a hook or recreate behavior of various code paths.
The hooks for each scenario (common loader, REPL, run main) are only enabled for the given scenario and not all at once.

  • Bonus for enhancing functionality
    • path._makeLong
    • process.binding
      • fs
        • getStatValues
        • internalModuleReadFile
        • internalModuleStat
        • stat
      • natives
      • util
        • arrow_message_private_symbol
        • decorated_private_symbol
        • decorateErrorStack
        • setHiddenValue

When possible, operations are guarded by fallbacks or catches to prevent failure. This is especially true for operations that enhance debugging or performance.

While it's true some pseudo internal APIs can and do change the ones we're using for hooks tend to be a bit more resilient. They have existed for many years and Node versions. Because so many packages rely on their existence and behavior they are essentially frozen. In addition, to help solidify cow paths I plan to have Lodash take a dependency on @std/esm, in part leveraging its CITGM release gate status to avoid preventable breaks. (or at least get the heads up on them)

Having access to pseudo internals enables some pretty incredible things, but understandably increases the compat and support burden of Node core. This is why, while Node's CJS is a bit of the wild west, Node's ESM support is being planned in such a way to restrict these bits.

I'm only closing the issue as invalid since you've offered nothing actionable and our usage of pseudo internals is by design. If you have ideas 💡 to avoid some of them that would be great!

@jdalton jdalton closed this as completed Aug 20, 2017
@MylesBorins
Copy link

@jdalton would you be willing to reopen this issue so we can use it as a place to discuss what kind of APIs you might be able to use to avoid relying on internals?

@jdalton jdalton reopened this Aug 20, 2017
@jdalton
Copy link
Member

jdalton commented Aug 20, 2017

would you be willing to reopen this issue so we can use it as a place to discuss what kind of APIs you might be able to use to avoid relying on internals?

Done!

@refack
Copy link

refack commented Aug 20, 2017

First I said it before, and I'll say it again. IMHO this is a valuable module, that solves the real problem of migrating modules to ESM.
I'd be happy if node core would find a way to incorporate/recommend/sanctify it as the way to do ESM in existing version of node.

In addition, to help solidify cow paths I plan to have Lodash take a dependency on @std/esm, in part leveraging its CITGM release gate status to avoid preventable breaks. (or at least get the heads up on them)

I think that until there is full buy-in from node core, the cons of introducing it to lodash outway the pros. As was commented in nodejs/CTC#164 (comment) changing the global behaviour of node (even with supported APIs) should be left to the end user. IMHO Introducing it in a deep dependency like lodash will lead to fragility.

There is an interesting idea in nodejs/node#12349 about implementing a hook for transforming modules during require, maybe we can work off that.

@jdalton
Copy link
Member

jdalton commented Aug 20, 2017

@refack

First I said it before, and I'll say it again. IMHO this is a valuable module, that solves the real problem of migrating modules to ESM.
I'd be happy if node core would find a way to incorporate/recommend/sanctify it as the way to do ESM in existing version of node.

🙌

I think that until there is full buy-in from node core, the cons of introducing it to lodash outway the pros.

Having buy-in from Node core would rock! If the situation doesn't improve and Node core is still super concerned when the time for Lodash v5 comes closer I could always offer it as a package flavor (like lodash-es) for folks that are up for it.

As was commented in nodejs/CTC#164 (comment) changing the global behaviour of node (even with supported APIs) should be left to the end user.

The @std/esm loader isn't a global hammer. Only packages that opt-in to it by having it as a dependency, dev dependency, peer dependency, or options config get the loader. For everyone else they get the default built-in behavior. So package A (which doesn't use std/esm) can depend on package B (that does use std/esm) and package A will continue to load as normal without the ESM hooks being applied to it.

The @std/esm loader also supports versioning so package B can use version 1 of @std/esm can depend on package C that uses version 2 of @std/esm and both work together without stepping on each others toes.

There is an interesting idea in nodejs/node#12349 about implementing a hook for transforming modules during require, maybe we can work off that.

Neat. I'll give it a look for sure!

@mcollina
Copy link

If the situation doesn't improve and Node core is still super concerned when the time for Lodash v5 comes closer I could always offer it as a package flavor (like lodash-es) for folks that are up for it.

I think having it as a flavor would be awesome for everyone. However, let's do our best to lift the doubts.

The @std/esm loader isn't a global hammer. Only packages that opt-in to it by having it as a dependency, dev dependency, peer dependency, or options config get the loader. For everyone else they get the default built-in behavior. So package A (which doesn't use std/esm) can depend on package B (that does use std/esm) and package A will continue to load as normal without the ESM hooks being applied to it.

@jdalton I briefly looked for this information in the README, and there is no doc for it. Can you please explain how the hooks encapsulation is achieved? I am really worried about this becoming a global hammer.

I would be extremely less worried if the API would be:

module.exports = require("@std/esm")("./main.mjs").default

which would not require monkey patching core internals (if I understand correctly how all of this work). In this way all of that is loaded in sub-tree started from the above file will use require("@std/esm"), but everything else will be left untouched.

The preloader should stay it is as it is a great feature for applications.

@jdalton
Copy link
Member

jdalton commented Aug 21, 2017

@mcollina

I briefly looked for this information in the README, and there is no doc for it. Can you please explain how the hooks encapsulation is achieved? I am really worried about this becoming a global hammer.

The main hook wraps Module._extensions of ".js", ".gz", ".js.gz", ".mjs.gz", ".mjs" and then defers to previous handler if the file hasn't opt-ed in, or goes down the ESM route if it has:

esm/src/hook/compile.js

Lines 33 to 45 in 339f92e

function managerWrapper(manager, func, args) {
const filePath = args[1]
const pkgInfo = PkgInfo.get(dirname(filePath))
let wrapped = null
if (pkgInfo !== null) {
wrapped = Wrapper.find(exts, ".js", pkgInfo.range)
}
return wrapped === null
? func.apply(this, args)
: wrapped.call(this, manager, func, pkgInfo, args)
}

I believe your idea of require("@std/esm")("./main.mjs").default would work. It avoids the need to hook Module._extensions in the first place since I code-transform calls I control what the require or import translate into and already have my own handlers. This could simplify versioning code paths too! I'll work on patching that ASAP!

@mcollina
Copy link

I believe your idea of require("@std/esm")("./main.mjs").default would work. It avoids the need to hook Module._extensions in the first place since I code transform calls I control what the require or import translate into and already have my own handlers. This could also simplify versioning too! I'll work on patching that today!

It also reduces the number of internal hooks that we need to expose in core.

@jdalton
Copy link
Member

jdalton commented Aug 21, 2017

Ok so it looks like using @mcollina's approach should be a winner. To reduce the number of pseudo internals used I'm simply going to copy their source code over to reimplement locally as individual modules (they're just written in JS after all).

This means there will be essentially no pseudo internal usage for the primary module use beyond those remaining for guarded bonus functionality (optimized fs operations and error stack cleanup).

References I can copy over or inline results from include:

  • new Module
  • Module.wrap
  • Module._cache
  • Module._compile
  • Module._extensions
  • Module._nodeModulePaths
  • Module._resolveFilename
  • path._makeLong
  • process.binding("natives")

References I can guard or are already guarded:

  • process._preload_modules (guarded; only used for node -r and REPL)
  • process._tickCallback (need to guard; only used for node -r)
  • process.binding("fs") (guarded; bonus optimized fs operations)
  • process.binding("util") (guarded; bonus error stack cleanup; deferrable until error is thrown)

Leaving just the following unguarded:

  • REPLServer.prototype.createContext (only used in REPL)

The node -r and REPL scenarios can be partially guarded, but as they are supplemental functionality, touch very few pseudo internals, are not critical to general use, and not loaded for general use, I'm ok taking the support burden for them.

With this there's no longer a hook into the primary module loading mechanism and significantly reduced pseudo internal usage.

I'll update the thread with commit references as they come.
These changes will land before the next release.

Update

Fixed by v0.6.0 🎉 🎉 🎉
Please note I've changed some usage instructions.

@jdalton jdalton closed this as completed Aug 21, 2017
@standard-things standard-things locked and limited conversation to collaborators Aug 22, 2017
@standard-things standard-things deleted a comment from benjamn Aug 23, 2017
@jdalton jdalton changed the title Stop using Node's internal APIs Reduce pseudo internal Node APIs Aug 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

5 participants