Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool config transpilation for webpack, gulp, etc #7

Open
cspotcode opened this issue May 24, 2021 · 30 comments
Open

Tool config transpilation for webpack, gulp, etc #7

cspotcode opened this issue May 24, 2021 · 30 comments

Comments

@cspotcode
Copy link
Contributor

I'm trying to best understand what a user's and tool author's experiences will be in the future when:

a) users write a project in pure ESM with package.json "type": "module"
b) users write a config file for a tool -- webpack, gulp, etc -- in a transpiled language, for example TypeScript or CoffeeScript
c) tool author does not like asking users to prefix with NODE_OPTIONS='--loader xyz', use cross-env on windows, etc

I've read the meeting minutes and the issues on nodejs/node and I haven't seen this use-case discussed lately. Has this been considered? If not, I can describe the challenges in greater detail in this issue.

@DerekNonGeneric
Copy link
Contributor

Has this been considered?

Yes, I believe this has been considered. One of the options proposed was to have a package.json field that would allow users to specify the relative path of a loader file. I also have another idea if there is push-back against this. I can provide additional details if necessary, but wonder at this point if we are talking about the same problem space. Am I on the right track?

@cspotcode
Copy link
Contributor Author

cspotcode commented May 24, 2021

I think so, but I'm not sure. Where should I look to find more information? Is there a nodejs/node ticket, or a search query I can use to find it?

As a concrete example, what is the experience for a user of webpack with a webpack.config.ts?

// package.json
{
  "name": "users-cool-project",
  "type": "module",
  "dependencies": {
    "webpack": "latest",
    "webpack-cli": "latest"
  }
}
// webpack.config.ts
import {bar} from 'esm-library';
export default {
  foo: await bar();
}

The user wants to run webpack to build their project. Webpack's current guidance says to npm install ts-node and it will work. How does that guidance change, if at all?

@ljharb
Copy link
Member

ljharb commented May 24, 2021

I wouldn't expect it to change much - ts-node basically is a loader.

@cspotcode
Copy link
Contributor Author

cspotcode commented May 24, 2021

Currently, my understanding of webpack's behavior is:

  • find webpack.config.*
  • based on file extension and consulting the mappings in interpret, webpack loads ts-node/register
  • webpack attempts to import('./webpack.config.ts') EDIT module specifier should be an absolute path to webpack.config.ts

Something will need to change so that an ESM loader gets installed, otherwise it can't handle webpack.config.ts. Where in the process will that happen, and which package.json file will need a new loader config field added?

@DerekNonGeneric
Copy link
Contributor

Where in the process will that happen, and which package.json file will need a new loader config field added?

We have not really discussed the feasibility of specifying a loader in a package.json extensively and am unaware of any open tickets regarding the matter. It was brought up on the first call we had when the team was formed, but there was also skepticism about whether or not it would be approved by another core team member who was not present, so no promises there.

As far as integration w/ Webpack, I am not a user of that tool, but I believe @guybedford would be able to comment if we do end up following through on having a loader specified in a package.json, which is not something I can guarantee will ever materialize but is worth further investigation.

@cspotcode
Copy link
Contributor Author

Ok, agreed it is worth investigating because the webpack use-case applies to other tools, and I believe it is in a blind spot being missed by current design discussions. gulp is another example of a tool that supports code-as-config.

In the webpack case, here are the difficulties I foresee:

  • webpack's package.json cannot specify the loader -- it must be the user's package.json -- and yet webpack binary is the process's entrypoint
  • webpack does not know which loader to use until parsing CLI args and performing FS traversal to locate webpack.config.*
  • webpack config contains functions, class instances, etc so cannot be marshalled across thread or process boundaries

If anyone is able to comment on the specifics of a package.json-specified loader, I can figure out if it supports this use-case.

@DerekNonGeneric
Copy link
Contributor

If anyone is able to comment on the specifics of a package.json-specified loader, I can figure out if it supports this use-case.

Well, it's a rather simple concept if you think about it.

The package root directory would contain a root-relative path to a loader file (i.e., "loader": "./loader.mjs"). The package entry point "exports": "./index.mjs" would then be loaded using the loader specified.

If there anything else that needs describing, let me know and I will try to clarify it for you.

@cspotcode
Copy link
Contributor Author

What's the scope of the loader's influence? If a package's entry-point is loaded using the loader, and that package attempts to dynamically or statically import a path that the loader resolves to be outside of the package, is that other path also loaded using the loader?

When dynamic imports are used, does this mean that loaders can be added dynamically at runtime? E.g. if a package is dynamically imported, and it declares a loader, does this mean the loader is being added at runtime?

Do these loaders compose with ones already active via --loader?

Does this enable webpack to cleanly support transpiled config files, or will it require additional NODE_OPTIONS or other env var hacks from end-users? webpack can declare a loader, but the user's config file is not a part of the webpack package, nor is it an entrypoint of any package.

@DerekNonGeneric
Copy link
Contributor

What's the scope of the loader's influence? If a package's entry-point is loaded using the loader, and that package attempts to dynamically or statically import a path that the loader resolves to be outside of the package, is that other path also loaded using the loader?

Remember that a loader file is generally nothing more than a collection of hook functions. These hooks get activated every single time a module is loaded regardless of where the module is located (whether the specifier is pointing to a location anywhere on the filesystem or somewhere on the internet), these hooks override the default loader behavior. So, to answer your question, there are no real “scope” boundaries (everywhere is fair game).

When dynamic imports are used, does this mean that loaders can be added dynamically at runtime? E.g. if a package is dynamically imported, and it declares a loader, does this mean the loader is being added at runtime?

Technically, one would be able to modify a package.json's loader field to some other path at runtime, but since this is all still very hypothetical, we may need to make some alterations for it to work. The ergonomics of this do not sound pleasant though, but curious how common of an occurrence this will be. I'm a bit unsure whether or not that will end up being something frequently done. If so, we should probably provide a more convenient way of going about this.

Do these loaders compose with ones already active via --loader?

We will have to see how this works out… Seeing as how there would be an additional loader specified in the package.json, we would need to see where that one would be in the sequence. I would probably say that the one specified in the package.json would be the first loader in the sequence (imagine it would be loader at index position 0 in the below chain).

node --loader=./mod1.mjs --loader=./mod2.mjs --loader=./mod3.mjs index.mjs

Does this enable webpack to cleanly support transpiled config files, or will it require additional NODE_OPTIONS or other env var hacks from end-users? webpack can declare a loader, but the user's config file is not a part of the webpack package, nor is it an entrypoint of any package.

¯\(ツ)

[ I will have to get back to you on this after becoming more familiar w/ that tool. It might be a couple of weeks, though. ]

@cspotcode
Copy link
Contributor Author

there are no real “scope” boundaries (everywhere is fair game).

That's good to hear. It means that a package can declare a configurable pass-through loader, then make special requests to the loader to further configure it at runtime.

For example:

// a contrived example
await import('node-configurable-loader:configure/add?name=https-loader');
await import('node-configurable-loader:configure/add?name=ts-node/esm');
const lib = await import('https://deno.land/x/path_to_regexp@v6.2.0/index.ts');
// a more realistic example
await import('node-configurable-loader:configure/add?name=ts-node/esm');
const config = await import('./jest.config.ts');

Packages that may need runtime-configurable loader behavior can declare a dependency on this configurable loader, declare it in their package.json, and then take advantage of it if necessary. Otherwise it will be a passthrough and will not affect node behavior at all.

Effectively, we can implement loader composition and runtime configuration of loaders in user-space as a module.


If so, we should probably provide a more convenient way of going about this.

Yeah, that's why I filed this issue. I believe that runtime configuration of loaders is useful. Currently CJS can do this and ESM can't, which is inconvenient for users and module authors.


I'm using webpack as a concrete example, but this may affect other tools as well, for example:

Jest
Gulp
Anything using Liftoff

@JakobJingleheimer
Copy link
Contributor

JakobJingleheimer commented May 26, 2021

there are no real “scope” boundaries (everywhere is fair game).

That's good to hear. It means that a package can declare a configurable pass-through loader, then make special requests to the loader to further configure it at runtime.

There is potentially a relevant caveat: what happens in Vegas stays in Vegas. Loaders' hook(s) are executed in a semi-isolated context; namely that an import within a loader is isolated from an import in "user-space". So if you tried to communicate with the loader via an import they both use, that won't work.

However, passing configuration to the hook via the specifier (such as in the example of the previous comment) seems like it could work. In your example, I imagine your hook would look something like¹:

await import('node-configurable-loader:configure/add?esm=ts-node/esm');
// note that `name` → format to which the loader is applicable
const loaders = new Map(/* pre-baked loaders here */);

export async function load(inputUrl, context /*, … */) {
  let url;
  try { url = new URL(inputUrl) }
  catch (error) {/* … */}

  if (url.protocol === 'node-configurable-loader:') {
    if (url.pathname === 'configure/add') {
      for (
        const [applicableFormat, loaderPkgName]
        of new URLSearchParams(url.search)
      ) loaders.set(
        applicableFormat,
        await import(loaderPkgName) // do this better
      );
    }
  } else {
    const format; // determine inputUrl's format
    const loader = loaders.get(format); // get appropriate loader

    if (loader) return loader(inputUrl, context /* … */);
  }

  // other cases
}
  1. Take this with a grain of salt, as phase 2 of loaders is not yet final

With the above, beware race conditions, whereby a "configured" loader has not yet been set up before something else tries to import (you would need to ensure your config imports occur at the very beginning).

Also, I'm not necessarily saying you should do this (merely that you probably can). Phase 2 adds loader (hook) chaining, and this would effectively reinvent that wheel.

@cspotcode
Copy link
Contributor Author

My goal here is that I believe phase 2 of loaders needs to consider this use-case, and the hack I've proposed above is simply one way we can achieve it with a third-party module. The loaders team can choose to make this easier with the right node API surface.

The ts-node/esm loader handles some advanced resolution logic across all formats, so it cannot be limited to the ESM format as in the example above.

I expect the node-configurable-loader would be published as a module offering both the loader and an application-side API which executes the correct dynamic import() calls to communicate with the loader.

Out of curiosity, which high-profile loaders has the team looked at as part of their design work? There are already a few out in the wild. Should we make a list? yarn2's ESM loader is in a draft PR, ts-node/esm is being used in the wild, I believe nyc has one, and I'm sure there are others. These will need to compose together.

@DerekNonGeneric
Copy link
Contributor

Out of curiosity, which high-profile loaders has the team looked at as part of their design work?

It has been very difficult for me to gather information about what currently exists in the ecosystem as far as published loader packages are concerned, so I appreciate you informing us of what is currently out there. As far as I know, there is no way to gather information about published loader packages on npm.

There are already a few out in the wild. Should we make a list?

Yes, please do make a list — even maintaining a Gist containing a list of this nature would be extremely useful.

@cspotcode, we are going to be making a member roster soon, so if you are interested in influencing the design of the next generation loader, it seems to me like you would be a great addition to the team.

@cspotcode
Copy link
Contributor Author

Ok, sounds good. I created #8 to start, but I can move it to a gist or anywhere else if you want.

I am definitely interested but I don't want to over-extend, so I will have to think about it and get back to you in a few days. Regardless, I will definitely be tuning in to the design meeting this Friday.

@JakobJingleheimer
Copy link
Contributor

I've taken a look at a couple (most namely, webpack). Your "hack" proposal reminded me a lot of webpack loaders. I feel like there's potential there.

At the previous meeting we discussed a runtime API-based option for this, and I believe @bmeck cautioned that it would likely be very difficult to support.

We do recognise the clear use-case for it. The "how" seemed to be the sticking point. It was deemed a "tomorrow's problem" since phase 1 is independent and not yet landed.

@bmeck
Copy link
Member

bmeck commented May 27, 2021

cross referencing relevant issues:

  • Per package loader hooks: RFC: Per Package Loader Hooks node#18233 - this basically would enable packages to produce a scope in which loaders are applied. It is actually important that such behavior is scope to avoid collisions (2 packages might differ in how they transform .html files for example)
  • Replacing a thread with another: Allow main thread to be supplanted by another node#38454 - this is needed due to problems with using an API to instrument loaders properly via application code.

To clarify somewhat, we can completely provide a configurable API for loaders. In fact various things like https://github.com/targos/multiloader do exist. However, doing so via an API accessible from application code causes various issues. SW have an activation based API somewhat in this vein and have a long history of the "doesn't work on first load" problem, and even just this very day had a comment on a similar issue that import maps is facing: WICG/import-maps#7 (comment) . However, unlike general issues we have had with Loaders the web stance is much firmer on not allowing any JS execution in their pipeline. Doing so in Node's pipeline was somewhat careful to avoid problems with ahead of time tooling and ensured that runtime based communication was limited and/or constrained so that problems wouldn't arise like timing issues.

@cspotcode
Copy link
Contributor Author

The use-case described in this thread does not seem susceptible to the "doesn't work on first load" problem.

await installLoaderUsingTheLoaderInstallationApi('transpiling-loader');
const v = await import('./plugin-or-config-etc.transpiled-language-file-extension');

The loader's necessity is not known before code (gulp CLI, etc) is able to traverse the filesystem, discovering the config file's location and extension. So the loader's installation is done async, not triggered by a static import.

The tool which requires the loader is installing it asynchronously and understands full well that it must wait for the loader to be installed before attempting to use it. Using the loader is not done by a static import; it's a dynamic one. So it is deliberately and easily postponed by the application (gulp CLI, etc) until after the loader is ready.

@GeoffreyBooth
Copy link
Member

The specific use case here of transpiled config files seems like it would be more directly addressed by the vm module (essentially Node’s version of eval). The build tool would read the config file from disk as a string, pass that string through the transpiler (just a regular function call, that takes a string and returns a transpiled string) and this output JavaScript string would be passed into vm to be executed. The return value of that (probably an object) would be the configuration that the build tool wants. No loaders are involved.

@cspotcode
Copy link
Contributor Author

cspotcode commented May 28, 2021

I looked into that briefly, and wasn't sure if it allows us to handle imports and exports. For example if webpack.config.ts does import {constantsForDefinePlugin} from './src/config/constants.js', then we need to:

a) resolve constants.js to constants.ts
b) transpile constants.ts

Does the vm module let us hook imports and exports transitively, playing nice with any other --loaders that might be running as well?

@JakobJingleheimer
Copy link
Contributor

With vm you explicitly must provide resolution mapping and orchestrate things. But I think it does not at all play with loaders?

That said, TypeScript has a rather…unique issue: The file extension in the specifier should not be .js, it should be .ts because the actual file on disk is named with .ts. The TypeScript authors just refuse to rewrite .ts in an import path to .js in tsc's output (despite how trivial it likely would be), so ts devs are forced to use the incorrect .js to avoid broken output.

I believe we/Node.js should not go out of our way to pick up the can TypeScript kicked down the road. But in general, I do believe tooling configs needing transpilation should be support 🙂

@giltayar
Copy link
Contributor

giltayar commented Jun 2, 2021

That said, TypeScript has a rather…unique issue: The file extension in the specifier should not be .js, it should be .ts

Yup. Which is why in my babel-register-esm that deals with Babel transpilation, I explicitly added a resolve hook to deal with this weird issue, even though it has nothing to do with babel transformations: https://github.com/giltayar/babel-register-esm/blob/873b1b7d84cc051054e364e83b44bb010b74b91b/src/babel-register-esm.js#L24.

@cspotcode
Copy link
Contributor Author

cspotcode commented Jun 2, 2021 via email

@JakobJingleheimer
Copy link
Contributor

JakobJingleheimer commented Jun 2, 2021

🤔 Actually, I think there's no special problem to solve: something.config.jsfoo.tsbar.coffee should work just fine via loaders.

Let's take Webpack for example:

$> NODE_OPTIONS='--loader=coffeeLoader --loader=tsLoader' \
webpack \
--config ./webpack.config.ts \
./src
// webpack.config.ts

import foo from 'foo.coffee';

// …

When webpack goes to load the config file, it should use a dynamic import, which would invoke tsLoader behind the scenes. Then the config file's import of foo.coffee would be handled by coffeeLoader.

No magic required.

@cspotcode
Copy link
Contributor Author

Yeah, 100% agreed, loaders is the way to go. I wanted to be sure we were not still considering that webpack should use vm instead of loaders, since that had been proposed earlier.

I was thinking about the idea to declare loaders in package.json. A few things we'd need to get right, but nothing too terrible:

  • user is required to add the loader declaration to their package.json
    • webpack cannot configure this on their behalf nor specify it in webpack's package.json
    • an extra line of boilerplate, but not a huge hardship for users
    • allows the loader author to explain configuration in a way that's tool-agnostic
    • One declaration will work for gulp, webpack, shell scripts, etc
  • will have to think about module boundaries w/multi-package workspaces
    • if 20x packages in a monorepo all specify ts-node loader, can all use the same loader instance / state internally? Good for performance since e.g. type-checking crosses package boundaries
  • publishConfig should be able to strip loader from package.json for publishing
    • Not a huge deal. Will require a PR to npm, pnpm, yarn

@JakobJingleheimer
Copy link
Contributor

Yes? I think the package.json proposal is quite viable and seems an appropriate place.

In terms of workspaces and mono-repos, if they're executing together, they would likely share loader state. I don't know enough about mono-repos and workspaces though.

@DerekNonGeneric
Copy link
Contributor

To start to address the initial question of “tool config transpilation”,
it might be good to check out what is being done today — (several months later):

Gulp

Webpack

TODO

Rollup

TODO

@cspotcode
Copy link
Contributor Author

cspotcode commented Feb 22, 2022

The esm module executes code as CommonJS, right? And it breaks for stuff like export default await asynchronouslyDoStuffBecauseThisIsEsm(); because esm tries to pretend that exports are determined synchronously. Last I checked it doesn't really follow the rules for top-level await.

EDIT added missing await in my example above

@DerekNonGeneric
Copy link
Contributor

The esm module, a userland loader to polyfill versions of Node.js without native ES Modules, is no longer necessary in v13.2+. This is because most of the ES module implementation was unflagged in the v13.x release line.

The esm module executes code as CommonJS, right?

Correct, the esm module does not put your code in the module context. It is still in the commonjs context, and I would advise against using it over the new ES module implementation in Node.js, i.e., #enabling.

And it breaks for stuff like export default await asynchronouslyDoStuffBecauseThisIsEsm(); because esm tries to pretend that exports are determined synchronously.
Last I checked, it doesn't follow the rules for top-level await.

Yes, I do recall there being async-related limitations as you describe. It's also no longer being actively developed or maintained, so probably not the best choice in most scenarios.

@cspotcode
Copy link
Contributor Author

For me, the takeaway is still the same as it was some months ago: these tools achieve some level of convenience because they install a CJS loader hook in-process, mid-execution. To achieve the same within the limitations of the current loader implementation, we effectively need a --loader late-binding-loader.mjs which exposes an API to add hooks at runtime: process[Symbol.for('late-binding-loader')].addHooks('module-name')

@JakobJingleheimer
Copy link
Contributor

JakobJingleheimer commented Feb 23, 2022

We're cognisant of a desire for an API like that, and I do believe there is no philosophical objections in principle. I think addressing those use-cases is somewhere down the roadmap (at least mentally tracked). Loaders is not yet ready to start designing a solution for those use-cases yet though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants