Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add esm experimental-loader support to @babel/register #15456

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

mkg20001
Copy link

@mkg20001 mkg20001 commented Feb 23, 2023

Q                       A
Fixed Issues? Fixes #1, Fixes #2
Patch: Bug Fix?
Major: Breaking Change?
Minor: New Feature? ESM support for @babel/register
Tests Added + Pass? none yet
Documentation PR Link
Any Dependency Changes?
License MIT

This adds experimental-loader/ESM support to @babel/register, which is a much wanted feature

I'm aware of experimental-loader being unstable, I'd still find it cool if it would be available at least behind a flag. Since it has to be loaded with --experimental-loader it should be clear that it's prone to breakage.

The code is basically "let's just get it working". Unstructured first attempt, for the project I tested it on it appears to work.

Since there is no way I'm aware to load @babel/register "the old way" and specify the config along, I added an environment variable which can be used to specify one. I'm open to better solutions. (edit: there's now a way to do it via ipc between loader and application, but it may be async and BABEL_REGISTER automatically preloads config. for esm configs there's --import, but it's not in any lts yet nodejs/node#40110)

BABEL_REGISTER=$PWD/register.mjs node --experimental-loader @babel/register/loader.mjs src/bin.ts
import { dirname, join } from 'path'
import { fileURLToPath } from 'url'

const __dirname = dirname(fileURLToPath(import.meta.url))

// This adds the registration to the Node args, which are passed
// to child processes by Node when we fork.
const loader = process.argv[process.argv.indexOf('--experimental-loader') + 1]
process.execArgv.push('--experimental-loader', loader);

export default {
    cwd: join(__dirname, 'src'),
    only: [join(__dirname, 'src')],
    presets: ['module:@babel/preset-typescript'],
    extensions: ['.ts', '.js'],
    envName: 'development',
}

@babel-bot
Copy link
Collaborator

babel-bot commented Feb 23, 2023

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/54047/

@mkg20001
Copy link
Author

i came across ./packages/babel-core/src/vendor/import-meta-resolve.js... maybe that has better import resolution code.

@mkg20001
Copy link
Author

mkg20001 commented Feb 23, 2023

hrm... so support goes back to node 12

biggest change is context being just a string (parentURL). a few functions are named differently. the loader.mjs should be working if I could get it to be compiled via the build system as ESM, as some stuff isn't available in node 12/14

@mkg20001
Copy link
Author

mkg20001 commented Feb 23, 2023

the loader seems to be in a different context, i'm handling this now (tests aren't done). cjs loading is explicitly not supported by esm loaders and needs to be handeled by the application context (aka the usual way). for cjs workers will likely be needed (setup/config can be done via messageport) but the esm part likely won't need any workers, since the loader itself will be the worker, so it may make sense to get rid of that eventually

@mkg20001 mkg20001 marked this pull request as ready for review February 23, 2023 19:22
Gulpfile.mjs Outdated
@@ -705,6 +705,12 @@ gulp.task(
gulp.series("copy-dts", () => buildRollupDts(dtsBundles))
);

gulp.task("build-other", () =>
getFiles(`packages/babel-register/src/loader.mjs`, {})
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea how to make it keep just this particular file as ESM. theoretically since I'm re-exporting via loader.mjs this might aswell be a cjs file.

@liuxingbaoyu liuxingbaoyu added PR: New Feature 🚀 A type of pull request used for our changelog categories i: discussion pkg: register labels Feb 23, 2023
Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initially wanted to wait until Node.js stabilizes their API, but since it's taking so long I'm ok with relying on the current experimental one.

export const resolve = major >= 16 && _resolve;
export const globalPreload = major >= 16 && _globalPreload;

export const getSource = major < 16 && _getSource;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is all experimental anyway, I would prefer to only support Node.js 16&18 and simplify our implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backwards compatibility - at least down to node 14 - consists of renamed functions only plus getFormat, which is 5 lines.

Copy link
Author

@mkg20001 mkg20001 Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also about this particular snippet: What I tried to accomplish was not exposing the old functions on node 16+, since that causes unnecessary warnings about old functions being used. But node 14 didn't like it.

Edit: fixed it now

@mkg20001
Copy link
Author

mkg20001 commented Mar 1, 2023

Currently tests fail, but it's the babel-node ones, not the one from this change. Not sure but it could be that switching the worker argument from class to string might have messed up something that babel-node depends on.

Problem is that loaders run in seperate context and all we get is a message port there, which doesn't like pushing complex objects over it.

edit: tests fixed

@mkg20001
Copy link
Author

mkg20001 commented Mar 1, 2023

I've released a built package here if anyone wants to try: https://github.com/mkg20001/babel/releases

It worked on the node 14 project i tried it on, but might break on some edge case that I'd like to know about

@mkg20001
Copy link
Author

mkg20001 commented Mar 1, 2023

ah windows, forgot about that... most likely just path seperator stuff

and node 12 doesn't do await in top-level

@liuxingbaoyu
Copy link
Member

Since there is no way I'm aware to load @babel/register "the old way" and specify the config along

Sorry I didn't fully understand what you mean, can we use similar usage?
https://github.com/esbuild-kit/esm-loader#usage

ah windows, forgot about that... most likely just path seperator stuff

Don't worry, I can help if you don't have a windows device.

// skip everything if below version 12
if (major < 12) {
// eslint-disable-next-line no-global-assign
it = it.skip;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a new variable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ide doesn't detect it as mocha test then, for running it from ui


flag[resolved] = true;

// console.log('map', resolved)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove

Co-authored-by: liuxingbaoyu <30521560+liuxingbaoyu@users.noreply.github.com>
@mkg20001
Copy link
Author

mkg20001 commented Mar 2, 2023

Sorry I didn't fully understand what you mean, can we use similar usage? esbuild-kit/esm-loader#usage

we already do, because we also use environment variables to configure. we can autoload .babelrc like babel-node, but maybe we can just extend babel-node to support esm aswell.

edit: babel-node is the tool that would have a usage most similar to esm-loader's cli

if (type) return type;
// Otherwise, (if not at the root) continue checking the next directory up
// If at the root, stop and return false
return dir.length > 1 && getPackageType(resolvePath(dir, ".."));
Copy link
Author

@mkg20001 mkg20001 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is likely going to break on windows, since dir will end up being d: or something. is there a function to detect if we have a root folder for a certain drive letter? or just dir.split(path.sep).length === 1?

edit: otherwise this will loop forever, calling path.dirname and expecting it to return just "/" eventually

}
}

export async function resolve(specifier, context, nextResolve) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we handle experimental network imports? (aka http(s):// as import path)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on why we need a resolve hook? Babel should only care about transforming the already resolved files, delegating to Node.js for the actual resolution.

This is what already happens with the require hook, where we only hook into the loading process and not in the resolution process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary. According to my understanding, esm-loader allows multiple and sequential processing, and https has another loader to provide support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't think we should really hook into how resolve behaves. The Node.js docs provided an example coffeescript ESM loader: https://nodejs.org/api/esm.html#transpiler-loader, maybe we can start from that example.

const ext = extname(url);
if (ext === ".cjs") return "commonjs";
if (ext === ".mjs") return "module";
const isFilePath = !!ext;
Copy link
Author

@mkg20001 mkg20001 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is from the docs and is based on the simple assumption that directories don't have file extensions, i guess we should stat here (and maybe cache the stat call)... but since the call below is perfectly fine with the file being not found it may be not necesarry

@mkg20001
Copy link
Author

mkg20001 commented Mar 2, 2023

how is a file with .cjs extension supposed to be handled by babel? we can autodetect the type and pass it along as parameter to the worker. currently it just detects it as a module because that's the result we get from the worker after transformation.

we need to get the type from the worker after transformation, as the code might be transpiled to commonjs (in the case of the loader this would mean that we'll return commonjs as the type and node then uses the pirates hook to load the file)

};

if (env.BABEL_REGISTER) {
const { default: opts } = await import(pathToFileURL(env.BABEL_REGISTER));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this breaks on node 12

/* eslint-disable import/extensions */

// this is a workarround to make sure global._BABEL_ESM_REGISTER is set to true before anything else gets loaded
import { readFile, access } from "node:fs/promises";
Copy link
Member

@liuxingbaoyu liuxingbaoyu Mar 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodejs/node#37583
nodejs/node#38006
I personally prefer to use the sync version, because it is more convenient to use and there is no obvious performance loss.
NOTE: This is just my opinion, I'm not sure what other reviewers think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The async version allows Node.js to call out loader with multiple files in parallel, but if the promise-based version is so much slower we could use the callback-based version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be little difference in the new version of nodejs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I went with not using it because I thought existsSync was deperacted. turns out only exists is, not existsSync. But having it async if it doesn't make a difference speed-wise is better then.

Comment on lines +93 to +100
async function exists(path) {
try {
await access(path, constants.R_OK);
return true;
} catch (error) {
// do nothing
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for existsSync. Node.js had an asynchronus exist API, that had been deprecated because checking if a file exists was effectively a sync operation anyway.

@mkg20001
Copy link
Author

mkg20001 commented Mar 6, 2023 via email

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Mar 6, 2023

Files that are modules could be transformed into commonjs, the user might
want to do things like not specify a file extension (as is common in
typescript) and resolve needs to account for that, etc.

I'd prefer to not to this (especially by default), and instead stick to the default semantics of Node.js. If users want different behaviors, they can use multiple loaders for that. Babel should only handle transpilation.

@mkg20001
Copy link
Author

mkg20001 commented Mar 6, 2023

Per spec things first go through resolve and that then returns whether the file is commonjs or a module (with load only being called for format=module), so we need to return it there anyways. It is planned to allow CJS loading with experimental loaders eventually. We need our own resolve as nodeJS's resolve won't resolve for example a .ts file, because that's not an extension it recognizes as valid. It'll just throw ERR_UNKNOWN_FILE_EXTENSION.

// loader. Avoiding the need for a separate CommonJS handler is a future
// enhancement planned for ES module loaders.

https://nodejs.org/api/esm.html#transpiler-loader

But that aside, typescript can be compiled to both cjs and esm both under the .ts extension, depending on which settings are used. Say you have a bunch of typescript in a package that is "type": "module", that is where that code is really necesarry. esm-loader has it's own resolve function aswell. https://github.com/esbuild-kit/esm-loader/blob/develop/src/loaders.ts#L98 Also, typescript can be written with import statements, but then end up being compiled to cjs in a package that isn't "type": "module". (Typescript is just a good example, it applies to pretty much everything)

I'd be open to taking the esm-loader logic as a base or having something like pirates but for esm that handles the more complicated parts and simply gives us files to transpile, but until that exists we need our own resolve logic. I tried to re-use as much as possible directly from the node internals, ending up with what is currently this loader's resolve function. Ain't pretty but it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i: discussion pkg: register PR: New Feature 🚀 A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants