Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Code splitting on async import() statements. #16

Open
tracker1 opened this issue Feb 17, 2020 · 65 comments
Open

[Feature] Code splitting on async import() statements. #16

tracker1 opened this issue Feb 17, 2020 · 65 comments

Comments

@tracker1
Copy link

Support code splitting on dynamic import() statements, and additionally split/join on shared bundles for shared dependency models.

@evanw
Copy link
Owner

evanw commented Feb 17, 2020

This is definitely something I plan to get to because I want to be able to use it myself. Right now import(path) turns into Promise.resolve().then(() => require(path)) so dynamic imports still "work" although they don't result in additional bundles. In the future it will generate separate bundles. I may also add support for common chunk and/or more advanced shared dependency analysis.

@tracker1 tracker1 changed the title Code splitting on async import() statements. [Feature] Code splitting on async import() statements. Feb 18, 2020
@jpmaga
Copy link

jpmaga commented Apr 16, 2020

@evanw do you have any kind of roadmap somewhere for esbuild? I am particularly interested in this feature, and would be cool to know where it is in terms of planning. Cheers.

@andrewvarga
Copy link

This would be awesome to have!

@evanw
Copy link
Owner

evanw commented May 17, 2020

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

@jpmaga
Copy link

jpmaga commented May 18, 2020

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

Damn! You're the man. This is the only thing I am missing to start using it in production, in smaller projects for starters, and see how it goes. PS: Have tested a couple locally, without code splitting, and everything worked flawlessly, even in one with a fairly large codebase using react and typescript. 👍

@ponsifiax
Copy link

Hello here,
Do you have any news about this?
That the last feature to use it on production 👍

@evanw
Copy link
Owner

evanw commented Jun 23, 2020

Do you have any news about this?

It's mostly working already. The chunk splitting analysis has already landed. All that's left is to bind imports and exports across chunks. I'm working on that in a branch and this will be my main focus soon.

@evanw
Copy link
Owner

evanw commented Jun 30, 2020

I just released version 0.5.15 with an experimental version of code splitting. See the release notes for details. It's still a work in progress but it's far enough along now that it's ready for feedback. Please try it out and let me know what you think.

@garygreen
Copy link

Excellent news! Thank you for all your hard work on this Evan. Code splitting was vital for us. Does this code splitting feature split css imports into seperate files and add at runtime? Simple CSS support is the next main thing we are eaglerly looking forward to.

@evanw
Copy link
Owner

evanw commented Jun 30, 2020

Simple CSS support is the next main thing we are eaglerly looking forward to.

You and me both! CSS support is currently the next major feature I want to implement after code splitting. That’s tracked by a separate issue, however: #20.

@matthiasg
Copy link

Works really well in initial testing. We will test more complicated setups (rush repo, nested pnpm deps) more fully in the next weeks

@evanw
Copy link
Owner

evanw commented Jul 1, 2020

That's great to hear! Thanks so much for trying it out.

@evanw
Copy link
Owner

evanw commented Jul 29, 2020

I have a small progress update on code splitting. From the release notes for the upcoming release (not out yet):

Code that is shared between multiple entry points is separated out into "chunk" files when code splitting is enabled. These files are named chunk.HASH.js where HASH is a string of characters derived from a hash (e.g. chunk.iJkFSV6U.js).

Previously the hash was computed from the paths of all entry points which needed that chunk. This was done because it was a simple way to ensure that each chunk was unique, since each chunk represents shared code from a unique set of entry points. But it meant that changing the contents of the chunk did not cause the chunk name to change.

Now the hash is computed from the contents of the chunk file instead. This better aligns esbuild with the behavior of other bundlers. If changing the contents of the file always causes the name to change, you can serve these files with a very large max-age so the browser knows to never re-request them from your server if they are already cached.

Note that the names of entry points do not currently contain a hash, so this optimization does not apply to entry points. Do not serve entry point files with a very large max-age or the browser may not re-request them even when they are updated. Including a hash in the names of entry point files has not been done in this release because that would be a breaking change. This release is an intermediate step to a state where all output file names contain content hashes.

The reason why this hasn't been done before now is because this change makes chunk generation more complex. Generating the contents of a chunk involves generating import statements for the other chunks which that chunk depends on. However, if chunk names now include a content hash, chunk generation must wait until the dependency chunks have finished. This more complex behavior has now been implemented.

Care was taken to still parallelize as much as possible despite parts of the code having to block. Each input file in a chunk is still printed to a string fully in parallel. Waiting was only introduced in the chunk assembly stage where input file strings are joined together. In practice, this change doesn't appear to have slowed down esbuild by a noticeable amount.

@matthiasg
Copy link

@evanw Thanks a lot for this detailed write-up ! This is the kind of information required for using a tool such as this.

@evanw
Copy link
Owner

evanw commented Aug 11, 2020

Another code splitting update:

I finally got around to implementing per-chunk symbol renaming, which I view as required for the code splitting feature. I've made several attempts at this in the past but I haven't landed them because I don't want to severely regress performance (or memory usage, which I've started to also pay attention to). I finally figured out a good algorithm for doing per-chunk symbol renaming that's fast and parallelizable while not using too much memory. It's actually two algorithms, one when minifying and a different one when not minifying.

From the release notes:

Previously, bundling with code splitting assigned minified names using a single frequency distribution calculated across all chunks. This meant that typical code changes in one chunk would often cause the contents of all chunks to change, which negated some of the benefits of the browser cache.

Now symbol renaming (both minified and not minified) is done separately per chunk. It was challenging to implement this without making esbuild a lot slower and causing it to use a lot more memory. Symbol renaming has been mostly rewritten to accomplish this and appears to actually usually use a little less memory and run a bit faster than before, even for code splitting builds that generate a lot of chunks. In addition, minified chunks are now slightly smaller because a given minified name can now be reused by multiple chunks.

@guybedford
Copy link
Contributor

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here. I'm sure it will make sense looking at the outputs too though of course.

@evanw
Copy link
Owner

evanw commented Aug 11, 2020

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here.

I just wrote up some documentation about the parallel symbol minification algorithm here.

The non-minified symbol renaming algorithm isn't described in the docs yet but it's pretty simple. Just rename symbols to avoid collisions by appending an increasing number to the name until there's no longer a collision. Each symbol will need to check for collisions in all parent scopes. Symbols in top-level scopes must be renamed in serial but symbols in nested scopes can be renamed in parallel.

@mtsewrs
Copy link

mtsewrs commented Oct 12, 2020

@evanw Do you plan on supporting code splitting with other formats apart from esm?

@evanw
Copy link
Owner

evanw commented Oct 12, 2020

@evanw Do you plan on supporting code splitting with other formats apart from esm?

Yes, that's why this issue is still open. However I want to fix issues with the current esm code splitting first: #399.

@DanielHeath
Copy link

If the file contents are included in the hash, does that imply that circular references cannot be built (since each file contains a reference to another)?

Or is the hash calculated before rewriting the imported filenames?

@evanw
Copy link
Owner

evanw commented Nov 25, 2020

does that imply that circular references cannot be built (since each file contains a reference to another)?

Yes, code splitting currently generates an acyclic module graph.

The current automatic code splitting algorithm makes sure that a) a given piece of code only ever lives in one chunk and b) a given entry point doesn't import any code that it won't use. This means it generates one chunk for each unique overlap of entry points. So if there are three entry points A, B, and C, that means there could potentially be up to 7 chunks: A, B C, A+B, A+C, B+C, and A+B+C. The chunk for A would only include code accessible by A but not by B or by C, the chunk A+B includes all code accessible by A and B but not by C, and the chunk A+B+C is for all code that is used by all entry points. Because of this structure, cyclic imports are not ever generated by construction. Two chunks wouldn't ever need to import each other because if they do reference each other, they would be considered a connected component in the graph and would have been written out as part of the same chunk.

This automatic algorithm was a good experiment but it has some drawbacks. The main drawback is just that it's automatic. Many people want to have control over the algorithm in various ways. With many entry points, I'm sure you can see how the current algorithm can potentially generate a lot of chunks due to the combinatorial explosion of overlaps. People familiar with ESM have said that this is fine since the browser can handle a lot of chunk files (>100). Other people are turned off by the idea of having lots of generated chunks and have been requesting manual control over chunk files. Potentially people are just more used to fewer chunks from Webpack setups with manual chunk generation and lack of HTTP/2. I'm not sure what to think about the trade-offs between these approaches because I haven't done extensive performance analysis myself.

To implement manual chunk assignment you would two things:

  1. You would need the ability to include code in the bundle that's guaranteed to never be used.

    For example, people may want to direct esbuild to turn a whole library into a single chunk even though not all of that library is used by all entry points. This will result in dead code. Right now this is impossible because esbuild's tree shaking algorithm automatically removes dead code. I'm currently designing a different linking model that will allow for keeping dead code while still keeping most of the benefits of ESM's static binding. It involves making module execution lazily-evaluated while still keeping module binding eagerly-evaluated. I'm not sure if this approach will work out but it seems hopeful.

  2. You would need the ability for chunks to potentially participate in an import cycle.

    Manual chunk assignment means esbuild can't generate an acyclic graph since code in a connected component may have multiple different manual chunk labels. I can think of two ways of linking cyclic chunks together. One way is to use dummy text for import paths, calculate all of the file hashes, then swap the dummy text for the real import paths. The file hashes will be "wrong" in that they won't be a hash of the ultimate file contents, but presumably it'd still be ok for cache invalidation as long as you mix in the hashes of all files involved in a cycle with each other. The other way is to pull out the hashes into an import map. That adds a level of indirection between the import paths and the actual hashed file names. It can lead to better caching because changing a dependency doesn't involve also changing the dependents, but import maps aren't a part of the web platform yet so this approach is presumably not viable for a while.

That's where my thinking is at the moment. I'm currently in the design phase for the next version of code splitting. The next iteration should hopefully finish the code splitting feature. I want to address the current known import ordering bug, get code splitting working for the cjs and iife formats, and potentially also implement manual chunk assignment. And it'd be really great to do top-level await too, although I may punt on that.

Edit: part of why I'm posting this is that I'm curious what people think about the path embedding approach vs. the import map approach.

@pft
Copy link

pft commented Nov 3, 2021

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:

File a.json:

{ "x-y": "foo" }

File imp.js:

const getJSON = () => import("./a.json");
getJSON();

Build an esm bundle with splitting:

[user@dom0 ~]$ esbuild --splitting --bundle --format=esm --outdir=app imp.js

Output:

File app/a-ZAVKVQOM.js:

// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

File app/imp.js:

// imp.js
var getJSON = () => import("./a-ZAVKVQOM.js");
getJSON();

I propose to simply not try and export those fields separately.

By the way, the JSON module proposal does not do named exports at all for JSON files, precisely because of this reason (and because it's conceptually a single thing); this reasoning is at the bottom of that page.

@evanw
Copy link
Owner

evanw commented Dec 14, 2021

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:

// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

This is perfectly valid JavaScript. It uses a new JavaScript syntax feature called Arbitrary Module Namespace Identifiers. I can understand the confusion because this feature was somehow added even though it bypassed the TC39 proposal process, and was therefore not ever really announced despite being a significant addition to the language. But it has already been added to the ECMAScript specification and support for it has shipped in Chrome 90+, Firefox 87+, and node 16+. It's a real JavaScript language feature. As with all new JavaScript language features, you need to make sure to set esbuild's --target= setting appropriately to tell esbuild to not use syntax features that are newer than what your target environment supports. For example, if you pass --target=node14 the x-y export will not be generated.

I propose to simply not try and export those fields separately.

It's true that this is a bundler-specific extension, and not part of a standard. Node doesn't behave this way for example. But it's a useful extension because it lets you import specific fields from the JSON file without importing the whole thing. For example, you can import { version } from './package.json' and all fields except version will be tree-shaken away. With the Arbitrary Module Namespace Identifiers feature you can also import { 'x-y' as x_y } from './a.json' if you need to.

@pft
Copy link

pft commented Dec 17, 2021

Thanks for clarifying @evanw, about this new spec and how to deal with it if the intended environment does not support it.

One question though: In dynamic imports, there is no syntax to import stuff like that, or am I missing something?

@hyrious
Copy link

hyrious commented Dec 17, 2021

@pft Dynamic imports:

var { "x-y": x_y } = await import("./b.mjs")
console.log(x_y)

@pft
Copy link

pft commented Dec 17, 2021

@hyrious Wow. And, just for completeness sake, this works too:

import("./a.js").then(({"x-y": x_y}) => console.log(x_y));

@hyrious
Copy link

hyrious commented Dec 30, 2021

@pablo-mayrgundter You need to use a newer target which supports dynamic import (import()), simply edit that field to esnext or something else.

@The-Code-Monkey
Copy link

@evanw i have a different use-case from above, i have this dynamic import

const getIcon = (name) => {
  return lazy(() => import(`./icons/${name}`));
};

problem is that esbuild is bundling all the icons into the index.js file rather than keeping them in a separate file. is it possible to say this should stay as it is.

@weilandia
Copy link

weilandia commented Aug 18, 2022

Trying to track down a recent update on "Code splitting is still a work in progress. It currently only works with the esm output format. There is also a known ordering issue with import statements across code splitting chunks."

Is a fix for these issues planned?

@mattfysh
Copy link

mattfysh commented Oct 9, 2022

I switched to code splitting on dynamic imports but now I've come across a strange bug, has anyone seen this before?

const { parse } = await import('css-what')
parse(selector)

css-what has both ESM and CJS code internally as well as both main and module package.json entries pointing to the relevant file, but for some reason esbuild is using the CJS code instead of ESM. When I edit the package.json and point main to the same place as module, the code starts working again.

I don't think this is a bug with css-what, but I could be wrong

@hyrious
Copy link

hyrious commented Oct 9, 2022

@mattfysh If you're using node.js to run this code, then this is because node.js ignores the module field. More details at doc: main-fields. You can import the ESM file directly with:

import {parse} from "css-what/lib/es/index.js"

If you're using esbuild to bundle the code with --platform=node, the reason is the same because esbuild is trying to behave the same as node.js. You can try to add --main-fields=module,main to your build script.

For package authors: If you really want users to use the native ESM way to use your package, at least do this:

"exports": {
	"node": {
		"import": "./dist/index.mjs",
		"require": "./dist/index.js"
	},
	"default": "./dist/index.mjs"
}

More details at doc: how-conditions-work.

@mattfysh
Copy link

mattfysh commented Oct 9, 2022

thanks @hyrious - the fix for my case was to use the --main-fields flag, thanks! One other thing I've noticed is the size of my output directory is much larger, I'm guessing that no tree shaking occurs when using code splitting and dynamic imports?

@eamodio
Copy link

eamodio commented Mar 4, 2023

Is there any way to control the code-splitting to only look at async/dynamic imports?

@evanw
Copy link
Owner

evanw commented Mar 5, 2023

If you bundle each entry point separately, then entry points won’t share any code.

@eamodio
Copy link

eamodio commented Mar 5, 2023

If you bundle each entry point separately, then entry points won’t share any code.

Not sure I fully understand that, but I tried setting up separate entry points for a couple of my dynamic import() calls and the main entry point still fully bundled everything, and then it created separate bundles for those imports (but they weren't used).

And if I try to use splitting then I still get WAY too many spit files

@JounQin
Copy link

JounQin commented Apr 30, 2023

cjs can also use import(path) inside, so I'm wondering why splitting can only been enabled with esm format, I'm saying that even with cjs format output option, the dynamic chunks can still be esm. Of course, correct extensions (.cjs vs .mjs) should be applied in this case.

@evanw

@mxdvl
Copy link

mxdvl commented Apr 30, 2023

If you bundle each entry point separately, then entry points won’t share any code.

And if I try to use splitting then I still get WAY too many spit files

One important caveat is that if you're importing entry points dynamically, you need to make sure that they are marked as external.

For example, if indicate dynamic imports, and your entry points are A, D & C:

A → B → C
D → C
C

You would need to make sure that your onResolve callbacks marks dynamic imports of A, D & C as external.

@Murtatrxx
Copy link

Is this still WIP?

@millsp
Copy link

millsp commented Nov 17, 2023

Initially meant to be posted here #1341

In my case, I needed to bundle dependencies and chunk them, while keeping the output files separate (not one big bundle). All that works except that only format: 'esm' is currently supported, so I wrote a plugin to transpile to CJS again 🙃.

Definitely not ideal, I can live with it.

export const esmSplitCodeToCjs: esbuild.Plugin = {
  name: 'esmSplitCodeToCjs',
  setup(build) {
    build.onEnd(async (result) => {
      const outFiles = Object.keys(result.metafile?.outputs ?? {})
      const jsFiles = outFiles.filter((f) => f.endsWith('js'))

      await esbuild.build({
        outdir: build.initialOptions.outdir,
        entryPoints: jsFiles,
        allowOverwrite: true,
        format: 'cjs',
        logLevel: 'error',
      })
    })
  },
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests