Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Case Study] Improving how yarn and yarn plugins are distributed #2117

Open
andreialecu opened this issue Nov 12, 2020 · 7 comments
Open

[Case Study] Improving how yarn and yarn plugins are distributed #2117

andreialecu opened this issue Nov 12, 2020 · 7 comments
Assignees
Labels
case study Package compatibility report

Comments

@andreialecu
Copy link
Contributor

What package is covered by this investigations?

@yarnpkg/berry

Describe the goal of the investigation

Investigate moving away from webpack and completely into zip-file based bundling for yarn itself, and for plugins.

Investigation report

While experimenting with nodejs worker_threads it quickly became apparent than trying to work with them in the yarn codebase, or in plugins is unfeasible, because of the single-file philosophy enforced by webpack bundling.

There are a bunch of problems and inefficiencies with this approach:

  • Worker threads generally require putting the worker code in a separate file.
  • Additionally, webpack itself gets in the way and makes things difficult.
  • Every time yarn needs to be executed, the entire bundle is parsed, regardless of which code-paths will be actually relevant for the command being ran.
  • The bundle could be smaller.
  • Code is unnecessarily obfuscated.

I think it would be worth investigating the following approach:

  • Instead of bundling all the code to a single JS file, use yarn pack instead to bundle, and distribute plugins and yarn itself as a zip file.
  • Distribute a small wrapper that can read the yarn zip file and execute it (eg. run-yarn.cjs).
  • Distribute yarn itself as a zip bundle (yarn-berry.zip) of separate, unobfuscated, non-bundled files. Optionally tree-shaken in-place via something like https://rollupjs.org/guide/en/#outputpreservemodules
  • All plugins themselves would also be zip files, reducing support cost by making @yarnpkg/builder unnecessary.
  • By allowing plugins to be distributed as zip files, they could also be installed from a package registry by default, via the normal yarn dependency installation mechanism.
  • Workarounds for packing assets inside plugins and/or yarn itself will no longer be necessary. They can simply be files inside the plugin zip.

Benefits:

  • Decrease perceived bundle size for both yarn and yarn plugins.
  • Make it easier to use nodejs features like worker threads.
  • Increase performance by not having to parse a huge blob of JS for every yarn execution.
  • Reduce code obfuscation, and thus increase security.
  • Allows packing assets as normal files, no longer needing custom scripts to bundle them as base64 encoded brotli blobs.
@andreialecu andreialecu added the case study Package compatibility report label Nov 12, 2020
@arcanis
Copy link
Member

arcanis commented Nov 12, 2020

We support zip access for the packages because that's critical to the package manager model itself. By contrast, zip support for plugins would be "nice to have", but not critical. In this context, I'd prefer such "package bundle" initiative to wait until Node supports this kind of workflow by default (cf https://github.com/WICG/webpackage).

Worker threads generally require putting the worker code in a separate file.

They support data URLs. While it's impractical, you can bundle your worker into a JS string that you then encode into your "real" plugin bundle.

@andreialecu
Copy link
Contributor Author

While this is also important for plugins, it's not just for plugins.

The goal is to simplify the developer experience for both core yarn and for plugin developers, while also reducing repo size for end-users, and greatly improving yarn core performance both via parallelization, and partly by giving V8 less code to parse for common commands.

@arcanis
Copy link
Member

arcanis commented Nov 12, 2020

I have some experience on the Yarn core 🙂, and this will not help. At best it would decrease the startup time, but in a way that would lead to the millisecond speedups being immediately lost to the time it takes to boot the wasm libzip and access the files. To be a net positive in terms of perfs, it requires a cooperative interpreter, hence nodejs/node#1278.

That being said, the thread on Node is currently closed; I think it's a good time to reopen it / open a new one, if only to express that the ecosystem would now benefit from this in various aspects that didn't exist in 2016.

@andreialecu
Copy link
Contributor Author

Startup time is not really a big deal anyway, that's why I said partly 😆

Everything else is much more important.

@andreialecu
Copy link
Contributor Author

andreialecu commented Nov 12, 2020

Just for kicks, here's a benchmark using hyperfine:

I assume not much work is being done to get the version:

$ hyperfine 'yarn -v'

Benchmark #1: yarn -v
  Time (mean ± σ):     233.8 ms ±   4.9 ms    [User: 228.1 ms, System: 40.5 ms]
  Range (min … max):   228.2 ms … 245.6 ms    12 runs

This uses lodash from inside a zip file, as per default PnP behavior (so it boots libzip):

// index.js:
const add = require("lodash/add");
console.log(add(1, 1));
hyperfine 'node -r ./.pnp.js index.js'
Benchmark #1: node -r ./.pnp.js index.js
  Time (mean ± σ):      89.4 ms ±   2.4 ms    [User: 222.1 ms, System: 17.6 ms]
  Range (min … max):    84.9 ms …  94.6 ms    33 runs

Looks like simply running yarn -v has at least a 150ms extra startup delay.

@arcanis
Copy link
Member

arcanis commented Nov 12, 2020

I assume not much work is being done to get the version

More than you'd think, since we need to locate the configuration to find the yarnPath field, and potentially shell out to a secondary process.

@andreialecu
Copy link
Contributor Author

Indeed, I guess it was parsing both yarn 1 and 2 via the global yarn command. This should prevent v1 from interfering:

hyperfine '.yarn/releases/yarn-berry.cjs -v'
Benchmark #1: .yarn/releases/yarn-berry.cjs -v
  Time (mean ± σ):     167.0 ms ±   1.5 ms    [User: 180.3 ms, System: 17.8 ms]
  Range (min … max):   164.4 ms … 169.5 ms    18 runs

Still +80ms extra versus parsing .pnp.js (which is pretty large), and booting libzip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
case study Package compatibility report
Projects
None yet
Development

No branches or pull requests

2 participants