Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Introduce scheme to npm dedupe npm dependencies of Atom packages #8217

Closed
bolinfest opened this issue Aug 4, 2015 · 11 comments
Closed

Introduce scheme to npm dedupe npm dependencies of Atom packages #8217

bolinfest opened this issue Aug 4, 2015 · 11 comments

Comments

@bolinfest
Copy link
Contributor

Currently, the organization of the ~/.atom/packages directory makes it impossible to dedupe dependencies, potentially causing a lot of bloat.

Specifically, in the case of Nuclide, we have our nuclide-installer package that is used to install our collection of Nuclide packages. In the most recent release of Nuclide (v0.0.26), there are 17 such packages to install: https://github.com/facebooknuclideapm/nuclide-installer/blob/v0.0.26/lib/config.json.

The way nuclide-installer works is that it starts up and then runs apm install for each package in that config.json file. On my machine (where ~/.npm was already populated with all of the dependencies used by these packages), it took nuclide-installer 6 minutes to install all 17 Atom packages and the resulting ~/.atom/packages directory was 903M! That's enormous! This is bad for several reasons:

  • 903M is an order of magnitude more disk space than is necessary if dependencies could be de-duped.
  • The installation process is much slower than it has to be.
  • Much more JavaScript code has to be loaded at runtime.
  • I suspect that the JavaScript in ~/.atom/packages is not loaded as efficiently as the JavaScript in /Applications/Atom.app/Contents/Resources/app.asar, putting third-party code at even more of a disadvantage.

I suspect that the only thing I can do right now without any changes to Atom is have ~/.atom/nuclide-installer create a special directory structure under itself. Maybe something like:

~/.atom/nuclide-installer/node_modules/
~/.atom/nuclide-installer/node_modules/atom_packages/

where npm packages common to multiple Nuclide Atom packages would be written to ~/.atom/nuclide-installer/node_modules/ (obviously there could only be one version per common dependency) and then each Atom package would be in its own directory under ~/.atom/nuclide-installer/node_modules/atom_packages/. Under that scheme, only an npm dependency that already had an entry under ~/.atom/nuclide-installer/node_modules/ would also have to exist under ~/.atom/nuclide-installer/node_modules/atom_packages/<ATOM PACKAGE>/node_modules if a different version were required.

Once everything was in place, nuclide-installer would run apm link for each entry in ~/.atom/nuclide-installer/node_modules/atom_packages.

While I'm rubber ducking, I would probably add another directory of indirection so that the directory structure would be parameterized by version number:

~/.atom/nuclide-installer/v0.0.26/node_modules/
~/.atom/nuclide-installer/v0.0.26/node_modules/atom_packages/

This way, nuclide-installer could be writing out v0.0.27 and would not switch over until all of the files were in place.

This is really important for Nuclide, so I'm willing to hack this into nuclide-installer, but it would be nice if there were a way for Atom to support this natively so that others could also provide package bundles efficiently for Atom.

@anaisbetts
Copy link
Contributor

I suspect that the JavaScript in ~/.atom/packages is not loaded as efficiently as the JavaScript in /Applications/Atom.app/Contents/Resources/app.asar, putting third-party code at even more of a disadvantage.

This is definitely true, especially on Windows, calling CreateFile on like 82023402343x JS files is pretty bad for perf

@nathansobo
Copy link
Contributor

We could de-dupe modules common to all packages by storing them in ~/.atom/packages/node_modules. Of course, this will only work when all packages share a common version. I don't see a great way of automatically grouping packages into directories as you're doing for the special case of Nuclide, however.

This really makes me wonder about doing something that isn't based on the file system in terms of reusing required modules across the entire application. The problem with such a change is it breaks the simple semantics of Node's require system, though we've already gone a step toward that with asar bundles.

@bolinfest
Copy link
Contributor Author

Hmm, if we had:

~/.atom/packages/hyperclick

as a symlink to (that was created via apm link):

~/.atom/packages/nuclide-installer/v0.0.27/node_modules/atom_packages/hyperclick

would it look in ~/.atom/packages/nuclide-installer/v0.0.27/node_modules for dependencies? Maybe we would have to use ~/.atom/packages/node_modules or ~/.atom/node_modules after all.

The thing I dislike about that is that while in the process of going from v0.0.26 to v0.0.27, the versions of our nuclide- packages in ~/.atom/node_modules would not match what was needed by the Atom packages during that transition. If that process fails, then things are in a bad state.

@bolinfest
Copy link
Contributor Author

@nathansobo Is there any way that third-party packages can leverage asar bundles to get some sort of benefit?

@nathansobo
Copy link
Contributor

@nathansobo Is there any way that third-party packages can leverage asar bundles to get some sort of benefit?

I haven't investigated but it sounds promising. @kevinsawicki would have more information about what it would take to create an asar bundle for multiple or individual packages.

@methodbox
Copy link

Why don't you check for an existing npm package that matches during your install? If the version check passes, use the existing module, if not run apm install.

I've also run into similar issues where I need to use different versions of some packages and the same on others. I created a symlink between folders so the program didn't know it was checking another folder.

@bolinfest
Copy link
Contributor Author

@methodbox The problem is that if you are using another package's npm dependency, and that package upgrades itself, it could break you, no?

@UnsungHero97
Copy link

any updates?

@dhaupin
Copy link

dhaupin commented Dec 30, 2015

Not trying to beat a known horse (src install should work, right), but I'm running into install time far longer than 6 mins due to capped I/O when trying nuclide-installer. Although it hasn't fully installed yet, its happening on Windows 10 via Atom GUI. I started the install at aprx ~10:20am, and now it's a little over halfway done at ~11:40am. Considering its mentioned that there are only 17 packages within that meta plugin, this seems wonked.

"Evented I/O for V8 JavaScript (32 bit)" seems to be the culprit using 100% disk. Im seeing it cap out above 6 MB/s with a sustained CPU between 30-80%. Considering this is a spinner drive (100+gb free), its very much affecting the performance of the whole PC.

Albeit it non SSD, this is a 64bit 8gb 3k machine, it's not like it's too underpowered to run an editor. I concur it's all the files in nuclide-installer meta, although perhaps Evented I/O running 32bit instead of 64bit could compound that effect.

evented

EDIT: after installing, Atom crashes out totally spikes I/O way higher peaking 15 MB/s

@stale
Copy link

stale bot commented Aug 29, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Sep 12, 2017
@lock
Copy link

lock bot commented Mar 30, 2018

This issue has been automatically locked since there has not been any recent activity after it was closed. If you can still reproduce this issue in Safe Mode then please open a new issue and fill out the entire issue template to ensure that we have enough information to address your issue. Thanks!

@lock lock bot locked and limited conversation to collaborators Mar 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants