Bundle/Include Dependencies #4210

Open
arei opened this Issue Nov 27, 2013 · 26 comments
@arei

Feature Description

Create a means within npm of generating a .tgz representation of a npm package AND all of its dependencies such that an npm install can be be done using the .tgz representation to install the package AND all of its dependencies.

This involves two separate and distinct aspects:

  1. Add an ability to npm to create a .tgz file that contains a single npm package in its uninstalled form and its entire dependency tree in their uninstalled forms. See below for additional details.

  2. Change npm install <tarball file> to understand that a <tarball file> may contains the package and the cache from which all dependencies should be installed. See below for additional details.

This is a rewrite of the closed ticket #3429.

Reason for Feature Request

The reason for this feature request is to provide a means for npm to be used to install a given package in an offline environment and without relying on the npm registry being available online. By using a bundled npm package tarball that contains the package and all necessary dependencies, npm is not forced to go out to the internet and find the dependencies required. This functionality is desirable for use in isolated/off-the-grid systems and for usage by package owners whom wish to distribute their package as a complete unit.

Consider this simple problem case: Install socket.io on a system that is 100% offline and not connected to the internet. (The target system has node/npm installed already.)

(In the process described below "LIVE" indicates a system with an internet connection and "OFFLINE" indicates a system without an internet connection.)

Your first attempt might go something like this:

  1. [LIVE] Create a new directory and change to it.
  2. [LIVE] Install socket.io using npm: npm install socket.io
  3. [LIVE] tar the node_modules directory.
  4. [LIVE] copy the tar to some removable media like a memory stick.
  5. [OFFLINE] move the memory stick to the remote offline system.
  6. [OFFLINE] copy the tar from the removable media.
  7. [OFFLINE] untar the node_modules

The problem with this is that this did not actually install socket.io on the remote system, but just duplicated the files of an installed socket.io. This ignores any special install process instructions that socket.io might want to perform on the system to which it is installed. In some cases it's good enough, but it's not the right answer.

Your next attempt might be to get the <tarball file> of socket.io and use that.

  1. [LIVE] Create a new directory and change to it.
  2. [LIVE] Get the 'socket.io' tarball using npm pack: npm pack socket.io
  3. [LIVE] copy the tar to some removable media like a memory stick.
  4. [OFFLINE] move the memory stick to the remote offline system.
  5. [OFFLINE] copy the tar from the removable media.
  6. [OFFLINE] Install socket.io from the tarball: npm install socket.io-0.9.16.tgz
  7. [OFFLINE] Stare confusedly at:
npm WARN optional dep failed, continuing redis@0.7.3
npm ERR! Error: connect ECONNREFUSED
npm ERR!     at errnoException (net.js:901:11)
npm ERR!     at Object.afterConnect [as oncomplete] (net.js:892:19)
npm ERR!  { [Error: connect ECONNREFUSED]
npm ERR!   code: 'ECONNREFUSED',
npm ERR!   errno: 'ECONNREFUSED',
npm ERR!   syscall: 'connect' }

npm is attempting to go out and get the dependencies from http://registry.npmjs.org which clearly will not work on our remote offline system.

Okay, so you can just get and install all the dependencies. Except there isn't a way to list the dependencies of an npm package from npm. The best way to get the dependency tree in node is to actually install the npm package, so let's do that.

  1. [LIVE] Create a new directory and change to it.
  2. [LIVE] Install socket.io using npm: npm install socket.io
  3. [LIVE] Make a note of everything that was installed: socket.io, socket.io-client/0.9.16, policyfile/0.0.4, base64id/0.1.0, redis/0.7.3, uglify-js/1.2.5, ws, xmlhttprequest/1.4.2, active-x-obfuscator/0.0.1, zeparser/0.0.5, commander, tinycolor, nan, and options
  4. [LIVE] For each item installed, go get the <tarball file> using npm pack. Don't forget to change the '/' to an '@' for packages that are given with a specific version.
  5. [LIVE] npm pack socket.io
  6. [LIVE] npm pack socket.io-client@0.9.16
  7. [LIVE] npm pack policyfile@0.0.4
  8. [LIVE] npm pack base64id@0.1.0
  9. [LIVE] npm pack redis@0.7.3
  10. [LIVE] npm pack uglify-js@1.2.5
  11. [LIVE] npm pack ws
  12. [LIVE] npm pack xmlhttprequest@1.4.2
  13. [LIVE] npm pack active-x-obfuscator@0.0.1
  14. [LIVE] npm pack zeparser@0.0.5
  15. [LIVE] npm pack commander
  16. [LIVE] npm pack tinycolor
  17. [LIVE] npm pack nan
  18. [LIVE] npm pack options
  19. [LIVE] copy the tars to some removable media like a memory stick.
  20. [OFFLINE] move the memory stick to the remote offline system.
  21. [OFFLINE] copy the tars from the removable media.
  22. [OFFLINE] Install socket.io from the tarball: npm install socket.io-0.9.16.tgz
  23. [OFFLINE] Stare confusedly at:
npm WARN optional dep failed, continuing redis@0.7.3
npm ERR! Error: connect ECONNREFUSED
npm ERR!     at errnoException (net.js:901:11)
npm ERR!     at Object.afterConnect [as oncomplete] (net.js:892:19)
npm ERR!  { [Error: connect ECONNREFUSED]
npm ERR!   code: 'ECONNREFUSED',
npm ERR!   errno: 'ECONNREFUSED',
npm ERR!   syscall: 'connect' }

Well, this is the same problem as before. Basically, there is not an easy way to tell npm to use all these .tgz files that we have, or is there?

npm does cache it's files and we can set that cache to somewhere, so maybe we can use the cache functionality.

  1. [LIVE] Create a new directory and change to it.
  2. [LIVE] Install socket.io using npm because this is the only way to get our dependency tree. Additionally, let us store our cache some where clean: npm --cache ./.cache install socket.io
  3. [LIVE] tar up the .cache folder.
  4. [LIVE] copy the tar to some removable media like a memory stick.
  5. [OFFLINE] move the memory stick to the remote offline system.
  6. [OFFLINE] copy the tar from the removable media.
  7. [OFFLINE] untar the tar, creating .cache in the process.
  8. [OFFLINE] Install socket.io from the .cache directory: npm install --cache ./.cache socket.io
  9. [OFFLINE] Wait. (This step will take a little while as npm tries to connect to the registry to see if newer versions are available. You can speed this up playing around with some of the npm config options regarding fetch.)

Success! A little convoluted and a bit of extra work, but a success. But I think we can do better...

Feature Breakdown

Goals for the Feature

  • Be able to install a package including all dependencies in an offline environment.
  • Minimize the objects that have to be moved (tar everything up).
  • Minimize the number of steps necessary to create and install.
  • Disable checking the registry for things that are already in the bundle.

Specifically, there are two aspects to this feature request: Bundling a Package into a .tgz that includes its dependencies, and using npm to install from said bundle.

Bundle Package

In this aspect of the feature the objective is to create a .tgz that contains the package and all the dependencies necessary to install that package and the other dependencies.

For a given package...

  • Build the dependency tree for the package including the dependency tree for each dependent package.
  • Download each item in the dependency tree.
  • Bundle all items into a .tgz.

Suggestions and additional things to keep in mind...

  • The .tgz file created should have the name and version of the initial package. Additionally the name might include something to indicate that this also bundles the dependencies. socket.io-0.9.16.withDependencies.tgz for example.
  • This could be it's own command like npm box socket.io or it could be a variant of npm pack like npm pack --includeDependencies socket.io or npm pack --all socket.io or npm packall socket.io.
  • One approach might be to simply insert the dependencies into the initial package .tgz and then make npm install <tarball file> look inside the tarball for the dependencies before going out to the registry.
  • Should include optional modules as if --optional was specified.

Install from Bundle

In this aspect of the feature the objective is to take a previously bundled .tgz file and install the package and all mecessary dependencies from said file.

From a given .tgz file...

  • Install the initial package from the .tgz.
  • Install all dependent packages from the .tgz
  • Fall back to searching the registry for packages when they are missing from a .tgz file.

Suggestions and additional things to keep in mind...

  • Allow for all the normal behavior of npm install such as global.
  • If a dependency exists in the .tgz and is of a valid version, it must be used.

Relevant Discussion/Links

This ticket was initially written up in #3429. I believe that #3429 was closed due to misunderstanding of the desired feature and/or the problem space. This rewrite is an attempt to clarify that misunderstanding.

npmbox is an attempt to provide an example of the desired functionality described in this feature request in a third party tool. However, npmbox is not a final solution because it depends on npmbox being first installed on the target system and thus we have a chicken and egg problem.

pac is somewhat related, but also not really the solution desired. It works by examining your package.json file and then downloading the tarball of any package listed there as a dependency into the .modules folder. While this might seem like a similar answer, it does not download the entire dependency tree nor does it provide a means to install the package or its dependencies.

@jackgill

I think I have something that will help with this: https://gist.github.com/jackgill/7687308

It's a script that installs a package, rewrites its package.json to copy dependencies to bundleDependencies, and then packs it. This creates a .tgz file which can be installed offline using npm. You would use it like this:

  1. [LIVE] node bundle.js socket.io
  2. [OFFLINE] copy socket.io-0.9.16.tgz to the offline machine via sneakernet
  3. [OFFLINE] npm install socket.io-0.9.16.tgz

The script is a rough cut, but it should be the right idea -- let me know if I'm missing something about your use case.

It would be nice to have a feature like this built into npm, since rewriting package.json for 3rd party modules seems sketchy.

@mdvanes

I would like to +1 this feature request. I've been using npm for different applications, but recently I've tried to install JSHint support for Sublime Text Editor on a virtual machine without internet access. Unfortunately, JSHint is only available through npm. I finally ran into npmbox, but I think it would make a great standard component of npm.

Likewise, the same might be applicable when running Grunt for your workflow on such a disconnected development machine.

@wwweaponizer

+1 I need this functionality.

@rtucker88

+1 this functionality would be great!

@medikoo

I don't think it's really needed.

Firstly if you need project to be installable offline, you should just install it, pack it with all it's dependencies and distribute it that way. Afterwards all that is needed to install, is to unpack and run npm rebuild in it's folder (it just recompiles packages that need to be compiled), no network needed.

I have many projects that I need to have to totally network independent, and have no issues with above workflow.

Just one thing that npm misses, is ability to install project but without compilation. Currently after install you need to make sure to exclude all build folders, to have clean bundle.

@badmadrad

+1 will be easy for secure enterprise environments to package things up into binary repos like nexus

@isaacs
npm member

So, basically, what you want is a npm pack that first installs and bundles all the deps, rather than creating a "publish-style" tarball like npm pack does.

Also, just for convenience, it should probably automatically ignore anything that looks like $pkgname-$version.tgz, so that if you run it multiple times you don't end up including every previous pack every time. (That really annoys me about npm pack anyway.)

This is a good and useful idea, and probably not that hard to work in, though it's not trivial. (See the code of npm pack to see how little it actually does. So, it really can't leverage that much.) Also, what to call it? Maybe something like npm fullpack or npm bundle or something?

@wilmoore

probably automatically ignore anything that looks like $pkgname-$version.tgz

Perhaps we can simply re-use .gitignore and/or .npmignore so we don't have to bake that magic into the tool.

...npm bundle or something

My vote would be for npm bundle.

@arei

@isaacs What I'm looking for is npm pack xyz that includes all the dependencies of xyz inside the produced .tgz file. And changes to npm install that when it sees a .tgz with included dependencies it uses those dependencies instead of going out to the internet to look for them. My module npmbox (npm install -g npmbox) is a good example of what I am trying to describe in this feature request but has the limitations I described above in the OP.

I personally like the idea of just adding a switch to npm pack like --all or --withDependencies instead of a new command, but I can go either way.

@badmadrad

I agree adding the switch to the existing npm pack with a --full or --all would suffice since it fits into the realm of what npm pack already does

@bachp

+1 for adding this.
It would also help to package node.js packages in OpenEmbedded / Yocto

@davidcl64

+1

I played around with something like this a bit - basically pointing the cache at a subfolder then installing. To install only off of the local cache, run install with the registry url unset so only the cache is used. While this generally works, one of the downsides is the cache structure is pretty verbose (both the expanded and tar'd versions of the libraries are there)

It basically looks like this:

// prime the cache
npm install --cache ./cache modulename

// install off the cache only
npm install --registry=false modulename --cache ./cache

Perhaps this is an approach that could work with minimal changes - either a replacement cache that just does tarballs or modifying the existing cache logic to store/work with a bit less data. Then add a nice command line option(s) to package all this logic up.

This wouldn't provide what @arei was looking for exactly (a module with its dependencies inside a single .tgz) but would result in a local only install option.

@vladikoff

Just in case anyone is looking for other solutions besides npmbox and pac, I started a project called Freight, which is a hosted server that helps you create bundles of node_modules and then quickly download / extract them from anywhere.

@arei

Looks like we're going to finally get some resolution, or at least the tools to build the resoltuion. http://blog.npmjs.org/post/91303926460/npm-cli-roadmap-a-periodic-update

@hekevintran

+1 This is a very good idea. I just want to add that it should not be necessary to include a package's dependency tree inside the package's tarball. An alternative would be to download the package and its dependencies all as individual tarball files and point the installer program to a directory of these tarballs.

In the Python world we have this feature in Buildout and Virtualenv. For example I can run Buildout online and tell it to download and install my packages, but at the same time also keep copies of the source tarballs in a directory. Then I can initiate an offline install of all the packages by running this command:

python bootstrap.py --find-links=./third_party/dist/

Another nice feature is if your Buildout config file does not list a particular dependency of a listed package, Buildout will print to the console these unlisted dependencies and you can copy and paste the list into your config file to make your listing complete.

@jsoref jsoref referenced this issue in apache/cordova-lib Aug 20, 2014
Closed

CB-7336 Fix cordova platform add blackberry10 #75

@othiym23 othiym23 changed the title from [Feature] Bundle/Include Dependencies to Bundle/Include Dependencies Sep 20, 2014
@keithchilders

I support the suggestion to bring back npm bundle and have it implement this behavior.

@schmod

Any recent progress on this front?

@grdryn

A builtin way to do offline installs would be great! 👍

@ralberts

Albeit not the best solution but I ended up creating a npm overlay that augments 'npm install' behavior to copy the final directory to a cache and then retrieving the cached directory when offline.

@foxxyz

I'm surprised this has been such a topic of debate. In many organizations there are closed intranets that can greatly benefit from this kind of functionality.

Python's package manager makes this incredibly easy:

On the online machine: pip install -r list_of_dependencies.txt --download /path/to/some/dir
On the offline machine: pip install -r list_of_dependencies.txt --no-index --find-links /path/to/stored/packages

@arei Thank you for your ongoing crusade in lobbying for this!

@majgis

I created the following package to behave like npm-pack, but it accepts the additional arguments of npm-install and includes dependencies in the output:
https://www.npmjs.com/package/npm-bundle

I propose that any official npm-bundle feature consider including the same arguments as npm install as it is it is pretty slick to npm-bundle something directly from the registry or a preexisting tarball.

One interesting issue I found when creating this package is that bundledDependencies is not recognized by npm v3 unless the --legacy-bundling option is used. Is this issue already documented somewhere?

@brndn4

+1 We have been relying on npmbox at my company for offline installs of back-office systems at customer sites. Would love to have this supported by npm.

@JosefJezek JosefJezek referenced this issue in StartPolymer/generator-startpolymer Jan 12, 2016
Open

Bundling of npm packages #3

@othiym23 othiym23 removed the ready label Jan 29, 2016
@axelfontaine

+1

(For anyone looking for a solution you can use today, npm-bundle by @majgis works great)

@73rhodes

It's definitely time for this. Please.

@kenberkeley
  1. In online environment, npm install --no-bin-link. You will have a entire flattened node_modules
  2. Then, bundle this flawless node_modules with tar / zip / rar / 7z etc
  3. In offline environment, extract the bundle, that's it

P.S node-pac is another option, but it can't deal with the packages which still need downloading something for installation.

@bachp bachp referenced this issue in imyller/meta-nodejs May 10, 2016
Closed

Not compatible with Yocto 2.1 #39

@othiym23

This is something that is highly relevant to @seldo's interests – having a better story for deployment is something that concerns a lot of the people at npm, Inc. thinking about deployment as a product. That said, it's not that clear where this fits on the product roadmap either for the company or for the CLI team, so I don't know when we'll work on this. There quite a few subtleties to getting this set up right (e.g. do we deal with cross-platform support for native modules? if so, how? or do we just deal with cross-compilation? or nothing at all and force an npm rebuild once it's deployed? etc), so I've removed patch-welcome and added needs-discussion. There will be more later from us, I'm sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment