Why should we use pnpm?
2017-03-19 18:39:00 +0200
pnpm is an alternative package manager for Node.js. It is a drop-in replacement for npm, but faster and more efficient.
How fast? 3 times faster! See benchmarks here.
Why more efficient? When you install a package, we keep it in a global store on your machine, then we create a hard link from it instead of copying. For each version of a module, there is only ever one copy kept on disk. When using npm or yarn for example, if you have 100 packages using lodash, you will have 100 copies of lodash on disk. Pnpm allows you to save gigabytes of disk space!
Why not Yarn?
TBH, I was really disappointed when Yarn became public. I was heavily contributing to pnpm for several months and there was nowhere any news about Yarn. The info about its development was not public.
After a few days, I realized that Yarn is just a small improvement over npm. Although it makes installations faster and it has some nice new features, it uses the same flat node_modules structure that npm does (since version 3).
And flattened dependency trees come with a bunch of issues:
- modules can access packages they don't depend on
- the algorithm of flattening a dependency tree is pretty complex
- some of the packages have to be copied inside one project's node_modules folder
Furthermore, there are issues that Yarn doesn't plan to solve, like the disk space usage issue. So I decided to continue investing my time to pnpm, and with great success. As of now (March 2017), pnpm has all the additional features that Yarn has over npm:
- security. Like Yarn, pnpm has a special file with all the installed packages' checksums to verify the integrity of every installed package before its code is executed.
- offline mode. pnpm saves all the downloaded package tarballs in a local registry mirror.
It never makes requests when a package is available locally. With the
--offlineparameter, HTTP requests can be prohibited at all.
- speed. pnpm is not only faster than npm, it is faster than Yarn. It is faster than Yarn both with cold and hot cache. Yarn copies files from cache whereas pnpm just links them from the global store.
How is it possible?
As I mentioned earlier, pnpm does not flatten the dependency tree. As a result, the algorithms used by pnpm can be a lot easier! That's why it is possible that only 1 developer could keep pace with the dozens of contributors of Yarn.
So how does pnpm structure the node_modules directory, if not by flattening? To understand it, we should recall how did the node_modules folder look like before npm version 3. Prior to npm@3, the node_modules structure was predictable and clean, as every dependency in node_modules had its own node_modules folder with all of its dependencies specified in package.json.
node_modules └─ foo ├─ index.js ├─ package.json └─ node_modules └─ bar ├─ index.js └─ package.json
This approach had two serious issues:
- frequently packages were creating too deep dependency trees, which caused long directory paths issue on Windows
- packages were copy pasted several times when they were required in different dependencies
To solve these issues, npm rethought the node_modules structure and came up with flattening. With npm@3 the node_modules structure now looks like this:
node_modules ├─ foo | ├─ index.js | └─ package.json └─ bar ├─ index.js └─ package.json
For more info about the npm v3 dependency resolution, see npm v3 Dependency Resolution.
Unlike npm@3, pnpm tries to solve the issues that npm@2 had, without flattening the dependency tree. In a node_modules folder created by pnpm, all packages have their own dependencies grouped together, but the directory tree is never as deep as with npm@2. pnpm keeps all dependencies flat but uses symlinks to group them together.
-> - a symlink (or junction on Windows) node_modules ├─ foo -> .registry.npmjs.org/foo/1.0.0/node_modules/foo └─ .registry.npmjs.org ├─ foo/1.0.0/node_modules | ├─ bar -> ../../bar/2.0.0/node_modules/bar | └─ foo | ├─ index.js | └─ package.json └─ bar/2.0.0/node_modules └─ bar ├─ index.js └─ package.json
To see a live example, visit the sample pnpm project repo.
Although the example seems too complex for a small project, for bigger projects the structure looks better structured than what is created by npm/yarn. Let's see why it works.
First of all, you might have noticed, that the package in the root of node_modules is
just a symlink. This is fine as Node.js ignores symlinks and executes the realpath.
require('foo') will execute the file in
Secondly, none of the installed packages have their own node_modules folder inside their directories. So how can foo require bar? Lets have a look on the folder that contains the foo package:
node_modules/.registry.npmjs.org/foo/1.0.0/node_modules ├─ bar -> ../../bar/2.0.0/node_modules/bar └─ foo ├─ index.js └─ package.json
As you can see
- the dependencies of foo (which is just bar) are installed, but one level up in the directory structure.
- both packages are inside a folder called node_modules
foo can require bar, because Node.js looks modules up in the directory structure till the root of the disk. And foo can also require foo, because it is in a folder called node_modules (yep, this is what some packages do).
Are you convinced?
Just install pnpm via npm:
npm install -g pnpm. And use it instead of npm whenever you want to install something:
pnpm i foo.