Skip to content

alexanderGugel/ied

Repository files navigation

ied

Travis npm Inline docs

__/\\\\\\\\\\\__/\\\\\\\\\\\\\\\__/\\\\\\\\\\\\____
 _\/////\\\///__\/\\\///////////__\/\\\////////\\\__
  _____\/\\\_____\/\\\_____________\/\\\______\//\\\_
   _____\/\\\_____\/\\\\\\\\\\\_____\/\\\_______\/\\\_
    _____\/\\\_____\/\\\///////______\/\\\_______\/\\\_
     _____\/\\\_____\/\\\_____________\/\\\_______\/\\\_
      _____\/\\\_____\/\\\_____________\/\\\_______/\\\__
       __/\\\\\\\\\\\_\/\\\\\\\\\\\\\\\_\/\\\\\\\\\\\\/___
        _\///////////__\///////////////__\////////////_____

An alternative package manager for Node.

  • Concurrent Installations - ied installs sub-dependencies in parallel. This means that the download of a dependency might have been completed before that of its parent or any of its siblings even started.

  • Correct Caching - Downloaded packages are being cached locally. Similarly to the entry dependencies stored in node_modules, they are being identified by their checksums. Therefore we can guarantee the consistency of the cache itself without (manually) invalidating dependencies (e.g. due to overridden version numbers).

  • node_modules as CAS - Packages are always being referenced by their SHA-1 checksums. Therefore a node_modules directory can be considered to be a Content Addressable Storage, meaning that packages are being identified by their contents, not by arbitrary identifiers, such as package names that are not guaranteed to be unique across different registries.

  • Flat node_modules - Due to the CAS-based design, conflicts due to naming collisions are more or less impossible. Therefore all dependencies can be stored in a flat directory structure. Circular dependencies and dependencies on different versions of the same packages are still being handled correctly.

  • Guaranteed uniqueness - Since the directory in which a specific package is being stored is determined by its shasum, identical packages can't conflict due to their location in the file system itself. This also means that the same dependency won't be installed more than once. Dependencies don't need to be explicitly declared as peerDependencies, since shared sub-dependencies are the default, not an option.

  • Atomic installs - The atomicity of installs can be ensured on a package-level. "In progress" downloads are being stored in node_modules/.tmp and moved into node_modules once their download has been completed. In order to prevent deadlocks, packages that have circular dependencies are exempt from this limitation. In most cases however, the node_modules directory is consistent at any given point in time during the main installation procedure.

  • Package names as links - While packages are being referenced by their shasum internally, they can still be required via their human-readable equivalent name. Package names themselves are simply symbolic links to the actual content-addressed package itself. A nice side-effect of this design is that in contrast to other package managers, you can not accidentally require a sub-dependency that hasn't been installed as such.

  • Semantic Versioning - Semantic version numbers are being resolved correctly.

  • Arbitrary package groups - Packages can be grouped into "package groups", such as dependencies and devDependencies. Dependencies can be installed exclusively based on the group they are in.

Internals

Under the hood, ied maintains an "object database", similar to git. Instead of storing packages by some arbitrary name, a SHA1-checksum is being generated to approximate their contents. The checksums can not only be used for guaranteeing a certain level of trust and consistency, but they also simplify the algorithm through which dependencies are being managed.

The algorithm through which packages are being installed guarantees consistency through atomic installs. The installation of a package either fails or succeeds, but at no point in time can a dependency itself be required without having its own sub-dependencies installed (with the exception of shared circular dependencies).

The checksum of a package is based on the contents of the package itself, not of its sub-dependencies. Therefore the validity of a package can be verified by hashing the package itself. Subsequent dependency updates have no effect of the generated checksum.

Since node_modules is essentially a file-system based content addressable storage, multiple versions of the same package can co-exist in the same project. In order to expose dependencies via CommonJS, symbolic links are being created that reference a specific version of the package. This has multiple advantages:

  1. Undeclared dependencies that have been installed as sub-dependencies of "direct" dependencies are unlikely to be required "accidentally".

  2. There is no need to "manually" (as in additionally to the installation procedure itself) de-duplicate the dependency graph. As long as the uniqueness of filenames itself can be guaranteed on an OS-level, it is impossible to install the same package twice. This does not prevent users from installing different versions of the same dependency as long as the content is different (whereas a different version declared in the package.json counts as different contents).

  3. Shorter pathnames and less problems due to OS-level limitations (as in Windows where the maximum path length is limited).

  4. Additional application-level startup performance improvements. require needs to traverse less directories. A limited number of symbolic links need to be followed. This performance improvement is primarily useful for continuously running tests, where startup time is actually noticeable for larger test suits.

Directory Structure

The used directory structure is primarily optimized for reducing the amount of IO interaction with the file system during subsequent installations and guaranteeing the consistency of installed packages.

A consequence of the require.resolve algorithm used by Node, all packages need to be stored in a project-level node_modules directory. This directory is completely flat on a package-level, meaning that there are no nested packages inside it.

Instead each package is being stored in its content-addressed directory. Such a directory has two sub-directories:

  • package - This is where the unpacked package contents is being stored. At no point in time will this directory be modified. This enables us to verify the integrity of the package at a later point in time by comparing the actual checksum to the one defined by other dependents or registries.

  • node_modules - Sub-dependencies of the dependency installed in package are being referenced by symbolic links in node_modules of the package itself. require.resolve will fall-back to this level after failing to locate a dependency in package. This means checked in dependencies are still supported, provided that their sub-dependencies are also available (anywhere in the dependency graph).

On a project level, the node_modules directory contains the fetched packages, installed dependencies and links that expose the packages to user-land via require.

A comparison of sample directory structures produced by ied, npm 2 and npm 3 is available as a GitHub Gist.

Why?

The original idea was to implement npm's pre-v3 install algorithm in as few lines as possible. This goal was achieved in c4ba56f.

Currently the main goal of this project is to provide a more performant alternative to npm.

Installation

The easiest way to install ied is using npm:

npm i -g ied

Alternatively you can also "bootstrap" ied. After an initial installation via npm, ied will install its own dependencies:

git clone https://github.com/alexanderGugel/ied ied && cd $_ && make install

Usage

The goal of ied is to support ~ 80 per cent of the npm commands that one uses on a daily basis. Feature parity with npm other than with its installation process itself is not an immediate goal. Raw performance is the primary concern during the development process.

A global configuration can be supplied via environment variables. NODE_DEBUG can be used in order to debug specific sub-systems. The progress bar will be disabled in that case.

Although run-script is supported, lifecycle scripts are not.

At this point in time, the majority of the command API is self-documenting. More extensive documentation will be available once the API is stabilized.

A high-level USAGE help is also supplied. The main goal is to keep the API predictable for regular npm-users. This means certain flags, such as for example --save, --save-dev, --save-optional, are supported.

  ied is a package manager for Node.

  Usage:

    ied [command] [arguments]

  The commands are:

    install     fetch packages and dependencies
    run         run a package.json script
    shell       enter a sub-shell with augmented PATH
    ping        check if the registry is up
    config      print the used config
    init        initialize a new package
    link        link the current package or into it
    unlink      unlink the current package or from it
    start       runs `ied run start`
    stop        runs `ied run stop`
    build       runs `ied run build`
    test        runs `ied run test`

  Flags:
    -h, --help          show usage information
    -v, --version       print the current version
    -S, --save          update package.json dependencies
    -D, --save-dev      update package.json devDependencies
    -O, --save-optional update package.json optionalDependencies
    -r, --registry      use a custom registry
                        (default: http://registry.npmjs.org/)
    -b, --build         execute lifecycle scripts upon completion
                        (e.g. postinstall)

  Example:
    ied install
    ied install <pkg>
    ied install <pkg>@<version>
    ied install <pkg>@<version range>

    Can specify one or more: ied install semver@^5.0.1 tape
    If no argument is supplied, installs dependencies from package.json.
    Sub-commands can also be called via their shorthand aliases.

  README:  https://github.com/alexanderGugel/ied
  ISSUES:  https://github.com/alexanderGugel/ied/issues

Development notes

To run the test suite, run npm test. The test suite mocks all HTTP requests, with fixtures cached inside fixtures/generated/. If you make new tests that perform HTTP requests, it'll be saved there.

npm test

Credits

Some ideas and (upcoming) features of ied are heavily inspired by Nix, a purely functional package manager.

License

Licensed under the MIT license. See LICENSE.