Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gsoc Contributions NPM (VS master at moment of deadline) #1

Open
wants to merge 9 commits into
base: base-gsoc
Choose a base branch
from

Conversation

jellelicht
Copy link
Owner

@jellelicht jellelicht commented Aug 23, 2016

This PR shows the differences between my GSoC branch, and a copy of the master branch at the moment of the deadline.

What follows here is a copy of my closing up text to the guix-devel mailing list:

--- Mail starts here ---
Hello Guix,

In the last hours of GSoC, the time has come to report my progress, issues and
plans regarding my project. To reiterate, the goals of this project:

  • The ability to parse npm version data
  • An npm backend for guix import
  • Npm modules in guix
  • An actual build system for npm packages

When the project started, there was some code written by David Thompson that was
exactly what I needed to start on a node build system. For the importer of
things, I started to look at the gem importer; it seemed simple enough to grok,
while still offering the basic functionality I needed to get a running start.

To start of with something that did not work out as well as I had hoped, getting
a popular build system (e.g. Gulp, Grunt, Broccoli and others) packaged. As
mentioned in my earlier mails, the list of transitive dependencies of any of
these suffer from at least the following:

  • It is a list with more than 4000 packages on it
  • It is a list with at some point the package itself on it
    As a halfway point, I wanted to get a testing framework packaged instead,
    because everyone likes testing. While looking at the dependencies of a testing
    framework, I noticed that I had a need for CoffeeScript. Having a passing
    familiarity with the CoffeeScript dialect, I researched how one could achieve
    this. At the moment of this writing, I have packaged CoffeeScript v1.0.0, to be
    found in my git repo. What I did not account for nor foresaw was that
    bootstrapping CoffeeScript took some effort. My earlier, optimistic estimates
    were based on a flaw that lacked the isolation which is needed for a proper
    reproducible build. Anyway, if any of you want to play around with CoffeeScript
    v1.0.0 or any of the 50+(!) preceding versions, be my guest.

I also took both Ludovic', as well as Catonano's detailed feedback on the
initial draft of the recursive importer into account when rewriting it. It
should now only visit each node in the dependency graph once, and be a whole lot
more efficient as well. It is still based on the multi-valued return values that drove Ricardo's initial work on the CRAN recursive importer.

The amount of npm packages and the complexity of how they depend on one another
is enormous. As discussed on the guix-devel ML[0], it would be useful to gain
some insights into which packages would be worth wile to get into guix. As
Catonano noted, the problem is not on Guix' side; we have the (elementary)
building blocks with which to do the graph processing. The issue here is on how
to implement something akin to `fold-packages' for npm packages in order to
traverse the dependency graph. After rewriting the recursive importer to be more
sane, I scrawled some notes on my notepad that basically boil down to the
following:

  1. We should only look up each npm package once, if possible
  2. We should have a list of all npm package names.
  3. We should be able to specify the maximum traversal depth

For (1.), a simplified version of the recursive npm importer can be used. For
(2.), once one has installed node (with npm) and executed some npm search' commands, there should be a file in $HOME/.npm/registry.npmjs.org/-/all/.cache.json' that contains, among other
things, a listing of all package names. npm can be configured to updated this
cache quite often, (or almost never). It does weigh in at a hefty 160MB. What is
left is wiring all this together, which I did not have my priority these months.

Regarding `guix refresh', one has to re-import an npm package in order to get an
up-to-date package-representation usable by guix. Originally I had thought that
this would be of similar difficulty to the other importers. Because we only use
the npm registry [1] to retrieve metadata and the location of the actual source
archive, we have no way of knowing whether a particular guix package originated
from the npm registry.

An easy-yet-inelegant solution would be to include the package name as used
within the npm registry as metadata via an argument to the node-build-system.
Think an #:npm-name' key in thearguments' field of the guix package
definition.

The importer should be able to handle most of the valid (and invalid) source
uri's you can find in the wild, especially github-related urls and shorthands.
See [3] for a list of packages that might need some changes to either their
package.json/npmregistry metadata, or obviate a change to the importer logic.

The current version of the importer only looks at the latest version of
packages. It should be easy to fix this by handling the `@version' suffix like
the hackage importer does. This could be useful to break some of the dependency
cycles that exist between npm packages. For this to work, a scheme different
from the current NODE_PATH will have to be considered. The first module with a
certain name found in NODE_PATH will be loaded at runtime, so in the current
implementation it is not possible to have multiple versions of a package with
the same name loaded at one moment.

Ricardo's idea of a recursive importer is pretty nice, imho. It should be doable
to implement some more of them in a similar fashion what has been done for cran
and npm.

While I hope nobody (including myself) has to package so many variants of the
same package again, it would be nice to somehow download only the revision you
are interested in. AFAIK, there is no proper way for git to do this for the
general 'give me this commit' case. Something that I eventually did in order to
alleviate the ~3 minute checkout times for each iteration of CS, was the
following hack[2]. It basically puts a recent-enough copy of the CS git repo in
my store, and then made a shallow copy from that when using git-fetch. This took
my build times down to less than 10 seconds per iteration.

If you are interested in my work, have a look at:
https://github.com/wordempire/guix/commits/gsoc-final
, or just
git clone https://github.com/wordempire/guix.git
git checkout gsoc-final.

I will be trickling in a patch series onto the ML the next few days.

I guess that is enough text from me again. I would still like to express my
gratitude to my mentors David Thompson and Christopher Allan Webber, as well as
the rest of #guix and guix-devel (and some folks at GHM as well) for dealing
with my ramblings, questions and helping me keep this project fun. Special thanks to Catonano as well for having a close look at my code as well.

With just some tweaks to the importer, we should be able to at least package a huge subset of all the packages that require zero to few dependencies, once we are able to identify them.

I probably forgot quite some important and unimportant details, so if you have
any questions, tips or just want to blame me for bringing horrible, horrible
JavaScript into guix-land, send me a mail :-)

  • Jelle Licht

[0] https://lists.gnu.org/archive/html/guix-devel/2016-07/msg01726.html
[1] https://www.npmjs.com/
[2] http://paste.lisp.org/display/323999 <- beware, here be dragons etc
[3] http://paste.lisp.org/display/324007

Remove <https://debbugs.gnu.org/23744> and
<https://debbugs.gnu.org/23723> workaround.

* gnu/packages/node.scm (node): Update to 6.4.0.
  (node)[arguments]: Disabled more tests. Remove custom 'patch-shebangs'
  phase. Manually patch npm script shebang in new 'patch-npm-shebang'
  phase.
* gnu/packages/node.scm (http-parser): New variable.
* gnu/packages/node.scm (define-module): Import gnu packages tls with
  tls: prefix
* gnu/packages/node.scm (node)[native-search-paths]: New field.
The Node build system was previously building its own copies of
C-ares and http-parser.

* gnu/packages/node.scm (node)[inputs]: Add c-ares and http-parser.
[arguments]: Add configure flags for using system libraries.
* gnu/packages/node.scm (define-module): Import gnu packages compression
  with a prefix
(node): Likewise.
Older versions of Node.js are required to bootstrap coffee-script, and
possibly other packages.  Be warned that these old versions have several
unpatched security problems.

* gnu/packages/node.scm (define-module): Import gnu packages curl
(define-module): Import gnu packages pkg-config
(node-0.5): New variable.
(node-0.3.1): New variable.
(node-0.3.0): New variable.
(node-0.1.101): New variable.
(node-0.1.98): New variable.
(node-0.1.95): New variable.
(node-0.1.90): New variable.
(node-0.1): New variable.
(node-0.1.32): New variable.
(node-0.1.31): New variable.
(node-0.1.30): New variable.
(node-0.1.29): New variable.
(node-0.1.28): New variable.
* guix/build/node-build-system: New file.
* guix/build-system/node.scm: New file.
* doc/guix.texi: Document it.
* Makefile.am: Added new files.
* doc/guix.texi: ("invoking guix import"): Document it.
* guix/build/git.scm (git-fetch-tags): New function.
* guix/build/json.scm: New file.
* guix/import/npm.scm: New file.
* guix/scripts/import/npm.scm: New file.
* guix/scripts/import.scm (importers): Add "npm".
* tests/npm.scm: New file.
* Makefile.am (MODULES): Add new files when guile-json is present.
(SCM_TESTS): Add new test file.
* gnu/packages/coffee-script.scm: New file.
* gnu/local.mk (GNU_SYSTEM_MODULES): Add it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant