Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for nix buildpack in repo2docker #407

Merged
merged 1 commit into from Nov 5, 2018

Conversation

@costrouc
Copy link
Contributor

costrouc commented Sep 17, 2018

This pull request is a work in progress. I would like feedback from the nix community that it is implemented correctly and has the necessary features.

Demo repo2docker repository: https://gitlab.com/costrouc/nix-binder-example

Why add nix as a buildpack?

Nix would be a great addition to reproducible data science. It is a unique package manager. Some notable features:

  • 100% reproducible environments (pin to exact commit in repository)
  • both a source and binary package repository
  • allows customized compilation and version of every package
  • can run identical environment outside of docker (all linux distros + dawin)
  • as of now 45,000+ packages
  • fully declarative environments
  • packages: python, javascript, julia, R, haskell, perl, and many other languages (some better than others).

Questions needing feedback:

repo2docker questions

  1. Should I use a custom Dockerfile not taking advantage of base?

The reason for this is that nix does not need apt or other package managers. So the smaller the base image the better. Nix handles all dependencies.

  1. Where should NixBuildPack be in ordered list of buildpacks?

As of now NixBuildPack only depends on a default.nix file being present.

nix specific questions

  1. Any shorter way to initialize nix in a docker container and launch with a nix shell?

The definition as of now is quite short but I would like to make it shorter. In general things that I was not happy including nixshell command and nix-channel initialization. This adds about 25 Mb to the image.

  1. Should nix-shell be launched with the --pure option?

Launching with a pure option would make the environment more reproducible by removing all non included packages. Has the downside that some expected tools will not be available.

  1. Anyway to ensure that jupyter is in installed dependencies?

Right now I do not check that jupyter is present.

  1. How to prevent nix-shell from evaluating shellHook twice?

Right now shellHook is evaluated in lines:

  • nix-shell default.nix --command "command -v jupyter"
  • ENTRYPOINT ["/usr/bin/nixshell"] when the docker container is run
  1. Should I include the unstable channel in the NIX_PATH?

I have included it for ease of use. However, like in the nix binder example I pin the exact nixpkgs version https://gitlab.com/costrouc/nix-binder-example/blob/master/default.nix#L4-7

@costrouc costrouc changed the title [WIP] Feature: Adding nix support for repo2docker [WIP] Adding support for nix buildpack in repo2docker Sep 17, 2018
@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch from a9691bf to 7444c90 Sep 17, 2018
@betatim

This comment has been minimized.

Copy link
Member

betatim commented Sep 17, 2018

Nice work!

re: New base image or not? I would keep the shared base image. We try and have our build packs be composable (you can use the R and conda build pack at the same time) which will be hard to do if they don't share a base image. It also means a lot of the standard things (creating the right user, copying the repo, etc0 are taken care of and we only need to maintain one setup.

@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Sep 17, 2018

Definitely. Yeah I see from a maintainer side that sharing the same dockerfile would be a better approach. From the tests I did the difference is about 100 MB in the final uncompressed docker image. I will make the changes.

@betatim

This comment has been minimized.

Copy link
Member

betatim commented Sep 17, 2018

In my experience the images that you end up with are measured in GBs, so I think we don't have to feel too bad about inflating them by 100MB.

For maintenance this is a big win. Also I'd imagine that people who use nix might want to also pull in things from other package managers (or vice versa). Say 90% of my packages comes from nix and the rare ones I directly install with pip. Or something like that. This means that maintaining composability is a big thing for users as well.

Maybe we should add a test that mixes nix and pip (or some other build pack) to make sure that works nicely and will keep working in the future (cue 'if you liked it you should have put a test on it').

EXPOSE 8888
# Copy and chown stuff. This doubles the size of the repo, because
# you can't actually copy as USER, only as root! Thanks, Docker!

This comment has been minimized.

This comment has been minimized.

Copy link
@betatim

betatim Sep 18, 2018

Member

👍 Would be interesting to explore. Do you want to create a PR to get going on this? Otherwise we should create a new issue to track this.

This comment has been minimized.

Copy link
@betatim

betatim Sep 18, 2018

Member

#273 would also be relevant for this.

This comment has been minimized.

Copy link
@minrk

minrk Sep 18, 2018

Member

This is fixed in docker itself. We just need to add a check for the docker version, or bump our base supported docker version to 17.09 and use it: #164

@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch from 7444c90 to e60e604 Sep 18, 2018
@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Sep 18, 2018

@betatim I have modified the commit from feedback to make it composable with other buildpacks. Looking at the size of the builds the difference is 900 MB with alpine as a base image and 1.7 GB for buildpack as the base. That is with installing numpy, scipy, and jupyterlab. The 1.7 GB may not be such a big deal especially since the early layers will match other images.

Additionally I have taken care that the nix installation needs minimal root privileges. The only root privilege used are chowning the correct files and creating a /nix directory owned by ${NB_USER}.

I have checked that it works using the following command locally. Please let me know what additionally needs to be done for testing. I have added a nix test that hopefully is done right.

repo2docker https://gitlab.com/costrouc/nix-binder-example
@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch from e60e604 to 8eebc77 Sep 19, 2018
from ..base import BaseImage


class NixBuildPack(BaseImage):

This comment has been minimized.

Copy link
@betatim

betatim Sep 27, 2018

Member

if we inherit from PythonBuildPack instead people will be able to use nix together with conda and pip.

This depends a bit on the use-case you see for nix. Should nix be something that people use without anything else (right now) or as an addition to install tools that they can't install with what they already have (a bit like apt.txt which lets you install some additional things as well as using your environment.yml or REQUIRE).

This comment has been minimized.

Copy link
@costrouc

costrouc Oct 11, 2018

Author Contributor

Nix should be fully composable with other package managers so I will change it to inherit from PythonBuildPack.

@@ -64,6 +64,7 @@ def _default_log_level(self):

buildpacks = List(
[
NixBuildPack,

This comment has been minimized.

Copy link
@betatim

betatim Sep 27, 2018

Member

nix should probably be where the Julia and R build packs are in this list, not at the end here. The Dockerfile build pack is the ultimate escape hatch, so we need to give priority to it over all other build packs.

@betatim

This comment has been minimized.

Copy link
Member

betatim commented Sep 27, 2018

Any thoughts from @minrk or @yuvipanda on adding nix support in general and specifics when it comes to composing it with other ways of specifying dependencies? Unfortunately nix isn't a tool I use or know anyone who uses a lot so trying to gently feel my way around for what is best.

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Sep 27, 2018

Re: alpine, I would avoid it here. Lots of things don't really behave as expected (many Python packages don't work right on alpine, and Python wheels can't be installed). It's a great base for well-defined service images, but I wouldn't recommend it for anything interactive.

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Sep 27, 2018

I think it definitely should use the same base image, though I'm not sure about inheriting the default Python env or not. It depends a bit on what is to be installed (i.e. is it the kernel environment). One of the reasons composing works well with the other setups is that the installation mechanisms (conda/pip/julia/r) play nice together (at least if they are done in the right order). Since nix owns its whole install, it's less clear what the right thing to do is:

  1. give nix control over the whole env, or
  2. inherit the notebook server env, and give nix control of the kernel env only
@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Oct 11, 2018

Nix installs all of the dependencies for a package (including things like libc etc.) so technically the base image does not matter to nix. I think that sharing a base image for now is the best path because as of now nix does not have every python package and conda does handle some packages better. In the future I certainly see that a smaller base image could be used.

I have spent some time trying to figure out why the travis test is failing? Is it because of appendix? The issue is that in order to expose all of the nix package I need to create a "virtualenv" by setting the ENTRYPOINT=nix-shell. It is similar to running pipenv shell or source activate myenv.

edit: solved it turns out that I was not exiting from the nix-shell command properly. This has been fixed. I had to catch SIGTERM properly.

@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch 5 times, most recently from 2b39d24 to dd00b9f Oct 11, 2018
@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Oct 11, 2018

@betatim , @minrk . Changes have been made with your feedback and all the NixBuildPack tests pass. For some reason julia keeps failing? Looks like this happens in other PRs. I have intentionally kept this PR as the bare minimum to have nix fully working with repo2docker. This bare minimum should be more than enough for any nix based repository.

  • based on base BuildPack. I found that when basing off of PythonBuildPack there was no advantage since nix can install conda and associated packages if it needs it. Using PythonBuildPack increased the docker image size by around 700 MB.
  • a non pure nix-shell is used for now. This means that the programs installed by the base image are still available.
  • Resulting docker images are around 1.5 GB. With the first 700 MB cachable. If alpine will be used as a base image in the future the images would be around 800 MB total.
  • Documentation has been added for how to use nix

I believe that with these changes it is ready for merging pending additional questions.

@costrouc costrouc changed the title [WIP] Adding support for nix buildpack in repo2docker Adding support for nix buildpack in repo2docker Oct 11, 2018
@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch from dd00b9f to 868dc8f Oct 17, 2018
@costrouc costrouc force-pushed the costrouc:costrouc/nix-buildpack branch from 868dc8f to 1e3630c Oct 17, 2018
@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Oct 17, 2018

Fixed merge conflict and small documentation typo. Ready for merge pending any questions that you may have. I've described in the previous comments exactly what was included in this commit.

@betatim

This comment has been minimized.

Copy link
Member

betatim commented Nov 5, 2018

Let's merge this and then figure out #448.

@betatim betatim merged commit 5cdd0ee into jupyter:master Nov 5, 2018
3 checks passed
3 checks passed
codecov/patch 100% of diff hit (target 0%)
Details
codecov/project 75.64% (target 0%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@choldgraf

This comment has been minimized.

Copy link
Collaborator

choldgraf commented Nov 5, 2018

woooo! thanks for all the work @costrouc

@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Nov 5, 2018

Just out of curiosity when would this be available on mybinder? I am unfamiliar with the release cycle with that tool. Looking forward to showing it to others!

@betatim

This comment has been minimized.

Copy link
Member

betatim commented Nov 5, 2018

You could deploy it tomorrow if you want to. Deploys are made via a PR to https://github.com/jupyterhub/mybinder.org-deploy/ with a guide here: https://mybinder-sre.readthedocs.io/en/latest/deployment/how.html#updating-repo2docker

We usually batch deploys on Tuesdays and Thursdays so that people can plan to be around for a few hours after the deploy they made to revert/fix side effects.

While I have your attention maybe you have some thoughts on #448.

@costrouc

This comment has been minimized.

Copy link
Contributor Author

costrouc commented Nov 5, 2018

Thank you! I looked in the PRs and saw how easy it is to update.

I commented here with my thoughts earlier #448 (comment). I don't expect any changes to the base image to have any effect on nix so I think the change is perfectly fine. It will only affect the resulting docker image size from my perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.