Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add saving image to tarball and a buildpack to load it #778

Open
wants to merge 6 commits into
base: master
from

Conversation

@nuest
Copy link
Contributor

commented Sep 6, 2019

This adds two features:

  • a --save-image option that saved the image to a file image.tar in the binder directory
  • a TarballBuildPack that will load and run that image if a file image.tar is found

This can be useful if a specific workflow should be preserved. It currently prints a warning if the r2d version used to create the image mismatches the one used to load it, but with #550 it could also switch to that version (pending changes discussed in #490).

I did not test with --subdir yet.

Try out locally with tarball from Zenodo:

repo2docker https://sandbox.zenodo.org/record/367144

Here is an example interaction.

(binderhubsprint) daniel@nuest:~/git/elife-sprint/repo2docker/tests/conda/binder-dir$ repo2docker --save-image --no-run .
Picked Local content provider.
Using local repo ..
Using CondaBuildPack builder
Step 1/51 : FROM buildpack-deps:bionic
 ---> 536a38f87e4b
Step 2/51 : ENV DEBIAN_FRONTEND=noninteractive
 ---> Using cache
 ---> d520bc3e9203
[...]
Step 51/51 : CMD ["jupyter", "notebook", "--ip", "0.0.0.0"]
 ---> Running in a3ffe7bb8bf8
Removing intermediate container a3ffe7bb8bf8
 ---> 5311e39c82be
{"aux": {"ID": "sha256:5311e39c82be10a6a0a0940c3e08fc904d66fc53f0b3e7d5df2163181668cc10"}}Successfully built 5311e39c82be
Successfully tagged r2d-2e1567783481:latest
Saving image to file binder/image.tar
Successfully saved image
(binderhubsprint) daniel@nuest:~/git/elife-sprint/repo2docker/tests/conda/binder-dir$ tree .
.
├── binder
│   ├── environment.yml
│   └── image.tar``
├── Dockerfile
├── environment.yml
└── verify

1 directory, 5 files
(binderhubsprint) daniel@nuest:~/git/elife-sprint/repo2docker/tests/conda/binder-dir$ repo2docker .
Picked Local content provider.
Using local repo ..
Using TarballBuildPack builder
[I 15:33:33.679 NotebookApp] Writing notebook server cookie secret to /home/daniel/.local/share/jupyter/runtime/notebook_cookie_secret
[I 15:33:33.973 NotebookApp] JupyterLab extension loaded from /srv/conda/envs/notebook/lib/python3.5/site-packages/jupyterlab
[I 15:33:33.973 NotebookApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab
[I 15:33:33.979 NotebookApp] nteract extension loaded from /srv/conda/envs/notebook/lib/python3.5/site-packages/nteract_on_jupyter
[I 15:33:33.981 NotebookApp] Serving notebooks from local directory: /home/daniel
[I 15:33:33.981 NotebookApp] The Jupyter Notebook is running at:
[I 15:33:33.981 NotebookApp] http://127.0.0.1:33661/?token=016d6f08e11e5e22f896d7ae48401726b650770fa4954079
[I 15:33:33.981 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 15:33:33.981 NotebookApp] No web browser found: could not locate runnable browser.
[C 15:33:33.982 NotebookApp] 
    
    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://127.0.0.1:33661/?token=016d6f08e11e5e22f896d7ae48401726b650770fa4954079
@manics

This comment has been minimized.

Copy link

commented Sep 6, 2019

I'm curious, what made you prefer this workflow over pushing to a docker registry?

@nuest

This comment has been minimized.

Copy link
Contributor Author

commented Sep 6, 2019

@manics A tarball in a scientific data repository along the data that was analysed is (hopefully) available more longterm, and more likely to be available and accepted in a scholarly context than a scholarly publisher running a container registry long term.

Does that make sense to you?

@nuest

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2019

As a a further illustration, here is the log I just got from running the example from Zendo Sandbox after the merge with master

$ repo2docker https://sandbox.zenodo.org/record/367144
Picked Zenodo content provider.
Fetching Zenodo record 367144.
Fetching image.tar
Using TarballBuildPack builder
repo2docker version missmatch: image label has '0.10.0+14.gb20eb6a.dirty' but running '0.10.0+55.g371b925'
[I 14:03:07.617 NotebookApp] Writing notebook server cookie secret to /home/daniel/.local/share/jupyter/runtime/notebook_cookie_secret
[I 14:03:07.900 NotebookApp] JupyterLab extension loaded from /srv/conda/envs/notebook/lib/python3.7/site-packages/jupyterlab
[I 14:03:07.900 NotebookApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab
[I 14:03:07.905 NotebookApp] nteract extension loaded from /srv/conda/envs/notebook/lib/python3.7/site-packages/nteract_on_jupyter
[I 14:03:07.906 NotebookApp] Serving notebooks from local directory: /home/daniel
[I 14:03:07.906 NotebookApp] The Jupyter Notebook is running at:
[I 14:03:07.906 NotebookApp] http://127.0.0.1:54013/?token=bc2ae6e7fd87a9447cb0eba74e1eee44111042179788d767
[I 14:03:07.906 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 14:03:07.911 NotebookApp] No web browser found: could not locate runnable browser.
[C 14:03:07.911 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///home/daniel/.local/share/jupyter/runtime/nbserver-1-open.html
    Or copy and paste one of these URLs:
        http://127.0.0.1:54013/?token=bc2ae6e7fd87a9447cb0eba74e1eee44111042179788d767

Note the "version mismatch" log in line 5, which #490 could .. solve.

@manics

This comment has been minimized.

Copy link

commented Sep 11, 2019

@nuest I see what you're getting at. Would using an established Docker registry such as Docker Hub or quay.io work? One issue with tar-files is you still need to publish them, and ideally make them discoverable which means adding metadata to wherever they're hosted.

@nuest

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2019

@manics I admit I did not consider container registries so far. AFAIK it's not so easy to download a tarball from a registry without a docker client (see https://devops.stackexchange.com/questions/2731/downloading-docker-images-from-docker-hub-without-using-docker), so a ContainerRegistryBuildpack would make more sense to me in that case.

I'd like to cover the case where users of BinderHub intentionally create a snapshop of their Binder and publish that in a data repository with the metadata to enable discovery.

@nuest

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2019

@betatim Any idea why the tests might fail on Travis but not locally?

Also note that the tests do fail on Azure, but the job says "successful": https://dev.azure.com/jupyter/repo2docker/_build/results?buildId=25&view=logs&jobId=7ff9283a-ab30-5e9f-8967-b9fdc546360c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.