Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat jupyter as an editor (decouple jupyter and user environments) #868

Open
mdeff opened this issue Mar 27, 2020 · 7 comments
Open

Treat jupyter as an editor (decouple jupyter and user environments) #868

mdeff opened this issue Mar 27, 2020 · 7 comments

Comments

@mdeff
Copy link
Contributor

mdeff commented Mar 27, 2020

I think jupyter should be treated as an editor, decoupled from user code and environment. As a start, it could be installed in its own environment, out of the environment where the user code is run. (This is true in general: desktop users are better off with a system-installed jupyter and kernelspecs pointing to environments.)

The main issue with the current setup is to run ancient code with a modern jupyter, as dependencies might conflict. That is especially true when their versions are specified by the user (in requirements.txt, environment.yml, etc.) for reproducibility's sake. I have in mind scientific experiments which, for reproducibility, shouldn't be updated (unlike living projects).

There is however a deeper issue, as ipykernel itself has many dependencies to be installed in the user environment. Longer term, it would be good to decouple ipykernel and jupyter more (to execute code with an old ipykernel specified by user environment with a modern jupyter editor). Even better would be to push the interface down to the notebook file to entirely decouple the editor from the user environment. After all, if jupyter is an editor, running notebooks should not require to install anything in the user environment.

This is potentially a far-fetched and long-term issue that I have little idea how to technically realize, if even possible. Please let me know your thoughts and if it has been discussed (elsewhere) already. As always, thanks for your amazing work!

@betatim
Copy link
Member

betatim commented Mar 27, 2020

We need to do a bit of digging but there should be a few issues already around splitting the environment of the kernel and the environment of the notebook server. This is what we do for Python 2 already. The kernel runs in a environment with Python 2, but the notebook server (and other r2d infrastructure) runs on Python 3.

We have talked about adopting that approach for Python in general. One immediate hurdle is that you now need two files. One to specify dependencies for your kernel and one for the notebook server (for example to change the Jupyter Lab version or install a extension). How to solve this nicely is a problem for which we need a few attempts (I think).

Someone who wants to work on this or explore options via code examples would be super welcome.


As a general point we will continue to ship Jupyter as the default UI with repo2docker. We need some form of default UI that can be accessed over the web. Jupyter seems like a good fit for that. We also have a bit of infrastructure already to proxy other UIs. There are even examples like https://github.com/danlester/binderhub-voila-direct and/or https://github.com/danlester/binderhub-voila-native that run without installing Jupyter (well, they install voila which is Jupyter but ... they show how you could run something else :) ). So I think repo2docker will continue with shipping Jupyter as the UI.

@mdeff
Copy link
Contributor Author

mdeff commented Mar 27, 2020

I didn't know it already worked this way for python 2. Great! It shouldn't be too much of a hurdle to do it for python3 then. What's the situation on non-python kernels? All the better if it unifies operations across kernels.

The notebook server environment could be specified in a .binder/requirements_ui.txt. It also makes sense from a reproducibility point-of-view, as you might want to update your UI (or use r2d's default) while preserving the experiment code and environment.

Completely agree with shipping Jupyter as the default UI. My point in general is to restrain from "polluting" (or altering) the user code environment.

@mdeff
Copy link
Contributor Author

mdeff commented Jun 15, 2020

Some thoughts about a potential shorter-term fix. It's possible to pin the base environment (specified in repo2docker/buildpacks/conda/environment.yml) by pinning those packages (and their dependencies) in the repo's environment.yml or requirements.txt. But what if some packages shouldn't be installed (like jupyterlab on ancient environments)? Or if newer versions of repo2docker add packages there?

While ipykernel needs to alter the user environment through dependencies, separating the user and editor environments will only solve part of the issue. In the meantime, we need a way to pin the editor environment that is injected in the user environment.

@mdeff
Copy link
Contributor Author

mdeff commented Jun 18, 2020

For reference, I ended up achieving the desired independence by creating a venv from postBuild, requiring the python version in runtime.txt, and defering to the default conda env for the jupyter UI.

# .binder/postBuild
python3.6 -m venv ./env
./env/bin/pip install -r requirements.txt
# Shadow the default kernelspec for jupyter to use our environment by default.
./env/bin/python -m ipykernel install --user

The problem for https://github.com/mdeff/fma is that the old computational environment (that I want to preserve for reproducibility) is not compatible with any jupyterlab (required by r2d's default conda env). I like this solution because my users can run the frozen env from the cloud in the latest jupyterlab! Happy to make a binder-example if you think that's a good (even if temporary) solution.

@manics
Copy link
Member

manics commented Sep 18, 2020

Cross-referencing related issues:

@manics
Copy link
Member

manics commented Sep 29, 2020

Another use-case: Disabling the mybinder.org jitsi extension jupyterhub/mybinder.org-deploy#1562 (comment)

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/previous-built-binder-repo-suddenly-with-404-error/13047/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants