Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customizable user environments #67

Closed
mrocklin opened this issue Jan 12, 2018 · 6 comments
Closed

Customizable user environments #67

mrocklin opened this issue Jan 12, 2018 · 6 comments
Labels

Comments

@mrocklin
Copy link
Member

We're getting a few comments from @rsignell-usgs and @rabernat about having their own custom user environments both in their local environment and in their worker environments.

They have some ability to customize worker environments either by creating custom docker images (either manually or with https://github.com/jupyter/repo2docker) or by using the EXTRA_PIP_PACKAGES and EXTRA_CONDA_PACKAGES environment variables.

In their notebooks ideally they could manage environments using standard pip/conda commands from the terminal. I've personally not been able to get this to work (see jupyterhub/zero-to-jupyterhub-k8s#393)

Generally I'm curious what the right way is to approach this. I suspect that it has been well handled before. cc @yuvipanda @choldgraf

@rsignell-usgs
Copy link
Member

rsignell-usgs commented Jan 12, 2018

I figured it out. If I add nb_conda_kernels to the root environment, then my custom environments show up in the dropdown menu (as long as those environments have nb_conda_kernels in them also)

2018-01-11_22-00-44

@rsignell-usgs
Copy link
Member

rsignell-usgs commented Jan 12, 2018

see jupyterhub/zero-to-jupyterhub-k8s#393 (comment) for persisting custom environments

@rabernat
Copy link
Member

Here are some of the use cases I would like to support.

  • I want to change my default environment. The desired environment is defined in an environment.yaml file that lives on github.
  • I have multiple different environments for different projects. I would like to be able to pick which one to use when I launch a notebook from jupyterhub.
  • I'm in an interactive session (using the dask cluster) and realize a need another package. I install it from conda / pip / git and somehow push this new environment out to the dask workers. (I recognize the cluster will have to be restarted; that's ok.)

@mrocklin
Copy link
Member Author

It's worth noting that all of @rabernat 's situations apply both to the notebook and to the workers. There are two competing concerns here:

  1. Rapid development of the local environment. These users are actively writing code, and pulling in new development versions of packages as they iterate on solutions.
  2. Easy deployment on that environment across a cluster. Workers will need to match the local environment pretty exactly.

Our current solution to the rapidly changing environment is to have the dask-workers check for the EXTRA_PIP_PACKAGES environment variable before loading and pip install the contents of that variable. This allows users to direct their workers to download from git repositories. There is a similar EXTRA_CONDA_PACKAGES environment variable, but the 10-30 seconds that this spends in dependency solving ends up being pretty annoying for interactive use.

This only solves the problem for small changes to the environment. In other cases, such as when users want to switch between anaconda/defaults and conda-forge packages (such as when using different versions of GDAL) then the pip solution certainly doesn't work. Currently the solution is to encourage them to build a docker image and point KubeCluster to that. This is probably not as convenient for the average scientist. Conda environment.yml files might be a solution, interactive construction of environments and then using conda list --export might be another solution.

cc @jjhelmus as a conda representative. There may be someone better suited to take part in this conversation, but @jjhelmus has some background with this community. Jonathan, feel free to ignore this completely, I just thought that someone working on conda packaging might want to be aware of our use case here.

@stale
Copy link

stale bot commented Jun 15, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 15, 2018
@stale
Copy link

stale bot commented Jun 22, 2018

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@stale stale bot closed this as completed Jun 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants