Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add lifecycle hooks to VSCode and Jupyterlab Workspaces to provide persistence of conda, pip, and extensions. #646

Conversation

aleksandrmelnikov
Copy link
Contributor

What this PR does:
Adds a design document of the implementation.

  • Particularly, the background and explanation of the solution.

Adds migrations to add lifecycle hooks into the workspace templates.

Which issue(s) this PR fixes:

Fixes #623

Special notes for your reviewer:

Checklist

Please check if applies

  • I have added/updated relevant unit tests
  • I have added/updated relevant documentation

Required

  • I accept to release these changes under the Apache 2.0 License

- Added lifecycle hooks to template
- This enables persistence of conda, pip, and jupyter labextensions.
- Added lifecycle hooks to template
- This enables persistence of conda, pip, and vscode extensions.
@aleksandrmelnikov aleksandrmelnikov self-assigned this Oct 8, 2020
@aleksandrmelnikov aleksandrmelnikov marked this pull request as ready for review October 9, 2020 19:16
@rushtehrani
Copy link
Contributor

@aleksandrmelnikov getting this error when running the migration. Same error when I directly copy/paste the template:

FATA[0004] Failed to run database go migrations: failed to run Go migration "20201008153033_update_jupyter_lab_template.go": error converting YAML to JSON: yaml: line 21: found character that cannot start any token 

@aleksandrmelnikov
Copy link
Contributor Author

Please don't merge this yet, need to move the docs to another repo.

@rushtehrani
Copy link
Contributor

It looks like using this approach for restoring pip/conda packages is adding an extra 10 mins delay (with JupyterLab). Need to verify whether this is because conda is taking a long time to restore or if it's somewhere else.

@rushtehrani
Copy link
Contributor

It seems like conda env update attempts to update all packages, even if they exist.

There is an open issue for this: conda/conda#10218

@aleksandrmelnikov
Copy link
Contributor Author

Good find on the "update all packages".
I'm wondering if instead of restoring the environment, we can add some logic to compare the packages.

  • If it exists, skip. If it doesn't install?

@rushtehrani
Copy link
Contributor

I'm wondering if instead of restoring the environment, we can add some logic to compare the packages.

Possibly, or if there is a way to only export user installed packages...

aleksandrmelnikov and others added 2 commits October 26, 2020 13:29
- Otherwise, the extension installation process also kicks off a
production minimization of code, which takes minutes.
- This leads to Jupyterlab Workspace taking ~10 minutes to resume, even
if no new extensions had been installed.
@aleksandrmelnikov
Copy link
Contributor Author

@rushtehrani Note the change in the migration to jupyterlab. This should fix the long start time for it.

Time command to debug postStart hook:
Jupyterlab

condayml="/data/.environment.yml";
          jupytertxt="/data/.jupexported.txt";
          if [ -f "$condayml" ]; then { time conda env update -f $condayml;} 2>> /data/time.txt; fi;
          if [ -f "$jupytertxt" ]; then { time cat $jupytertxt | xargs -n 1 jupyter labextension install --no-build;} 2>> /data/juptime.txt; fi;

VScode

            condayml="/data/.environment.yml";
            vscodetxt="/data/.vscode-extensions.txt";
            if [ -f "$condayml" ]; then { time conda env update -f $condayml;} 2>> /data/time.txt; fi;
            if [ -f "$vscodetxt" ]; then { time cat $vscodetxt | xargs -n 1 code-server --install-extension;} 2>> /data/juptime.txt; fi;

- This is SQL and will never evaluate to true. These templates are
go migrations, not sql based.
See onepanelio#646
for context.
@aleksandrmelnikov
Copy link
Contributor Author

All issues should be resolved.

@rushtehrani
Copy link
Contributor

rushtehrani commented Oct 27, 2020

Getting the message below after resuming a paused JupyterLab workspace.

I'm not sure if the --no-build flag is the way to go if this is going to happen... I did click "rebuild" in this case and it only took about 2 mins to rebuild and another 20 seconds restart. Is the rebuild triggered by UI any different than the command line version?

image

- This ensures all the extensions installed with "--no-build" will
be compiled in one go, instead of individually.
@aleksandrmelnikov
Copy link
Contributor Author

@rushtehrani I was able to recreate this error and I pushed up a fix.
I tested with the following packages.

conda install nbresuse 
jupyter labextension install --no-build jupyterlab-topbar-extension jupyterlab-system-monitor

The time to install and build seems to be ~5-6 minutes.

root@jup2-0:/data# tail jupbuild.txt 
[LabBuildApp] JupyterLab 2.2.5
[LabBuildApp] Building in /opt/conda/share/jupyter/lab
[LabBuildApp] Building jupyterlab assets (build:prod:minimize)

real	2m37.921s
user	3m25.626s
sys	0m48.704s
root@jup2-0:/data# tail time.txt 

Please update conda by running

    $ conda update -n base conda



real	2m8.134s
user	1m21.518s
sys	0m5.535s
root@jup2-0:/data# 

- We want it to run only if the installation succeeds.
@rushtehrani
Copy link
Contributor

rushtehrani commented Oct 27, 2020

Based on my testing:

  • Launch with no builds: 1m 30s
  • Launch with latest template (with builds): ~8m
  • Launch with --minimize=False flag set: ~4m

The last option seems like the best option as far as startup times, however since it doesn't minimize the extension files, it could slow down browser load times.

Question: is there a way to save JupyterLab extensions in a different directory that is mounted vs a directory on OS disk?

@aleksandrmelnikov
Copy link
Contributor Author

Let me take a look at changing the jupyter installation directory.

@rushtehrani
Copy link
Contributor

Let me take a look at changing the jupyter installation directory.

Sounds good, if that is not option, I think our best bet is to set the --minimize flag.

@aleksandrmelnikov
Copy link
Contributor Author

Let me take a look at changing the jupyter installation directory.

Sounds good, if that is not option, I think our best bet is to set the --minimize flag.

@rushtehrani Investigated this a bit.
Jupyter lab supports a --app-dir flag.

What this does is dictate where jupyterlab installs it's assets and extensions.

[LabBuildApp] Building in /opt/conda/share/jupyter/lab
That's our current default.

We can change that to /data/jupyterlab, or any other volume.

If we go ahead and save the extensions and jupyter to the mounted disk, we have to make sure the workspace resumes and uses that directory for jupyterlab.

  • Either through --app-dir or through an environment variable JUPYTERLAB_DIR.

I'm testing if JUPYTERLAB_DIR will work on start-up.

But is your intent to save the extensions to that volume on shut-down?
And restore from that volume?

@rushtehrani
Copy link
Contributor

rushtehrani commented Oct 28, 2020

@aleksandrmelnikov this is only going to be useful if it allows us to either bypass build or reduce the time it takes to build the extensions.

If it does, then putting it in /data/.jupyterlab would probably makes the most sense assuming we can still use /data as the working directory.

The intent is to always save the data there and not have to move it on start or shutdown.

How would this work with extensions already installed in the image. Can we separate user installed extensions?

- This is a trade-off to reduce workspace start-up time, but increase
browser load times for Jupyterlab.
@aleksandrmelnikov
Copy link
Contributor Author

Per discussion here: https://onepanelio.slack.com/archives/GJ88JMG9E/p1603913216187400

We're going to go ahead and use the --minimize=False flag as a trade-off.
Using volumes will require additional steps and logic to ensure it works properly.

@rushtehrani rushtehrani merged commit 79d4c44 into onepanelio:master Oct 28, 2020
@aleksandrmelnikov aleksandrmelnikov deleted the feat/core.623-persist.packages.between.ws.switch.pause branch October 28, 2020 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Persist pip packages, jupyterlab extensions, and vscode extensions between switching and pausing.
3 participants