New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop and restart Jupyter Servers while maintaining state #4857
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with |
/cc @kimwnasptd |
As a workaround for now, the following kubectl commands can be used to trigger stop/restart:
|
@karlschriek how do those annotations work?
/home/jovyan should be on the PVC. So any git credentials or pip packages installed in /home/jovyan won't be lost if the user restarts the notebook using the same PVC. |
As far as I can gather the annotations trigger a culling service that looks for pods to shut down (presumably it scales the StatefulSet down to zero), so in principle the functionality to stop/start a server already exists, it just needs to be exposed to users in the dashboard. True, /home/jovyan will be on the PVC and thinking a bit more about it that might be sufficient as long as it is clear to users that they cannot (should not) change any system-wide settings and expect it to still be there the next day, But this probably runs a bit counter to what many users would expect. This wouldn't matter if we use a stop/restart approach rather than a delete/create (with re-attach) approach since the state would in that case be saved entirely. This also seems to be a much more intuitive approach to me as opposed to asking users to remember which PVC they used and attaching it to a new server every time. |
I think we can either improve culling feature in Jupyter or support a new button to scale statefulset to 0. Actually both can be supported for different use cases. We'd better bring notebook status in the UI as well |
How would replacing the backend for the controller to a Knative Service work? |
If this works:
it seems like we could get to this functionality with these changes:
I know there are probably a lot of weird edge cases, but it seems like those are all achievable steps without having to do major re-architecting? |
For those watching, this feature has been implemented in PR #5280 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I am gonna bump this, because we have implemented it, but its not merged into master yet. |
Closing this issue as the feature to start/stop a notebook server is integrated into the Jupyter Web App that is being released wtih 1.3. The issue regarding user installed packages through pip being persistent does remain. However, I have included a small section about this in the README for the new notebook images. |
@DavidSpek: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind feature
Why you need this feature:
I support data science teams with sophisticated Machine Learning toolsets and in particular have several teams right now who are interested in adopting Kubeflow largely because of the Jupyter Notebook functionality.
There is a lot to like about what is on there at the moment: you can customize the image you want to use, you can set the computing resources you need, you can attach volumes and configurations etc.
There are however, some limitations. Most notably the fact that you cannot stop a Jupyter Server and restart it the next day, just continuing where you left off. Sure, you can create a new server and reattach the PV that you worked on, but that will only restore the notebooks you created. Anything else you did (such as installing additional Python packages, changing Jupyter's theme, setting git credentials etc.) will go lost.
Most users will balk at this. In this case it would be much simpler for them to just start their own compute instance and run Jupyter from there, happily installing and changing stuff as they go along, and then just shutting the instance down at the end of the day.
Describe the solution you'd like:
Quite simply a "shut down server" button in the central dashboard that scales the stateful set for the server down to zero and a "start server" button that scales it back up again. I don't work with stateful sets that often, so I will admit that I am not certain of the complexity involved here.
The text was updated successfully, but these errors were encountered: