Highly-available JupyterHub proxy with Traefik
JupyterHub uses a proxy to direct incoming user requests to
notebook servers. A proxy's routing table determines which
requests are sent where. For example, if user
server is available at the address
10.0.1.2:8000, the routing
table should contain a mapping
For each request, the proxy determins the URL, consults this routing
table & directs the request to appropriate address.
As users start / stop their servers, JupyterHub must dynamically modify the routing table to add / remove routes. Modifying the routing table should also not cause disruption to requests currently being processed.
is currently the most used proxy implementation for JupyterHub. The
routing table is kept in-memory, which means you can only run a
single copy of
configurable-http-proxy at a time. If the proxy
process is disrupted for some reason (the node it is running on goes down,
it uses too much memory, etc), the whole JupyterHub is unavailable.
This is particularly a problem in dynamic large scale systems like
Zero to JupyterHub on Kubernetes, where
nodes dynamically come and go.
In this project, you will implement a JupyterHub proxy that uses traefik to do the routing, and etcd to store the routing table. This allows multiple copies of the proxy to be running easily, making the proxy highly available. You will also integrate this proxy implementation into our high-scale kubernetes distribution, Zero to JupyterHub on Kubernetes.
configurable-http-proxy is written in nodejs, it requires admins
to have nodejs installed before they can set up JupyterHub. This complicates
setup, since you need to have two runtimes installed (nodejs & python3) than
just one (python3). A stretch goal would be to make this situation better,
by writing a proxy that runs in the same process as JupyterHub & does all
the proxying required. This will make deploying JupyterHub much easier for
smaller installs, make debugging easier and have a host of other benefits.
How can applicants make a contribution to the project?
We require students finish at least one project-specific microtask before they apply. https://github.com/jupyterhub/outreachy/labels/project-traefik-proxy lists the various microtasks that are specific to this project. You should complete at least one of them. Comment on the issue, or reach out to us at https://gitter.im/jupyterhub/jupyterhub for help!
Remember that we do not expect you to already have all the skills required to complete the tasks. Ask and we shall help!
You'll learn important development skills in this project:
- Asynchronous programming with Python
- Modeling & building distributed systems
- Tradeofss between simplicity, high-availability & latency in distributed systems
- Direct experience with modern large scale system tools, such as Kubernetes, etcd & treafik.
You'll also learn to work with a distributed community of people in various fields from across the world. Your work will be featured prominently on the Project Jupyter Blog, and lots of people around the world will likely use this proxy in many ways.
JupyterHub is gaining adoption in large scale deployments that place a lot of value in highly available systems. The ability to make use of a highly-available proxy would be a big step in that direction. In the long term, it reduces the total amount of code the community will have to maintain, and leverage improvements in the traefik / etcd communities easily. There will also be other performance & reliability improvements as a side effect of this change.
Use this timeline as a starting point for this project in the application. Feel free to make adjustments as appropriate:
- Month 1:
- Create a new jupyterhub-traefik-proxy package
- Initial implementation of Proxy API for traefik
- Common test suite for verifying uniform behavior of JupyterHub proxy implementations
- Month 2:
- Test deployments of jupyterhub using the new proxy
- Implementations of both local toml and etcd configurations
- Month 3:
- Integrate traefik proxy implementation into zero-to-jupyterhub as the new default proxy
- Profiling of performance with traefik proxy