Inspired by this article
This code creates a JupyterLab setup for a list of users. For each user, you set up:
- A JupyterLab deployment to run the Notebooks.
- A Service to access your JupyterLab environment.
- An Inverting Proxy Agent deployment to get a unique URL per user (and per node)
-
Set variables
# TODO(developer): Replace with your project PROJECT_ID="[YOUR_PROJECT_ID]" CLUSTER_NAME="kupyterhub" ZONE="us-central1-a"
-
Install Kubernetes tools
gcloud components install kubectl --quiet
-
Create the managed Kubernetes cluster (GKE)
gcloud beta container clusters create ${CLUSTER_NAME} \ --project ${PROJECT_ID} \ --zone ${ZONE} \ --release-channel regular \ --enable-ip-alias \ --scopes "https://www.googleapis.com/auth/cloud-platform" \ --num-nodes 1 \ --machine-type n1-standard-4
-
Configure kubectl access
gcloud container clusters get-credentials ${CLUSTER_NAME} \ --project ${PROJECT_ID} \ --zone ${ZONE}
To deploy one or several Notebook servers on GKE, do the following:
-
Set your variables
# Both Docker images can be your own or google-provided ones. DOCKER_IMAGE_AGENT="gcr.io/${PROJECT_ID}/agent:gke" DOCKER_IMAGE_JUPYTERLAB="gcr.io/${PROJECT_ID}/ain:gke" # List of users. Any email registered to Google as a whole works including Gmail. # Your system must filter our email addresses first or run its own Inverting Proxy server. DEPLOYMENT_NAMES_LIST="user1@example.com,user2@example.com,user3@gmail.com"
-
Create a Docker image for JupyterLab
- This step is not required if you are using one of the AI Notebooks standard images.
- If you decide to build your own image, you can either use a standard image as a base (recommended) or use your own from scratch.
gcloud builds submit --tag ${DOCKER_IMAGE_JUPYTERLAB} ./docker/jupyterlab
-
Update the image reference for JupyterLab
# You can do manually in the file sed -i "s/<DOCKER_IMAGE_JUPYTERLAB>/${DOCKER_IMAGE_JUPYTERLAB}/g" "gke/configs/jupyterlab/deployment.yaml"
-
Create a Docker image for the Inverting Proxy agent.
- This step is not required if you use the Agent image provided by Google as a public image on gcr.io registry. If you do not know the URL, build the image and host it where relevant. Example:
gcr.io/inverting-proxy/agent
gcloud builds submit --tag ${DOCKER_IMAGE_AGENT} ./docker/agent
- This step is not required if you use the Agent image provided by Google as a public image on gcr.io registry. If you do not know the URL, build the image and host it where relevant. Example:
-
Update the docker image for the [agent] (gke/configs/agent/deployment.yaml)
# You can do manually in the file sed -i "s/<DOCKER_IMAGE_AGENT>/${DOCKER_IMAGE_AGENT}/g" "gke/configs/agent/deployment.yaml"
-
Run the deploy script. The deploy script creates temporary GKE yaml files for each ids then deploy.
cd gke bash deploy.sh ${PROJECT_ID} ${DEPLOYMENT_NAMES_LIST}
-
Wait for deployment to be done
kubectl get pods
-
Get Inverting proxy URLs
bash get_urls.sh ${DEPLOYMENT_NAMES_LIST}
-
Access a Notebook using the relevant URL. Note people logged to Google can only access the URL that matches the identity.
-
Deployments
bash delete.sh ${DEPLOYMENT_NAMES_LIST}
-
Kubernetes Engine
TODO
-
One user can only have one instance per node
- Inverting Proxy URL uses user email + VM Id to create consistently the same unique URL.
- It needs a valid Google email for authenticating later when accessing the URL.
-
If ask for more than one Notebook server per user, what is the reason. Currently:
- Notebooks persist on GCS and survive the deletion of a Notebook server.
- With custom images for Jupyterlab, users can quickly start predefined environments.
-
One agent per Notebook Server adds many deployments. Could do with one but would need a routing system like JupyterHub.
-
Administrators must limit email addresses to their needs because Inverting Proxy URL works with Gmail accounts.
-
Notebooks security is not currently enforced on GCS. Would need to setup ACL if this is a requirements.
- A user can access Notebooks of users.
- Makes is easy to collaborate but might want to manage this.