JupyterHub on Kubernetes for Data 8

This repo contains the Kubernetes config, container images, and docs for Data 8's deployment of JupyterHub on Kubernetes.

Getting Started

Google Cloud Engine

Log into the gcloud console at console.cloud.google.com

Create a cluster. Go to Container Engine > Container clusters > + Create Cluster. Fill out the required information and make sure you know how many instances you will need and what the memory and cpu requirements will be.

Go back to your dashboard in the GCP console. Click Activate Google Cloud Shell in the upper right-hand corner. It is an icon that looks like a small terminal.

In the new terminal window, clone the jupyterhub-k8s repository.

git clone https://github.com/data-8/jupyterhub-k8s

Set the zone.

gcloud config set compute/zone <your zone>

Get credentials for your cluster

gcloud container clusters get-credentials <your cluster>

Edit the docker-settings.json file. Set the docker repo name corresponding to your cloud provider. Set the image types. You can leave this blank if you are only using the base image. Set the context prefix to whatever you want.

Here is an example:

{
    "clusters": ["dev", "prod"],
    "buildSettings": {
        "dockerRepo": "gcr.io/<your project>",
        "imageTypes": ["datahub", "prob140", "stat28"],
    },
    "gcloud": {
        "project": "<your project>",
        "zone": "<your zone>"
    }
}

Run the build script to generate Docker images.

./build.bash [ hub | proxy | base | user {user_type} ]

hub is for the jupyterhub image and proxy is for the jupyterhub proxy image. You will find their Dockerfiles in the respective subdirectories. The singleuser server builds utilize a shared base image specified by user/Dockerfile.base. Various other singleuser server images are built from this specified by user/Dockerfile.{user_type}. For example to build the singleuser image for the course Stat 28, you would run ./build.bash base at least once, then ./build.bash user stat28.

Each docker image is tagged with the git commit hash corresponding with the last git revision of the build files. When the build completes, the script will output the name of the tagged docker image along with the suggestion to run populate.bash. The populate step preseeds the docker images onto cluster nodes.

Edit the helm-chart/values.yaml file where it says # Must be overridden. Set the image tags to the tags of the docker images you just built using ./build.bash. Also make sure to set the correct docker images. You may also adjust some of the other settings in the values.yaml file if necessary.

Install helm.

Run helm.

helm init

helm --kube-context=<your context prefix><your cluster> install ./helm-chart

Later, when you want to change your deployment run:

helm list

helm --kube-context=<your context prefix><your cluster> upgrade <release name> ./helm-chart

Congragulations! You just deployed your own jupyterhub cluster using kubernetes! :D

File / Folder structure

The manifest.yaml file in the project root directory contains the entirety of the Kubenetes configuration for this deployment.

The subdirectories contain the Dockerfiles and scripts for the images used for this deployment.

All the images for this deployment are pushed to the data8 Docker Hub organization and are named data8/jupyterhub-k8s-<name> where <name> is the name of the containing folder for that image.

Development

Current work on this project lives in a ZenHub board for this repo. You must install the browser extension to see the board.

After installing the extension, navigate to the issue board or press b. You'll see a screen that looks something like this:

Icebox contains future tasks that haven't been prioritized.
This week contains tasks that we plan to finish this week.
In Progress contains tasks that someone is currently working on. All of these tasks have at least one person assigned to them.
When the task is complete, we close the related issue.

Epics are groups of tasks that correspond to a complete feature. To see only issues that belong to a specific Epic, you can click / unclick the "Filter by this epic" button on the Epic.

Workflow

As tasks / issues first get created, they land in the Icebox pipeline and are categorized into an Epic if needed.
During our weekly planning meetings we'll move tasks from Icebox to This Week.
When team members start actively working on a task, they'll assign themselves to the task and move it into the In Progress pipeline.
When team members finish a task, they'll make a Pull Request for the task. When the PR gets merged, they'll close the task to take it off the board.

Cal Blueprint

![bp](https://cloud.githubusercontent.com/assets/2468904/11998649/8a12f970-aa5d-11e5-8dab-7eef0766c793.png "BP Banner")

This project was worked on in close collaboration with Cal Blueprint. Cal Blueprint is a student-run UC Berkeley organization devoted to matching the skills of its members to our desire to see social good enacted in our community. Each semester, teams of 4-5 students work closely with a non-profit to bring technological solutions to the problems they face every day.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

JupyterHub on Kubernetes for Data 8

Getting Started

Google Cloud Engine

File / Folder structure

Development

Workflow

Cal Blueprint

Files

README.md

Latest commit

History

README.md

File metadata and controls

JupyterHub on Kubernetes for Data 8

Getting Started

Google Cloud Engine

File / Folder structure

Development

Workflow

Cal Blueprint