Skip to content

Commit

Permalink
Add kaggle notebook Dockerfile
Browse files Browse the repository at this point in the history
  • Loading branch information
Peter MacKinnon committed Jul 12, 2018
1 parent a730ddf commit 9e9e4af
Show file tree
Hide file tree
Showing 2 changed files with 91 additions and 0 deletions.
75 changes: 75 additions & 0 deletions components/contrib/kaggle-notebook-image/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.

# use basic syntax for now
FROM gcr.io/kaggle-images/python:latest

USER root

ENV DEBIAN_FRONTEND noninteractive

ENV NB_USER jovyan
ENV NB_UID 1000
ENV HOME /home/$NB_USER
# We prefer to have a global conda install
# to minimize the amount of content in $HOME
ENV CONDA_DIR=/opt/conda
ENV PATH $CONDA_DIR/bin:$PATH

# Use bash instead of sh
SHELL ["/bin/bash", "-c"]

# add https support
RUN apt-get update && apt-get install -yq --no-install-recommends apt-transport-https

RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen

ENV LC_ALL en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US.UTF-8

# Create jovyan user with UID=1000 and in the 'users' group
# but allow for non-initial launches of the notebook to have
# $HOME provided by the contents of a PV
RUN useradd -M -s /bin/bash -N -u $NB_UID $NB_USER && \
chown -R ${NB_USER}:users /usr/local/bin && \
mkdir -p $HOME

RUN export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
echo "deb https://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" > /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update && \
apt-get install -y google-cloud-sdk kubectl

# Install Tini - used as entrypoint for container
RUN cd /tmp && \
wget --quiet https://github.com/krallin/tini/releases/download/v0.10.0/tini && \
echo "1361527f39190a7338a0b434bd8c88ff7233ce7b9a4876f3315c22fce7eca1b0 *tini" | sha256sum -c - && \
mv tini /usr/local/bin/tini && \
chmod +x /usr/local/bin/tini

RUN chown -R ${NB_USER}:users $HOME

ENV GITHUB_REF https://raw.githubusercontent.com/kubeflow/kubeflow/master/components/tensorflow-notebook-image

ADD --chown=jovyan:users $GITHUB_REF/jupyter_notebook_config.py /tmp

# Wipe $HOME for PVC detection later
WORKDIR $HOME
RUN rm -fr $(ls -A $HOME)

# Get init scripts from kubeflow
ADD --chown=jovyan:users \
$GITHUB_REF/start-singleuser.sh \
$GITHUB_REF/start-notebook.sh \
$GITHUB_REF/start.sh \
$GITHUB_REF/pvc-check.sh \
/usr/local/bin/

RUN chmod a+rx /usr/local/bin/*

# Configure container startup
EXPOSE 8888
ENTRYPOINT ["tini", "--"]
CMD ["start-notebook.sh"]
16 changes: 16 additions & 0 deletions components/contrib/kaggle-notebook-image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Kaggle notebook

This Dockerfile builds an image that is derived from the latest [Kaggle python image](gcr.io/kaggle-images/pythoni:latest) but which is compatible for launching from the Kubeflow JupyterHub. [Kaggle](https://www.kaggle.com/) is the home of data science collaboration and competition.

Important notes:
* this notebook is not curated by the Kubeflow project and is not regularly tested
* the versions of TensorFlow, PyTorch, and the other libraries included may change at any time
* this is a very large notebook, over 21 Gb in size. Since our notebook uses the latest Kaggle image, docker pulls (and notebook launches) can take a lengthy period of time.
* the base image size for docker is 10 Gb, which won't be large enough to run this image. Your docker daemon must be configured for at least 30 Gb (`-storage-opt dm.basesize=30G`) or use a storage driver like overlay2.
* the Kaggle image includes TensorFlow 1.9 or greater built with AVX2 support, so the image may not run on older CPU

To build the image run:
```
docker build --pull -t kubeflow-kaggle-notebook:latest .
```
Of course specify whatever repo and image tag you need for your purposes.

0 comments on commit 9e9e4af

Please sign in to comment.