Skip to content

machine-learning-helpers/docker-python-jupyter

Repository files navigation

Docker images to support Machine Learning (ML) in Python

Docker Cloud Build Status Docker Repository on Quay

Overview

That project produces Docker images, which provide ready-to-use Artificial Intelligence (AI) / Machine Learning (ML) Python Jupyter environments on a few well known and stable Linux distributions (e.g., CentOS 9 Stream, CentOS 8 Stream, Debian 12 (Bookworm), Debian 11 (Bullseye), Ubuntu 22.04 LTS (Jammy Jellyfish), Ubuntu 20.04 LTS (Focal Fossa) and Ubuntu 18.04 LTS (Bionic Beaver)).

The Docker images just add some Jupyter notebook and data set samples on top of other general purpose C++/Python Docker images, produced by a dedicated project on GitHub and available on Docker Hub too.

The Python virtual environments are installed thanks to Pyenv and pipenv, as detailed in the dedicated procedure on the Python induction notebook sub-project. Any additional Python module should be installed in a dedicated virtual environment, controlled by pipenv through local Pipfile (and Pipfile.lock) files, which should be versioned. The Docker images therefore do not install those modules globally; only the pyenv and pipenv utilities are provided (and correctly configured).

Those Docker images are intended to run any collection of Jupyter notebooks, using any collection of data sets, which you may have locally. Those Docker images provide the engine (Jupyter Lab), and you provide the gas (Jupyter notebooks and data sets). With that analogy, some sample gas is provided for convenience purpose:

  • Sample Jupyter notebooks, available in the /notebook top directory of the Docker images (when not overshadowed by your own Jupyter notebook volume)
  • Sample data sets, available in the /data top directory of the Docker images (when not overshadowed by your own data set volume)

Another GitHub repository features Python Docker light images, aimed at deploying Data Science applications on operational environments such as cloud-based Kubernetes clusters or services (e.g., AWS EKS, Azure AKS, IBM/RedHat OpenShift v4 or Google GKE). Those images are available on their own Docker Hub repository.

See also

Simple use

  • Download the Docker image for your preferred Linux distribution (where <linux-distrib> is one of centos9, centos8, debian12, debian11, ubuntu2204, ubuntu2004 or ubuntu1804):
$ docker pull infrahelpers/python-jupyter:<linux-distrib>

With the Jupyter notebook and data set samples provided by the Docker images

  • Launch Jupyter Lab within the Docker image (where <port> corresponds to the local port on which Jupyter Lab is launched; the default is 8888):
$ docker run -d -p <port>:8888 infrahelpers/python-jupyter:<linux-distrib>

With your own Jupyter notebooks and data sets

  • Launch Jupyter Lab within the Docker image (where <port> corresponds to the local port on which Jupyter Lab is launched; the default is 8888):
$ docker run -d -p <port>:8888 -v ${PWD}/notebook/induction:/notebook -v ${PWD}/data/induction:/data infrahelpers/python-jupyter:<linux-distrib>

Interact with Jupyter Lab in a Web browser

Jupyter Lab (run from the Docker image) is now available on the Web browser: http://localhost:8888 Note that the port (8888 by default) may be changed as per your convenience.

Build your own Docker image

$ mkdir -p ~/dev/ml && cd ~/dev/ml
$ git clone https://github.com/machine-learning-helpers/docker-python-jupyter.git
$ cd docker-python-jupyter
  • Build the Docker image:
$ docker build -t infrahelpers/python-jupyter:<linux-distrib> <linux-distrib>/
$ docker images
REPOSITORY                            TAG           IMAGE ID     CREATED            SIZE
infrahelpers/python-jupyter linux-distrib 33a1ad533140 About a minute ago 2.29GB
  • (Optional) Push the newly built image to Docker Cloud. That step is usually not needed, as the images are automatically built everytime there is a change on GitHub)
$ docker login
$ docker push infrahelpers/python-jupyter:<linux-distrib>
  • Shutdown the Docker image
$ docker ps
CONTAINER ID IMAGE                    COMMAND                   CREATED        STATUS        PORTS                  NAMES
7b69efc9dc9a ai/python-jupyter:centos9   "/bin/sh -c 'pipenv …" 48 seconds ago Up 47 seconds 0.0.0.0:9000->8888/tcp vigilant_merkle
$ docker kill vigilant_merkle
vigilant_merkle
$ docker ps
CONTAINER ID IMAGE                    COMMAND                   CREATED        STATUS        PORTS                  NAMES

About

Docker images to use Machine Learning (ML) in Python Jupyter notebooks in a minute

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published