Skip to content

Commit

Permalink
Add cuda12 variant of tensorflow-notebook (#2100)
Browse files Browse the repository at this point in the history
* Add cuda12 variant for tensorflow-notebook

* Reduce size of CPU version of tensorflow-notebook

* Try to fix tests

* Update docs/using/selecting.md

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>

* Update images/tensorflow-notebook/cuda12/Dockerfile

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>

* Update tests/docker-stacks-foundation/test_packages.py

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>

* Remove obsolete XLA_FLAGS env var

* Install CUDA and cuDNN using pip instead of mamba

* Fix pre-commit shell checks

* Change tensorflow variant name from cuda12 to cuda

* Update selecting.md

* Update selecting.md

---------

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>
  • Loading branch information
ChristofKaufmann and mathbunnyru committed Mar 26, 2024
1 parent dd06b93 commit b9553a8
Show file tree
Hide file tree
Showing 6 changed files with 61 additions and 7 deletions.
13 changes: 13 additions & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,17 @@ jobs:
needs: [x86_64-scipy]
if: ${{ !contains(github.event.pull_request.title, '[FAST_BUILD]') }}

x86_64-tensorflow-cuda:
uses: ./.github/workflows/docker-build-test-upload.yml
with:
parent-image: scipy-notebook
image: tensorflow-notebook
variant: cuda
platform: x86_64
runs-on: ubuntu-latest
needs: [x86_64-scipy]
if: ${{ !contains(github.event.pull_request.title, '[FAST_BUILD]') }}

aarch64-pytorch:
uses: ./.github/workflows/docker-build-test-upload.yml
with:
Expand Down Expand Up @@ -378,6 +389,7 @@ jobs:
{ image: r-notebook, variant: default },
{ image: julia-notebook, variant: default },
{ image: tensorflow-notebook, variant: default },
{ image: tensorflow-notebook, variant: cuda },
{ image: pytorch-notebook, variant: default },
{ image: pytorch-notebook, variant: cuda11 },
{ image: pytorch-notebook, variant: cuda12 },
Expand Down Expand Up @@ -439,6 +451,7 @@ jobs:
{ image: r-notebook, variant: default },
{ image: julia-notebook, variant: default },
{ image: tensorflow-notebook, variant: default },
{ image: tensorflow-notebook, variant: cuda },
{ image: pytorch-notebook, variant: default },
{ image: pytorch-notebook, variant: cuda11 },
{ image: pytorch-notebook, variant: cuda12 },
Expand Down
9 changes: 5 additions & 4 deletions docs/using/selecting.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,12 @@ The following sections describe these images, including their contents, relation

## CUDA enabled variant

We provide CUDA accelerated version of `pytorch-notebook` image.
Prepend a CUDA version prefix (like `cuda12-`) to the image tag to allow PyTorch operations to use compatible NVIDIA GPUs for accelerated computation.
We only build images for 2 last major versions of CUDA.
We provide CUDA accelerated version of `pytorch-notebook` and `tensorflow-notebook` images.
Prepend a CUDA version prefix (like `cuda12-` for `pytorch-notebook` or `cuda-` for `tensorflow-notebook`) to the image tag
to allow PyTorch or TensorFlow operations to use compatible NVIDIA GPUs for accelerated computation.
Note: We only build `pytorch-notebook` for 2 last major versions of CUDA, `tensorflow-notebook` image only supports the latest CUDA version listed in the [officially tested build configurations](https://www.tensorflow.org/install/source#gpu).

For example, you can use an image `quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.8`
For example, you can use an image `quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.8` or `quay.io/jupyter/tensorflow-notebook:cuda-latest`

### jupyter/docker-stacks-foundation

Expand Down
5 changes: 3 additions & 2 deletions images/tensorflow-notebook/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
# Fix: https://github.com/koalaman/shellcheck/wiki/SC3014
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

# Install Tensorflow with pip
RUN pip install --no-cache-dir tensorflow && \
# Install tensorflow with pip, on x86_64 tensorflow-cpu
RUN [[ $(uname -m) = x86_64 ]] && TF_POSTFIX="-cpu" || TF_POSTFIX="" && \
pip install --no-cache-dir "tensorflow${TF_POSTFIX}" && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
27 changes: 27 additions & 0 deletions images/tensorflow-notebook/cuda/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
ARG REGISTRY=quay.io
ARG OWNER=jupyter
ARG BASE_CONTAINER=$REGISTRY/$OWNER/scipy-notebook
FROM $BASE_CONTAINER

LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"

# Fix: https://github.com/hadolint/hadolint/wiki/DL4006
# Fix: https://github.com/koalaman/shellcheck/wiki/SC3014
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

# Install TensorFlow, CUDA and cuDNN with pip
RUN pip install --no-cache-dir "tensorflow[and-cuda]" && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"

# workaround for https://github.com/tensorflow/tensorflow/issues/63362
RUN mkdir -p "${CONDA_DIR}/etc/conda/activate.d/" && \
fix-permissions "${CONDA_DIR}"

COPY --chown="${NB_UID}:${NB_GID}" nvidia-lib-dirs.sh "${CONDA_DIR}/etc/conda/activate.d/"

# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/docker-specialized.html#dockerfiles
ENV NVIDIA_VISIBLE_DEVICES="all" \
NVIDIA_DRIVER_CAPABILITIES="compute,utility"
9 changes: 9 additions & 0 deletions images/tensorflow-notebook/cuda/nvidia-lib-dirs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.

# This adds the NVIDIA libraries to the LD_LIBRARY_PATH. Workaround for
# https://github.com/tensorflow/tensorflow/issues/63362
NVIDIA_DIR=$(dirname "$(python -c 'import nvidia;print(nvidia.__file__)')")
LD_LIBRARY_PATH=$(echo "${NVIDIA_DIR}"/*/lib/ | sed -r 's/\s+/:/g')${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH
5 changes: 4 additions & 1 deletion tagging/taggers.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,10 @@ def tag_value(container: Container) -> str:
class TensorflowVersionTagger(TaggerInterface):
@staticmethod
def tag_value(container: Container) -> str:
return "tensorflow-" + _get_pip_package_version(container, "tensorflow")
try:
return "tensorflow-" + _get_pip_package_version(container, "tensorflow")
except AssertionError:
return "tensorflow-" + _get_pip_package_version(container, "tensorflow-cpu")


class PytorchVersionTagger(TaggerInterface):
Expand Down

0 comments on commit b9553a8

Please sign in to comment.