Rapids AI notebook
This Dockerfile builds an image that is derived from the current RAPIDS image but which is compatible for launching from the Kubeflow JupyterHub. RAPIDS uses NVIDIA CUDA for high-performance GPU execution, exposing that GPU parallelism and high memory bandwidth through user-friendly Python interfaces. RAPIDS provides several Python API including cuDF, a GPU DataFrame library with a pandas-like API, and cuML, a GPU-accelerated library of machine learning algorithms.
- Pascal or better GPU (e.g., Tesla-P100), ideally with 32 GB
- CUDA 9.2 or higher
- NVIDIA driver 396.44 or higher
The image includes sample notebooks for cuML and cuDF within the sub-directories under
To build the image run:
docker build --pull -t kubeflow-rapidsai-notebook:latest .
Specify whatever repo and image tag you need for your purposes.
cuDF E2E notebook
The demonstration of the cuDF API in
E2E.ipynb performs intensive ETL of Fannie Mae mortgage data that can be downloaded into your notebook. The defaults in the notebook use 8 dask workers (for 8 GPU) and assumes mortgage data for 16 years partitioned across 16 files. This configuration may exceed the capabilities of what you actually have available for GPU count and GPU RAM. Simply change the defaults in notebook cells 2 and 4 to match workers to available GPU and use a reduced span of years and partitions from the mortgage dataset.
GKE deployment notes
It is possible to run this image with the latest NVIDIA drivers available, even if they are not installed yet in GKE. You need to login into your Kubeflow GKE cluster and apply this daemonset for the installation of the desired NVIDIA drivers. You must add the following environment variables to the init container in the daemonset YAML:
- name: NVIDIA_DRIVER_VERSION value: "410.79" - name: IGNORE_MISSING_MODULE_SYMVERS value: "1"
Once you have applied the new daemonset, you must recycle the GPU nodes in your cluster. Note this is just a temporary solution until such time that newer NVIDIA drivers are added to GKE, and in fact may no longer be applicable at some point in the future.
When deploying this Kubeflow image to GKE, you will need to ensure that the
/etc/ld.so.cache is updated after launch in order to use the cuDF Python API. Simply open a terminal in JupyterHub and execute
ldconfig. Permissions have been modified in this image to allow the
ld.so.cache update by the
jovyan user. cuDF provides shared libs that are currently unable to dynamically resolve their CUDA dependencies from the