In this tutorial, researchers and practitioners will learn to leverage Kubernetes to scale and parallelize AI/ML and advanced analytics workloads, such that they can run their Big Data workloads on any size of cluster, ranging from local to commercial clouds to publicly-funded research platforms. This tutorial will cover building reproducible and portable containers for software using Docker and deploying them to Kubernetes clusters. Each step of the deployment from local development to Kubernetes cluster deployment will be covered, including building and pushing custom containers to image registries, building Kubernetes pod and job YAML files, and deploying pods and jobs to the cluster.
If you are participating in the tutorial, please ensure you have an active account for either Github.com or ORCID for sign in to the computing environment for hands on activities of the tutorial.
If you are participating in GP-ENGINE training, please complete this survey to assist our reporting of our project outreach efforts.
Survey Link (3-minutes to complete): https://forms.gle/hmtJQpiM7UAVbZNW8
The notebooks
folder contains all the hands on notebooks for the IEEE Big Data 2023 tutorial.
Supporting files and extra material are provided in the docker
, scripts
, and yaml
folders.
These are designed as follow-on self-study and research task starter materials as next steps after tutorial training.