<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>

# 9.0 Enabling GPU within a Kubernetes (K8s) Cluster
## (part of Lab 3)

<img src="images/k8s/kubernetes_stack_0.png" style="float: right;">
In this notebook, you'll learn how to prepare a Kubernetes cluster for GPU acceleration full production deployment of conversational AI applications.<br><br>

**[9.1 Launch a K8s Cluster](#9.1-Launch-a-K8s-Cluster)<br>**
**[9.2 Deploy a CUDA Test Application](#9.2-Deploy-a-CUDA-Test-Application)<br>**
**[9.3 Add GPU Awareness to K8s](#9.3-Add-GPU-Awareness-to-K8s)<br>**
**[9.4 Interact with GPU Resources in K8s](#9.4-Interact-with-GPU-Resources-in-K8s)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[9.4.1 Exercise: Configure Pod](#9.4.1-Exercise:-Configure-Pod)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[9.4.2 Final Checks and Shutdown](#9.4.2-Final-Checks-and-Shutdown)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[9.4.2.1 Exercise: Delete a Pod](#9.4.2.1-Exercise:-Delete-a-Pod)<br>

In the previous parts of the class, you deployed NVIDIA Riva using basic shell commands. As convenient as this method is during development, it becomes impractical when deploying to production, that is, when managing larger numbers of servers and services. 

[Kubernetes](https://kubernetes.io/), also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications. 
In this part of the class, we will first launch a K8s cluster, enable the cluster for GPU acceleration and interact with those resources. This is our first step toward monitoring, managing, and deploying conversational AI applications in production. Monitoring and deployment will be covered in later notebooks.

### Notebook Dependencies
The steps in this notebook assume that you are starting with a clean environment.  Ensure that by stopping any previous Kubernetes installation and all docker containers, then looking at our environment's state. 

In [1]:
# Check running docker containers. This should be empty.
!docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


In [2]:
# If not empty,
# Clear Docker containers to start fresh...
!docker kill $(docker ps -q)

# Check for clean environment - this should be empty
!docker ps

"docker kill" requires at least 1 argument.
See 'docker kill --help'.

Usage:  docker kill [OPTIONS] CONTAINER [CONTAINER...]

Kill one or more running containers
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


In [3]:
# Deletes local Kubernetes cluster if it exists
!minikube delete

🙄  "minikube" profile does not exist, trying anyways.
💀  Removed all traces of the "minikube" cluster.


--- 
# 9.1 Launch a K8s Cluster

A [Kubernetes cluster](https://kubernetes.io/docs/concepts/overview/components/) consists of a set of worker machines (physical or virtual), called nodes, that run containerized applications. Every cluster has at least one worker node, though it can also support thousands of nodes! For this class, we will use [Minikube](https://minikube.sigs.k8s.io/docs/), which allows us to deploy a local and self-contained Kubernetes cluster with a single node. 

Review the class hardware resources available and launch the K8s cluster.

We can see details and status of the available GPU using the `nvidia-smi` command.

<img src="images/k8s/nvidia_smi.png">

In [4]:
# What GPU are we using and how much memory does it have?
!nvidia-smi

Wed Apr 27 09:52:42 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   28C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [5]:
# What type of CPU processor(s) are we using?
!cat /proc/cpuinfo | grep "model name"

model name	: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
model name	: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
model name	: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
model name	: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz


In [6]:
# How many processors are available?
!nproc

4


In [7]:
# Launch the K8s cluster using Minikube
!minikube start --driver=none

😄  minikube v1.19.0 on Ubuntu 20.04 (docker/amd64)
✨  Using the none driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🤹  Running on localhost (CPUs=4, Memory=15717MB, Disk=297738MB) ...
ℹ️  OS release is Ubuntu 20.04.1 LTS
    > kubeadm.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s[K[K[K[K[K[K[K[K| 
    > kubectl.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s
    > kubelet.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s
    > kubeadm: 37.40 MiB / 37.40 MiB [-------------] 100.00% 1.87 GiB p/s 219ms
    > kubectl: 38.37 MiB / 38.37 MiB [------------] 100.00% 12.91 GiB p/s 206ms
    > kubelet: 108.73 MiB / 108.73 MiB [---------] 100.00% 320.44 MiB p/s 539msK[K[K[K[K[K[K[K| 
[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K[K

Once the cluster is successfully launched, we expect to see a number of containers running.  Check this by executing `docker ps` again.

In [8]:
# Listing the Kuberenetes components deployed
!docker ps

CONTAINER ID   IMAGE                                     COMMAND                  CREATED          STATUS          PORTS     NAMES
d8ce88e65dd1   gcr.io/k8s-minikube/storage-provisioner   "/storage-provisioner"   4 seconds ago    Up 3 seconds              k8s_storage-provisioner_storage-provisioner_kube-system_d60a76c3-475a-486f-ab1a-ef78d7f28437_0
15e8c210c8e5   k8s.gcr.io/pause:3.2                      "/pause"                 5 seconds ago    Up 5 seconds              k8s_POD_storage-provisioner_kube-system_d60a76c3-475a-486f-ab1a-ef78d7f28437_0
a78bfacd9623   bfe3a36ebd25                              "/coredns -conf /etc…"   8 seconds ago    Up 7 seconds              k8s_coredns_coredns-74ff55c5b-zv7v2_kube-system_830c4107-4186-4d70-a93b-d3b70c6b84cb_0
068edd4a0048   43154ddb57a8                              "/usr/local/bin/kube…"   8 seconds ago    Up 7 seconds              k8s_kube-proxy_kube-proxy-n6m24_kube-system_6217ce95-7758-4d6f-abf8-d4218355eb21_0
601c57a3f3ff   k8s.gcr.io

We should now have access to the [kubectl command line tool](https://kubernetes.io/docs/reference/kubectl/overview/), which is used to interact with the cluster. List the nodes and services in the cluster using the `kubectl get` command:

In [9]:
# List nodes in the cluster
!kubectl get nodes

NAME           STATUS   ROLES                  AGE   VERSION
442be9fcde81   Ready    control-plane,master   87s   v1.20.2


In [10]:
# List all services deployed
!kubectl get services

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   100s


--- 
# 9.2 Deploy a CUDA Test Application

Next, we will deploy a simple GPU-accelerated application. This is a toy application which randomly generates two very large vectors and adds them. Print out the YAML configuration file needed to deploy the application:

In [11]:
# Set the configuration directory
CONFIG_DIR='/dli/task/kubernetes-config'

In [12]:
# Review the application we will deploy
!cat $CONFIG_DIR/gpu-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: gpu-operator-test
spec:
  restartPolicy: OnFailure
  containers:
  - name: cuda-vector-add
    image: "nvidia/samples:vectoradd-cuda10.2"
    resources:
      limits:
         nvidia.com/gpu: 1

The main difference between a YAML file specifying a GPU-accelerated application compared to one specifying a non-GPU-accelerated application, is the configuration of the GPU resources required. In our case, we have created a basic configuration requesting a single NVIDIA GPU by setting `resources: limits: nvidia.com/gpu:` to 1. 

To deploy an application, execute the `kubectl apply` command, specifying the YAML configuration file with the `-f` file option.

In [13]:
# Deploy the application
!kubectl apply -f $CONFIG_DIR/gpu-pod.yaml

pod/gpu-operator-test created


Once deployed, we can observe the status of a pod created with `kubectl get`:

In [14]:
# Get the status of the pod deployed
!kubectl get pods gpu-operator-test

NAME                READY   STATUS    RESTARTS   AGE
gpu-operator-test   0/1     Pending   0          57s


At this stage, the application is in the "Pending" state. <br>
Why do you think this is case? Do you think its just the fact we have not given the application enough time to launch? Or do you think there are other reasons for this behavior? Try executing the same command again to see if the status changes.

In [15]:
# Checking again. Is it still pending?
!kubectl get pods gpu-operator-test

NAME                READY   STATUS    RESTARTS   AGE
gpu-operator-test   0/1     Pending   0          73s


So the application is indeed in the "Pending" state and it will remain like that irrespective of the amount of time we wait. Why? Begin to answer this by looking at the configuration of the available nodes (in our case we just have one). In particular, look for any NVIDIA-specific configuration using the `kubectl describe` command, as this will help us identify GPU resources:

In [16]:
# Can we see the GPU?
!kubectl describe nodes

Name:               442be9fcde81
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=442be9fcde81
                    kubernetes.io/os=linux
                    minikube.k8s.io/commit=15cede53bdc5fe242228853e737333b09d4336b5
                    minikube.k8s.io/name=minikube
                    minikube.k8s.io/updated_at=2022_04_27T10_44_30_0700
                    minikube.k8s.io/version=v1.19.0
                    node-role.kubernetes.io/control-plane=
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 27 Apr 2022 10:44:29 +0000
Taints:             <none>
Unschedulable:    

Can you find anything? Try again, filtering the output with `grep`:

In [17]:
# Let's look for the lines containing the word "nvidia"
!kubectl describe nodes | grep nvidia

We did not find anything. That would explain why the application is still pending. Our cluster is not aware of the presence of the GPU.  The cluster is unable to schedule the execution since our YAML required GPU resources, but they are for all intents and purposes unavailable. We need to add the NVIDIA GPU device plugin.

--- 
# 9.3 Add GPU Awareness to K8s
To take advantage of GPU acceleration on Kubernetes, install the [NVIDIA GPU plugin](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#deploying-nvidia-gpu-device-plugin) to the cluster. Before adding it, look at the status without the plugin  with `kubectl get`:

In [18]:
# Try to find the GPU device plugin. Not there 
!kubectl get pods -A

NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE
default       gpu-operator-test                      0/1     Pending   0          2m56s
kube-system   coredns-74ff55c5b-zv7v2                1/1     Running   0          7m11s
kube-system   etcd-442be9fcde81                      1/1     Running   0          7m22s
kube-system   kube-apiserver-442be9fcde81            1/1     Running   0          7m22s
kube-system   kube-controller-manager-442be9fcde81   1/1     Running   0          7m22s
kube-system   kube-proxy-n6m24                       1/1     Running   0          7m11s
kube-system   kube-scheduler-442be9fcde81            1/1     Running   0          7m22s
kube-system   storage-provisioner                    1/1     Running   0          7m25s


To install the NVIDIA GPU plugin, we can use the Kubernetes package manager [Helm](https://helm.sh/).

In [19]:
# Install the device plugin with the Helm package manager
!helm repo add nvdp https://nvidia.github.io/k8s-device-plugin \
   && helm repo update
!helm install --version=0.9.0 --generate-name nvdp/nvidia-device-plugin

"nvdp" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "nvdp" chart repository
Update Complete. ⎈Happy Helming!⎈
NAME: nvidia-device-plugin-1651056784
LAST DEPLOYED: Wed Apr 27 10:53:04 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None


Check the status again to make sure the plugin was deployed:

In [20]:
# Now the device plugin "nvidia-device-plugin-*" should be "Running" after a "ContainerCreating" status
!kubectl get pods -A

NAMESPACE     NAME                                    READY   STATUS      RESTARTS   AGE
default       gpu-operator-test                       0/1     Completed   0          4m33s
kube-system   coredns-74ff55c5b-zv7v2                 1/1     Running     0          8m48s
kube-system   etcd-442be9fcde81                       1/1     Running     0          8m59s
kube-system   kube-apiserver-442be9fcde81             1/1     Running     0          8m59s
kube-system   kube-controller-manager-442be9fcde81    1/1     Running     0          8m59s
kube-system   kube-proxy-n6m24                        1/1     Running     0          8m48s
kube-system   kube-scheduler-442be9fcde81             1/1     Running     0          8m59s
kube-system   nvidia-device-plugin-1651056784-z2zj2   1/1     Running     0          29s
kube-system   storage-provisioner                     1/1     Running     0          9m2s


We should now see the NVIDIA-specific configuration listed against the nodes:

In [21]:
# Now we should see Allocable GPUs
!kubectl describe nodes

Name:               442be9fcde81
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=442be9fcde81
                    kubernetes.io/os=linux
                    minikube.k8s.io/commit=15cede53bdc5fe242228853e737333b09d4336b5
                    minikube.k8s.io/name=minikube
                    minikube.k8s.io/updated_at=2022_04_27T10_44_30_0700
                    minikube.k8s.io/version=v1.19.0
                    node-role.kubernetes.io/control-plane=
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 27 Apr 2022 10:44:29 +0000
Taints:             <none>
Unschedulable:    

In [22]:
# Let's look for the lines containing the word nvidia
!kubectl describe nodes | grep nvidia

  nvidia.com/gpu:     1
  nvidia.com/gpu:     1
  kube-system                 nvidia-device-plugin-1651056784-z2zj2    0 (0%)        0 (0%)      0 (0%)           0 (0%)         76s
  nvidia.com/gpu     0           0


As we deployed the GPU device plugin, what do you think happened to our application?

In [23]:
# Let's check the application again
!kubectl get pods gpu-operator-test

NAME                READY   STATUS      RESTARTS   AGE
gpu-operator-test   0/1     Completed   0          5m33s


Our application executed successfully when the GPU resources became available. In fact, it has now completed so we can have a look at its execution logs with `kubectl logs`:

In [24]:
# Let's look at the output
!kubectl logs gpu-operator-test

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done


Check the list of Helm charts installed with the `helm list` command (see the [Helm documentation](https://helm.sh/docs/helm/helm_list/)). The `--filter` option allows filtering by name.  Use the `--output` option to specify the output format ("json", "table", or "yaml").  

Now, let's delete the Kubernetes pod `gpu-operator-test`:

In [25]:
# Let's delete the pod
!kubectl delete pod gpu-operator-test 

pod "gpu-operator-test" deleted


Congratulations! You deployed a GPU accelerated applicaiton with Kuberenetes. So far, we have specified that we want a single GPU without specifying which GPU we want.

--- 
# 9.4 Interact with GPU Resources in K8s

Now, let's see how to get more control over the GPU-accelerated cluster. Being able to control the GPU type, or the MIG ([Multi-Instance GPU](https://www.nvidia.com/en-us/technologies/multi-instance-gpu/)) partition on an Ampere GPU is very important as GPUs vary in terms of computational capability, memory, and cost. The MIG allows users to fragment the GPU into as many as 7 (on A100) partitions. This allows more granular control over the resources in the cluster and better application isolation. 

In order to control the GPU type, we'll add the `gpu-feature-discovery` plugin and deploy it with Helm. This plugin can be configured with several options, as described in the [gpu-feature-discovery repository](https://github.com/NVIDIA/gpu-feature-discovery#deployment-via-helm). One of the most interesting options when working with Ampere GPUs is the ability to support MIG partitions. The feature discovery plugin can be deployed with the following configurable features:


|Feature|Description|Default|
|-|-|-|
|`failOnInitError`|Fail if there is an error during initialization of any label sources|"true"|
|`sleepInterval`|Time to sleep between labeling|"60s"|
|`migStrategy`|Pass the desired strategy for labeling MIG devices on GPUs that support it [none | single | mixed]|"none"|
|`nfd.deploy`|When set to true, deploy NFD as a subchart with all of the proper parameters set for it|"true"|

In this class, we are not using Ampere GPUs, so we will do a simple install:

In [26]:
!helm repo add nvgfd https://nvidia.github.io/gpu-feature-discovery \
    && helm repo update
!helm install \
    --version=0.4.1 \
    --generate-name \
    nvgfd/gpu-feature-discovery

"nvgfd" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "nvgfd" chart repository
...Successfully got an update from the "nvdp" chart repository
Update Complete. ⎈Happy Helming!⎈
NAME: gpu-feature-discovery-1651057196
LAST DEPLOYED: Wed Apr 27 10:59:56 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None


Let's look at additional information that we have about our system:

In [27]:
# Looking for all of the NVIDIA related information
!kubectl describe nodes | grep "nvidia.com" -A 15

  nvidia.com/gpu:     1
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  304884524Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16095212Ki
  nvidia.com/gpu:     1
  pods:               110
System Info:
  Machine ID:                 84fb46bd39d2483a97ab4430ee4a5e3a
  System UUID:                4138acd6-eadc-4c1a-8c42-4fbbc960846d
  Boot ID:                    a3589ab0-527a-424b-aa63-1ab9fd222aca
  Kernel Version:             5.4.0-1041-aws
  OS Image:                   Ubuntu 20.04.1 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.3
  Kubelet Version:            v1.20.2
  Kube-Proxy Version:         v1.20.2
PodCIDR:                      10.244.0.0/24
PodCIDRs:                     10.244.0.0/24
Non-terminated Pods:          (11 in total)
--
  nvidia.com/gpu     0           0
Events:
  Type    Reason                   Age                From        Messa

You should see a wide range of GPU-specific information, including the driver and CUDA information, as well as which GPU is in use from `nvidia.com/gpu.product`.

This is probably a Tesla-T4, unless you are running the class on an alternative GPU. Recall that we deployed our test application `gpu-operator-test` with a generic "GPU".  It is possible to deploy it with more specific information regarding the GPU. 

A new YAML file, `gpu-pod-T4.yaml`, is already prepared. Let's inspect it first:

In [28]:
# Review the application we are deploying
!cat $CONFIG_DIR/gpu-pod-T4.yaml

apiVersion: v1
kind: Pod
metadata:
  name: gpu-operator-test-a100
spec:
  restartPolicy: OnFailure
  containers:
  - name: cuda-vector-add
    image: "nvidia/samples:vectoradd-cuda10.2"
    resources:
      limits:
         nvidia.com/gpu: 1
  nodeSelector: 
    nvidia.com/gpu.product: A100-SXM4-40GB

As you might have noticed, the YAML was configured to deploy on an A100 GPU, which is not available in the class. Go ahead and deploy the application anyway.

In [29]:
!kubectl apply -f $CONFIG_DIR/gpu-pod-T4.yaml

pod/gpu-operator-test-a100 created


In [30]:
!kubectl get pods gpu-operator-test-a100

NAME                     READY   STATUS    RESTARTS   AGE
gpu-operator-test-a100   0/1     Pending   0          8s


Just as we saw in the earlier non-GPU case, the deployment is in the "Pending" state and it will remain in this state until an A100 GPU becomes available or it is terminated. 

## 9.4.1 Exercise: Configure Pod

Modify the YAML file and deploy the `gpu-operator-test` application on the correct GPU.
Open the [gpu-pod-T4.yaml](kubernetes-config/gpu-pod-T4.yaml) config file and make those chages:
* Change the pod name to "gpu-operator-test-t4"
* Set the GPU product to "Tesla-T4" instead of the A100

Check your work against the [solution](solutions/ex9.4.1.yaml) before moving on:

In [34]:
# TODO modify gpu-pod-T4.yaml so that this cell verifies changes are correct
# Check your work - you'll get no output if the files match
!diff $CONFIG_DIR/gpu-pod-T4.yaml solutions/ex9.4.1.yaml

Next, deploy the `gpu-operator-test-t4` pod using the modified [gpu-pod-T4.yaml](kubernetes-config/gpu-pod-T4.yaml).

In [35]:
!kubectl apply -f $CONFIG_DIR/gpu-pod-T4.yaml

pod/gpu-operator-test-t4 created


## 9.4.2 Final Checks and Shutdown
It might take a few seconds, but the application should deploy and finish successfully.

In [36]:
# Get the status of the pod deployed
!kubectl get pods gpu-operator-test-t4

NAME                   READY   STATUS      RESTARTS   AGE
gpu-operator-test-t4   0/1     Completed   0          12s


In [37]:
# Let's look at the output
!kubectl logs gpu-operator-test-t4

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done


### 9.4.2.1 Exercise: Delete a Pod

Delete the Kubernetes pod `gpu-operator-test-t4`. Check the [solution](solutions/ex9.4.2.ipynb) before moving on:

In [39]:
# TODO delete the pod
!kubectl delete pod gpu-operator-test-t4


pod "gpu-operator-test-t4" deleted


Before moving forward to the next notebook, shut down K8s and clean up the docker environment.

In [40]:
# Shut down K8s
!minikube delete
# Shut down running docker containers
!docker kill $(docker ps -q)
# Check for clean environment - this should be empty
!docker ps

🔄  Uninstalling Kubernetes v1.20.2 using kubeadm ...
🔥  Deleting "minikube" in none ...
💀  Removed all traces of the "minikube" cluster.
"docker kill" requires at least 1 argument.
See 'docker kill --help'.

Usage:  docker kill [OPTIONS] CONTAINER [CONTAINER...]

Kill one or more running containers
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


---
<h2 style="color:green;">Congratulations!</h2>

In this notebook, you have:
- Launched a K8s cluster
- Interacted with K8s using `kubectl`
- Installed plugins with Helm
- Enabled GPU acceleration and GPU feature discovery
- Deployed an application

Next, you'll monitor activity on the cluster. Move on to [Monitoring GPU within Kubernetes Cluster](010_K8s_Monitor.ipynb).

<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>