Additional kubernetes GPU docs #2078

arnaldo2792 · 2022-04-18T18:01:09Z

Issue number:
N / A

Description of changes:

QUICKSTART-EKS: add NVIDIA GPUs sample configuration

Now the documentation explicitly says that it is possible to use a GPU
per orchestrated container, and references the official kubernetes
documentation to schedule NVIDIA GPUs.

README: add NVIDIA GPUs section

This adds a new section for NVIDIA GPUs and lists what EC2 instance
types are supported by the official Bottlerocket `nvidia` k8s AMIs.

Testing done:

Links work as expected

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

webern · 2022-04-18T19:58:58Z

README.md

@@ -740,6 +741,11 @@ There are a few important caveats about the provided kdump support:
 * The system kernel will reserve 256MB for the crash kernel, only when the host has at least 2GB of memory; the reserved space won't be available for processes running in the host
 * The crash kernel will only be loaded when the `crashkernel` parameter is present in the kernel's cmdline and if there is memory reserved for it

+### NVIDIA GPUs Support
+Bottlerocket's `nvidia` kubernetes variants include the required packages and configurations to leverage NVIDIA GPUs.
+The official AMIs for these variants can be used with the following EC2 instance types: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.


Suggestion: Guard against having an incorrect list in the future.

Suggested change

The official AMIs for these variants can be used with the following EC2 instance types: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.

The official AMIs for these variants can be used with EC2 GPU-equipped instance types such as: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.

jpculp · 2022-04-18T23:05:53Z

QUICKSTART-EKS.md

@@ -383,3 +383,20 @@ You can install them in your cluster by following the `helm install` instruction
 The [GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html#install-nvidia-gpu-operator) can also be used to install these tools.
 However, it is cumbersome to select the right subset of features to avoid conflicts with the software included in the variant.
 Therefore we recommend installing the tools individually if they are required.
+
+In hosts with multiple GPUs (i.e. EC2 `g4dn` instances) you can assign a GPU per container by specifying the resource in the containers' spec as described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/):


Suggested change

In hosts with multiple GPUs (i.e. EC2 `g4dn` instances) you can assign a GPU per container by specifying the resource in the containers' spec as described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/):

In hosts with multiple GPUs (ex. EC2 `g4dn` instances) you can assign a GPU per container by specifying the resource in the containers' spec as described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/):

e.g. would also work.

arnaldo2792 · 2022-04-20T16:44:51Z

Forced push addresses the comments above

bcressey · 2022-04-21T17:47:28Z

README.md

@@ -740,6 +741,11 @@ There are a few important caveats about the provided kdump support:
 * The system kernel will reserve 256MB for the crash kernel, only when the host has at least 2GB of memory; the reserved space won't be available for processes running in the host
 * The crash kernel will only be loaded when the `crashkernel` parameter is present in the kernel's cmdline and if there is memory reserved for it

+### NVIDIA GPUs Support
+Bottlerocket's `nvidia` kubernetes variants include the required packages and configurations to leverage NVIDIA GPUs.


nit: maybe omit Kubernetes since we'll have ECS soon:

Suggested change

Bottlerocket's `nvidia` kubernetes variants include the required packages and configurations to leverage NVIDIA GPUs.

Bottlerocket's `nvidia` variants include the required packages and configurations to leverage NVIDIA GPUs.

bcressey · 2022-04-21T17:49:08Z

README.md

+### NVIDIA GPUs Support
+Bottlerocket's `nvidia` kubernetes variants include the required packages and configurations to leverage NVIDIA GPUs.
+The official AMIs for these variants can be used with EC2 GPU-equipped instance types such as: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.
+Please refer to the [Amazon EKS quickstart](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about these variants.


Elsewhere in this doc, we refer to this doc as QUICKSTART-EKS so I'd like to continue using that name for consistency.

This adds a new section for NVIDIA GPUs and lists what EC2 instance types are supported by the official Bottlerocket `nvidia` k8s AMIs. Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>

arnaldo2792 · 2022-04-21T21:23:30Z

Forced push addresses comments above

bcressey · 2022-04-22T23:11:13Z

README.md

+### NVIDIA GPUs Support
+Bottlerocket's `nvidia` variants include the required packages and configurations to leverage NVIDIA GPUs.
+The official AMIs for these variants can be used with EC2 GPU-equipped instance types such as: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.
+Please see [QUICKSTART-EKS](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about kubernetes variants.


Suggested change

Please see [QUICKSTART-EKS](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about kubernetes variants.

Please see [QUICKSTART-EKS](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about Kubernetes variants.

Now the documentation explicitly says that it is possible to use a GPU per orchestrated container, and references the official kubernetes documentation to schedule NVIDIA GPUs. Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>

arnaldo2792 · 2022-04-25T18:29:03Z

(Forced push fixes comment above)

arnaldo2792 requested review from bcressey, cbgbt and jpculp April 18, 2022 18:01

webern approved these changes Apr 18, 2022

View reviewed changes

jpculp reviewed Apr 18, 2022

View reviewed changes

arnaldo2792 force-pushed the k8s-gpu-docs branch from 5be3daf to a8c216f Compare April 20, 2022 16:44

arnaldo2792 requested a review from jpculp April 20, 2022 23:43

bcressey reviewed Apr 21, 2022

View reviewed changes

README: add NVIDIA GPUs section

2be38c4

This adds a new section for NVIDIA GPUs and lists what EC2 instance types are supported by the official Bottlerocket `nvidia` k8s AMIs. Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>

arnaldo2792 force-pushed the k8s-gpu-docs branch from a8c216f to 506e55a Compare April 21, 2022 21:23

arnaldo2792 requested a review from bcressey April 21, 2022 21:33

bcressey approved these changes Apr 22, 2022

View reviewed changes

arnaldo2792 force-pushed the k8s-gpu-docs branch from 506e55a to d398824 Compare April 25, 2022 18:28

arnaldo2792 merged commit cf82440 into bottlerocket-os:develop Apr 25, 2022

arnaldo2792 deleted the k8s-gpu-docs branch June 21, 2022 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional kubernetes GPU docs #2078

Additional kubernetes GPU docs #2078

arnaldo2792 commented Apr 18, 2022

webern Apr 18, 2022

jpculp Apr 18, 2022

arnaldo2792 commented Apr 20, 2022

bcressey Apr 21, 2022

bcressey Apr 21, 2022 •

edited

arnaldo2792 commented Apr 21, 2022

bcressey Apr 22, 2022

arnaldo2792 commented Apr 25, 2022

	The official AMIs for these variants can be used with the following EC2 instance types: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.
	The official AMIs for these variants can be used with EC2 GPU-equipped instance types such as: `p2`, `p3`, `p4`, `g4dn`, `g5` and `g5g`.

	In hosts with multiple GPUs (i.e. EC2 `g4dn` instances) you can assign a GPU per container by specifying the resource in the containers' spec as described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/):
	In hosts with multiple GPUs (ex. EC2 `g4dn` instances) you can assign a GPU per container by specifying the resource in the containers' spec as described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/):

	Bottlerocket's `nvidia` kubernetes variants include the required packages and configurations to leverage NVIDIA GPUs.
	Bottlerocket's `nvidia` variants include the required packages and configurations to leverage NVIDIA GPUs.

	Please see [QUICKSTART-EKS](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about kubernetes variants.
	Please see [QUICKSTART-EKS](QUICKSTART-EKS.md#aws-k8s--nvidia-variants) for further details about Kubernetes variants.

Additional kubernetes GPU docs #2078

Additional kubernetes GPU docs #2078

Conversation

arnaldo2792 commented Apr 18, 2022

webern Apr 18, 2022

Choose a reason for hiding this comment

jpculp Apr 18, 2022

Choose a reason for hiding this comment

arnaldo2792 commented Apr 20, 2022

bcressey Apr 21, 2022

Choose a reason for hiding this comment

bcressey Apr 21, 2022 • edited

Choose a reason for hiding this comment

arnaldo2792 commented Apr 21, 2022

bcressey Apr 22, 2022

Choose a reason for hiding this comment

arnaldo2792 commented Apr 25, 2022

bcressey Apr 21, 2022 •

edited