Skip to content

NVIDIA/cloud-native-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NVIDIA Cloud Native Stack

NVIDIA Cloud Native Stack (formerly known as Cloud Native Core) is a collection of software to run cloud native workloads on NVIDIA GPUs. NVIDIA Cloud Native Stack is based on Ubuntu, Kubernetes, Helm and the NVIDIA GPU and Network Operator.

Interested in deploying NVIDIA Cloud Native Stack? This repository has install guides for manual installations and ansible playbooks for automated installations.

Interested in a pre-provisioned NVIDIA Cloud Native Stack environment? NVIDIA LaunchPad provides pre-provisioned environments so that you can quickly get started.

Getting Started

Prerequisites

Please make sure to meet the following prerequisites to Install the Cloud Native Stack

  • system has direct internet access
  • system should have an Operating system either Ubuntu 20.04 and above or RHEL 8.7
  • system has adequate internet bandWidth
  • DNS server is working fine on the System
  • system can access Google repo(for k8s installation)
  • system has only 1 network interface configured with internet access. The IP is static and doesn't change
  • UEFI secure boot is disabled
  • Root file system should has at least 40GB capacity
  • system has 2CPU and 4GB Memory
  • At least one NVIDIA GPU attached to the system

Installation

Run the below commands to clone the NVIDIA Cloud Native Stack.

git clone https://github.com/NVIDIA/cloud-native-stack.git
cd cloud-native-stack/playbooks

Update the hosts file in playbooks directory with master and worker nodes(if you have) IP's with username and password like below

nano hosts

[master]
<master-IP> ansible_ssh_user=nvidia ansible_ssh_pass=nvidipass ansible_sudo_pass=nvidiapass ansible_ssh_common_args='-o StrictHostKeyChecking=no'
[node]
<worker-IP> ansible_ssh_user=nvidia ansible_ssh_pass=nvidiapass ansible_sudo_pass=nvidiapass ansible_ssh_common_args='-o StrictHostKeyChecking=no'

Install the NVIDIA Cloud Native Stack stack by running the below command. "Skipping" in the ansible output refers to the Kubernetes cluster is up and running.

bash setup.sh install

For more Information about customize the values, please refer Installation

NVIDIA Cloud Native Stack Component Matrix

Branch/Release Version Initial Release Date Platform OS Containerd CRI-O K8s Helm NVIDIA GPU Operator NVIDIA Network Operator NVIDIA Data Center Driver
24.5.0/master 13.0 14 May 2024 NVIDIA Certified Server (x86 & arm64) Ubuntu 22.04 LTS 1.7.16 1.30.0 1.30.0 3.14.4 24.3.0 24.1.1(x86 only) 550.54.15
24.5.0/master 13.0 14 May 2024 NVIDIA Certified Server (x86 & arm64) RHEL 8.9 1.7.16 1.30.0 1.30.0 3.14.4 24.3.0 N/A 550.54.15
24.5.0/master 13.0 14 May 2024 Jetson Devices(AGX, NX, Orin) JetPack 5.1 and JetPack 5.0 1.7.16 1.30.0 1.30.0 3.14.4 N/A N/A N/A
24.5.0/master 13.0 14 May 2024 DGX Server DGX OS 6.0(Ubuntu 22.04 LTS) 1.7.16 1.30.0 1.30.0 3.14.4 24.3.0 N/A N/A
24.5.0/master 12.1 14 May 2024 NVIDIA Certified Server (x86 & arm64) Ubuntu 22.04 LTS 1.7.16 1.29.4 1.29.4 3.14.4 24.3.0 24.1.1(x86 only) 550.54.15
24.5.0/master 12.1 14 May 2024 NVIDIA Certified Server (x86 & arm64) RHEL 8.9 1.7.16 1.29.4 1.29.4 3.14.4 24.3.0 N/A 550.54.15
24.5.0/master 12.1 14 May 2024 Jetson Devices(AGX, NX, Orin) JetPack 5.1 and JetPack 5.0 1.7.16 1.29.4 1.29.4 3.14.4 N/A N/A N/A
24.5.0/master 12.1 14 May 2024 DGX Server DGX OS 6.0(Ubuntu 22.04 LTS) 1.7.16 1.29.4 1.29.4 3.14.4 24.3.0 N/A N/A
24.5.0/masrer 11.2 14 May 2024 NVIDIA Certified Server (x86 & arm64) Ubuntu 22.04 LTS 1.7.16 1.28.6 1.28.8 3.14.4 24.3.0 24.1.1(x86 only) 550.54.15
24.5.0/master 11.2 14 May 2024 NVIDIA Certified Server (x86 & arm64) RHEL 8.9 1.7.16 1.28.6 1.28.8 3.14.4 24.3.0 N/A 550.54.15
24.5.0/master 11.2 14 May 2024 Jetson Devices(AGX, NX, Orin) JetPack 5.1 and JetPack 5.0 1.7.16 1.28.6 1.28.8 3.14.4 N/A N/A N/A
24.5.0/master 11.2 14 May 2024 DGX Server DGX OS 6.0(Ubuntu 22.04 LTS) 1.7.16 1.28.6 1.28.8 3.14.4 24.3.0 N/A N/A
24.5.0/master 10.5 14 May 2024 NVIDIA Certified Server (x86 & arm64) Ubuntu 22.04 LTS 1.7.16 1.27.6 1.27.12 3.14.4 24.3.0 24.1.1(x86 only) 550.54.15
24.5.0/master 10.5 14 May 2024 NVIDIA Certified Server (x86 & arm64) RHEL 8.9 1.7.16 1.27.6 1.27.12 3.14.4 24.3.0 N/A 550.54.15
24.5.0/master 10.5 14 May 2024 Jetson Devices(AGX, NX, Orin) JetPack 5.1 and JetPack 5.0 1.7.16 1.27.6 1.27.12 3.14.4 N/A N/A N/A
24.5.0/master 10.5 14 May 2024 DGX Server DGX OS 6.0(Ubuntu 22.04 LTS) 1.7.16 1.27.6 1.27.12 3.14.4 24.3.0 N/A N/A

To Find other CNS Release Information, please refer to Cloud Native Stack Component Matrix

NOTE: Above CNS versions are available on master branch as well but it's recommend to use specific branch with respective release

Cloud Native Stack Topologies

  • Cloud Native Stack allows to deploy:
    • 1 node with both control plane and worker functionalities
    • 1 control plane node and any number of worker nodes

NOTE: (Cloud Native Stack does not allow the deployment of several control plane nodes)

Cloud Native Stack Features

Getting help or Providing feedback

Please open an issue on the GitHub project for any questions. Your feedback is appreciated.

Useful Links