Skip to content

Commit

Permalink
Docker CUDA updates (#1119)
Browse files Browse the repository at this point in the history
* Start work on updating CUDA but using Docker for 22.04+

* updates from test install on fresh install of Pop!OS 22.04

ran through the new instructions on getting versions of the CUDA library installed outside what is packaged with the Nvidia driver directly.
Updates to get docker.io installed before adding user to docker group
remove sudo on docker commands

* fix Bare URL used

* remove extra blank line

* replace then with than

* fix: Spelling, grammar, & phrasing in introduction paragraph

* cuda: Various formatting & content improvements

* fix(cuda): Correct commands for Docker tutorial

* remove "s" from 0_Introductions

* Fix space

* match directories in commands

* fix(cuda): Minor command adjustments

* fix(cuda): Run spell check

---------

Co-authored-by: Aaron Honeycutt <aaronhoneycutt@proton.me>
Co-authored-by: Randall White <N3M0-22@pm.me>
Co-authored-by: Jacob Kauffmann <jacob@system76.com>
  • Loading branch information
4 people committed Mar 24, 2023
1 parent 2cc56ff commit 38a36f7
Showing 1 changed file with 135 additions and 11 deletions.
146 changes: 135 additions & 11 deletions content/cuda.md
Expand Up @@ -16,19 +16,151 @@ tableOfContents: true

## Pop!\_OS 22.04 LTS

It is recommended to use [Tensorman](/articles/tensorman) as newer versions of CUDA are no longer packaged on their own.
Basic CUDA runtime functionality is installed automatically with the NVIDIA driver (in the `libnvidia-compute-*` and `nvidia-compute-utils-*` packages). The maximum CUDA version supported by the libraries included with the driver can be seen using the `nvidia-smi` command.

Additional tools for using and developing with CUDA can be installed with the `nvidia-cuda-toolkit` package:

```
sudo apt install nvidia-cuda-toolkit
```

The `nvidia-cuda-toolkit` package is [maintained by Ubuntu](https://packages.ubuntu.com/jammy/amd64/nvidia-cuda-toolkit), and may contain an older version of CUDA than what the driver supports.

### Other Versions of CUDA

The `nvidia-container-toolkit` package uses Docker containers to allow alternate versions of the CUDA libraries to be installed alongside the one included with the NVIDIA driver. You can see the different Docker images that are published by NVIDIA here: <https://hub.docker.com/r/nvidia/cuda/>

This example installs a development enviroment with CUDA version 12.1.

#### Install Software

After making sure the system is up-to-date, install the NVIDIA container toolkit. In this example, Docker will also be installed using the `docker.io` package.

```bash
sudo apt update
sudo apt full-upgrade
sudo apt install nvidia-container-toolkit docker.io
```

The user account working with the Container Toolkit must be added to the `docker` group if that hasn't been done already:

```bash
sudo usermod -aG docker $USER
```

The last step is to add a kernel parameter:

```bash
sudo kernelstub --add-options "systemd.unified_cgroup_hierarchy=0"
```

...and reboot.

#### Configure the Docker daemon for the NVIDIA Container Runtime

Use the NVIDIA Container Toolkit CLI to configure Docker to use the NVIDIA libraries, then restart Docker:

```bash
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
```

#### Test Configuration

Run this command to check the Docker configuration for CUDA:

```bash
docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.0-devel-ubuntu22.04 nvidia-smi
```

The output displays the CUDA version supported by the container:

```
Thu Mar 23 14:43:51 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 30% 37C P5 N/A / 75W | 789MiB / 4096MiB | 16% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
```

#### Run the Container

Start a shell within the container:

```bash
docker run -it --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.0-devel-ubuntu22.04 bash
```

Commands can then be run with CUDA support:

```shell
root@5397e7ea7f57:/# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
```

The container can be viewed and managed using `docker ps` in another terminal or tab:

```bash
system76@pop-os:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5397e7ea7f57 nvidia/cuda:12.1.0-devel-ubuntu22.04 "/opt/nvidia/nvidia_…" 2 minutes ago Up 2 minutes boring_tesla
```

The container ID can be referenced to copy files into and out of the container:

```bash
system76@pop-os:~$ git clone https://github.com/NVIDIA/cuda-samples.git
system76@pop-os:~$ docker cp cuda-samples/ 5397e7ea7f57:/root/cuda-samples/
```

Now, from within the container, an example project can be built:

```bash
root@5397e7ea7f57# cd /root/cuda-samples/Samples/0_Introduction/c++11_cuda/
root@5397e7ea7f57:~/cuda-samples/Samples/0_Introduction/c++11_cuda# make
```

The binary (`c++11_cuda`) is built:

```
root@5397e7ea7f57:~/cuda-samples/Samples/0_Introduction/c++11_cuda# ls -l
total 6108
-rw-rw-r-- 1 1000 1000 13679 Mar 24 16:45 Makefile
-rw-rw-r-- 1 1000 1000 2090 Mar 24 16:45 NsightEclipse.xml
-rw-rw-r-- 1 1000 1000 3556 Mar 24 16:45 README.md
-rwxr-xr-x 1 root root 1881448 Mar 24 16:48 c++11_cuda
...
```

## Pop!\_OS 20.04 LTS

### Install the Latest NVIDIA CUDA Toolkit

To install the CUDA toolkit, please run this command:
To install the CUDA toolkit, run this command:

```bash
sudo apt install system76-cuda-latest
```

To install the cuDNN library, please run this command:
To install the cuDNN library, run this command:

```bash
sudo apt install system76-cudnn-11.2
Expand Down Expand Up @@ -112,12 +244,4 @@ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key 204DD8AEC33A7AFF
sudo apt update
```

### Ubuntu 21.04

```bash
echo "deb http://apt.pop-os.org/proprietary hirsute main" | sudo tee -a /etc/apt/sources.list.d/pop-proprietary.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key 204DD8AEC33A7AFF
sudo apt update
```

The following [article](/articles/system76-driver) will go over installing the System76 NVIDIA driver.

0 comments on commit 38a36f7

Please sign in to comment.