You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: container error: cgroup subsystem devices not found: unknown.
ERRO[0000] error waiting for container: context canceled
Seems that this is issue with cgroups v2 (googling for error leads to quite a few issues out there already reported - I will try to compile list later) and the workaround (not a solution) seemed to be
sudo kernelstub -a "systemd.unified_cgroup_hierarchy=0"
sudo update-initramfs -c -k all
sudo reboot
Steps to reproduce (if you know):
Get 21.10 PopOS
Install nvidia-container-toolkit (and other nvidia stuff)
Try to use docker run --gpus all ... command
Expected behavior:
it works fine with output along
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
Thu Nov 11 10:21:10 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01 Driver Version: 470.63.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 39C P8 7W / 185W | 1486MiB / 7979MiB | 19% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Other Notes:
Happy to provide additional information. I planned to reinstall my machine back to 21.04 but decided to postpone by a day or two in case you'd like to get some more information about the problem or have some advice.
The text was updated successfully, but these errors were encountered:
Note that only versions after v1.8.0 of the NVIDIA Container Toolkit (including libnvidia-container1) support cgroupv2. Please install a more recent version and see if this addresses your issue.
How did you upgrade to 21.10? (Fresh install / Upgrade)
Upgrade from 21.04 (actually it was quite accidental in sense I was not aware it was still beta :))
Related Application and/or Package Version (run
apt policy $PACKAGE NAME
):Issue/Bug Description:
Package
nividia-container-toolkit
was missing. Previously it was provided fromI did have to try to get it from older releases with
but then I was getting
Seems that this is issue with cgroups v2 (googling for error leads to quite a few issues out there already reported - I will try to compile list later) and the workaround (not a solution) seemed to be
Steps to reproduce (if you know):
docker run --gpus all ...
commandExpected behavior:
it works fine with output along
Other Notes:
Happy to provide additional information. I planned to reinstall my machine back to 21.04 but decided to postpone by a day or two in case you'd like to get some more information about the problem or have some advice.
The text was updated successfully, but these errors were encountered: