gpu_passthrough

Notes on setting up GPU passthrough on Proxmox.

Goal: Allow Jellyfin container (running inside a Virtual Machine) to use an Nvidia GPU for transcoding video.

Host: Optiplex 9020 SFF running Proxmox 8.0.2
VM: Debian 12 Bookworm
GPU: Nvidia Quadro P400

In Proxmox

enable IOMMU on host
Blacklist modules so host does not use GPU or set in BIOS
Create VM. Just make sure you select 'q35' for Machine Type and 'OVMF(UEFI)' for BIOS
Passthrough PCI GPU (Hardware -> Add PCI Device)
Start VM, enter BIOS and Disable SecureBoot (F2 -> Device Manager -> Secure Boot Configuration)
Go through and complete the Debian installation. Nothing special required here.

Inside the VM

Initial Steps

Check if VM even detects the GPU is attached first. If not, look back over the previous steps.

root@media:~$ lspci | grep -i p400
01:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)

Install some useful packages
apt install htop curl wget vim rsync sudo
Add account to sudoers
usermod -aG sudo conor
From Workstation, add your SSH keys
ssh-copy-id conor@<VM-IP-ADDRESS>

Once that's done you can jump off the Proxmox console and just SSH to the VM from your workstation.

Install Nvidia drivers

Install kernel headers

sudo apt install linux-headers-amd64

Make sure these repositories are present in /etc/apt/sources.list

deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
deb-src http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware

Now install the nvidia driver from those repos

sudo apt install nvidia-driver firmware-misc-nonfree

Reboot the VM

sudo systemctl reboot

Once the VM is back up, check if the nvidia driver is loaded with modinfo nvidia-current. You should see:

conor@gputest2:~$ sudo modinfo nvidia-current | head -15
filename:       /lib/modules/6.1.0-10-amd64/updates/dkms/nvidia-current.ko
firmware:       nvidia/525.125.06/gsp_tu10x.bin
firmware:       nvidia/525.125.06/gsp_ad10x.bin
alias:          char-major-195-*
version:        525.125.06
supported:      external
license:        NVIDIA
srcversion:     B91689BE2C82F40FE520B73
alias:          pci:v000010DEd*sv*sd*bc06sc80i00*
alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
depends:        drm
retpoline:      Y
name:           nvidia
vermagic:       6.1.0-10-amd64 SMP preempt mod_unload modversions

Install some more useful packages

sudo apt install nvtop nvidia-detect

conor@gputest2:~$ nvidia-smi
Sun Aug  6 21:30:43 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:01:00.0 Off |                  N/A |
| 34%   44C    P8    N/A /  30W |      1MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Really at this point if you didn't care about running Jellyfin in a container, you could install Jellyfin via the system packages and use the options in the WebGUI to enable transcoding on the GPU.

Docker

Install docker and docker-compose

sudo apt install docker docker-compose -y

Install the Nvidia container toolkit. This will allow for GPU support in Docker.

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey > /etc/apt/keyrings/nvidia-docker.key
curl -s -L https://nvidia.github.io/nvidia-docker/debian11/nvidia-docker.list > /etc/apt/sources.list.d/nvidia-docker.list
sed -i -e "s/^deb/deb \[signed-by=\/etc\/apt\/keyrings\/nvidia-docker.key\]/g" /etc/apt/sources.list.d/nvidia-docker.list
apt update
apt install install nvidia-container-toolkit
systemctl restart docker

Install nvidia-docker2. You need this to be able to pass the 'runtime: nvidia' environment variable to Docker

sudo apt install nvidia-docker2

Before we look at Jellyfin. Test if you can pass the GPU through to a container via
docker run --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi
This is going to spin up a container using the base cuda image and run nvidia-smi. You should see this:

conor@media:~$ sudo docker run --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi

==========
== CUDA ==
==========

CUDA Version 12.1.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Mon Aug 14 16:49:33 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:01:00.0 Off |                  N/A |
| 34%   38C    P8    N/A /  30W |      1MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Jellyfin

I used the image from lscr.io

My docker compose file looks like this: (should really use a volume for the configs but whatever)

---
version: "2.1"
services:
  jellyfin:
    image: lscr.io/linuxserver/jellyfin:latest
    container_name: jellyfin
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Etc/UTC
      - JELLYFIN_PublishedServerUrl=192.168.0.223 #optional
    volumes:
      - /opt/docker/configs/jellyfin:/config
      - /mnt/media:/media
      - /mnt/media_2:/media_2
    ports:
      - 8096:8096
      - 8920:8920 #optional
      - 7359:7359/udp #optional
      - 1900:1900/udp #optional
    restart: unless-stopped
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all

Bring that up with docker-compose up and check if you are able to access the WebGUI at http://<VM-IP-ADDRESS>:8096
Once in Jellyfin, navigate to Dashboard -> Playback -> Enable Hardware Acceleration. That's it done!

Testing

Here is the VM playing a ~20GB 4K HEVC video file. CPUs are pinned to 100% (and it's loud!)

After enabling Hardware Acceleration, the CPUs are much less stressed and the GPU is handling the video transcoding.

Troubleshooting

conor@gputest2:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

Make sure you definitely disabled SecureBoot on the VM as above.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
gpu_jellyfin.png		gpu_jellyfin.png
gpu_pci_device.png		gpu_pci_device.png
gpu_secureboot.png		gpu_secureboot.png
gpu_vmconfig.png		gpu_vmconfig.png
jellyfin_no_gpu.png		jellyfin_no_gpu.png
jellyfin_with_gpu.png		jellyfin_with_gpu.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

gpu_jellyfin.png

gpu_jellyfin.png

gpu_pci_device.png

gpu_pci_device.png

gpu_secureboot.png

gpu_secureboot.png

gpu_vmconfig.png

gpu_vmconfig.png

jellyfin_no_gpu.png

jellyfin_no_gpu.png

jellyfin_with_gpu.png

jellyfin_with_gpu.png

Repository files navigation

gpu_passthrough

In Proxmox

Inside the VM

Initial Steps

Install Nvidia drivers

Docker

Jellyfin

Testing

Troubleshooting

References

About

Releases

Packages

LowryConor/gpu_passthrough

Folders and files

Latest commit

History

Repository files navigation

gpu_passthrough

In Proxmox

Inside the VM

Initial Steps

Install Nvidia drivers

Docker

Jellyfin

Testing

Troubleshooting

References

About

Resources

Stars

Watchers

Forks