-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Describe the bug
GPUs not connecting to VM. The NVIDIA RTX A-Series in the Flemingburg zone is not connecting to the VMs smoothly. The other GPUs namely NVIDIA Quadro K420 and NVIDIA GRID K1 are recognised by the UNIX system as PCI devices, but nvidia-smi does not recognise them due to a signing-key mismatch.
To Reproduce
Steps to reproduce the behaviour:
- Create a Ubuntu VM in the Flemingsburg zone in the Gold tier.
- Attach any of the available GPUs (except
NVIDIA Quadro RTX 5000this was not available during my testing, so its behaviour is inconclusive) - After installing the latest production drivers, run the
nvidia-smicommand to see a failed output. - Check the
lspci | grep -i vgacommand and thelspci | grep -i nvidiacommand to obtain connected GPUs in the K420 and K1 cases and empty outputs in the A-Series' cases. - Check the
dmesg | grep -i nvidiacommand to see the errors occurring with the exact details. - Check
dpkg -l | grep -i nvidiato see that the modules are installed but running thelsmod | grep -i nvidiagives a blank output signalling that they are not loaded. Manual loading of the modules also fails.
Expected behavior
Essentially on leasing a GPU through the web interface and installing the drivers and rebooting the system should flawlessly attach the GPUs. But I understand there is a scope of access for each GPU and if there is something of that sort, then it should be made clear in the web interface or in the documentation as to which tier has access to which GPUs.
System Configurations:
- OS: Ubuntu 24.04 LTS (noble)
- Kernel-Version: 6.8.0-36-generic