-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unable to enable the GPU's via the GUI #5793
Comments
are you please able to confirm that the nvidia driver addon is enabled? |
Are the |
Hi, Not sure how i can check this? |
the From that screenshot it looks like this has not been edited so no real driver has been installed on the underlying hosts. |
Oh ok, that makes sense then. The host is not currently internet connected (sits behind a proxy and i am trying to locate all the URLs to white list in our proxy) Will internet access resolve this or do i still need to find a location |
the http endpoint is supposed to be an internal http server where you can host the drivers. You will need access to the nvidia portal to download the nvidia kvm drivers. These are different from the opensource drivers. Please refer to the docs: https://docs.harvesterhci.io/v1.3/advanced/addons/nvidiadrivertoolkit |
OK, downloaded the latest KVM drivers and put them on an internal web server, updated it but still no luck Tested that i can hit the URL from my PC and the file starts downloading, i also can ping the host from the harvester host. |
any chance i may have a support bundle to figure out what is going on? There would be messages in the nvidia driver toolkit container / pcidevices which would provide insights on what is going on. |
Sure, What is the best way to provide them to you securely? |
please email the bundle to harvester-support-bundle@suse.com |
Sent. Thank you |
The nvidia-driver-runtime image cannot be pulled by your nodes
This image is not shipped in the iso and needs to be pulled to your private registry in case your nodes do not have access to the docker hub. Once the image is available please update the image details in the addon to point to your private registry. This is mentioned in the docs: https://docs.harvesterhci.io/v1.3/advanced/addons/nvidiadrivertoolkit |
Oh ok, I didnt realise that, I thought i just had to download the driver and host it on a web server which i done Do i need to setup this private registry also? Do i just deploy a SUSE microOS and setup as a private registry? Thanks |
The private registry is a container registry. I do not think microOS contains a registry of its own. You could use something like goharbor to get started with a private registry |
Thanks for that, we will look into that. Interestingly, i white listed all the domains the system was trying to get out to in our proxy and then configured the proxy in the harvester UI, but still no luck, it should be able to get out now |
are you able to ssh to all your nodes and just run if the nodes can pull this image then the addon should work |
Describe the bug
Unable to enable the GPU's that are installed in the server from the UI
Two GPUs are listed (2xNVIDIA L40) however selecting one and selecting enable nothing happens
To Reproduce
Goto SR-IOV GPU Devices, Select a listed GPU and click the 3 dots to enable
Expected behavior
GPU should enable
Support bundle
Please reach out to request one securely
Environment
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: