-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tesla P4 Not Used #2132
Comments
What is the output of |
As i said in another issue, i have of course installed the sdk-runtimes (and published a link there)! I think the problem is that Gtp4all uses either a cpu or a gpu bot not both (why not, if the Ram is small - of course difficult to program such a code) and not 'either of them AND a cuda-card ' without gpu. I |
I need the output of Btw, CUDA is irrelevant here, as GPT4All does not use it. It uses Vulkan as a compute backend. Tesla card are GPUs and can do graphics, they just don't have any video outputs. But GPT4All does not care whether the card has any video outputs - I use my Tesla P40 with GPT4All and it works just fine. |
I append the vulkaninfo --summary > d:\sum2.txt |
So, this is not a GPT4All issue. Your installed NVIDIA driver is not providing the Vulkan API for your Tesla P4. One of these might do it: I'm using Linux, and the standard nvidia proprietary driver supports Vulkan on my Tesla P40. |
No : I cant install the desktop driver because it is not a gpu-card. I had installed the datacenter-driver and installed now Your proposed new version - with success, then also the newest vulkan runtime (vulkan now 1.3.275 : though it is not documented as necessary with gpt4all). |
You have to switch to driver mode from TCC mode to WDDM mode to be able to use graphic APIs |
So, there's your answer (thanks sorasoras) - you can only use the Tesla P4 with GPT4All on Windows if you have a GRID license, and use nvidia-smi to switch to WDDM mode: https://docs.nvidia.com/nsight-visual-studio-edition/reference/index.html#setting-tcc-mode-for-tesla-products This is a limitation of Tesla devices on Windows. Unless GPT4All adds support for llama.cpp's CUDA backend, or you downgrade to an older driver that doesn't require a GRID license, there is no way around this. |
Tesla - Cuda devices like P4 improve gaming-speed on windows without special grid-license .
|
For the last time, the Tesla P4 is a fully featured GPU (minus the physical display connectors), not just a "CUDA card", NVIDIA just treats Tesla GPUs differently from e.g. GeForce and Quadro on Windows. You should not need a vGPU, only WDDM mode so you can use Vulkan for compute. GPT4All does not use CUDA at all at the moment. I found this guide, maybe it helps. |
Yes - but the desktop-driver doesnt recognize the P4 and similar others as GPU. The P4 and others are no GRID-Cards which require a license.
**** To finalize this discussion : With grid-driver 4.72.39 (500+ do not function) the nvidia wmi-driver is installed and the card appears in the dropdown. If i choose the P4, then i get an 'out of Vram-error' (as expected). It seems that Gpt4all uses either Cpu or on-chip-Gpu or P4-card or a graphics-card and not all (as LM-Strudio does - which is therefore faster ). This should be the next point in develeopment (from my sight). If LMstudio develops a localdocs-extension, then You will have a problem. |
I believe that LM Studio uses the llama.cpp CUDA backend. The fastest use of llama.cpp is always to use a single GPU which is the fastest one available. And your integrated Intel GPU certainly isn't supported by the CUDA backend, so LM Studio can't use it. Since the CPU is universally slower than the GPU for LLMs, you should only split computation between CPU and GPU if you would run out of VRAM otherwise - you can do this in GPT4All by adjusting the per-model "GPU layers" setting. The main reason that LM Studio would be faster than GPT4All when fully offloading is that the kernels in the llama.cpp CUDA backend are better optimized than the kernels in the Nomic Vulkan backend. This is something we intend to work on, but there are higher priorities at the moment. |
Originally posted by @gtbu in #1843 (comment)
The text was updated successfully, but these errors were encountered: