-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requested float16 compute type, but the target device or backend do not support efficient float16 computation. #42
Comments
Your GPU does not support FP16 execution. You can set |
With |
Can you try with |
|
Since your GPU does not support float16, you should set "int8" and not "int8_float16". |
Now I received this error """"NVIDIA System Information report created on: 03/15/2023 09:45:48 [Display] """"nvui.dll 8.17.15.3114 NVIDIA User Experience Driver Component |
Did you install the required NVIDIA libraries as indicated in the README? |
Not worked. I checked CUDADNN dlls as suggested by NVidia and cuda toolkit as 12.1 version. No chance. Maybe CUDA version is problem. I do not know |
You need to install CUDA 11.x (not 12.x). Also make sure to configure the PATH environment variable accordingly. |
I just want to mention that cuDNN 11.2 has a bug where int8 did not work correctly on my RTX2070 GPU. I use 11.1 which works in my case. |
I'm closing this issue. The initial error has been explained and there are other useful threads about Windows installation. See for example #85. |
I am facing the same issue. i am using Nvidia MX350(Pascal). my understanding is int8 computation only works with latest architecture(Ada Lovelace), but it seems to work with my graphic card as well. as per the information on wikipedia about pascal and ada lovelace architectures my graphic should only support upto fp16('float16' in CTranslate terms) and not int8. but opposite is true. @guillaumekln could you please explain how this is possible. |
INT8 computation works on GPUs with Compute Capability 6.1 and above. Your GPU probably has CC 6.1 so it is compatible with this mode. Maybe you are thinking about FP8 which indeed requires the Ada architecture? Regarding FP16, your GPU could support it but it does not have Tensor Cores. Currently we disable FP16 without Tensor Cores as it has worse performance than FP32 in my experience. You can override this behavior by setting the environment variable |
yes, I think I miss read it in some documentation and was thinking that int8 is the addition for the new architecture and not fp8. which didn't make much sense to me but i accepted it "as is", since this is not my active feild of research. it's making more sense now. Thanks for the clarification! |
Also not sure why the P40 is reported as not supporting FP16 when the datasheets for the GPU indicate that it definitely does - needed to set the allow flag for it to use FP16. Will post benchmarks in a bit from FP32 vs. FP16 (with forcing flag on). FP32 test on a ~45 min file, Tesla P40, batch size 16.
FP16 test on the same file, Tesla P40, batch size 16, environment variable set:
Much lower memory pressure as well. Transcription was the same quality, speed/performance about the same? |
As explained above, FP16 is only enabled for GPUs with Tensor Cores (Compute Capability 7.0 and above). You can set the environment variable to bypass this check. int8 should work on the P40 which has Compute Capability 6.1. What error do you get? |
I was getting a similar error when trying to run faster_whisper with my GPU and was able to figure out a solution which I'll write here. (I was able to run it just fine with my CPU, but it's so much slower) Error: Whenever I would try and iterate over the segments trying to use my GPU I'd get the following error: code:
I downloaded the following CUDA Toolkit, cuDNN, and Zlib versions:
CUDA is just an .exe install. And I updated my Windows PATH variable with the cuDNN/bin folder and zlibwapi.dll/.lib (i don't think i need the .lib, but i'm covering my bases there). After all of this I was still getting the same error and ran across the 'torch' package and when I ran I had to
And that solved my issue. I'm now able to run faster_whisper on my GPU. I'm still getting an error when it finishes/unloads the model, but this is a different issue that's already opened on a different topic #85 |
I recently tried this wonderful tool on CPU of my Windows 10 amchine and got quite good results. But when I tried on GPU via
model = WhisperModel(model_path, device="cuda", compute_type="float16")
I received following errorRequested float16 compute type, but the target device or backend do not support efficient float16 computation.
I have GTX1050 Ti and main driver is 31.0.15.1694. How can I fix this error and run on my GPU card?
The text was updated successfully, but these errors were encountered: