High CPU usage with cudaowl #13
Comments
|
The CUDA backend is no longer supported, sorry. |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
|
The CUDA backend is no longer supported, sorry. |
Hi,
I'm running cudaowl with Arch Linux and Cuda 9.20, on a GTX 960. The CPU usage of cudaowl stays close to 100% constantly.
I found that the CPU usage can be reduced significantly, to 2.6%, by adding a
cudaDeviceSynchronize();call in CudaGpu.h, line 225. This is at the end of the for loop inmodSqLoop(). I guess this has something to do with cuda busy-waiting for the kernels to finish. With an explicit synchronization call, the CPU code goes to sleep instead (you set the cudaDeviceScheduleBlockingSync flag).The synchronization has a small performance impact however, time per iteration increases from 18.85 ms to 19.10 ms. On the other hand, the power consumption of the whole computer drops by about 30W (no other significant CPU load), so the impact seems worth it for me. I'm testing M(90000881), FFT 4860K, 18.08 bits/word.
Greetings,
Fredrik
The text was updated successfully, but these errors were encountered: