You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pycuda defaults to asking nvcc to use the maximum compute capability available on the GPU. This fails if the version of CUDA doesn't support the compute capability. For instance, if you're trying to use a GTX 1080 on CUDA 7.5 you get error messages like:
ExecError: error invoking 'nvcc --preprocess -arch sm_61 -Ifile.cu --compiler-options -P': [Errno 2] No such file or directory
The solution seems to be to use the highest compute capability available in CUDA that's supported by the card, but I'm not sure the best way to do that.
The text was updated successfully, but these errors were encountered:
Is there an easy way to determine what the maximum supported compute capability of the linked version of CUDA is? Seems like we want to use an arch which is min(max supported by CUDA, max supported by device).
Having an environment variable like PYCUDA_DEFAULT_JIT_ARCH would be very useful.
For custom kernels you can indeed use the arch argument, but this is not possible for ElementWise or Reduction kernels (and I guess Parallel Scan, but I do not use them).
pycuda defaults to asking nvcc to use the maximum compute capability available on the GPU. This fails if the version of CUDA doesn't support the compute capability. For instance, if you're trying to use a GTX 1080 on CUDA 7.5 you get error messages like:
ExecError: error invoking 'nvcc --preprocess -arch sm_61 -Ifile.cu --compiler-options -P': [Errno 2] No such file or directory
The solution seems to be to use the highest compute capability available in CUDA that's supported by the card, but I'm not sure the best way to do that.
The text was updated successfully, but these errors were encountered: