You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since kernels can now call other kernels, I wanted to try and call cublas_v2 routines from inside my kernels. I can't figure out though how I get this to compile.
Using this SourceModule definition:
kernel_mod=SourceModule(""" #include <cublas_v2.h> __global__ void deviceReduceCublas(double *in, double *out, int N) { cublasHandle_t cnpHandle; cublasStatus_t status = cublasCreate(&cnpHandle); status = cublasDasum(cnpHandle, N, in, 1, out); cublasDestroy(cnpHandle); }""",
options=['-lcublas', '-lcublas_device', '-lcudadevrt'],
include_dirs=['/opt/cuda/include', '/opt/cuda/lib64'])
I get the error:
Traceback (most recent call last):
File "chaotic_neural.py", line 129, in<module>
include_dirs=['/opt/cuda/include', '/opt/cuda/lib64'])
File "/usr/lib/python3.6/site-packages/pycuda/compiler.py", line 265, in __init__
arch, code, cache_dir, include_dirs)
File "/usr/lib/python3.6/site-packages/pycuda/compiler.py", line 255, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File "/usr/lib/python3.6/site-packages/pycuda/compiler.py", line 137, in compile_plain
stderr=stderr.decode("utf-8", "replace"))
pycuda.driver.CompileError: nvcc compilation of /tmp/tmp9po2la3f/kernel.cu failed
[command: nvcc --cubin -lcublas -lcublas_device -lcudadevrt -arch sm_50 -I/opt/cuda/include -I/opt/cuda/lib64 -I/usr/lib/python3.6/site-packages/pycuda/cuda kernel.cu]
[stderr:
ptxas fatal : Unresolved extern function'cublasCreate_v2'
]
Any ideas how to get that working?
The text was updated successfully, but these errors were encountered:
There were 2 problems with the above: 1) no linking to cublas was done; adding -dlink fixes that. 2) after doing that, a lot of conflicting declaration of C function errors popped up, which are fixed by moving the include declarations outside of the extern "C" block and writing it yourself:
kernel_mod=SourceModule(""" #include "cublas_v2.h" extern "C" { __global__ void deviceReduceCublas(double *in, double *out, int N) { cublasHandle_t cnpHandle; cublasStatus_t status = cublasCreate(&cnpHandle); status = cublasDasum(cnpHandle, N, in, 1, out); cublasDestroy(cnpHandle); } }""",
options=['-lcublas', '-lcublas_device', '-lcudadevrt','-dlink'], no_xtern_c=True)
However, now a new problem appeared; is this related to name mangling?
% optirun python chaotic_neural.py
Traceback (most recent call last):
File "chaotic_neural.py", line 130, in<module>
options=['-lcublas', '-lcublas_device', '-lcudadevrt', '-dlink'], no_extern_c=True, keep=True)
File "/usr/lib/python3.6/site-packages/pycuda/compiler.py", line 268, in __init__
self.module = module_from_buffer(cubin)
pycuda._driver.LogicError: cuModuleLoadDataEx failed: named symbol not found -
It further turns out that one can compile the above kernel into cubin file by hand using nvcc and then load the module into pycuda. I wonder why it can't do that by itself?
Since kernels can now call other kernels, I wanted to try and call
cublas_v2
routines from inside my kernels. I can't figure out though how I get this to compile.Using this
SourceModule
definition:I get the error:
Any ideas how to get that working?
The text was updated successfully, but these errors were encountered: