Support for curand #43

mondus · 2020-02-25T17:33:41Z

I have been making lots of good progress with jitify. Thanks for the excellent tool. One issue which I am currently unable to resolve however is the use of curand.

If I include curand in my jitify kernel (e.g. #include <curand_kernel.h>) and correctly set the compiler to add the cuda include directory (from CUDA_PATH) then there are a whole bunch of errors from cuda.h relating to ambiguous definitions of size_t

e.g.

cuda.h(1773): error: "size_t" is ambiguous

The text was updated successfully, but these errors were encountered:

benbarsdell · 2020-02-26T00:01:34Z

Thanks for the feedback!

I tried to reproduce the problem but ran into a different issue related to missing headers (which I'll submit a patch for).

Could you tell me what CUDA version you're using and provide a minimal reproducer?
*EDIT: Could you also provide what platform you're running on.

mondus · 2020-02-26T15:06:16Z

Hi @benbarsdell. Thanks for getting back to me. I saw your talk with Kate at GTC when you first introduced jitify and knew I would need it at some point!

I am using CUDA 10.1 on windows using dlfcn-win32 (see #42). A minimal example is to add the line

"#include <curand_kernel.h>\n"

at line 159

Note: This actually works on linux no problem. To reproduce on windows requires building with dlfcn-win32, setting a pre-porcessor macro of NOMINMAX , updating the hard coded include directory and pre-processing (with stringify) the my_header3.cuh file to generate to stringified version. I also had to create a disk version of my_header4.cuh as the callback function didn't seem to work.

I have noticed that one of the later tests also fail on windows. E.g. line 266 fails. Looking at the get_constant_ptr and demangle functions this is because it is expecting the name mangling to be different in visual studio. My windows ptx has the same mangling as in linux however. E.g.

.const .align 4 .u32 a;
.const .align 4 .u32 _ZN1b1aE;
.const .align 4 .u32 _ZN1c1b1aE;

Hence the lookup fails.

maddyscientist · 2020-02-26T17:40:03Z

That's interesting that you are not seeing the expected name mangling on Windows. Which compiler are you using? I guess we need to fix up the demangle detection for this.

mondus · 2020-02-26T18:05:27Z

Hi @maddyscientist it is Visual Studio 2019 with CUDA 10.1. I noticed that the demangle detection is easy to fix by changing a macro guard but the actual demangling would require cxxabi which is not so easy to come by on windows...

benbarsdell · 2020-02-27T10:18:21Z

I was able to reproduce your issues in Visual Studio 2019 with CUDA 10.2, and I have fixes for them. I'll put them into a PR probably tomorrow. I'll also add workarounds for the NOMINMAX and dlfcn problems to avoid those annoyances.

The size_t issue is because jitify provides a definition of it in a built-in header, but NVRTC already provides its own built-in definition, and on Windows these definitions conflict. The fix is to remove these lines:

jitify/jitify.hpp

Lines 1578 to 1579 in 8af928e

    
           "typedef unsigned long size_t;\n" 
        
           "typedef   signed long ptrdiff_t;\n"

The name mangling issue is because CUDA (PTX) always uses the Itanium mangling scheme even when compiling with Visual Studio. I was able to implement a simple demangler for variable names that avoids needing cxxabi.

With these fixes, all the tests in jitify_example.cpp pass. I didn't see anything related to the callbacks; maybe it was some kind of current-working-directory problem?

mondus · 2020-02-27T10:42:57Z

@benbarsdell Fantastic. Looking forward to looking at your Itanium demangler code! I will take another look at the callbacks issue. Probably user error.

benbarsdell · 2020-02-28T05:24:16Z

The fixes are available in #45. Let me know if anything doesn't work for you.

benbarsdell · 2020-03-04T03:08:51Z

@mondus @Robadob
I added the #include <cctype> fix to #45. Are you OK with it being merged?

Robadob · 2020-03-04T07:50:26Z

Building with the current status of #45 and VS2015, it works out the box for me. These are the remaining compile warnings I get (I think they're significantly reduce from what Mondus showed me the other week, though I haven't tried it in VS2020).

I haven't tested with Curand, I don't think Mondus shared that code with me. I've sent him an email and will try and chase up if he's in the office today.

Might be worth addressing them. Though, if it blocks our CI I can probably just wrap the include to reduce warning level.

C:\Users\Robadob\Documents\Visual Studio 2015\Projects\Jitify>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -x cu -rdc=true -I"C:\Program Files (x86)\dlfcn-win32\include" -I"C:\Program Files (x86)\Visual Leak Detector\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64\Debug\vc140.pdb /FS /Zi /RTC1 /MDd " -o x64\Debug\kernel.cu.obj "C:\Users\Robadob\Documents\Visual Studio 2015\Projects\Jitify\kernel.cu"
1>c:\program files\nvidia gpu computing toolkit\cuda\v10.1\include\vector_types.h(423): warning : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
1>c:\program files\nvidia gpu computing toolkit\cuda\v10.1\include\vector_types.h(423): warning : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
1>  kernel.cu
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2259): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2087): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2793): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(3564): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(217): warning C4800: 'unsigned __int64': forcing value to bool 'true' or 'false' (performance warning)
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(217): note: while compiling class template member function 'bool jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>::contains(const unsigned __int64 &) const'
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2894): note: see reference to function template instantiation 'bool jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>::contains(const unsigned __int64 &) const' being compiled
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2357): note: see reference to class template instantiation 'jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>' being compiled

mondus · 2020-03-04T09:46:45Z

@benbarsdell Sorry for the delay. Other academic duties have moved me away form coding this week. I am sure your fix is fine but I will test this early next week and confirm.

benbarsdell · 2020-03-04T10:04:02Z

No worries, thanks for the feedback so far. I pushed some more fixes for conversion warnings. I'll merge #45 now (I have a follow-up PR to submit) and we can address any other issues that come up next week.

mondus · 2020-04-02T14:10:09Z

This works for me. Cheers.

benbarsdell mentioned this issue Feb 28, 2020

Fix a collection of issues related to Windows/MSVC #45

Merged

mondus closed this as completed Apr 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for curand #43

Support for curand #43

mondus commented Feb 25, 2020

benbarsdell commented Feb 26, 2020 •

edited

Loading

mondus commented Feb 26, 2020

maddyscientist commented Feb 26, 2020

mondus commented Feb 26, 2020

benbarsdell commented Feb 27, 2020

mondus commented Feb 27, 2020 •

edited

Loading

benbarsdell commented Feb 28, 2020

benbarsdell commented Mar 4, 2020

Robadob commented Mar 4, 2020 •

edited

Loading

mondus commented Mar 4, 2020

benbarsdell commented Mar 4, 2020

mondus commented Apr 2, 2020

Support for curand #43

Support for curand #43

Comments

mondus commented Feb 25, 2020

benbarsdell commented Feb 26, 2020 • edited Loading

mondus commented Feb 26, 2020

maddyscientist commented Feb 26, 2020

mondus commented Feb 26, 2020

benbarsdell commented Feb 27, 2020

mondus commented Feb 27, 2020 • edited Loading

benbarsdell commented Feb 28, 2020

benbarsdell commented Mar 4, 2020

Robadob commented Mar 4, 2020 • edited Loading

mondus commented Mar 4, 2020

benbarsdell commented Mar 4, 2020

mondus commented Apr 2, 2020

benbarsdell commented Feb 26, 2020 •

edited

Loading

mondus commented Feb 27, 2020 •

edited

Loading

Robadob commented Mar 4, 2020 •

edited

Loading