-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for curand #43
Comments
Thanks for the feedback! I tried to reproduce the problem but ran into a different issue related to missing headers (which I'll submit a patch for). Could you tell me what CUDA version you're using and provide a minimal reproducer? |
Hi @benbarsdell. Thanks for getting back to me. I saw your talk with Kate at GTC when you first introduced jitify and knew I would need it at some point! I am using CUDA 10.1 on windows using dlfcn-win32 (see #42). A minimal example is to add the line
at line 159 Note: This actually works on linux no problem. To reproduce on windows requires building with dlfcn-win32, setting a pre-porcessor macro of I have noticed that one of the later tests also fail on windows. E.g. line 266 fails. Looking at the
Hence the lookup fails. |
That's interesting that you are not seeing the expected name mangling on Windows. Which compiler are you using? I guess we need to fix up the |
Hi @maddyscientist it is Visual Studio 2019 with CUDA 10.1. I noticed that the demangle detection is easy to fix by changing a macro guard but the actual demangling would require |
I was able to reproduce your issues in Visual Studio 2019 with CUDA 10.2, and I have fixes for them. I'll put them into a PR probably tomorrow. I'll also add workarounds for the NOMINMAX and dlfcn problems to avoid those annoyances. The size_t issue is because jitify provides a definition of it in a built-in header, but NVRTC already provides its own built-in definition, and on Windows these definitions conflict. The fix is to remove these lines: Lines 1578 to 1579 in 8af928e
The name mangling issue is because CUDA (PTX) always uses the Itanium mangling scheme even when compiling with Visual Studio. I was able to implement a simple demangler for variable names that avoids needing cxxabi. With these fixes, all the tests in jitify_example.cpp pass. I didn't see anything related to the callbacks; maybe it was some kind of current-working-directory problem? |
@benbarsdell Fantastic. Looking forward to looking at your Itanium demangler code! I will take another look at the callbacks issue. Probably user error. |
The fixes are available in #45. Let me know if anything doesn't work for you. |
Building with the current status of #45 and VS2015, it works out the box for me. These are the remaining compile warnings I get (I think they're significantly reduce from what Mondus showed me the other week, though I haven't tried it in VS2020). I haven't tested with Curand, I don't think Mondus shared that code with me. I've sent him an email and will try and chase up if he's in the office today. Might be worth addressing them. Though, if it blocks our CI I can probably just wrap the include to reduce warning level.
|
@benbarsdell Sorry for the delay. Other academic duties have moved me away form coding this week. I am sure your fix is fine but I will test this early next week and confirm. |
No worries, thanks for the feedback so far. I pushed some more fixes for conversion warnings. I'll merge #45 now (I have a follow-up PR to submit) and we can address any other issues that come up next week. |
This works for me. Cheers. |
I have been making lots of good progress with jitify. Thanks for the excellent tool. One issue which I am currently unable to resolve however is the use of curand.
If I include curand in my jitify kernel (e.g.
#include <curand_kernel.h>
) and correctly set the compiler to add the cuda include directory (fromCUDA_PATH
) then there are a whole bunch of errors fromcuda.h
relating to ambiguous definitions ofsize_t
e.g.
The text was updated successfully, but these errors were encountered: