-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CUDA for LLVM 5 and 6 using LLVM's NVPTX backend #299
Conversation
Dude, you're awesome. Will try to get around to testing this in the next day or so. In the mean time, would it be possible to add LLVM 5 to the Travis tests (without CUDA), assuming that's a configuration that's supposed to work now? I tried LLVM 5 on the previous PR and it seemed that it didn't quite work. |
All tests pass on Ubuntu 16.04 with CUDA 9.2 and LLVM 6.0. Very nice. |
LLVM 5.0 does not build for me, getting errors like:
Is it hard to get LLVM 5 working? I'm not sure how much it matters, but if it's easy and you don't mind doing it, it would be nice to have. |
Sorry, forgot to cherry-pick the commit into PR branch :) Should be fixed now :) |
LLVM 5.0 is spilling segfaults on macOS. The reference counting logic is broken IMHO. Blacklisted for now. |
Sorry for the delay, I'm hoping to get someone familiar with Regent CUDA support to look at this, and the people I've contacted are busy until later this week. (Regent is relevant here because it's just one of the bigger users of Terra, and it happens to provide CUDA support.) At the moment if I run a simple CUDA example with Regent I get the following. It may be that this is entirely Regent's fault, but I'd just like to double check in case there might be some problem with how the new CUDA support works in Terra.
And the backtrace is:
|
@elliottslaughter I observed the same on Opt. It seems to be an issue with
|
@elliottslaughter Any idea? It seems only crash when both |
Also note that the line StructType * CreateStruct(Obj * typ) {
//check to see if it was initialized externally first
if(typ->boolean("llvm_definingfunction")) {
const char * name = typ->string("llvm_definingfunction");
llvm::errs() << name << "\n"; // HERE
printStackTrace();
Function * df = CU->TT->external->getFunction(name); assert(df);
int argpos = typ->number("llvm_argumentposition");
StructType * st = cast<StructType>(df->getFunctionType()->getParamType(argpos)->getPointerElementType());
assert(st);
return st;
} What is the purpose of this |
@zdevito I dumped the offending module:
And it is empty. Needs more investigation though. |
For what it's worth, without CUDA the Regent test suite passes on LLVM 6. So I'm not finding any evidence of a problem without CUDA at the moment. |
I finally found the cause of this issue! In
This is not needed at all, as the symbols we need may not be in the |
Also, I tested Opt with CUDA on LLVM 3.9, which runs fine, however not for 5 and 6. Suggest a problem in LLVM's PTX backend.
|
After solving the df assert problem, the only(and biggest) outstanding standing problem surfaced: LLVM's NVPTX backend is not able to handle relocatable code (RDC)! We would probably have to wait for upstream (hopefully LLVM 9)
See http://lists.llvm.org/pipermail/llvm-dev/2017-August/116868.html
|
Good news is that I got the NVVM backend to work with
Problem with NVCC is by default all kernels are inline, which makes them optimized out by NVVM, as there isn't a thing that uses them. Awwwww. |
I'm trying to follow the discussion on this issue and I'm a bit lost. Do you mind explaining some things for me @ProfFan? Regarding #299 (comment), I am not familiar with an NVCC backend for LLVM, could you please explain further what you're doing? Regarding #299 (comment), I don't understand why LLVM's support for relocatable CUDA code matters for us, if we're not trying to link against externally compiled CUDA libraries. The particular function you're trying to link against, Regarding #299 (comment), I remember coming across this code in the past. I had concluded that the reason for having Edit: wording fix |
@manopapad Sorry for the confusion in my wording :D For your first question, it is actually the NVVM backend (edited); For the 2nd question, it will be linked by the NVVM backend, however when using the NTPTX backend, I am still finding a way to do the linking. There are SO posts indicating that this is an LLVM issue that made me a little bit confused. Sorry again for bad phrasing. (edited) For the last question, I understand the purpose of have a extern function that uses all symbols to avoid the symbols being optimized out, however what that function does is just really no-op, so we only need it to be in the module, not at runtime, am I right? Thank you for you kind comments. |
@manopapad So I rechecked and added the Linking pass. Now it works flawlessly :) (Tested with Optlang at current master, LLVM 6.0). In retrospect: Initially ignored because terra's test suite does not use the @elliottslaughter Please test against other programs as well :) |
@ProfFan Will test soon, but is there a simple test we could do to catch the cases that weren't being tested before? (Is it just testing anything in libdevice at all?) Thanks again for working on this! |
I do see some warnings when I run some simple Regent programs with CUDA, but despite these everything seems to work.
Ubuntu 16.04, CUDA 9.2, LLVM 6.0. |
I'm happy to merge this, or if there's an easy fix for the warnings that's ok too. As far as the code goes everything looks fine to me. |
@elliottslaughter Yes it is easy to fix and harmless :) I just want to first make sure that it works though. Just need to add some math functions to the current cuda*.t tests. Will now start to fix the warnings. In the meanwhile please test the performance regression if possible. |
Fixed in 7740d0f. Please test :)
|
Some performance results on
|
@manopapad Would you be willing to run Soleil and check to see if there are any performance regressions? |
@ProfFan Thanks, that fixed the first warning. I still see the second one, but again this doesn't appear to prevent it from actually running anything.
|
@elliottslaughter This is because some libraries that uses terra actually links libdevice themselves by calling terra's We could add an optional param here indicating that the library being linked is a CUDA lib, or we can just add a |
Given that it's just a warning, it's really not that big a deal. We can always fix it later if someone cares. Thanks again for working on this, this is a really big help to those of us who depend on the Terra! |
@ProfFan Now that you've done all this work, LLVM went and released version 7. You've already done more than enough for us, but would you have any interest in checking out 7? I figure it's better to work on it now, while everything is still fresh, rather than waiting to be 3 or 4 versions behind again. Either way, thanks again for all you've done. |
@elliottslaughter It's my great pleasure :) And thanks to all the contributors to Terra, this is really an amazing piece of software! For LLVM 7.0, let's keep up the pace and I will open a PR to get CI working first. |
I just want to chime in (as the maintainer of Opt) and say that this is absolutely awesome, and to thank @ProfFan profusely. @elliottslaughter: It'd be great for Opt if there were binary releases of terra which can handle newer versions CUDA. If I compiled a new version using LLVM 6.0 on the three major platforms, and wrote a summary of all changes since the last release, would you cut a new release? I'll have to track down an OS X machine with an NVIDIA graphics card... |
@Mx7f I agree a release would be good. My initial inclination was to build and package the release directly from Travis. I don't think it's a requirement to have a GPU, since you can install the CUDA compiler without any physical hardware. I just haven't gotten around to figuring out how to have Travis do the packaging yet. There is also the issue of building on running on Windows. I have an AppVeyor build, but it only works on LLVM 3.5 and Visual Studio 2013. I'd really love to figure out how to get it working on more LLVM and Visual Studio versions, but haven't been able to spend any time on it. Based on the regular stream of issues we've been collecting from Windows users, I don't think this is just an AppVeyor problem, I think it's a real issue with our Windows build. I had made partial progress on a "one CMake build system to rule them all" over at this PR: #297 . But it still has bugs I need to work out and again, I keep running out of time to actually do things. If you'd like to take up one or more of those tasks, I'd be all for it. Even if not, we could potentially think about a release anyway, but we'd have to decide what level of support we can afford to provide moving forward if we can't put in the time to make all the platforms work well. |
The current releases have a fixed LLVM version number; having a full (Travis+AppVeyor) matrix set of options would be great, but I think having a canonical release (LLVM 6.0, VS2015) makes sense if you are fine with it. If thats the case the main obstacle to release is a LLVM 6.0 VS2015 Windows build. I don't mind taking a crack at it on my home Windows machine. |
I agree, and I'm fine with using LLVM 6.0 exclusively for the next binary release. Travis and AppVeyor are mostly conveniences so that no one has to maintain build machines, and so that the release process is documented and repeatable. If you're willing to look at LLVM 6.0 on Windows with any reasonably recent VS version, that would be a big step towards helping us get there. |
@Mx7f Thank you for your appreciation :) Opt is a great piece of software and I have enjoyed it so much in my work. Cheers! |
This PR should be able to enable CUDA for LLVM 5 and 6 :)
Tested on my Mac running: