-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit tests with Cuda - Maxwell #196
Comments
You may need to use the nvcc_wrapper, also a couple other flags may help: |
What is your GCC version loaded? (i.e. g++ --version?) |
it's gcc 4.9.2 I'm about to attempt with the additional cxx_flags as well as with the nvcc_wrapper as the compiler. |
With those options added I get compile errors:
|
I run into similar errors when trying make build-test using similar arguments to generate_makefile.bash
Sample error: |
build-test is running fine without experimental view, i.e. There may be an issue with an attempted fix I submitted for compiler warnings about calling device function from host, will look into it |
The test is only enabled if KOKKOS_USING_EXPERIMENTAL_VIEW is defined. |
Fix merged 6947a18 into develop branch |
After pulling the changes and rebuilding with the same options as previously, I do get the tests to compile but a few fail in the first test case:
|
I wasn't able to reproduce the test failures, could you send your setup to run the tests in case something diverged between our setup? Here is my setup: I cleared out my testing directory for other tests so reran the generate_makefile.bash
build the tests ran the tests (KokkosCore_UnitTest_Cuda specifically for the failed tests reported) |
I too am using the develop branch, gcc 4.9.2 and cuda 7.5, build options:
The same tests failed. I noticed some additional output when the tests started:
Could those other gpus be breaking anything or does the problem lie elsewhere? |
The default option must be Kepler35, can you try adding --arch=Maxwell52 as argument to generate_makefile.bash?
|
Yes the default GPU architecture is Kepler35 because thats what all the production platforms have. I don' believe we actually can run on capability 2.0 anymore (i.e. on Fermi). While Maxwell is theoretically supported, it is not tested so that might cause problems. Let me test the maxwell build. Btw. Nathan the arch=Maxwell52 was what caused the initial compiler error I believe. |
Yeah, my understanding was that Kokkos defaulted to using cuda device 0, but I just wanted to make sure that there weren't other issues (esp. when the other GPUs are unsupported) with multi-gpu systems that I was unaware of. I added the option, but the test results are unchanged. |
I tried compiling with --arch=Maxwell52 to see if the original compile error reproduced but it did not. Unable to test on current system (Kepler arch), will have to look into it further. |
It looks like a fix has been merged with master, but I just pulled and rebuilt and ran the tests and I am still showing the same tests failing on cuda. Note that this is with arch=Maxwell52 (Titan X) GPU. |
There seems to be some issues with using Maxwell where the unit tests fail that you mentioned before, still need to look further into this. |
I renamed the issue to reflect that this is a problem with Maxwell GPUs. |
Made this a bug again, since it is not actually fixed. |
Any update on this Nathan? I know other things got bumped up but we should follow up. |
Sorry for the delay, no update yet. I need to talk with you about possibly using/testing with the Maxwell GPU on your system. I also have Maxwell GPU at home and will try and replicate the errors there. |
Hi, FYI: the message is only issued when building in debug mode (not in release) I tried several build configurations with the error
I don't know if and how this can be fixed. |
Sounds like a compiler bug, we need to see if we can circumvent it. Btw. if you just want debug symbols for debugging the option to use is "-lineinfo" "-G" will produce vastly different machine code, and will even partially serialize execution. |
Ok I think I might have identified the issue (which might be a bug in Cuda) see issue #398. |
I get much further in the tests when I build with However, I am now seeing It did fix the failure in the |
That error is something weird in the Makefiles, probably an extra space or what not, which for some reason doesn't affect the actual build. Do all the tests pass now (i.e. the scan tests etc)? On issue #352 someone confirmed that all the Maxwell errors are gone. |
Looks good, I don't see any test failures. |
It fixed the failure in the range_tag test on my Ubuntu machine with a Maxwell card at home also, if it helps having another data point. |
I am trying to get compile and run the unit tests the cuda build. The tests fail to compile with most of the options I passed into
generate_makefile.sh
. With those of the combinations that failed to compile the tests, they appeared to fail on the link with the following error:I used
../generate_makefile.bash -dbg --with-cuda=/usr/local/cuda-7.5 --arch=Maxwell52 --cxxflags="-DKOKKOS_USING_EXPERIMENTAL_VIEW" --with-options=enable_lambda --kokkos-path=... --prefix=...
(as well as other similar option combinations) to obtain the above failure.However, when I used
../generate_makefile.bash --with-cuda=/usr/local/cuda-7.5
, the tests compiled butrange_tag
andshared_team
failed from the first test case.I tried to build the tests b/c I was running into compile issues of a
parallel_scan
.The text was updated successfully, but these errors were encountered: