Unit tests with Cuda - Maxwell #196

dholladay00 · 2016-02-22T20:42:05Z

I am trying to get compile and run the unit tests the cuda build. The tests fail to compile with most of the options I passed into generate_makefile.sh. With those of the combinations that failed to compile the tests, they appeared to fail on the link with the following error:

Error: Internal Compiler Error (codegen): "unexpected: typeref type encountered!"
make[2]: *** [TestCuda.o] Error 2

I used ../generate_makefile.bash -dbg --with-cuda=/usr/local/cuda-7.5 --arch=Maxwell52 --cxxflags="-DKOKKOS_USING_EXPERIMENTAL_VIEW" --with-options=enable_lambda --kokkos-path=... --prefix=... (as well as other similar option combinations) to obtain the above failure.

However, when I used ../generate_makefile.bash --with-cuda=/usr/local/cuda-7.5, the tests compiled but range_tag and shared_team failed from the first test case.

I tried to build the tests b/c I was running into compile issues of a parallel_scan.

The text was updated successfully, but these errors were encountered:

ndellingwood · 2016-02-22T20:51:09Z

You may need to use the nvcc_wrapper, also a couple other flags may help:
--compiler=[PATH_TO_DIR_WITH_KOKKOS]/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

crtrott · 2016-02-22T20:55:33Z

What is your GCC version loaded? (i.e. g++ --version?)

dholladay00 · 2016-02-22T20:58:55Z

it's gcc 4.9.2

I'm about to attempt with the additional cxx_flags as well as with the nvcc_wrapper as the compiler.

dholladay00 · 2016-02-22T21:12:38Z

With those options added ../generate_makefile.bash --with-cuda=/usr/local/cuda-7.5 --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

I get compile errors:

/home/xrage/inlinlte/packages/kokkos/kokkos/core/unit_test/TestViewMapping.hpp(871): 
error: calling a __host__ function("testing::internal::scoped_ptr<std::basic_string<char, std::char_traits<char>, std::allocator<char> >> ::~scoped_ptr") from a __device__ function("_ZZN4Test17test_view_mappingIN6Kokkos4CudaEEEvvENKUnvdl1_PFvvEN4Test17test_view_mappingIN6Kokkos4CudaEEE1_N6Kokkos12Experimental4ViewIPiJN6Kokkos4CudaEEEEclEi") is not allowed

ndellingwood · 2016-02-22T21:57:05Z

I run into similar errors when trying make build-test using similar arguments to generate_makefile.bash
Cuda 7.5, GCC 4.9.2

~/kokkos/generate_makefile.bash --with-cuda --compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

Sample error: /ascldap/users/ndellin/kokkos/core/unit_test/TestViewMapping.hpp(868): error: calling a __host__ function("testing::internal::CmpHelperEQ<int, int> ") from a __device__ function("_ZZN4Test17test_view_mappingIN6Kokkos4CudaEEEvvENKUnvdl1_PFvvEN4Test17test_view_mappingIN6Kokkos4CudaEEE1_N6Kokkos12Experimental4ViewIPiJN6Kokkos4CudaEEEEclEi") is not allowed

ndellingwood · 2016-02-22T22:00:49Z

build-test is running fine without experimental view, i.e.
~/kokkos/generate_makefile.bash --with-cuda --compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA"

There may be an issue with an attempted fix I submitted for compiler warnings about calling device function from host, will look into it

hcedwar · 2016-02-22T22:07:48Z

The test is only enabled if KOKKOS_USING_EXPERIMENTAL_VIEW is defined.
The test uses KOKKOS_LAMBDA with a host execution space. The problem is when KOKKOS_CUDA_USE_LAMBDA is also defined all host-device lambdas are device only.
The solution is to guard the test with ' ! defined( KOKKOS_CUDA_USE_LAMBDA )'.

ndellingwood · 2016-02-22T23:26:50Z

Fix merged 6947a18 into develop branch

dholladay00 · 2016-02-23T00:01:40Z

After pulling the changes and rebuilding with the same options as previously, I do get the tests to compile but a few fail in the first test case:

[----------] Global test environment tear-down
[==========] 40 tests from 1 test case ran. (19993 ms total)
[  PASSED  ] 37 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] cuda.range_tag
[  FAILED  ] cuda.shared_team
[  FAILED  ] cuda.lambda_shared_team

 3 FAILED TESTS

ndellingwood · 2016-02-23T17:05:05Z

I wasn't able to reproduce the test failures, could you send your setup to run the tests in case something diverged between our setup?

Here is my setup:
kokkos develop branch ( 6947a18 )
module load cuda/7.5.18 gcc/4.9.2

I cleared out my testing directory for other tests so reran the generate_makefile.bash

~/kokkos/generate_makefile.bash --with-cuda compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

build the tests
make build-test

ran the tests (KokkosCore_UnitTest_Cuda specifically for the failed tests reported)

dholladay00 · 2016-02-23T17:36:36Z

I too am using the develop branch, gcc 4.9.2 and cuda 7.5, build options:

./generate_makefile.bash --with-cuda --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

The same tests failed. I noticed some additional output when the tests started:

macro  KOKKOS_HAVE_CUDA      : defined
macro  CUDA_VERSION          = 7050 = version 7.5
Kokkos::Cuda[ 0 ] GeForce GTX TITAN X capability 5.2, Total Global Memory: 12 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 1 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 2 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 3 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 4 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 5.2 , this will likely reduce potential performance.

Could those other gpus be breaking anything or does the problem lie elsewhere?

ndellingwood · 2016-02-23T17:52:16Z

The default option must be Kepler35, can you try adding --arch=Maxwell52 as argument to generate_makefile.bash?

./generate_makefile.bash --with-cuda --arch=Maxwell52 --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

crtrott · 2016-02-23T18:26:30Z

Yes the default GPU architecture is Kepler35 because thats what all the production platforms have. I don' believe we actually can run on capability 2.0 anymore (i.e. on Fermi). While Maxwell is theoretically supported, it is not tested so that might cause problems. Let me test the maxwell build. Btw. Nathan the arch=Maxwell52 was what caused the initial compiler error I believe.

dholladay00 · 2016-02-23T18:28:31Z

Yeah, my understanding was that Kokkos defaulted to using cuda device 0, but I just wanted to make sure that there weren't other issues (esp. when the other GPUs are unsupported) with multi-gpu systems that I was unaware of.

I added the option, but the test results are unchanged.

ndellingwood · 2016-02-23T20:35:44Z

I tried compiling with --arch=Maxwell52 to see if the original compile error reproduced but it did not. Unable to test on current system (Kepler arch), will have to look into it further.

dholladay00 · 2016-03-02T20:26:03Z

It looks like a fix has been merged with master, but I just pulled and rebuilt and ran the tests and I am still showing the same tests failing on cuda.

Note that this is with arch=Maxwell52 (Titan X) GPU.

ndellingwood · 2016-03-02T20:29:27Z

There seems to be some issues with using Maxwell where the unit tests fail that you mentioned before, still need to look further into this.

crtrott · 2016-03-02T23:10:44Z

I renamed the issue to reflect that this is a problem with Maxwell GPUs.

crtrott · 2016-03-02T23:11:16Z

Made this a bug again, since it is not actually fixed.

crtrott · 2016-03-14T17:17:50Z

Any update on this Nathan? I know other things got bumped up but we should follow up.

ndellingwood · 2016-03-14T17:36:34Z

Sorry for the delay, no update yet. I need to talk with you about possibly using/testing with the Maxwell GPU on your system. I also have Maxwell GPU at home and will try and replicate the errors there.

pkestene · 2016-06-28T20:58:12Z

Hi,

FYI:
I tested both master/devel branch, and I'm running into the same error message as the one reported at the top of this issue:
Error: Internal Compiler Error (codegen): "unexpected: typeref type encountered!"
in a slightly different context:

the message is only issued when building in debug mode (not in release)
core/unit_test/TestCuda_c.cpp
The device flag "-G" triggers the error.
I also noticed that among all tests defined in this file (core/unit_test/TestCuda_c.cpp), the one
that seems to trigger the error is "memory_pool"

I tried several build configurations with the error

Ubuntu 16.04 / cuda 8.0, g++5.3 and g++ 4.9 (cuda_arch=Kepler30)
Centos 6.7 / cuda 7.5; g++ 4.9.3 (cuda_arch=Kepler35)

I don't know if and how this can be fixed.

crtrott · 2016-06-28T21:11:00Z

Sounds like a compiler bug, we need to see if we can circumvent it. Btw. if you just want debug symbols for debugging the option to use is "-lineinfo" "-G" will produce vastly different machine code, and will even partially serialize execution.

This is a workaround which hopefully addresses the reduction issues we have seen and reported in issue #352, #398 and #196.

crtrott · 2016-08-29T17:11:50Z

Ok I think I might have identified the issue (which might be a bug in Cuda) see issue #398.
If you could try and see if the latest development branch works now, that would be awesome. You have to compile for Maxwell (i.e. CC 5.0 or higher) to get the workaround.

dholladay00 · 2016-08-29T23:13:56Z

I get much further in the tests when I build with --arch=Maxwell52 vs. not using that flag!

However, I am now seeing (standard_in) 1: syntax error's.

It did fix the failure in the range_tag test though.

crtrott · 2016-08-29T23:35:48Z

That error is something weird in the Makefiles, probably an extra space or what not, which for some reason doesn't affect the actual build. Do all the tests pass now (i.e. the scan tests etc)? On issue #352 someone confirmed that all the Maxwell errors are gone.

dholladay00 · 2016-08-29T23:46:50Z

Looks good, I don't see any test failures.

ndellingwood · 2016-08-30T02:48:23Z

It fixed the failure in the range_tag test on my Ubuntu machine with a Maxwell card at home also, if it helps having another data point.

ndellingwood self-assigned this Feb 22, 2016

hcedwar added the bug - fix pushed to develop branch label Feb 22, 2016

dholladay00 mentioned this issue Mar 1, 2016

parallel_scan on the device with a TeamThreadRange #45

Closed

crtrott changed the title ~~Unit tests with Cuda~~ Unit tests with Cuda - Maxwell Mar 2, 2016

crtrott added Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) and removed bug - fix pushed to develop branch labels Mar 2, 2016

hcedwar added this to the Backlog milestone Mar 17, 2016

ndellingwood mentioned this issue Jul 8, 2016

wrong results for a parallel_reduce with CUDA8 / Maxwell50 #352

Closed

crtrott mentioned this issue Aug 17, 2016

Failing Tests on NVIDIA Pascal GPUs #398

Closed

crtrott added a commit that referenced this issue Aug 29, 2016

Core: Bug workaround for Reduction on Pascal and maybe Maxwell

2452da1

This is a workaround which hopefully addresses the reduction issues we have seen and reported in issue #352, #398 and #196.

crtrott added the InDevelop label Sep 1, 2016

crtrott modified the milestones: Summer 2016, Backlog Sep 1, 2016

crtrott closed this as completed Sep 3, 2016

ndellingwood mentioned this issue Feb 13, 2018

Kokkos CUDA Debug Device Code Turned Off #1410

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit tests with Cuda - Maxwell #196

Unit tests with Cuda - Maxwell #196

dholladay00 commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

crtrott commented Feb 22, 2016

dholladay00 commented Feb 22, 2016

dholladay00 commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

hcedwar commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

crtrott commented Feb 23, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

dholladay00 commented Mar 2, 2016

ndellingwood commented Mar 2, 2016

crtrott commented Mar 2, 2016

crtrott commented Mar 2, 2016

crtrott commented Mar 14, 2016

ndellingwood commented Mar 14, 2016

pkestene commented Jun 28, 2016

crtrott commented Jun 28, 2016

crtrott commented Aug 29, 2016

dholladay00 commented Aug 29, 2016

crtrott commented Aug 29, 2016

dholladay00 commented Aug 29, 2016

ndellingwood commented Aug 30, 2016 •

edited

Unit tests with Cuda - Maxwell #196

Unit tests with Cuda - Maxwell #196

Comments

dholladay00 commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

crtrott commented Feb 22, 2016

dholladay00 commented Feb 22, 2016

dholladay00 commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

hcedwar commented Feb 22, 2016

ndellingwood commented Feb 22, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

crtrott commented Feb 23, 2016

dholladay00 commented Feb 23, 2016

ndellingwood commented Feb 23, 2016

dholladay00 commented Mar 2, 2016

ndellingwood commented Mar 2, 2016

crtrott commented Mar 2, 2016

crtrott commented Mar 2, 2016

crtrott commented Mar 14, 2016

ndellingwood commented Mar 14, 2016

pkestene commented Jun 28, 2016

crtrott commented Jun 28, 2016

crtrott commented Aug 29, 2016

dholladay00 commented Aug 29, 2016

crtrott commented Aug 29, 2016

dholladay00 commented Aug 29, 2016

ndellingwood commented Aug 30, 2016 • edited

ndellingwood commented Aug 30, 2016 •

edited