Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit tests with Cuda - Maxwell #196

Closed
dholladay00 opened this issue Feb 22, 2016 · 28 comments
Closed

Unit tests with Cuda - Maxwell #196

dholladay00 opened this issue Feb 22, 2016 · 28 comments
Assignees
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Milestone

Comments

@dholladay00
Copy link

I am trying to get compile and run the unit tests the cuda build. The tests fail to compile with most of the options I passed into generate_makefile.sh. With those of the combinations that failed to compile the tests, they appeared to fail on the link with the following error:

Error: Internal Compiler Error (codegen): "unexpected: typeref type encountered!"
make[2]: *** [TestCuda.o] Error 2

I used ../generate_makefile.bash -dbg --with-cuda=/usr/local/cuda-7.5 --arch=Maxwell52 --cxxflags="-DKOKKOS_USING_EXPERIMENTAL_VIEW" --with-options=enable_lambda --kokkos-path=... --prefix=... (as well as other similar option combinations) to obtain the above failure.

However, when I used ../generate_makefile.bash --with-cuda=/usr/local/cuda-7.5, the tests compiled but range_tag and shared_team failed from the first test case.

I tried to build the tests b/c I was running into compile issues of a parallel_scan.

@ndellingwood
Copy link
Contributor

You may need to use the nvcc_wrapper, also a couple other flags may help:
--compiler=[PATH_TO_DIR_WITH_KOKKOS]/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

@crtrott
Copy link
Member

crtrott commented Feb 22, 2016

What is your GCC version loaded? (i.e. g++ --version?)

@dholladay00
Copy link
Author

it's gcc 4.9.2

I'm about to attempt with the additional cxx_flags as well as with the nvcc_wrapper as the compiler.

@dholladay00
Copy link
Author

With those options added ../generate_makefile.bash --with-cuda=/usr/local/cuda-7.5 --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

I get compile errors:

/home/xrage/inlinlte/packages/kokkos/kokkos/core/unit_test/TestViewMapping.hpp(871): 
error: calling a __host__ function("testing::internal::scoped_ptr<std::basic_string<char, std::char_traits<char>, std::allocator<char> >> ::~scoped_ptr") from a __device__ function("_ZZN4Test17test_view_mappingIN6Kokkos4CudaEEEvvENKUnvdl1_PFvvEN4Test17test_view_mappingIN6Kokkos4CudaEEE1_N6Kokkos12Experimental4ViewIPiJN6Kokkos4CudaEEEEclEi") is not allowed

@ndellingwood
Copy link
Contributor

I run into similar errors when trying make build-test using similar arguments to generate_makefile.bash
Cuda 7.5, GCC 4.9.2

~/kokkos/generate_makefile.bash --with-cuda --compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

Sample error: /ascldap/users/ndellin/kokkos/core/unit_test/TestViewMapping.hpp(868): error: calling a __host__ function("testing::internal::CmpHelperEQ<int, int> ") from a __device__ function("_ZZN4Test17test_view_mappingIN6Kokkos4CudaEEEvvENKUnvdl1_PFvvEN4Test17test_view_mappingIN6Kokkos4CudaEEE1_N6Kokkos12Experimental4ViewIPiJN6Kokkos4CudaEEEEclEi") is not allowed

@ndellingwood
Copy link
Contributor

build-test is running fine without experimental view, i.e.
~/kokkos/generate_makefile.bash --with-cuda --compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA"

There may be an issue with an attempted fix I submitted for compiler warnings about calling device function from host, will look into it

@hcedwar
Copy link
Contributor

hcedwar commented Feb 22, 2016

The test is only enabled if KOKKOS_USING_EXPERIMENTAL_VIEW is defined.
The test uses KOKKOS_LAMBDA with a host execution space. The problem is when KOKKOS_CUDA_USE_LAMBDA is also defined all host-device lambdas are device only.
The solution is to guard the test with ' ! defined( KOKKOS_CUDA_USE_LAMBDA )'.

@ndellingwood ndellingwood self-assigned this Feb 22, 2016
@ndellingwood
Copy link
Contributor

Fix merged 6947a18 into develop branch

@dholladay00
Copy link
Author

After pulling the changes and rebuilding with the same options as previously, I do get the tests to compile but a few fail in the first test case:

[----------] Global test environment tear-down
[==========] 40 tests from 1 test case ran. (19993 ms total)
[  PASSED  ] 37 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] cuda.range_tag
[  FAILED  ] cuda.shared_team
[  FAILED  ] cuda.lambda_shared_team

 3 FAILED TESTS

@ndellingwood
Copy link
Contributor

I wasn't able to reproduce the test failures, could you send your setup to run the tests in case something diverged between our setup?

Here is my setup:
kokkos develop branch ( 6947a18 )
module load cuda/7.5.18 gcc/4.9.2

I cleared out my testing directory for other tests so reran the generate_makefile.bash

~/kokkos/generate_makefile.bash --with-cuda compiler=~/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

build the tests
make build-test

ran the tests (KokkosCore_UnitTest_Cuda specifically for the failed tests reported)

@dholladay00
Copy link
Author

I too am using the develop branch, gcc 4.9.2 and cuda 7.5, build options:

./generate_makefile.bash --with-cuda --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

The same tests failed. I noticed some additional output when the tests started:

macro  KOKKOS_HAVE_CUDA      : defined
macro  CUDA_VERSION          = 7050 = version 7.5
Kokkos::Cuda[ 0 ] GeForce GTX TITAN X capability 5.2, Total Global Memory: 12 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 1 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 2 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 3 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda[ 4 ] Tesla C2075 capability 2.0, Total Global Memory: 5.249 G, Shared Memory per Block: 48 K
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 5.2 , this will likely reduce potential performance.

Could those other gpus be breaking anything or does the problem lie elsewhere?

@ndellingwood
Copy link
Contributor

The default option must be Kepler35, can you try adding --arch=Maxwell52 as argument to generate_makefile.bash?

./generate_makefile.bash --with-cuda --arch=Maxwell52 --compiler=/home/xrage/inlinlte/packages/kokkos/kokkos/config/nvcc_wrapper --cxxflags="-expt-extended-lambda -DKOKKOS_CUDA_USE_LAMBDA -DKOKKOS_USING_EXPERIMENTAL_VIEW"

@crtrott
Copy link
Member

crtrott commented Feb 23, 2016

Yes the default GPU architecture is Kepler35 because thats what all the production platforms have. I don' believe we actually can run on capability 2.0 anymore (i.e. on Fermi). While Maxwell is theoretically supported, it is not tested so that might cause problems. Let me test the maxwell build. Btw. Nathan the arch=Maxwell52 was what caused the initial compiler error I believe.

@dholladay00
Copy link
Author

Yeah, my understanding was that Kokkos defaulted to using cuda device 0, but I just wanted to make sure that there weren't other issues (esp. when the other GPUs are unsupported) with multi-gpu systems that I was unaware of.

I added the option, but the test results are unchanged.

@ndellingwood
Copy link
Contributor

I tried compiling with --arch=Maxwell52 to see if the original compile error reproduced but it did not. Unable to test on current system (Kepler arch), will have to look into it further.

@dholladay00
Copy link
Author

It looks like a fix has been merged with master, but I just pulled and rebuilt and ran the tests and I am still showing the same tests failing on cuda.

Note that this is with arch=Maxwell52 (Titan X) GPU.

@ndellingwood
Copy link
Contributor

There seems to be some issues with using Maxwell where the unit tests fail that you mentioned before, still need to look further into this.

@crtrott crtrott changed the title Unit tests with Cuda Unit tests with Cuda - Maxwell Mar 2, 2016
@crtrott
Copy link
Member

crtrott commented Mar 2, 2016

I renamed the issue to reflect that this is a problem with Maxwell GPUs.

@crtrott crtrott added Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) and removed bug - fix pushed to develop branch labels Mar 2, 2016
@crtrott
Copy link
Member

crtrott commented Mar 2, 2016

Made this a bug again, since it is not actually fixed.

@crtrott
Copy link
Member

crtrott commented Mar 14, 2016

Any update on this Nathan? I know other things got bumped up but we should follow up.

@ndellingwood
Copy link
Contributor

Sorry for the delay, no update yet. I need to talk with you about possibly using/testing with the Maxwell GPU on your system. I also have Maxwell GPU at home and will try and replicate the errors there.

@hcedwar hcedwar added this to the Backlog milestone Mar 17, 2016
@pkestene
Copy link
Contributor

Hi,

FYI:
I tested both master/devel branch, and I'm running into the same error message as the one reported at the top of this issue:
Error: Internal Compiler Error (codegen): "unexpected: typeref type encountered!"
in a slightly different context:

the message is only issued when building in debug mode (not in release)
core/unit_test/TestCuda_c.cpp
The device flag "-G" triggers the error.
I also noticed that among all tests defined in this file (core/unit_test/TestCuda_c.cpp), the one
that seems to trigger the error is "memory_pool"

I tried several build configurations with the error

  • Ubuntu 16.04 / cuda 8.0, g++5.3 and g++ 4.9 (cuda_arch=Kepler30)
  • Centos 6.7 / cuda 7.5; g++ 4.9.3 (cuda_arch=Kepler35)

I don't know if and how this can be fixed.

@crtrott
Copy link
Member

crtrott commented Jun 28, 2016

Sounds like a compiler bug, we need to see if we can circumvent it. Btw. if you just want debug symbols for debugging the option to use is "-lineinfo" "-G" will produce vastly different machine code, and will even partially serialize execution.

crtrott added a commit that referenced this issue Aug 29, 2016
This is a workaround which hopefully addresses the reduction issues
we have seen and reported in issue #352, #398 and #196.
@crtrott
Copy link
Member

crtrott commented Aug 29, 2016

Ok I think I might have identified the issue (which might be a bug in Cuda) see issue #398.
If you could try and see if the latest development branch works now, that would be awesome. You have to compile for Maxwell (i.e. CC 5.0 or higher) to get the workaround.

@dholladay00
Copy link
Author

I get much further in the tests when I build with --arch=Maxwell52 vs. not using that flag!

However, I am now seeing (standard_in) 1: syntax error's.

It did fix the failure in the range_tag test though.

@crtrott
Copy link
Member

crtrott commented Aug 29, 2016

That error is something weird in the Makefiles, probably an extra space or what not, which for some reason doesn't affect the actual build. Do all the tests pass now (i.e. the scan tests etc)? On issue #352 someone confirmed that all the Maxwell errors are gone.

@dholladay00
Copy link
Author

Looks good, I don't see any test failures.

@ndellingwood
Copy link
Contributor

ndellingwood commented Aug 30, 2016

It fixed the failure in the range_tag test on my Ubuntu machine with a Maxwell card at home also, if it helps having another data point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

5 participants