Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RULE_LAUNCH_COMPILE and RULE_LAUNCH_LINK system for nvcc_wrapper #3136

Merged
merged 21 commits into from
Sep 24, 2020

Conversation

jrmadsen
Copy link
Contributor

  • This enables Kokkos_ENABLE_CUDA=ON with a CMAKE_CXX_COMPILER to be set to a non-clang C++ compiler instead of nvcc_wrapper
  • This provides a way for Kokkos + CUDA to be compiled without forcing CMAKE_CXX_COMPILER to be nvcc_wrapper or clang

Proposed

  • kokkos_compilation macro
    • This will enable setting the launch rule globally and on projects, directories, targets, and source files
    • By default, find_package(Kokkos) will invoke kokkos_compilation(GLOBAL)
  • find_package(Kokkos COMPONENTS separable_compilation)
    • This will disable kokkos_compilation(GLOBAL) and the user will be responsible for doing:
      • kokkos_compilation(PROJECT)
      • kokkos_compilation(DIRECTORY <DIR>)
      • kokkos_compilation(TARGET <TARGETS...>)
      • kokkos_compilation(SOURCE <SOURCES...>)

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 1, 2020

@crtrott @jjwilke This is good to go as long unless the CI fails. It works based my various compilation tests locally.

@dalg24 we need to add not setting CXX to nvcc_wrapper somewhere in the Jenkins testing I think.

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 1, 2020

Oh and in all the CUDA tests, it will still be using the kokkos_launch_compiler command even if CXX=nvcc_wrapper.

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 2, 2020

@dalg24 This should now provide this exact same behavior if CMAKE_CXX_COMPILER=nvcc_wrapper. Thus, all the current changes should have zero effect on any of the testing.

Before this is merged, Jenkins needs to have new instances of all tests with -DKokkos_ENABLE_CUDA=ON that do not set the compiler to nvcc_wrapper.

@masterleinad
Copy link
Contributor

There still is

clang-8: error: unsupported option '--diag_suppress=esa_on_defaulted_function_ignored'

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 2, 2020

@masterleinad I really need to finished that PR for CDash... if only @dalg24 wasn't so damn picky... I have no idea where that error is in the log. Did that happen in the build tree or in the install tree?

@masterleinad
Copy link
Contributor

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 2, 2020

Thanks, I think just needed logic for Clang

@masterleinad
Copy link
Contributor

The architecture auto-detection fails (maybe unsurprisingly) with

/usr/bin/c++  -DSM_ONLY  -Werror    -std=c++11 -o CMakeFiles/cmTC_35a1f.dir/cuda_compute_capability.cc.o -c /tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc: In function 'int main()':
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:49:3: error: 'cudaDeviceProp' was not declared in this scope
   cudaDeviceProp device_properties;
   ^~~~~~~~~~~~~~
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:50:9: error: 'cudaError_t' does not name a type; did you mean 'error_t'?
   const cudaError_t error = cudaGetDeviceProperties(&device_properties,
         ^~~~~~~~~~~
         error_t
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:52:7: error: 'error' was not declared in this scope
   if (error != cudaSuccess) {
       ^~~~~
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:52:7: note: suggested alternative: 'perror'
   if (error != cudaSuccess) {
       ^~~~~
       perror
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:52:16: error: 'cudaSuccess' was not declared in this scope
   if (error != cudaSuccess) {
                ^~~~~~~~~~~
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:53:36: error: 'cudaGetErrorString' was not declared in this scope
     std::cout << "CUDA error: " << cudaGetErrorString(error) << '\n';
                                    ^~~~~~~~~~~~~~~~~~
/tmp/ArborX/my_docker/kokkos/cmake/compile_tests/cuda_compute_capability.cc:57:7: error: 'device_properties' was not declared in this scope
       device_properties.major * 10 + device_properties.minor;
       ^~~~~~~~~~~~~~~~~

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Jul 6, 2020

Oh, well this isn't too complicated of a fix.

IF(KOKKOS_ENABLE_CUDA AND NOT CUDA_ARCH_ALREADY_SPECIFIED)
  TRY_RUN(
    _RESULT
    _COMPILE_RESULT
    ${_BINARY_TEST_DIR}
    ${CMAKE_CURRENT_SOURCE_DIR}/cmake/compile_tests/cuda_compute_capability.cc
    COMPILE_DEFINITIONS -DSM_ONLY
    RUN_OUTPUT_VARIABLE _CUDA_COMPUTE_CAPABILITY)
ENDIF()

I don't see a way to tell cmake to use the launcher so just need to either:

  • add an if block around using try_run and then do it manually via execute_process
  • enable_language(CUDA) + CMAKE_TRY_COMPILE_TARGET_TYPE CUDA

@masterleinad @jjwilke Thoughts? Preferences?

@masterleinad
Copy link
Contributor

Currently, the auto-detection doesn't work with clang as a compiler either. It would be nice to come up with a solution that works for compilers different from nvcc_wrapper but that could be done separately and I don't think it's crucial for this pull request.

AFAICT, CMAKE_TRY_COMPILE_TARGET_TYPE can only differentiate between executables and libraries (https://cmake.org/cmake/help/latest/variable/CMAKE_TRY_COMPILE_TARGET_TYPE.html#variable:CMAKE_TRY_COMPILE_TARGET_TYPE) but can't be used to specify the language resp. the compiler to use. I guess the only way really is execute_process.

@jrmadsen
Copy link
Contributor Author

@masterleinad Been too busy lately to figure out the auto-detection but I agree it would be nice to have in the future.

.jenkins Outdated Show resolved Hide resolved
@masterleinad
Copy link
Contributor

Retest this please.

@jrmadsen
Copy link
Contributor Author

@masterleinad Hey just got back from vacation this week, is this ready for merging?

@masterleinad
Copy link
Contributor

I think we should wait for this until after the release to have sufficient time to test it out.

Copy link

@jjwilke jjwilke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iniital comments

bin/kokkos_launch_compiler Show resolved Hide resolved
bin/kokkos_launch_compiler Outdated Show resolved Hide resolved
cmake/KokkosConfig.cmake.in Outdated Show resolved Hide resolved
cmake/KokkosConfigCommon.cmake.in Show resolved Hide resolved
cmake/KokkosConfigCommon.cmake.in Show resolved Hide resolved
cmake/KokkosConfigCommon.cmake.in Show resolved Hide resolved
cmake/kokkos_functions.cmake Show resolved Hide resolved
@jrmadsen
Copy link
Contributor Author

@masterleinad I'm not sure I see the merit in delaying it. It doesn't change anything at all if CXX is set to nvcc_wrapper, it can be explicitly disabled, and merging it will help find edge cases much easier.

Copy link

@jjwilke jjwilke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add comment in launch script. This all seems fine to me. Merging this sooner than later might be good. It solves nasty problems with FetchContent and ExternalProject that we've been having.

bin/kokkos_launch_compiler Show resolved Hide resolved
cmake/KokkosConfigCommon.cmake.in Show resolved Hide resolved
cmake/KokkosConfigCommon.cmake.in Show resolved Hide resolved
bin/kokkos_launch_compiler Show resolved Hide resolved
Copy link

@jjwilke jjwilke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with my tests. A bit of cleanup on error messages/status messages and I'm basically ready to approve.

Curious about thoughts on changing default for the launcher script to be globally applied.

bin/kokkos_launch_compiler Outdated Show resolved Hide resolved
bin/kokkos_launch_compiler Outdated Show resolved Hide resolved
cmake/KokkosConfig.cmake.in Show resolved Hide resolved
cmake/KokkosConfig.cmake.in Show resolved Hide resolved
@masterleinad
Copy link
Contributor

Retest this please.

@jjwilke
Copy link

jjwilke commented Aug 2, 2020

Still thinking about this. My concern now is if 2 projects import Kokkos. One of them asks for separable_compilation. The other does not. You now won't get separable compilation because the global property has been set.

A counter-proposal would be this:

  1. Keep everything you have, except that you can now ONLY point to nvcc_wrapper from your Kokkos installation. These are not allowed to be inconsistent.
  2. Have Kokkos add a transitive flag like --kokkos-dependence to everything. The compiler launcher can actually look for this. If this flag does not appear, the compiler launcher can actually just pass the args through. If the flag does appear, use nvcc_wrapper.

This PR would be a better place to add the --kokkos-dependence flag.

Now we have the separable compilation problem fixed. No one needs to ask for it. If you depend on Kokkos, you go to the wrapper. If not, you're good to go. And you don't need to worry about two different Kokkos projects thrashing each other's global properties (of course, I could be misunderstanding global scope rules in find_package).

@jrmadsen
Copy link
Contributor Author

jrmadsen commented Aug 2, 2020

@jjwilke I like it. What about using a compiler definition instead of a compiler flag? E.g. -DKOKKOS_DEPENDENCE. The benefit is that it wouldn't have to be removed from the args in the launcher and it could actually help downstream projects (in the 2 project scenario you mentioned) detect whether Kokkos is being used by parent project.

…tional behavior

- Added some info in cmake about what kokkos_launch_compiler does and provides
- Added ability to set Kokkos_LAUNCH_COMPILER=OFF in downstream projects
- Added kokkos_compiler_is_nvcc to KokkosConfigCommon.cmake.in to avoid using kokkos_launch_compiler if CMAKE_CXX_COMPILER=nvcc_wrapper
- Fixed AND NOT AND NOT
- Created new KCL (Kokkos-Compiler-Launcher) tests
- specify CUDA arch
- Fixed environment to set CXX instead of NVCC_WRAPPER_DEFAULT_COMPILER
@crtrott
Copy link
Member

crtrott commented Sep 22, 2020

The CUDA Arch detection doesn't work without a CUDA compiler. And it looks like downstream you have to set CMAKE_POLICY CMP0057 explicitly down stream? We just worked with some folks in Germany which compile some other raw CUDA as well and this made it possible relatively straight forward. Just those two things didn't work (i.e. we had to set the architecture explicitly - which I think is fine for now and can be subsequently fixed) and we had to explicitly set CMAKE_POLICY CMP0057 in the application cmake file.

@crtrott
Copy link
Member

crtrott commented Sep 23, 2020

Retest this please.

@jrmadsen jrmadsen dismissed jjwilke’s stale review September 23, 2020 19:06

I think this was resolved since we did change the default for the launcher script to not be globally applied

bin/kokkos_launch_compiler Outdated Show resolved Hide resolved
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good

I moved the testing into the CUDA 11 build

kokkos/.jenkins

Lines 184 to 245 in 6bf9c87

stage('CUDA-11.0-NVCC-C++17-RDC') {
agent {
dockerfile {
filename 'Dockerfile.nvcc'
dir 'scripts/docker'
additionalBuildArgs '--pull --build-arg BASE=nvidia/cuda:11.0-devel --build-arg ADDITIONAL_PACKAGES="g++-8 gfortran" --build-arg CMAKE_VERSION=3.17.3'
label 'nvidia-docker && volta'
args '-v /tmp/ccache.kokkos:/tmp/ccache'
}
}
environment {
OMP_NUM_THREADS = 8
OMP_PLACES = 'threads'
OMP_PROC_BIND = 'spread'
NVCC_WRAPPER_DEFAULT_COMPILER = 'g++-8'
}
steps {
sh 'ccache --zero-stats'
sh '''rm -rf install && mkdir -p install && \
rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=g++-8 \
-DCMAKE_CXX_FLAGS=-Werror \
-DCMAKE_CXX_STANDARD=17 \
-DKokkos_ENABLE_COMPILER_WARNINGS=ON \
-DKokkos_ENABLE_OPENMP=ON \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ENABLE_CUDA_LAMBDA=OFF \
-DKokkos_ENABLE_CUDA_UVM=ON \
-DKokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DCMAKE_INSTALL_PREFIX=${PWD}/../install \
.. && \
make -j8 install && \
cd .. && \
rm -rf build-tests && mkdir -p build-tests && cd build-tests && \
export CMAKE_PREFIX_PATH=${PWD}/../install && \
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
-DCMAKE_CXX_COMPILER=$WORKSPACE/bin/nvcc_wrapper \
-DCMAKE_CXX_FLAGS=-Werror \
-DCMAKE_CXX_STANDARD=17 \
-DKokkos_INSTALL_TESTING=ON \
.. && \
make -j8 && ctest --output-on-failure && \
cd ../example/build_cmake_installed && \
rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_CXX_COMPILER=g++-8 \
-DCMAKE_CXX_FLAGS=-Werror \
-DCMAKE_CXX_STANDARD=17 \
.. && \
make -j8 && ctest --output-on-failure'''
}
post {
always {
sh 'ccache --show-stats'
}
}
}

I purposefully use nvcc_wrapper directly in the "install testing" step.

@crtrott crtrott merged commit 7a4dd39 into kokkos:develop Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants