Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kokkos + KokkosKernels Promotion to 2.8.00 #4329

Merged
merged 7 commits into from Feb 6, 2019
Merged

Conversation

ndellingwood
Copy link
Contributor

This promotes Kokkos and KokkosKernels to Version 2.8.00.

@trilinos/kokkos @trilinos/kokkos-kernels

This update includes the following:

Kokkos

Volta fixes, C++14 support and testing, Capability to add environment variables for all command line arguments, Threads RangePolicy fix for offsets, RemoteMemorySpace support

KokkosKernels

BLAS TPL + build system fixes (Complex API, fortran mangling), C++14 support and testing, Batched GETRS


Kokkos Changelog

2.8.00 (2019-02-05)

Full Changelog

Implemented enhancements:

  • Capability, Tests: C++14 support and testing #1914
  • Capability: Add environment variables for all command line arguments #1798
  • Capability: --kokkos-ndevices not working for Slurm #1920
  • View: Undefined behavior when deep copying from and to an empty unmanaged view #1967
  • BuildSystem: nvcc_wrapper should stop immediately if nvcc is not in PATH #1861

Fixed bugs:

  • Cuda: Fix Volta Issues 1 Non-deterministic behavior on Volta, runs fine on Pascal #1949
  • Cuda: Fix Volta Issues 2 CUDA Team Scan gives wrong values on Volta with -G compile flag #1942
  • Cuda: illegal warp sync in parallel_reduce by functor on Turing 75 #1958
  • Threads: Pthreads backend does not handle RangePolicy with offset correctly #1976
  • Atomics: atomic_fetch_oper has no case for Kokkos::complex<double> or other 16-byte types #1951
  • MDRangePolicy: Fix zero-length range #1948
  • TeamThreadRange: TeamThreadRange MaxLoc reduce doesnt compile #1909

KokkosKernels Changelog

2.8.00 (2019-02-05)

Full Changelog

Implemented enhancements:

  • Capability, Tests: C++14 Support and Testing #351
  • Capability: Batched getrs #332
  • More Kernel Labels for KokkosBlas #239
  • Name all parallel kernels and regions #124

Fixed bugs:

  • BLAS TPL: BLAS underscore mangling #369
  • BLAS TPL, Complex: Promotion 2.7.24 broke MV unit tests in Tpetra with complex types #360
  • GEMM: GEMM uses wrong function for computing shared memory allocation size #368
  • BuildSystem: BLAS TPL macro not properly enabled with MKL BLAS #347
  • BuildSystem: make clean - errors #353
  • Compiler Workaround: Internal compiler error in KokkosBatched::Experimental::TeamGemm #349
  • KokkosBlas: Some KokkosBlas kernels assume default execution space #14

Changes to coordinate with kokkos/kokkos#1886.

Various ViewMapping partial specializations now require the final
template parameter to be the specialize tag, rather than explicit void.

These changes ensure Sacado and Stokhos compatibility with the Kokkos changes.

Test with Cuda_Serial build on waterman with cuda/9.2+gcc/7.2

 Changes to be committed:
	modified:   packages/sacado/src/KokkosExp_View_Fad.hpp
	modified:   packages/sacado/src/KokkosExp_View_Fad_Contiguous.hpp
	modified:   packages/sacado/src/Kokkos_DynRankView_Fad.hpp
	modified:   packages/sacado/src/Kokkos_DynRankView_Fad_Contiguous.hpp
	modified:   packages/sacado/test/UnitTests/Fad_KokkosTests_Cuda.cpp
	modified:   packages/stokhos/src/sacado/kokkos/pce/KokkosExp_View_UQ_PCE_Contiguous.hpp
	modified:   packages/stokhos/src/sacado/kokkos/vector/KokkosExp_View_MP_Vector_Contiguous.hpp
Remove additional argument to view ctor.
 Changes to be committed:
	modified:   stokhos/src/sacado/kokkos/Stokhos_Tpetra_Utilities.hpp
…549b57

From repository at git@github.com:kokkos/kokkos.git

At commit:
commit 5d6e7fb38e96aec88d2c514e1f9be1cf2b549b57
Merge: 9614f72 d1659d1
Author: Nathan Ellingwood <ndellin@sandia.gov>
Date:   Tue Feb 5 17:10:27 2019 -0700

    Merge branch 'develop' for 2.8.00

    Part of Kokkos C++ Performance Portability Programming EcoSystem 2.8
…1763439cb56039

From repository at git@github.com:kokkos/kokkos-kernels.git

At commit:
commit 4ee5f3c6dbd0981f6d8c7a9b2b1763439cb56039
Merge: 94456cf 6a79032
Author: Nathan Ellingwood <ndellin@sandia.gov>
Date:   Tue Feb 5 17:13:18 2019 -0700

    Merge branch 'develop' for 2.8.00

    Part of Kokkos C++ Performance Portability Programming EcoSystem 2.8
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@ndellingwood
Copy link
Contributor Author

Cross-referencing the Kokkos promotion 2.8.00 issue (where testing tracked) kokkos/kokkos#1981

@crtrott @srajama1 @ibaned

Other likely interested parties: @mhoemmen @ZUUL42 @william76

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2376
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2177
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 668
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 295
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 80
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Using Repos:

Repo: TRILINOS (trilinos/Trilinos)
  • Branch: kokkos-promotion
  • SHA: c5c8608
  • Mode: TEST_REPO

Pull Request Author: ndellingwood

Copy link
Contributor

@srajama1 srajama1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the individual commits were reviewed in the Kokkos side. The integration tests pass. @ndellingwood Thanks for taking care of this quickly.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2376
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2177
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 668
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 295
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 80
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4329
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH kokkos-promotion
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA c5c8608
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA fa48062


CDash Test Results for PR# 4329.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ srajama1 ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@ndellingwood
Copy link
Contributor Author

Testing results from kokkos/kokkos#1981

Testing 1

  • Testing with scripts from kokkos-kernels/scripts/trilinos_integration

White Cuda Kepler Nodes

Trilinos pristine: develop branch SHA 80e2cb3
Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@52a6ba2

Summary

Compilation - No new build failures
Tests - One new failure
Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4

This test failure was fixed with PR kokkos/kokkos-kernels#381 pattern matching existing fix in Trilinos that tests this case.

Results of a rerun with that updated:

bash-4.2$ ctest -R Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4
Test project /ascldap/users/ndellin/IntegrationTests-White/cuda-kepler/build_updated
    Start 1683: Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4
1/1 Test #1683: Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4 ...   Passed    4.80 sec

White OpenMP

Trilinos pristine: develop branch SHA 80e2cb3
Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@52a6ba2

Summary

Compilation - No new build failures
Tests - No new test failures


White Cuda + Complex_Double Kepler Nodes

Trilinos pristine: develop branch SHA ac4762e
Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@67af7da

Summary

Compilation - No new build failures
Tests - No new test failures


Blake Serial

Trilinos pristine: develop branch SHA 80e2cb3
Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@52a6ba2

Summary

Compilation - No new build failures
Tests - One new failure
Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4

This test failure was fixed with PR kokkos/kokkos-kernels#381 pattern matching existing fix in Trilinos that tests this case.


Blake Pthreads

Trilinos pristine: develop branch SHA 80e2cb3
Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@52a6ba2

Summary

Compilation - No new build failures
Tests - One new failure
Stokhos_TpetraCrsMatrixMPVectorUnitTest_Serial_MPI_4

This test failure was fixed with PR kokkos/kokkos-kernels#381 pattern matching existing fix in Trilinos that tests this case.


Testing 2

  • Various tests using ATDM configurations and packages highly impacted by Kokkos modifications

White cuda-9.2-opt-Kepler37

Build 1: Updated Trilinos, deprecated code disabled

Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@67af7da

Reproducer instructions

module purge
source ${TRILINOS_DIR}/cmake/std/atdm/load-env.sh cuda-9.2-opt-Kepler37
module swap cmake/3.9.6 cmake/3.12.3

cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON \
  -DKOKKOS_ENABLE_DEPRECATED_CODE:BOOL=OFF \
  -DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
  -DKokkosKernels_SOURCE_DIR_OVERRIDE:STRING=kokkos-kernels \
  -DTrilinos_ENABLE_MueLu=ON \
    -DMueLu_ENABLE_Experimental:BOOL=ON \
    -DMueLu_ENABLE_Kokkos_Refactor:BOOL=ON \
    -DMueLu_ENABLE_Epetra:BOOL=OFF \
    -DXpetra_ENABLE_Experimental:BOOL=ON \
    -DMueLu_ENABLE_TESTS:BOOL=OFF \
    -DMueLu_ENABLE_EXAMPLES:BOOL=OFF \
  -DTrilinos_ENABLE_Tpetra=ON \
  -DTrilinos_ENABLE_Sacado=ON \
  -DTrilinos_ENABLE_Intrepid2=ON \
  -DTrilinos_ENABLE_Panzer=ON \
$TRILINOS_DIR

Summary

Compilation - No build failures
Tests - 4 new failures, however they passed when rerun with ctest (no -j)

        867 - PanzerAdaptersSTK_STK_VolumeSideResponse_MPI_2 (Timeout) #NEW - passed with ctest with no -j
        868 - PanzerAdaptersSTK_ip_coordinates_MPI_2 (Timeout) #NEW - passed with ctest with no -j
        874 - PanzerAdaptersSTK_tDomainInterface_MPI_2 (Timeout) #NEW - passed with ctest with no -j
        875 - PanzerAdaptersSTK_node_normals_MPI_2 (Timeout) #NEW - passed with ctest with no -j

Several pre-existing Panzer failures when deprecated code disabled (occurring in builds with Trilnios VOTD develop branch). Failures indicate view(s) constructed with more arguments than dynamic rank.


White cuda-9.2-debug-Kepler37

Build 2: Updated Trilinos, deprecated code disabled, debugging on

Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@67af7da

Reproducer instructions

module purge
source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-debug-Kepler37
module swap cmake/3.9.6 cmake/3.12.3

cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON \
  -DTrilinos_ENABLE_Tpetra=ON \
  -DTrilinos_ENABLE_Sacado=ON \
  -DTrilinos_ENABLE_Stokhos=ON \
  -DKOKKOS_ENABLE_DEPRECATED_CODE=OFF \
  -DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
  -DKokkosKernels_SOURCE_DIR_OVERRIDE:STRING=kokkos-kernels \
$TRILINOS_DIR

Summary

Compilation - No build failures
Tests - No new failures

Several existing Panzer failures when deprecated code disabled (View constructed with more arguments than dynamic rank)

Same as above


Waterman cuda-9.2-opt (Volta)

Build: Updated Trilinos, deprecated code disabled, debugging on

Trilinos updated: kokkos-promotion branch SHA 80e2cb3 (before snapshot)
kokkos develop branch SHA kokkos/kokkos@534b7f3
kokkos-kernels develop SHA kokkos/kokkos-kernels@67af7da

Reproducer instructions

module purge
source ${TRILINOS_DIR}/cmake/std/atdm/load-env.sh cuda-9.2-opt
module swap cmake/3.6.2 cmake/3.12.3

PACKAGE1=Tpetra
PACKAGE2=Sacado
PACKAGE3=Stokhos
PACKAGE4=MueLu
PACKAGE5=Intrepid2
PACKAGE6=Ifpack2
PACKAGE7=Panzer
PACKAGE8=Phalanx
PACKAGE9=Stratimikos
PACKAGE10=Belos

cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON \
  -DTrilinos_ENABLE_${PACKAGE1}=ON \
  -DTrilinos_ENABLE_${PACKAGE2}=ON \
  -DTrilinos_ENABLE_${PACKAGE3}=ON \
  -DTrilinos_ENABLE_${PACKAGE4}=ON \
  -DTrilinos_ENABLE_${PACKAGE5}=ON \
  -DTrilinos_ENABLE_${PACKAGE6}=ON \
  -DTrilinos_ENABLE_${PACKAGE7}=ON \
  -DTrilinos_ENABLE_${PACKAGE8}=ON \
  -DTrilinos_ENABLE_${PACKAGE9}=ON \
  -DTrilinos_ENABLE_${PACKAGE10}=ON \
  -DKOKKOS_ENABLE_DEPRECATED_CODE=OFF \
  -DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
  -DKokkosKernels_SOURCE_DIR_OVERRIDE:STRING=kokkos-kernels \
$TRILINOS_DIR

Summary

Compilation - Build failures in Ifpack2 (deprecated code disabled) fixed in kokkos-promotion branch
Tests - No new failures

Several existing Panzer failures when deprecated code disabled (View constructed with more arguments than dynamic rank)

Same as above


Blake Pthreads Retesting

With PR kokkos/kokkos#1980

Trilinos pristine: develop branch SHA 967ff0a
Trilinos updated: kokkos-promotion branch SHA 05a91ff (merge commit b858cf8)
kokkos issue-1976b branch SHA kokkos/kokkos@f782e0a
kokkos-kernels develop SHA kokkos/kokkos-kernels@67af7da

Summary

No new build failures
No new test failures

Existing test failures for reference:

MueLu_ImportPerformance_Tpetra_MPI_4 (Failed)
MueLu_Q2Q1-Tpetra_MPI_1 (Failed)
ROL_adapters_tpetra_test_sol_TpetraSimulatedConstraintInterfaceCVaR_MPI_4 (Failed)
Stratimikos_test_single_belos_thyra_solver_driver_nos1_nrhs8_MPI_1 (Failed)

@ndellingwood
Copy link
Contributor Author

PR testing successful, merging!

@ndellingwood ndellingwood merged commit 9573120 into develop Feb 6, 2019
@ndellingwood ndellingwood deleted the kokkos-promotion branch February 6, 2019 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants