Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up benchmarks/gups, add permuted index mode #6378

Merged
merged 2 commits into from
Sep 26, 2023

Conversation

cwpearson
Copy link
Contributor

  • Use the default execution space / memory space
  • Use c++ stdlib where possible
  • Enable when Kokkos_ENABLE_BENCHMARKS is set
    • Convert from Makefile -> CMake

Sample executions

$ ./Kokkos_gups
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0655 MB)
 - Atomics:                   No
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Views...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.193954
-------------------------------------------------------------
$ ./Kokkos_gups --atomics
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0655 MB)
 - Atomics:                  Yes
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Views...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.188255
-------------------------------------------------------------
$ ./Kokkos_gups --pattern-permutation
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0655 MB)
 - Atomics:                   No
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Views...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.141398
-------------------------------------------------------------
$ ./Kokkos_gups --pattern-permutation --atomics
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0655 MB)
 - Atomics:                  Yes
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Views...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.139726
-------------------------------------------------------------

@simongdg

int64_t data = 33554432;
int64_t repeats = 10;
bool useAtomics = false;
AccessPattern pattern = AccessPattern::random;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we either make the default atomics == false pattern = permute, or atomics = true and pattern = random?

That would avoid race conditions in the default mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RandomAccess benchmark in the HPC Challenge tolerates some amount of errors, so it might not be surprising to people if there are race conditions in our default GUPS execution. I'm happy to change it though if anyone has a strong opinion.

cmake/kokkos_tribits.cmake Outdated Show resolved Hide resolved
@@ -51,6 +51,7 @@ MACRO(KOKKOS_PROCESS_SUBPACKAGES)
ADD_SUBDIRECTORY(simd)
if (NOT KOKKOS_HAS_TRILINOS)
ADD_SUBDIRECTORY(example)
ADD_SUBDIRECTORY(benchmarks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine but just noting that we can refactor that in the future.
Instead of defining these KOKKOS_ADD_*_DIRECTORIES macros we could just guard the adding the directories here.

benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
@cwpearson
Copy link
Contributor Author

$ ./Kokkos_gups 
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0328 MB)
 - Atomics:                   No
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Data...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.482265
-------------------------------------------------------------
$ ./Kokkos_gups --atomics
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0328 MB)
 - Atomics:                  Yes
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Data...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.460090
-------------------------------------------------------------
$ ./Kokkos_gups --pattern-permutation
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0328 MB)
 - Atomics:                   No
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Data...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.237500
-------------------------------------------------------------
$ ./Kokkos_gups --pattern-permutation --atomics
-------------------------------------------------------------
Kokkos GUPS Benchmark
-------------------------------------------------------------
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 8.0 on device with compute capability 8.6 , this will likely reduce potential performance.
Reports fastest timing per kernel
Creating Views...
Memory Sizes:
- Elements:             33554432 (    268.4355 MB)
- Indices:                  8192 (      0.0328 MB)
 - Atomics:                  Yes
Benchmark kernels will be performed for 10 iterations.
-------------------------------------------------------------
Initializing Data...
Starting benchmarking...
-------------------------------------------------------------
GUP/s Random:                0.260215
-------------------------------------------------------------

benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
benchmarks/gups/gups.cpp Outdated Show resolved Hide resolved
* Use stdlib for time and random numbers
* Use Kokkos default spaces rather than explicit spaces
* add a permutation mode
@cwpearson
Copy link
Contributor Author

I made the optional changes and squashed into 2 commits

Copy link
Contributor

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me.

@crtrott
Copy link
Member

crtrott commented Sep 26, 2023

Retest this please.

@crtrott crtrott merged commit 9250328 into kokkos:develop Sep 26, 2023
27 of 28 checks passed
@crtrott
Copy link
Member

crtrott commented Sep 26, 2023

Failed tests due to disk full.

@cwpearson cwpearson mentioned this pull request Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants