-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BabelStream flavors as thrust::transform benchmarks #1921
Conversation
🟨 CI finished in 1h 31m: Pass: 99%/249 | Total: 1d 07h | Avg: 7m 30s | Max: 58m 28s | Hits: 99%/247587
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
@bernhardmgruber @gevtushenko idea: what do you think about adding a We may want to have multiple benchmarks for a downstream that cover different algorithms, and keeping those in one place might make maintenance easier, especially if some downstream benchmarks end up mixing multiple algorithms with |
I would like that! I would then add the full set of BabelStream benchmarks, which also contain |
We discussed this PR with @gevtushenko and concluded that the babelstream kernels are just variants of |
2e58498
to
5fcfef7
Compare
🟨 CI finished in 3h 51m: Pass: 93%/249 | Total: 3d 00h | Avg: 17m 35s | Max: 46m 43s | Hits: 87%/231914
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
🟩 CI finished in 3h 51m: Pass: 100%/249 | Total: 3d 00h | Avg: 17m 35s | Max: 46m 43s | Hits: 87%/248439
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
Regarding the tested element types, BabelStream uses @ahendriksen uses Alternatively, we could use |
I like this idea. |
ccebd04
to
2aa0bd4
Compare
See BabelStream Thrust implementation: https://github.com/UoB-HPC/BabelStream/blob/main/src/thrust/ThrustStream.cu Co-authored-by: Georgii Evtushenko <evtushenko.georgy@gmail.com>
2aa0bd4
to
737f380
Compare
🟨 CI finished in 3h 58m: Pass: 99%/249 | Total: 1d 10h | Avg: 8m 23s | Max: 28m 57s | Hits: 98%/247587
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
🟩 CI finished in 5h 28m: Pass: 100%/249 | Total: 1d 11h | Avg: 8m 31s | Max: 36m 34s | Hits: 98%/248439
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
🏃 Runner counts (total jobs: 249)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
The Thrust implementation of BabelStream uses a few versions of
thrust::transform
. Given the importance of this public benchmark, we should add these uses ofthrust::transform
(mul, add, triad and nstream) to your benchmarks as well. The copy and dot benchmarks are covered by existing thrust benchmarks.