-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use TeamVectorRange for filling tiles in BruteForce implementation #616
Use TeamVectorRange for filling tiles in BruteForce implementation #616
Conversation
I'd like to see some performance data for this patch. |
a273b03
to
45b6801
Compare
Some data from my workstation (GTX1070): $ for i in master_6794807d branch_45b6801c; do ./ArborX_BruteForce_$i --predicates 100 --primitives 100| grep 'Time BF'; done
Time BF: 0.000546
Time BF: 0.000478
$ for i in master_6794807d branch_45b6801c; do ./ArborX_BruteForce_$i --predicates 500 --primitives 500| grep 'Time BF'; done
Time BF: 0.001253
Time BF: 0.000986
$ for i in master_6794807d branch_45b6801c; do ./ArborX_BruteForce_$i --predicates 10000 --primitives 10000| grep 'Time BF'; done
Time BF: 0.030061
Time BF: 0.024904
$ for i in master_6794807d branch_45b6801c; do ./ArborX_BruteForce_$i --predicates 50000 --primitives 50000| grep 'Time BF'; done
Time BF: 0.598086
Time BF: 0.514194 |
I similarly see
on |
Are these GPU-only results? |
Yes, it's the default execution space, so |
My workstation (Intel E5-2620): $ for k in 100 500 1000 5000 10000; do \
for i in master_host_6794807d branch_host_45b6801c; do \
./ArborX_BruteForce_$i --predicates $k --primitives $k | grep 'Time BF'; \
done; \
done
Time BF: 0.000262
Time BF: 0.000251
Time BF: 0.005522
Time BF: 0.005494
Time BF: 0.022902
Time BF: 0.021482
Time BF: 0.603481
Time BF: 0.571992
Time BF: 2.438251
Time BF: 2.291969 So it is at least as fast on host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering reports that it is not slower both on the CPU and on the GPU
This fixes current problems in the
Kokkos
SYCL backend (see #614) but the restriction to do everything on the first team member always felt weird to me.Alternatively, we can special case for
SYCL
of course.