Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New BVH::query() overload that only takes predicates and callback #329

Merged
merged 17 commits into from
Aug 7, 2020

Conversation

dalg24
Copy link
Contributor

@dalg24 dalg24 commented Jun 1, 2020

No description provided.

@dalg24 dalg24 force-pushed the new_query_overload branch 2 times, most recently from a672f6b to 5366adf Compare June 1, 2020 23:18
@dalg24 dalg24 marked this pull request as ready for review June 1, 2020 23:28
Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atomic thing in the example can be done in a separate PR, if desired.

examples/callback/example_callback.cpp Outdated Show resolved Hide resolved
@aprokop
Copy link
Contributor

aprokop commented Jun 2, 2020

I tried implementing a benchmark examples using this traversal type. It does not compile:

../src/details/ArborX_Callbacks.hpp(97): error: class "ArborX::Experimental::TraversalPolicy" has no member "value_type"
          detected during:
            instantiation of type "ArborX::Details::OutputFunctorHelper<ArborX::Experimental::TraversalPolicy>"
(117): here
            instantiation of "void ArborX::Details::check_valid_callback(const Callback &, const Predicates &, const OutputView &) [with Callback=Type1NearestCallback<Kokkos::OpenMP::device_type>, Predicates=Kokkos::View<ArborX::Nearest<ArborX::Point> *, Kokkos::OpenMP::device_type>, OutputView=ArborX::Experimental::TraversalPolicy]"
../src/details/ArborX_DetailsBoundingVolumeHierarchyImpl.hpp(333): here
            instantiation of "std::enable_if_t<<expression>, void> ArborX::Details::BoundingVolumeHierarchyImpl::check_valid_callback_if_first_argument_is_not_a_view(const Callback &, const Predicates &, const OutputView &) [with Callback=Type1NearestCallback<Kokkos::OpenMP::device_type>, Predicates=Kokkos::View<ArborX::Nearest<ArborX::Point> *, Kokkos::OpenMP::device_type>,
 OutputView=ArborX::Experimental::TraversalPolicy]"
../src/details/ArborX_DetailsBoundingVolumeHierarchyImpl.hpp(362): here
            instantiation of "void ArborX::Details::BoundingVolumeHierarchyImpl::query(const ExecutionSpace &, const BVH &, const Predicates &, CallbackOrView &&, View &&, Args &&...) [with ExecutionSpace=Kokkos::HostSpace::execution_space, BVH=ArborX::BoundingVolumeHierarchy<Kokkos::Serial::memory_space, void>, Predicates=Kokkos::View<ArborX::Nearest<ArborX::Point> *, Kokko
s::OpenMP::device_type>, CallbackOrView=Type1NearestCallback<Kokkos::OpenMP::device_type> &, View=ArborX::Experimental::TraversalPolicy &, Args=<>]"
../src/ArborX_LinearBVH.hpp(68): here
            instantiation of "void ArborX::BoundingVolumeHierarchy<MemorySpace, Enable>::query(const ExecutionSpace &, const Predicates &, Args &&...) const [with MemorySpace=Kokkos::Serial::memory_space, Enable=void, ExecutionSpace=Kokkos::HostSpace::execution_space, Predicates=Kokkos::View<ArborX::Nearest<ArborX::Point> *, Kokkos::OpenMP::device_type>, Args=<Type1NearestCa
llback<Kokkos::OpenMP::device_type> &, ArborX::Experimental::TraversalPolicy &>]"
../src/ArborX_LinearBVH.hpp(137): here
            instantiation of "void ArborX::BoundingVolumeHierarchy<DeviceType, std::enable_if_t<Kokkos::Impl::is_device_helper<std::remove_cv<DeviceType>::type>::type::value, void>>::query(Args &&...) const [with DeviceType=Kokkos::OpenMP::device_type, Args=<const Kokkos::View<ArborX::Nearest<ArborX::Point> *, Kokkos::OpenMP::device_type> &, Type1NearestCallback<Kokkos::Open
MP::device_type> &, ArborX::Experimental::TraversalPolicy &>]"
../benchmarks/bvh_driver/bvh_driver.cpp(183): here
            instantiation of "void BM_knn_type1_search<TreeType>(benchmark::State &) [with TreeType=ArborX::BVH<Kokkos::OpenMP::device_type>]"
../benchmarks/bvh_driver/bvh_driver.cpp(472): here

Essentially, if additional argument (such as TraversalPolicy) is specified, it chooses a wrong overload.

Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix overloads choices. Definitly need a test that runs query(space, predicates, callback, traversal_policy) and chooses the correct overload. The test may just be that it compiles.

@aprokop
Copy link
Contributor

aprokop commented Jun 2, 2020

I pushed a benchmark for this type of callback. It does not compile yet, but it should once the issue with overloads is fixed.

@aprokop aprokop added the API User visible interface modifications label Jun 2, 2020
@aprokop
Copy link
Contributor

aprokop commented Jun 2, 2020

Well, maybe I was to fast to say my patch would compile. Needs a bit of tweaking.

Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think permutation is broken for type 1 travesal. The callback(query_index, primitive_index) is called for non-permuted query_index while queries are permuted. So either we need to wrap the callback, or we need to merge #305 first.

I discovered this issue by a quick modification of halo finder example, trying to see how doing type 1 traversal affects its performance. It produced completely wrong halos.

Yeah, we need tests.

@aprokop
Copy link
Contributor

aprokop commented Jun 3, 2020

I did the following change to the example to try it out with HACC:

diff --git a/examples/halo_finder/ArborX_HaloFinder.hpp b/examples/halo_finder/ArborX_HaloFinder.hpp
index 7afb9d7..2c92dac 100644
--- a/examples/halo_finder/ArborX_HaloFinder.hpp
+++ b/examples/halo_finder/ArborX_HaloFinder.hpp
@@ -196,7 +196,6 @@ bool verifyCC(ExecutionSpace exec_space, IndicesView indices, OffsetView offset,
 template <typename MemorySpace>
 struct CCSCallback
 {
-  using tag = ArborX::Details::InlineCallbackTag;
   Kokkos::View<int *, MemorySpace> stat_;
 
   // Per [1]:
@@ -240,9 +239,8 @@ struct CCSCallback
     return curr;
   }
 
-  template <typename Query, typename Insert>
-  KOKKOS_FUNCTION void operator()(Query const &query, int j,
-                                  Insert const &) const
+  template <typename Query>
+  KOKKOS_FUNCTION void operator()(Query const &query, int j) const
   {
     int const i = ArborX::getData(query);
 
@@ -353,14 +351,11 @@ void findHalos(ExecutionSpace exec_space, Primitives const &primitives,
   // insert() will not be called
   start = clock::now();
   Kokkos::Profiling::pushRegion("ArborX:HaloFinder:ccs");
-  Kokkos::View<int *, MemorySpace> indices("indices", 0);
-  Kokkos::View<int *, MemorySpace> offset("offset", 0);
   Kokkos::View<int *, MemorySpace> stat(
       Kokkos::ViewAllocateWithoutInitializing("stat"), n);
   ArborX::iota(exec_space, stat);
   Kokkos::Profiling::pushRegion("ArborX:HaloFinder:ccs:query");
-  bvh.query(exec_space, predicates, CCSCallback<MemorySpace>{stat}, indices,
-            offset);
+  bvh.query(exec_space, predicates, CCSCallback<MemorySpace>{stat});
   Kokkos::Profiling::popRegion();
   // Per [1]:
   //
@@ -393,6 +388,8 @@ void findHalos(ExecutionSpace exec_space, Primitives const &primitives,
     start = clock::now();
     Kokkos::Profiling::pushRegion("ArborX:HaloFinder:verify");
 
+    Kokkos::View<int *, MemorySpace> indices("indices", 0);
+    Kokkos::View<int *, MemorySpace> offset("offset", 0);
     bvh.query(exec_space, predicates, indices, offset);
     auto passed = verifyCC(exec_space, indices, offset, ccs);
     printf("Verification %s\n", (passed ? "passed" : "failed"));

This does not compile with

<snip>/arborx/examples/halo_finder/ArborX_HaloFinder.hpp(245): error: 
no suitable conversion function from "const std::decay_t<ArborX::Details::PermutedIndices>" to "const int" exists

@dalg24
Copy link
Contributor Author

dalg24 commented Jun 3, 2020

This does not compile with

<snip>/arborx/examples/halo_finder/ArborX_HaloFinder.hpp(245): error: 
no suitable conversion function from "const std::decay_t<ArborX::Details::PermutedIn

Will fix

@aprokop
Copy link
Contributor

aprokop commented Jun 4, 2020

With 7bf380d and HACC patch, HACC' query runs about 7% faster. Summit results are pending.

@aprokop
Copy link
Contributor

aprokop commented Jun 4, 2020

Summit results (59cbe5c vs 7bf380d)

Serial

BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/2/manual_time_median                          +0.0082         +0.0081         50453         [181/40855]
   50449         50859                                            
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/2/manual_time_median                        +0.0093         +0.0093        528465        533378
  528399        533297                      
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0098         +0.0098       5631699       5686870
 5631247       5686374                                                       
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/1/3/manual_time_median                          +0.0060         +0.0060         49700         49997
   49696         49993
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/1/3/manual_time_median                        +0.0088         +0.0088        669399        675272
  669325        675190                                 
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0097         +0.0097       9314484       9405010
 9313685       9404177                                                  
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                          +0.0104         +0.0104         51710         52248
   51707         52244
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                        +0.0089         +0.0089        555906        560844
  555827        560776
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0097         +0.0097       6300426       6361545
 6298332       6359471
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                          +0.0097         +0.0097         56641         57192
   56637         57188
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                        +0.0088         +0.0088        795034        802002
  794931        801900
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0086         +0.0086      11499850      11598358
11496259      11594670
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0984         +0.0985         58985         64789
   58972         64784
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.1019         +0.1019        613932        676464
  613859        676395
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.1037         +0.1037       6477592       7149299
 6477095       7148749
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.1189         +0.1189         14699         16447
   14699         16447
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.1154         +0.1154        107809        120251
  107800        120240
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0985         +0.0985       1053840       1157598
 1053596       1157379
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0902         +0.0902         59800         65193
   59795         65188
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0944         +0.0945        621907        680641
  621830        680570
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0953         +0.0953       6622094       7253181
 6621454       7252495
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.1079         +0.1079         15043         16667
   15044         16667
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.1076         +0.1076        110324        122198
  110314        122186
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0917         +0.0917       1080900       1179967
 1080655       1179731
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0898         +0.0898         62636         68264
   62631         68258
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0923         +0.0923        676529        738992
  676446        738911
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0920         +0.0920       7775315       8490764
 7772508       8487697
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0880         +0.0880         18911         20574
   18910         20574
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.1173         +0.1172        127638        142604
  127626        142590
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.1285         +0.1285       1025627       1157439
 1025478       1157258
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0902         +0.0901         63268         68972
   63263         68966
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.0860         +0.0860        683085        741805
  682997        741716
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0858         +0.0859       7873517       8549444
 7870710       8546430
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0878         +0.0878         19253         20943
   19252         20943
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.1132         +0.1132        130154        144889
  130142        144876
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.1260         +0.1260       1049087       1181226
 1048894       1181015

OpenMP

BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/2/manual_time_median                          +0.0028         +0.0027          2368          2374
    2370          2376
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/2/manual_time_median                        +0.0072         +0.0079         15197         15306
   14862         14979
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0072         +0.0076        149715        150795
  148587        149711
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/1/3/manual_time_median                          +0.0020         +0.0020          2735          2741
    2737          2742
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/1/3/manual_time_median                        +0.0054         +0.0083         25710         25848
   23932         24131
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0091         +0.0067        372625        376006
  318985        321133
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                          +0.0036         +0.0035          1838          1845
    1841          1847
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                        +0.0025         +0.0034         14589         14626
   13936         13984
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0025         +0.0032        192912        193403
  183780        184370
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                          +0.0031         +0.0030          1966          1972
    1969          1975
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                        +0.0025         +0.0023         20433         20484
   20429         20476
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0031         +0.0027        314447        315437
  313301        314152
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0529         +0.0527          2433          2561
    2435          2563
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0824         +0.0802         17611         19061
   17052         18420
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0907         +0.0889        171404        186958
  169504        184574
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0660         +0.0260          1606          1712
    1447          1485
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0772         +0.0453          7686          8280
    3529          3689
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0645         +0.0376         65628         69863
   29489         30598
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0410         +0.0410          2584          2690
    2586          2692
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0743         +0.0722         17998         19336
   17450         18710
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0801         +0.0775        177158        191345
  175292        188868
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0577         +0.0284          1749          1850
    1559          1603
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0669         +0.0400          7997          8531
    3675          3822
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0591         +0.0374         68192         72224
   30634         31781
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0620         +0.0619          1913          2032
    1915          2034
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0850         +0.0849         17379         18856
   17380         18856
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0762         +0.0760        241639        260064
  241073        259394
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0484         +0.0483           835           875
     837           877
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0980         +0.0979          3704          4068
    3706          4070
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.1094         +0.1127         26866         29805
   26295         29259
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0892         +0.0891          2033          2214
    2035          2216
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.1187         +0.1187         17548         19632
   17548         19631
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0998         +0.0991        243440        267731
  242892        266962
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0508         +0.0505           960          1009
     962          1011
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.1103         +0.1102          3862          4288
    3864          4290
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.1306         +0.1292         27611         31217
   27045         30540

Cuda

BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median                            -0.0013         -0.0014          1542          1540      
    1635          1632                                                                                                                                       
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median                          +0.0035         +0.0034          7416          7442
    7883          7909
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median                        -0.0240         -0.0241         46550         45432
   47441         46300
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median                            +0.0051         +0.0046          1578          1586
    1668          1676
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median                          -0.0022         -0.0030          9322          9301
    9790          9760
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median                        -0.0103         -0.0103         93619         92659
   94447         93473
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                            -0.0055         -0.0049          1151          1145
    1243          1237
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                          +0.0039         +0.0034          5836          5859
    6306          6327
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                        +0.0073         +0.0075         70277         70787
   71152         71685
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                            -0.0006         -0.0007          1195          1194
    1285          1284
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                          +0.0047         +0.0009         11391         11445
   11861         11872
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                        +0.0016         +0.0036        168022        168294
  168577        169186
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       -0.0038         -0.0034           963           959
    1053          1050
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     -0.0122         -0.0147          4626          4569
    5100          5025
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   +0.0234         +0.0185         32469         33227
   33372         33991
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       +0.0004         +0.0006           898           898
     988           989
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0004         -0.0008          2610          2[5/40855]
    3082          3080
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   +0.0022         +0.0022          8448          8466
    9298          9318
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      -0.0141         -0.0132          1100          1085
    1191          1175
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.0283         -0.0256          6253          6076
    6722          6550
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  -0.0286         -0.0283         43059         41827
   43971         42727
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      -0.0036         -0.0036          1018          1014
    1110          1106
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    -0.0048         -0.0051          3745          3727
    4214          4192
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  +0.0103         +0.0006         10522         10630
   11366         11373
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       -0.0008         -0.0015           642           641
     731           730
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     -0.0151         -0.0131          3449          3397
    3926          3874
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   +0.0117         +0.0109         83217         84191
   84121         85042
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       -0.0034         -0.0029           580           578
     672           670
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0012         -0.0007          1356          1355
    1827          1826
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   +0.0016         +0.0016          5377          5386
    6230          6240
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      -0.0047         -0.0047           754           751
     844           840
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.0098         -0.0088          5023          4974
    5497          5449
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  +0.0087         +0.0087         94115         94929
   94973         95799
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      -0.0043         -0.0039           708           705
     800           797
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0121         -0.0002          2475          2445
    2914          2913
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  +0.0164         +0.0002          7643          7768
    8494          8496

@aprokop
Copy link
Contributor

aprokop commented Jun 4, 2020

I'm not ready to accept this yet. Will go through the code to see if I can identify the reason for slowdown and fix that. Will have to wait for a few days, though.

@aprokop
Copy link
Contributor

aprokop commented Aug 7, 2020

Current version (master ffed0fe vs this branch 7d4ebbe). Only posting radius search results, as construction and knn did not change (here are raw logs: 202008071619_master_ffed0fe.txt
202008071706_query_7d4ebbe.txt)

Serial

BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0895         +0.0895         58842         6[13/49154]
   58837         64104
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0932         +0.0932        612526        669599
  612436        669517
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0960         +0.0961       6457790       7077978
 6457182       7077438
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.1104         +0.1103         14664         16283
   14665         16283
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.1102         +0.1102        107762        119641
  107750        119630
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0958         +0.0959       1050279       1150894
 1050008       1150679
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0860         +0.0859         59760         64899
   59756         64890
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0909         +0.0910        621044        677511
  620957        677436
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0919         +0.0920       6613177       7221162
 6612360       7220458
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.1029         +0.1028         15016         16561
   15017         16561
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.1025         +0.1025        110398        121717
  110387        121704
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0895         +0.0895       1077875       1174301
 1077630       1174070
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0914         +0.0914         62376         68080
   62371         68073
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0937         +0.0937        673098        736163
  673024        736078
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0930         +0.0932       7715046       8432782
 7710663       8429667
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0842         +0.0842         18839         20426
   18839         20425
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.1103         +0.1103        126979        140983
  126966        140969
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.1203         +0.1204       1025562       1148967
 1025389       1148805
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0984         +0.0984         62948         69144
   62944         69138
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.1019         +0.1019        678347        747451
  678257        747355
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0976         +0.0978       7811128       8573642
 7806883       8570454
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0917         +0.0917         19151         20907
   19151         20907
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.1173         +0.1174        129464        144655
  129450        144641
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.1250         +0.1250       1049964       1181161
 1049731       1180954

OpenMP

BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0384         +0.0384          2436          2530
    2439          2532
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0692         +0.0710         17647         18868
   17046         18256
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0764         +0.0752        171769        184899
  169378        182117
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0549         +0.0186          1608          1697
    1448          1475
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0683         +0.0420          7656          8179
    3500          3647
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0593         +0.0442         65100         68963
   29015         30299
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0373         +0.0373          2584          2681
    2586          2683
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0688         +0.0718         18041         19283
   17437         18689
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0748         +0.0770        177629        190921
  175098        188575
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0530         +0.0215          1747          1840
    1559          1592
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0651         +0.0416          7963          8481
    3641          3792
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0578         +0.0429         67654         71567
   30171         31466
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0586         +0.0584          1912          2024
    1914          2026
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0814         +0.0813         17348         18759
   17349         18760
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0714         +0.0751        241362        258593
  239550        257539
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0450         +0.0449           834           872
     836           874
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0864         +0.0863          3701          4021
    3704          4023
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0998         +0.0926         26797         29470
   26506         28961
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0622         +0.0620          2031          2158
    2034          2160
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.0867         +0.0866         17522         19041
   17523         19041
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0761         +0.0808        243227        261743
  241389        260903
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0377         +0.0376           959           995
     961           997
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0892         +0.0889          3858          4202
    3861          4204
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.1125         +0.1039         27593         30697
   27215         30043

Cuda

BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       -0.0111         -0.0116           975          [13/49302]
    1068          1055
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     +0.0062         +0.0047          4622          4650
    5099          5124
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   +0.0029         +0.0027         32525         32621
   33409         33501
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       -0.0109         -0.0114           908           898
    1001           989
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0071         -0.0061          2624          2606
    3092          3074
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   +0.0162         -0.0040          8482          8620
    9333          9296
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      -0.0072         -0.0078          1113          1105
    1206          1197
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.0059         -0.0062          6292          6254
    6768          6726
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  +0.0026         +0.0021         43173         43287
   44055         44148
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      -0.0101         -0.0104          1030          1020
    1124          1112
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    +0.0007         -0.0048          3756          3759
    4225          4205
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  +0.0012         -0.0085         10564         10577
   11409         11311
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       -0.0158         -0.0160           645           635
     737           725
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     +0.0003         -0.0016          3450          3452
    3927          3921
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   +0.0041         +0.0106         83113         83450
   83428         84316
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       +0.0018         +0.0004           579           580
     672           672
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0027         -0.0066          1365          1361
    1836          1824
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   -0.0034         -0.0027          5420          5402
    6262          6245
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      -0.0126         -0.0128           762           752
     853           843
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.0041         -0.0020          5035          5014
    5497          5487
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  +0.0094         +0.0088         93473         94349
   94379         95205
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      +0.0077         +0.0056           705           710
     798           803
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0069         -0.0067          2455          2439
    2924          2904
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  +0.0001         -0.0008          7675          7676
    8520          8513

So, Serial ~10% slower everywhere. OpenMP ~6%. Cuda unaffected.

After looking exhaustively, I did not find a solution. The only thing that comes to mind is the way attach works, overall, and extra copies with it that I'm not sure about. However, this will not be fixed here, as I do not want to hold this PR any longer.

I removed the new test from benchmark during rebase, and will introduce it in a separate PR.

Copy link
Contributor Author

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aprokop aprokop merged commit 779f43d into arborx:master Aug 7, 2020
@aprokop aprokop added the performance Something is slower than it should be label Aug 7, 2020
@dalg24 dalg24 deleted the new_query_overload branch September 28, 2020 23:08
@aprokop aprokop mentioned this pull request Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API User visible interface modifications performance Something is slower than it should be
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants