Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement stackless tree traversal using escape index (ropes) #364

Merged
merged 12 commits into from
Oct 3, 2020

Conversation

aprokop
Copy link
Contributor

@aprokop aprokop commented Aug 18, 2020

Here's a figure to help understanding what ropes are:

Screenshot 2023-05-24 at 6 37 09 PM

@aprokop aprokop added the performance Something is slower than it should be label Aug 18, 2020
@aprokop aprokop requested a review from dalg24 August 18, 2020 13:58
@aprokop aprokop changed the title Stackless rebased Implement stackless tree traversal using ropes Aug 18, 2020
@aprokop aprokop marked this pull request as ready for review August 18, 2020 17:08
@aprokop
Copy link
Contributor Author

aprokop commented Aug 18, 2020

Summit results (master 8f1cc64 vs branch 9b9f577):

Serial

BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median                                     +0.0347         +0.0335          1779          1840          1847          1909
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median                                    +0.0368         +0.0366         16037         16627         16106         16694
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median                                   +0.0463         +0.0466        171383        179315        171759        179768
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median                                     +0.0356         +0.0343          1738          1800          1806          1868
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median                                    +0.0405         +0.0404         15607         16240         15674         16308
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median                                   +0.0492         +0.0495        167327        175551        167700        175997
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/2/manual_time_median                          -0.0057         -0.0057         50367         50078         50363         50075
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/2/manual_time_median                        -0.0027         -0.0027        526133        524730        526071        524665
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/2/manual_time_median                      -0.0009         -0.0009       5578910       5573748       5578397       5573250
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/1/3/manual_time_median                          -0.0100         -0.0100         49051         48562         49048         48558
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/1/3/manual_time_median                        -0.0081         -0.0081        650790        645529        650711        645456
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/1/3/manual_time_median                      -0.0056         -0.0056       8917848       8867477       8917011       8866654
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                          +0.0176         +0.0176         51433         52341         51430         52337
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                        +0.0247         +0.0247        550643        564229        550568        564153
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0299         +0.0299       6204356       6390111       6202311       6387803
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                          +0.0191         +0.0191         55884         56951         55879         56946
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                        +0.0298         +0.0298        776381        799491        776295        799393
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0366         +0.0366      11086560      11492440      11082968      11488470
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0314         +0.0314         64203         66220         64199         66214
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0350         +0.0350        670175        693624        670112        693551
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0322         +0.0322       7089267       7317474       7088609       7316843
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0145         +0.0145         16324         16561         16324         16561
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0156         +0.0156        119657        121520        119646        121507
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0099         +0.0099       1153652       1165117       1153439       1164896
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0304         +0.0304         64905         66881         64900         66876
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0341         +0.0341        677546        700632        677467        700550
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0316         +0.0316       7223087       7451303       7222308       7450559
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0097         +0.0097         16646         16807         16646         16807
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0128         +0.0127        121792        123346        121780        123332
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0085         +0.0085       1175602       1185541       1175360       1185307
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.1190         +0.1190         67954         76039         67949         76032
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.1323         +0.1323        734705        831875        734623        831781
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.1313         +0.1312       8416671       9521665       8413562       9517825
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0939         +0.0939         20492         22417         20492         22416
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0927         +0.0927        142453        155660        142438        155646
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0834         +0.0834       1156461       1252863       1156300       1252692
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.1055         +0.1055         69593         76935         69587         76928
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.1176         +0.1176        753227        841818        753146        841724
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.1176         +0.1176       8632040       9647551       8628961       9643642
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0784         +0.0784         21075         22727         21074         22727
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0761         +0.0761        146781        157953        146766        157938
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.0728         +0.0728       1190241       1276947       1190034       1276704

Everything looks good except unsorted spatial traverses (which are about 5-15% slower). This is acceptable.

OpenMP

BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median                                     +0.0087         +0.0084           326           329           336           339
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median                                    +0.0102         +0.0103          1544          1560          1559          1575
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median                                   +0.0134         +0.0218         14653         14849         15046         15374
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median                                     +0.0103         +0.0100           340           343           350           353
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median                                    +0.0120         +0.0148          1707          1728          1717          1743
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median                                   +0.0124         +0.0193         17356         17572         17749         18092
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/2/manual_time_median                          +0.0055         +0.0055          1768          1778          1771          1780
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/2/manual_time_median                        +0.0060         +0.0070         15177         15268         14834         14938
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0046         +0.0119        156081        156793        153633        155465
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/1/3/manual_time_median                          +0.0022         +0.0022          2116          2121          2119          2123
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/1/3/manual_time_median                        -0.0002         -0.0022         25958         25953         24267         24215
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/1/3/manual_time_median                      -0.0070         -0.0021        387182        384458        328597        327920
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                          -0.0100         -0.0098          1561          1546          1564          1548
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                        -0.0088         -0.0073         15027         14895         14353         14248
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0225         +0.0169        199159        203640        189860        193066
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                          -0.0028         -0.0027          1687          1683          1689          1685
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                        +0.0012         +0.0022         21085         21110         21061         21108
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0208         +0.0187        326276        333062        325147        331223
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0828         +0.0828          2041          2211          2044          2213
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0986         +0.0967         18473         20294         17791         19511
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0985         +0.1028        184979        203195        181979        200696
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0498         +0.0392          1250          1312           922           959
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0565         +0.0403          7812          8253          3206          3335
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0339         +0.0302         69361         71714         29962         30866
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0571         +0.0570          2152          2274          2154          2277
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0784         +0.0784         19135         20635         18391         19833
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0784         +0.0817        193658        208837        190625        206202
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0345         +0.0218          1325          1371           955           976
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0364         +0.0305          8089          8383          3282          3382
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0225         +0.0344         71794         73412         31016         32082
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0877         +0.0879          1847          2009          1849          2011
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.1142         +0.1142         18627         20756         18628         20756
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.1206         +0.1168        259946        291299        258188        288338
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0483         +0.0491           691           724           692           726
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0712         +0.0711          3858          4132          3859          4134
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0648         +0.0734         29307         31206         28680         30785
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0644         +0.0646          1932          2057          1934          2059
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.0878         +0.0879         19293         20987         19292         20987
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.1005         +0.0974        267470        294344        265944        291851
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0356         +0.0361           736           762           738           765
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0508         +0.0509          4030          4235          4032          4237
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.0516         +0.0583         30868         32459         30181         31941

Radius traverses 5-10% slower. Acceptable.

Cuda

BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median                                      +0.0347         +0.0299          2142          2216          2533          2609
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median                                     +0.1531         +0.1426          6589          7597          7078          8087
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median                                       +0.0205         +0.0204          1405          1434          1419          1448
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median                                      +0.0383         +0.0335          2144          2226          2535          2620
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median                                     +0.1545         +0.1444          6595          7613          7080          8102
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median                            -0.0596         -0.0568          1544          1451          1635          1542
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median                          -0.0326         -0.0299          6447          6237          6910          6704
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median                        -0.0197         -0.0185         26885         26356         27767         27252
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median                            +0.0102         +0.0098          1591          1607          1681          1698
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median                          -0.0047         +0.0007          7233          7199          7659          7665
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median                        -0.0275         -0.0274         43780         42578         44665         43440
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                            +0.0015         +0.0014          1155          1157          1246          1248
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                          -0.0091         -0.0056          4655          4613          5109          5080
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                        +0.0416         +0.0401         54009         56255         54891         57090
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                            +0.0215         +0.0204          1209          1235          1299          1326
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                          -0.0178         -0.0206          6819          6697          7282          7131
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                        +0.0457         +0.0453         96188        100589         97083        101477
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       +0.0124         +0.0109           953           965          1045          1056
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     -0.3011         -0.2720          4526          3163          4996          3637
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   -0.4783         -0.4652         32385         16896         33282         17799
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       +0.0147         +0.0127           886           899           978           990
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0506         -0.0430          2585          2455          3051          2920
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   -0.0881         -0.0729          8457          7711          9234          8560
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      -0.0091         -0.0086          1092          1082          1184          1174
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.3205         -0.2978          6177          4197          6645          4666
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  -0.4332         -0.4238         43002         24375         43871         25280
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      +0.0021         +0.0018          1007          1009          1099          1101
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    -0.0170         -0.0088          3731          3668          4169          4133
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  -0.0561         -0.0512         10452          9866         11284         10706
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       +0.0759         +0.0670           632           680           722           770
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     -0.3886         -0.3412          3370          2060          3838          2529
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   -0.2300         -0.2276         82558         63572         83460         64466
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       +0.0278         +0.0240           569           584           660           676
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0193         -0.0138          1336          1310          1801          1777
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   -0.0794         -0.0642          5387          4959          6208          5810
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      +0.0052         +0.0044           744           748           834           838
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.3662         -0.3340          4950          3137          5418          3608
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  -0.2261         -0.2205         93925         72692         94300         73509
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      +0.0083         +0.0074           694           700           786           791
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0167         -0.0137          2417          2377          2880          2841
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  -0.0867         -0.0774          7602          6943          8443          7790

Construction time ~15% slower for larger sizes, this is expected. Spatial traverses are much faster, and the larger size the better the improvement. Some reach 2x speedup.

src/details/ArborX_DetailsNode.hpp Outdated Show resolved Hide resolved
src/details/ArborX_DetailsNode.hpp Outdated Show resolved Hide resolved
src/details/ArborX_DetailsTreeTraversal.hpp Show resolved Hide resolved
@aprokop
Copy link
Contributor Author

aprokop commented Aug 20, 2020

@dalg24 Should have addressed most (all?) of your comments.

@aprokop
Copy link
Contributor Author

aprokop commented Aug 20, 2020

retest this please

@aprokop
Copy link
Contributor Author

aprokop commented Aug 20, 2020

Summit results (master 4834bff vs branch bd3eff0)

Serial

BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median                                     +0.0921         +0.0919          1798          1964          1867          2038
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median                                    +0.0171         +0.0175         16008         16282         16076         16357
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median                                   +0.0093         +0.0094        172590        174197        173064        174688
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median                                     +0.1004         +0.1006          1751          1926          1819          2002
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median                                    +0.0190         +0.0194         15572         15868         15640         15942
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median                                   +0.0120         +0.0119        168582        170598        169062        171067
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/2/manual_time_median                          +0.0089         +0.0089         50318         50764         50314         50761
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/2/manual_time_median                        +0.0082         +0.0082        525442        529764        525381        529705
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0079         +0.0079       5575179       5618957       5574665       5618499
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/1/3/manual_time_median                          -0.0072         -0.0072         49165         48809         49162         48806
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/1/3/manual_time_median                        -0.0017         -0.0017        651362        650227        651284        650147
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0024         +0.0024       8931631       8952903       8930890       8952144
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                          +0.0062         +0.0062         51410         51728         51407         51724
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                        +0.0129         +0.0129        549630        556726        549559        556657
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0168         +0.0168       6200409       6304834       6198556       6302754
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                          +0.0128         +0.0128         55811         56525         55807         56522
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                        +0.0208         +0.0208        776500        792679        776404        792574
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0283         +0.0283      11097232      11411584      11094027      11407970
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0220         +0.0221         64037         65449         64032         65444
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0230         +0.0229        668389        683735        668328        683657
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0225         +0.0225       7064364       7223542       7063934       7222822
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0145         +0.0145         16314         16550         16314         16550
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0078         +0.0078        119515        120448        119504        120436
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0034         +0.0034       1153954       1157923       1153748       1157705
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0493         +0.0493         64781         67974         64776         67970
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0496         +0.0496        675623        709157        675547        709072
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0468         +0.0468       7201237       7537970       7200584       7537308
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0301         +0.0301         16669         17171         16669         17171
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0184         +0.0184        121670        123913        121658        123902
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0100         +0.0100       1176235       1188023       1176011       1187781
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0054         +0.0053         68190         68555         68184         68549
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0095         +0.0095        737165        744181        737082        744099
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0153         +0.0153       8437554       8566361       8434677       8563310
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0001         +0.0001         20558         20560         20558         20560
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   -0.0072         -0.0072        142632        141608        142619        141595
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 -0.0024         -0.0024       1158937       1156189       1158778       1156040
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    -0.0176         -0.0176         70164         68929         70158         68923
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  -0.0110         -0.0110        755592        747316        755506        747230
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                -0.0045         -0.0045       8653734       8615223       8650965       8612179
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    -0.0177         -0.0177         21211         20836         21211         20836
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  -0.0233         -0.0233        146983        143566        146970        143551
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                -0.0201         -0.0201       1195197       1171219       1194991       1171006

OpenMP

BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median                                     +0.8025         +0.8323           333           601           344           630
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median                                    +0.1488         +0.1594          1551          1781          1565          1815
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median                                   +0.0013         +0.0022         14832         14850         15357         15391
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median                                     +0.7503         +0.7809           348           608           358           638
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median                                    +0.1267         +0.1365          1726          1944          1740          1978
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median                                   +0.0026         +0.0030         17525         17570         18050         18104
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/2/manual_time_median                          +0.2486         +0.2483          1773          2213          1775          2216
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/2/manual_time_median                        +0.0377         +0.0406         15177         15748         14828         15430
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0169         +0.0176        155295        157920        153932        156645
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/1/3/manual_time_median                          +0.2216         +0.2215          2126          2597          2128          2599
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/1/3/manual_time_median                        +0.0524         +0.0469         25975         27336         24275         25415
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0481         +0.0333        387400        406026        329099        340068
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                          +0.1406         +0.1408          1564          1783          1565          1785
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                        +0.0133         +0.0095         15018         15217         14345         14481
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0181         +0.0151        198128        201720        188841        191687
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                          +0.1271         +0.1263          1690          1905          1693          1907
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                        +0.0070         +0.0072         21079         21226         21074         21226
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0110         +0.0112        325045        328628        324158        327803
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.1638         +0.1634          2045          2380          2047          2382
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0356         +0.0392         18481         19139         17794         18491
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0214         +0.0243        185023        188977        182116        186536
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.2090         +0.3857          1256          1519           929          1287
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0375         +0.0898          7820          8113          3210          3498
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0028         +0.0019         69407         69602         30083         30140
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.1446         +0.1443          2152          2463          2155          2466
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0013         +0.0099         19117         19142         18357         18539
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                -0.0122         -0.0077        193500        191135        190649        189184
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.2216         +0.4303          1332          1627           961          1375
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0308         +0.0984          8086          8335          3282          3604
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0017         +0.0084         71887         72011         31257         31520
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0890         +0.0889          1854          2019          1857          2022
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0417         +0.0417         18647         19425         18648         19426
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0494         +0.0465        257030        269716        256085        268000
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.1763         +0.1756           695           818           698           820
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0537         +0.0537          3854          4062          3857          4064
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0234         +0.0214         29365         30053         28795         29411
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0773         +0.0773          1944          2094          1946          2097
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.0082         +0.0082         19295         19452         19296         19453
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0246         +0.0247        264474        270985        263082        269591
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.2030         +0.2023           743           894           746           896
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0364         +0.0364          4018          4164          4020          4167
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.0003         +0.0007         30946         30955         30234         30255

Cuda

BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median                                       +0.0066         +0.0061          1406          1415          1421          1430
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median                                      +0.0074         +0.0065          2142          2158          2535          2552
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median                                     +0.0079         +0.0080          6601          6653          7090          7147
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median                                       +0.0062         +0.0064          1406          1415          1421          1430
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median                                      +0.0059         +0.0060          2145          2157          2536          2551
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median                                     +0.0077         +0.0076          6604          6655          7096          7150
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median                            +0.0305         +0.0303          1527          1574          1617          1666
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median                          +0.0086         +0.0084          6369          6424          6835          6892
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median                        -0.0062         -0.0055         26730         26565         27619         27467
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median                            +0.0381         +0.0378          1576          1636          1665          1728
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median                          +0.0089         +0.0087          7172          7235          7635          7701
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median                        -0.0081         -0.0078         43847         43491         44737         44386
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                            +0.0460         +0.0448          1147          1200          1237          1293
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                          +0.0064         +0.0062          4607          4637          5073          5104
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                        +0.0380         +0.0374         54056         56113         54938         56995
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                            +0.0387         +0.0381          1198          1244          1286          1335
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                          +0.0142         +0.0138          6698          6794          7165          7264
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                        +0.0296         +0.0303         96463         99315         97245        100190
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       +0.0359         +0.0349           945           979          1035          1071
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     -0.3020         -0.2725          4545          3173          5015          3649
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   -0.4761         -0.4633         32307         16926         33211         17826
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       +0.0309         +0.0304           884           911           974          1004
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0458         -0.0392          2584          2466          3050          2930
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   -0.0798         -0.0720          8393          7723          9236          8571
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      +0.0139         +0.0148          1081          1096          1171          1189
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.3158         -0.2931          6165          4218          6635          4690
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  -0.4327         -0.4239         42959         24371         43853         25264
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      +0.0159         +0.0164          1006          1022          1097          1115
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    -0.0082         -0.0066          3709          3678          4172          4144
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  -0.0554         -0.0502         10447          9869         11282         10716
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       +0.1208         +0.1108           623           699           712           791
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     -0.3890         -0.3406          3376          2063          3847          2536
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   -0.2271         -0.2247         82569         63813         83469         64717
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       +0.0596         +0.0541           566           600           657           693
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0098         -0.0061          1338          1325          1803          1792
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   -0.0715         -0.0611          5349          4967          6196          5817
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      +0.0227         +0.0230           741           758           830           849
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.3641         -0.3321          4955          3151          5424          3623
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  -0.2236         -0.2213         93355         72481         94247         73390
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      +0.0343         +0.0327           694           718           785           811
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0132         -0.0101          2422          2390          2885          2855
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  -0.0852         -0.0761          7598          6951          8442          7799

@aprokop
Copy link
Contributor Author

aprokop commented Aug 20, 2020

So, with the latest patch, Serial and Cuda behave well. Serial is on par with master, Cuda is improving. But something is completely off with OpenMP. Not only did it not get back when using stack-based traversal, it became much worse with it. No clue why.

@aprokop aprokop closed this Aug 20, 2020
@aprokop aprokop deleted the stackless_rebased branch August 20, 2020 21:43
@aprokop aprokop restored the stackless_rebased branch August 20, 2020 21:44
@dalg24 dalg24 reopened this Aug 21, 2020
@aprokop
Copy link
Contributor Author

aprokop commented Aug 21, 2020

OK. Something I learned. If I revert 2754cf1, the results are reasonable. I'm not sure what in that patch causes the issue.

Summit results (master 4834bff vs branch bd3eff0 with 2754cf1 reverted):

Serial

BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median                                     +0.0506         +0.0432          1788          1879          1857          1937
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median                                    +0.0349         +0.0347         15997         16555         16065         16622
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median                                   +0.0383         +0.0385        171758        178344        172232        178857
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median                                     +0.0460         +0.0446          1751          1832          1820          1901
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median                                    +0.0368         +0.0366         15573         16146         15641         16214
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median                                   +0.0372         +0.0372        167815        174052        168289        174555
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/2/manual_time_median                          +0.0111         +0.0112         50315         50875         50309         50871
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/2/manual_time_median                        +0.0087         +0.0087        525791        530374        525715        530310
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0088         +0.0088       5574543       5623830       5574036       5623357
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/1/3/manual_time_median                          -0.0012         -0.0012         49114         49056         49110         49052
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/1/3/manual_time_median                        -0.0002         -0.0002        651755        651596        651673        651514
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0039         +0.0039       8932174       8966633       8931351       8965816
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                          +0.0077         +0.0077         51331         51728         51327         51724
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                        +0.0104         +0.0104        550553        556285        550480        556211
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0175         +0.0175       6197176       6305795       6195210       6303831
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                          +0.0104         +0.0104         55837         56421         55833         56416
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                        +0.0185         +0.0185        777391        791738        777279        791632
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0281         +0.0281      11093118      11404504      11089497      11400987
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0228         +0.0228         64031         65492         64026         65486
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0223         +0.0224        668705        683641        668621        683567
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0215         +0.0215       7072016       7224165       7071436       7223654
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0115         +0.0115         16316         16504         16316         16504
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0062         +0.0062        119518        120255        119506        120242
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 -0.0002         -0.0001       1156343       1156122       1156123       1155952
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0061         +0.0064         64758         65155         64753         65168
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0061         +0.0061        676914        681028        676818        680939
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0060         +0.0060       7209449       7252651       7208741       7251963
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0005         +0.0005         16631         16638         16631         16638
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  -0.0008         -0.0008        121729        121627        121715        121613
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                -0.0007         -0.0009       1178480       1177683       1178349       1177305
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0057         +0.0057         68177         68566         68171         68560
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0111         +0.0111        737100        745297        737019        745209
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0161         +0.0161       8434763       8570804       8431660       8567769
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0009         +0.0009         20563         20580         20562         20580
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0018         +0.0018        142653        142911        142638        142897
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 -0.0051         -0.0051       1162599       1156674       1162421       1156508
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    -0.0139         -0.0139         69839         68871         69833         68864
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  -0.0127         -0.0127        756122        746507        756024        746405
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                -0.0058         -0.0058       8649990       8599625       8646901       8596675
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    -0.0119         -0.0117         21187         20935         21183         20935
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  -0.0123         -0.0123        147046        145241        147040        145225
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                -0.0148         -0.0148       1199854       1182050       1199648       1181846

OpenMP

BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median                                     +0.0194         +0.0185           339           346           350           356
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median                                    +0.0172         +0.0168          1550          1577          1565          1591
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median                                   +0.0021         +0.0037         14817         14849         15332         15389
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median                                     +0.0394         +0.0384           349           362           359           373
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median                                    +0.0183         +0.0179          1724          1756          1739          1771
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median                                   +0.0032         +0.0044         17549         17604         18060         18140
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/2/manual_time_median                          +0.0432         +0.0432          1777          1853          1779          1856
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/2/manual_time_median                        +0.0229         +0.0221         15158         15505         14826         15153
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/2/manual_time_median                      +0.0152         +0.0188        155936        158313        154005        156904
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/1/3/manual_time_median                          +0.0562         +0.0563          2126          2245          2127          2247
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/1/3/manual_time_median                        +0.0444         +0.0392         25982         27135         24287         25240
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/1/3/manual_time_median                      +0.0495         +0.0385        387122        406273        329536        342216
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                          +0.0208         +0.0208          1565          1598          1568          1600
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                        -0.0007         -0.0030         15023         15012         14351         14308
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0194         +0.0179        197978        201812        188718        192103
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                          +0.0165         +0.0165          1690          1718          1692          1720
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                        -0.0066         -0.0063         21075         20936         21068         20936
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0070         +0.0074        324859        327124        324133        326528
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0279         +0.0277          2050          2107          2052          2109
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0209         +0.0231         18480         18867         17795         18207
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0204         +0.0213        185043        188810        182311        186195
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0271         +0.0239          1256          1290           929           951
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   +0.0090         +0.0039          7831          7901          3213          3225
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 +0.0002         -0.0283         69619         69630         30167         29313
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0271         +0.0271          2153          2212          2156          2214
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0180         +0.0218         19098         19441         18335         18734
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0163         +0.0188        193495        196645        190566        194158
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0512         +0.0253          1305          1372           969           994
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0129         +0.0065          8089          8194          3277          3298
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                +0.0062         -0.0065         72043         72487         31271         31067
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     +0.0430         +0.0429          1854          1934          1856          1936
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   +0.0437         +0.0437         18650         19464         18649         19465
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 +0.0553         +0.0513        256519        270698        255934        269063
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0426         +0.0425           696           726           699           728
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0313         +0.0313          3866          3988          3869          3990
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0210         +0.0197         29383         30000         28930         29499
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    +0.0127         +0.0128          1939          1964          1941          1966
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  +0.0021         +0.0021         19298         19338         19298         19338
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                +0.0278         +0.0251        263885        271209        263059        269652
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    +0.0273         +0.0272           743           763           746           766
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0045         +0.0045          4027          4045          4029          4047
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                -0.0032         -0.0004         30959         30861         30372         30358

Cuda

BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median                                       +0.0143         +0.0139          1416          1437          1431          1451
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median                                      +0.0324         +0.0261          2157          2227          2554          2620
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median                                     +0.1462         +0.1347          6646          7617          7141          8104
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median                                       +0.0177         +0.0172          1416          1441          1431          1456
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median                                      +0.0372         +0.0301          2159          2239          2556          2633
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median                                     +0.1459         +0.1381          6670          7643          7142          8128
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median                            -0.0501         -0.0471          1531          1454          1621          1545
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median                          -0.0132         -0.0125          6374          6290          6843          6757
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median                        +0.0086         +0.0082         26498         26727         27387         27612
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median                            +0.0102         +0.0098          1587          1603          1677          1694
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median                          +0.0019         +0.0025          7217          7231          7679          7698
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median                        +0.0167         +0.0166         42816         43530         43700         44428
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                            +0.0088         +0.0091          1156          1166          1246          1257
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                          +0.0030         +0.0026          4623          4637          5089          5103
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                        +0.0442         +0.0442         54308         56706         54984         57414
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                            +0.0227         +0.0216          1203          1230          1293          1321
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                          +0.0202         +0.0182          6675          6809          7144          7274
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                        +0.0197         +0.0203         97391         99308         98170        100164
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       +0.0184         +0.0173           954           971          1044          1062
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     -0.3014         -0.2727          4556          3183          5032          3660
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   -0.4740         -0.4611         32161         16917         33074         17825
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       +0.0136         +0.0124           890           903           981           994
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0531         -0.0466          2603          2465          3073          2930
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   -0.0824         -0.0733          8439          7744          9278          8597
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      +0.0021         +0.0023          1089          1091          1180          1182
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.3188         -0.2964          6195          4220          6669          4693
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  -0.4323         -0.4228         42821         24309         43683         25212
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      +0.0003         +0.0009          1013          1013          1104          1105
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    -0.0182         -0.0110          3765          3697          4208          4162
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  -0.0639         -0.0532         10571          9896         11348         10745
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       +0.0891         +0.0811           631           687           720           779
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     -0.3878         -0.3413          3380          2069          3856          2540
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   -0.2226         -0.2218         82654         64258         83555         65019
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       +0.0453         +0.0414           566           592           657           684
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0152         -0.0125          1341          1321          1811          1789
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   -0.0798         -0.0663          5401          4970          6233          5819
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      +0.0002         +0.0010           745           746           835           836
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.3652         -0.3333          4962          3150          5435          3624
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  -0.2186         -0.2199         93634         73162         94533         73750
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      +0.0266         +0.0256           691           710           782           802
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0186         -0.0159          2434          2389          2901          2855
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  -0.0951         -0.0807          7689          6958          8487          7803

@aprokop
Copy link
Contributor Author

aprokop commented Aug 22, 2020

I confirmed that the resulting hierarchy does not change with the patch. The only thing that changes slightly (independent of patch) is that some child nodes may be swapped with their neighbor. This happens when two primitives have the same Morton codes. As BinSort is not guaranteed to be stable, the ordering of the nodes may vary slightly.

Another fun thing that I discovered. During one of the interactive sessions on Summit, running one executable produced different OpenMP timings depending on whether only OpenMP was run, or OpenMP together with other tests (i.e., after Serial). Seems to also depend on the node, as I was not always able to reproduce it.

test/tstDetailsTreeConstruction.cpp Outdated Show resolved Hide resolved
test/tstDetailsTreeConstruction.cpp Outdated Show resolved Hide resolved
src/details/ArborX_DetailsTreeConstruction.hpp Outdated Show resolved Hide resolved
@aprokop aprokop added the help wanted Extra attention is needed label Sep 5, 2020
@aprokop
Copy link
Contributor Author

aprokop commented Sep 5, 2020

I'm stuck and not sure how to move forward here. Would appreciate any help.

@aprokop
Copy link
Contributor Author

aprokop commented Oct 1, 2020

I rebased on master. I also implemented slightly different approach. Now, we allow two different node types: one with two children, and one with a left child and rope (this became possible due to #403). We choose node type depending on whether we are running Cuda, or host, as previous results indicated that rope seem to slow things down on the host.

Each node now carries a tag, which we use to construct and traverse hierarchies differently. But the differences are very minute (except for the introduction of ropes based traversal), allowing most of the code to be shared.

This is a bit unfinished at the moment, as there are at least couple things that require fixing:

  • Tree construction test needs to update to test both types of nodes
  • Tree visualization test needs update
    However, everything in tests/ passes.

There are couple reasons why I pushed this is a bit early. First, I wanted to run and post results from Summit. Second, I wanted @dalg24 to take a look and see if there is anything he completely disagrees with.

My plan is to post the Summit results once that's completed, and work on finishing the tests I mentioned earlier. The core of this PR (assuming satisfactory Summit runs) is complete, from my perspective.

@aprokop aprokop added enhancement New feature or request and removed help wanted Extra attention is needed labels Oct 1, 2020
@aprokop
Copy link
Contributor Author

aprokop commented Oct 1, 2020

Summit results: master (1d7ed60) vs stackless branch (6499e4b)

Serial (complete parity)

BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median                                     -0.0130         -0.0133          1848          1823          1916          1891
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median                                    -0.0039         -0.0038         16297         16233         16364         16302
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median                                   +0.0031         +0.0030        175846        176387        176330        176863
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median                                     -0.0076         -0.0072          1809          1795          1877          1864
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median                                    -0.0044         -0.0043         16008         15937         16075         16006
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median                                   +0.0016         +0.0017        173111        173392        173584        173885
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/2/manual_time_median                          -0.0040         -0.0040         53416         53203         53414         53200
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/2/manual_time_median                        -0.0038         -0.0038        560307        558158        560243        558097
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/2/manual_time_median                      -0.0040         -0.0040       5964819       5940813       5964431       5940380
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/1/3/manual_time_median                          -0.0013         -0.0013         53226         53159         53223         53155
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/1/3/manual_time_median                        -0.0001         -0.0001        718232        718154        718142        718076
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/1/3/manual_time_median                      -0.0002         -0.0003       9969751       9967464       9968904       9966121
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                          -0.0019         -0.0019         55037         54931         55034         54927
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                        -0.0023         -0.0023        591145        589812        591070        589735
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0000         +0.0000       6623067       6623157       6620874       6621054
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                          +0.0019         +0.0019         59515         59626         59511         59622
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                        +0.0031         +0.0031        833515        836098        833414        835983
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0055         +0.0055      11938438      12004423      11934709      12000539
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/0/2/manual_time_median                     -0.0034         -0.0034         64066         63847         64061         63842
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/0/2/manual_time_median                   -0.0034         -0.0034        668962        666721        668887        666638
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/0/2/manual_time_median                 -0.0034         -0.0035       7082775       7058379       7082295       7057825
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/0/1/3/manual_time_median                     -0.0032         -0.0032         16319         16266         16319         16267
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/0/1/3/manual_time_median                   -0.0000         -0.0000        119736        119731        119725        119720
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/0/1/3/manual_time_median                 -0.0012         -0.0012       1158766       1157382       1158558       1157158
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0094         +0.0094         64685         65291         64681         65286
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0087         +0.0088        675039        680905        674945        680868
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0080         +0.0080       7211008       7268794       7210386       7268104
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0031         +0.0031         16663         16715         16663         16715
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0032         +0.0032        121968        122355        121954        122344
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/10/1/3/manual_time_median                -0.0010         -0.0009       1181894       1180736       1181632       1180521
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/0/2/manual_time_median                     -0.0037         -0.0037         68498         68247         68493         68242
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/0/2/manual_time_median                   -0.0040         -0.0040        741850        738891        741766        738817
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/0/2/manual_time_median                 -0.0043         -0.0043       8495313       8458736       8492539       8455710
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0007         +0.0007         20578         20592         20577         20591
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/1/3/manual_time_median                   +0.0001         +0.0001        142679        142687        142664        142674
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/1/3/manual_time_median                 -0.0031         -0.0031       1159677       1156078       1159512       1155904
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/0/2/manual_time_median                    -0.0139         -0.0139         70510         69533         70504         69527
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/0/2/manual_time_median                  -0.0043         -0.0043        754097        750862        754006        750765
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/0/2/manual_time_median                -0.0046         -0.0047       8645128       8605023       8642311       8601964
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/10/1/3/manual_time_median                    -0.0054         -0.0054         21207         21093         21207         21092
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/10/1/3/manual_time_median                  +0.0044         +0.0044        145956        146593        145941        146579
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.0048         +0.0049       1188023       1193758       1187653       1193432

OpenMP (complete parity)

BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median                                     +0.0027         +0.0026           316           316           326           326
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median                                    -0.0066         -0.0065          1550          1540          1565          1555
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median                                   +0.0070         +0.0049         14801         14905         15350         15424
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median                                     +0.0016         +0.0014           328           328           338           338
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median                                    -0.0051         -0.0052          1720          1711          1734          1725
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median                                   +0.0015         +0.0012         17541         17567         18087         18108
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/2/manual_time_median                          -0.0018         -0.0018          1795          1792          1797          1794
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/2/manual_time_median                        +0.0013         +0.0009         15383         15403         15037         15050
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/2/manual_time_median                      -0.0003         +0.0007        157769        157729        156278        156389
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/1/3/manual_time_median                          +0.0015         +0.0015          2174          2178          2176          2180
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/1/3/manual_time_median                        -0.0017         -0.0032         26810         26764         25005         24926
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/1/3/manual_time_median                      -0.0026         -0.0035        400609        399580        339430        338251
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                          +0.0006         +0.0005          1560          1561          1563          1564
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                        +0.0007         +0.0020         14963         14973         14225         14254
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                      +0.0015         +0.0023        198606        198904        188868        189293
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                          -0.0015         -0.0017          1673          1670          1675          1672
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                        +0.0012         +0.0016         20777         20803         20769         20803
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                      +0.0006         +0.0029        322095        322293        320790        321731
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/0/2/manual_time_median                     +0.0052         +0.0052          2062          2073          2064          2075
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/0/2/manual_time_median                   +0.0058         +0.0048         18577         18684         17880         17966
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/0/2/manual_time_median                 +0.0059         +0.0046        185907        186995        183414        184260
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/0/1/3/manual_time_median                     +0.0039         +0.0029          1258          1263           925           927
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/0/1/3/manual_time_median                   -0.0031         -0.0063          7861          7837          3243          3223
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/0/1/3/manual_time_median                 -0.0061         -0.0046         69662         69236         30008         29870
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/0/2/manual_time_median                    +0.0055         +0.0055          2153          2165          2155          2167
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/0/2/manual_time_median                  +0.0060         +0.0064         19159         19275         18412         18530
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/0/2/manual_time_median                +0.0066         +0.0066        194013        195294        191210        192467
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/10/1/3/manual_time_median                    +0.0049         +0.0004          1325          1332           957           957
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/10/1/3/manual_time_median                  +0.0003         -0.0068          8105          8108          3315          3292
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/10/1/3/manual_time_median                -0.0037         -0.0109         72120         71851         31482         31137
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/0/2/manual_time_median                     -0.0064         -0.0064          1849          1837          1851          1840
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/0/2/manual_time_median                   -0.0085         -0.0085         18716         18557         18717         18557
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/0/2/manual_time_median                 -0.0063         -0.0094        260535        258881        259406        256967
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/1/3/manual_time_median                     +0.0010         +0.0009           688           689           690           691
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/1/3/manual_time_median                   -0.0032         -0.0031          3865          3853          3867          3855
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/1/3/manual_time_median                 +0.0013         -0.0045         29399         29439         28912         28781
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/0/2/manual_time_median                    -0.0122         -0.0122          1908          1885          1911          1887
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/0/2/manual_time_median                  -0.0080         -0.0080         18894         18742         18894         18743
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/0/2/manual_time_median                -0.0036         -0.0073        262910        261968        261930        260015
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/10/1/3/manual_time_median                    -0.0052         -0.0050           729           725           731           727
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/10/1/3/manual_time_median                  -0.0052         -0.0052          3966          3945          3968          3947
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/10/1/3/manual_time_median                +0.0007         -0.0064         30462         30483         29948         29755

Cuda (complete parity in construction and knn, yay! in spatial)

BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median                                       -0.0050         +0.0013          1418          1411          1424          1426
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median                                      -0.0094         +0.0040          2180          2159          2543          2553
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median                                     +0.0078         +0.0053          6887          6941          7376          7415
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median                                       -0.0045         +0.0012          1417          1411          1424          1425
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median                                      -0.0117         +0.0037          2186          2160          2543          2553
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median                                     +0.0131         +0.0052          6890          6980          7380          7418
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median                            +0.0077         +0.0074          1542          1553          1633          1645
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median                          +0.0078         +0.0069          6394          6444          6865          6912
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median                        -0.0028         +0.0004         26769         26693         27577         27587
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median                            +0.0138         +0.0136          1586          1608          1677          1700
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median                          +0.0045         +0.0109          7246          7278          7661          7745
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median                        +0.0026         +0.0011         43330         43443         44233         44283
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                            +0.0065         +0.0061          1162          1169          1253          1261
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                          +0.0080         +0.0066          4615          4652          5084          5117
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                        +0.0420         +0.0425         54046         56316         54856         57188
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                            +0.0135         +0.0134          1206          1222          1296          1313
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                          -0.0002         +0.0018          6753          6752          7208          7221
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                        +0.0336         +0.0338         96722         99975         97546        100841
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median                       +0.0218         +0.0208           952           972          1042          1064
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median                     -0.2966         -0.2609          4520          3179          4950          3659
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median                   -0.4681         -0.4551         31831         16932         32731         17836
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median                       +0.0170         +0.0165           890           905           981           997
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median                     -0.0377         -0.0422          2603          2504          3070          2941
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median                   -0.0845         -0.0715          8505          7786          9280          8617
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/0/2/manual_time_median                      +0.0038         +0.0042          1087          1091          1178          1183
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/0/2/manual_time_median                    -0.3142         -0.2912          6149          4217          6621          4693
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/0/2/manual_time_median                  -0.4269         -0.4174         42539         24377         43383         25276
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/10/1/3/manual_time_median                      +0.0053         +0.0057          1013          1018          1104          1110
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/10/1/3/manual_time_median                    -0.0101         -0.0089          3735          3697          4201          4164
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/10/1/3/manual_time_median                  -0.0583         -0.0499         10542          9928         11345         10779
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/0/2/manual_time_median                       +0.0780         +0.0700           634           684           724           774
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/0/2/manual_time_median                     -0.3820         -0.3340          3343          2066          3818          2543
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/0/2/manual_time_median                   -0.2245         -0.2214         82263         63798         83105         64704
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/1/3/manual_time_median                       +0.0349         +0.0307           571           591           663           683
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/1/3/manual_time_median                     -0.0280         -0.0118          1358          1320          1811          1790
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/1/3/manual_time_median                   -0.0738         -0.0619          5387          4989          6209          5825
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/0/2/manual_time_median                      +0.0119         +0.0114           743           752           833           843
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/0/2/manual_time_median                    -0.3593         -0.3275          4927          3156          5399          3631
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/0/2/manual_time_median                  -0.2229         -0.2202         93305         72510         94142         73414
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/10/1/3/manual_time_median                      +0.0111         +0.0102           700           708           792           800
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/10/1/3/manual_time_median                    -0.0155         -0.0122          2430          2392          2896          2861
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/10/1/3/manual_time_median                  -0.0903         -0.0780          7643          6953          8464          7803

The only slowdown on Cuda is +7% for the smallest 10K filled mesh for non-sorted queries. Not an issue.

Reran the HACC problem and confirmed the gain.

Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

src/details/ArborX_DetailsNode.hpp Show resolved Hide resolved
src/details/ArborX_DetailsTreeVisualization.hpp Outdated Show resolved Hide resolved
src/details/ArborX_DetailsTreeVisualization.hpp Outdated Show resolved Hide resolved
test/tstDetailsTreeConstruction.cpp Outdated Show resolved Hide resolved
test/tstDetailsTreeConstruction.cpp Show resolved Hide resolved
@aprokop
Copy link
Contributor Author

aprokop commented Oct 1, 2020

@dalg24 Three things:

  • Completely redone the tree construction test, so that now it tests both nodes.
  • Separated setRightChildOrRope in the functor in two, setRightChild and setRope
    It was breaking the abstraction layer for leaf nodes by explicitly setting permutation index to the right child.
  • I only do visualization using two children, as it's on host anyway.

The number of std::is_same<typename Node::Tag, BlahTag>{} is now quite a few. I don't really want to do anything about it, but if you feel strongly and/or have a clean way to trim it, I wouldn't argue about it.

The previous version broke the abstraction layer for leaf nodes by
explicitly setting permutation index to the right child inside tree
construction.
@aprokop aprokop changed the title Implement stackless tree traversal using ropes Implement stackless tree traversal using escape index (ropes) Oct 1, 2020
@aprokop aprokop mentioned this pull request Oct 1, 2020
Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/test/tstDetailsTreeConstruction.cpp b/test/tstDetailsTreeConstruction.cpp
index 670e02d..cfc6be0 100644
--- a/test/tstDetailsTreeConstruction.cpp
+++ b/test/tstDetailsTreeConstruction.cpp
@@ -162,10 +162,10 @@ BOOST_AUTO_TEST_CASE_TEMPLATE(example_tree_construction, DeviceType,
       << "L3"
       << "I4"
       << "L4"
+      << "L6"
       << "I5"
       << "I6"
       << "L5"
-      << "L6"
       << "L7";
   std::cout << "ref=" << ref.str() << "\n";

@@ -217,4 +217,6 @@ BOOST_AUTO_TEST_CASE_TEMPLATE(example_tree_construction, DeviceType,
   std::cout << "sol=" << sol.str() << "\n";

   BOOST_TEST(sol.str().compare(ref.str()) == 0);
+  BOOST_TEST(sol.str() == ref.str());
+  BOOST_TEST(sol.str() == ref.str(), tt::per_element());
 }
<ArborX>/test/tstDetailsTreeConstruction.cpp:219: error: in "example_tree_construction<Kokkos__Device<Kokkos__OpenMP_ Kokkos__HostSpace>>": check sol.str().compare(ref.str()) == 0 has failed [-3 != 0]
<ArborX>/test/tstDetailsTreeConstruction.cpp:220: error: in "example_tree_construction<Kokkos__Device<Kokkos__OpenMP_ Kokkos__HostSpace>>": check sol.str() == ref.str() has failed [I0I3I1L0L1I2L2L3I4L4I5I6L5L6L7 != I0I3I1L0L1I2L2L3I4L4L6I5I6L5L7]
<ArborX>/test/tstDetailsTreeConstruction.cpp:221: error: in "example_tree_construction<Kokkos__Device<Kokkos__OpenMP_ Kokkos__HostSpace>>": check sol.str() == ref.str() has failed
  - mismatch at position 20: ['I' == 'L'] is false
  - mismatch at position 21: ['5' == '6'] is false
  - mismatch at position 23: ['6' == '5'] is false
  - mismatch at position 24: ['L' == 'I'] is false
  - mismatch at position 25: ['5' == '6'] is false
  - mismatch at position 27: ['6' == '5'] is false

*** 3 failures are detected in the test module "Master Test Suite"

std::ostringstream sol;
traverseRecursive(root, sol);
std::cout << "sol=" << sol.str() << "\n";
BOOST_TEST(sol.str().compare(ref.str()) == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BOOST_TEST(sol.str().compare(ref.str()) == 0);
BOOST_TEST(sol.str() == ref.str());

traverse(leaf_nodes, internal_nodes, root, sol);
std::cout << "sol(node_with_left_child_and_rope) = " << sol.str() << "\n";

BOOST_TEST(sol.str().compare(ref.str()) == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BOOST_TEST(sol.str().compare(ref.str()) == 0);
BOOST_TEST(sol.str() == ref.str());

Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can merge after you fixed

@aprokop
Copy link
Contributor Author

aprokop commented Oct 3, 2020

Merging with the standard HIP failure.

@aprokop aprokop merged commit 5222818 into arborx:master Oct 3, 2020
@aprokop aprokop deleted the stackless_rebased branch October 3, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Something is slower than it should be
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants