Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Thrust 1.16 #10489

Merged
merged 9 commits into from
Apr 1, 2022
Merged

Update to Thrust 1.16 #10489

merged 9 commits into from
Apr 1, 2022

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Mar 23, 2022

This PR updates the version of Thrust from 1.15 to 1.16 (changelog). This update is needed to fix compilation with GCC 11, because of some warnings-as-errors present in Thrust 1.15 with GCC 11 (such as this one from Thrust's copy of cub: NVIDIA/cub#418).

Notably, Thrust reduced the number of internal header inclusions:

#1572 Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

This change illuminated many missing includes in libcudf, so I added #include <thrust/...> for all thrust features used in each file (with help from a Python script).

I included raw benchmarks that I recorded below.

Benchmarks:
Benchmark                                                                                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CopyIfElse/int16_no_nulls/4096/manual_time                                                                                     +0.0581         +0.0307             0             0             0             0
CopyIfElse/uint32_no_nulls/4096/manual_time                                                                                    +0.1308         +0.0463             0             0             0             0
CopyIfElse/uint32_no_nulls/32768/manual_time                                                                                   +0.1043         +0.0485             0             0             0             0
CopyIfElse/float64_no_nulls/4096/manual_time                                                                                   +0.0894         +0.0422             0             0             0             0
StringDateTime/from_days/32768/manual_time                                                                                     +0.0529         +0.0491            93            98           112           118
StringDateTime/to_days/1024/manual_time                                                                                        +0.0596         +0.0493            35            37            54            57
StringDateTime/to_days/32768/manual_time                                                                                       +0.0547         +0.0460            37            39            55            58
StringToDurations/to_durations_ms/1024/manual_time                                                                             +0.0516         +0.0426            30            31            49            51
StringToDurations/to_durations_ms/32768/manual_time                                                                            +0.0542         +0.0506            32            34            52            55
StringToDurations/to_durations_us/32768/manual_time                                                                            +0.0520         +0.0440            32            34            52            55
StringsFromFixedPoint/strings_from_decimal64/16384/manual_time                                                                 +0.0530         +0.0508            94            99           113           119
StringsToNumeric/strings_to_float32/1024/manual_time                                                                           +0.0521         +0.0451            31            32            50            52
StringsToNumeric/strings_to_float64/16384/manual_time                                                                          +0.0517         +0.0437            32            34            51            53
StringsToNumeric/strings_to_float64/65536/manual_time                                                                          +0.0505         +0.0496            35            36            53            56
StringsToNumeric/strings_to_uint8/4096/manual_time                                                                             +0.0559         +0.0466            24            25            43            45
StringsToNumeric/strings_to_uint8/65536/manual_time                                                                            +0.0563         +0.0458            26            27            44            46
StringCopy/gather/4096/32/manual_time                                                                                          +0.0652         +0.0574             0             0             0             0
StringCopy/gather/4096/128/manual_time                                                                                         +0.0706         +0.0615             0             0             0             0
StringCopy/gather/4096/512/manual_time                                                                                         +0.0547         +0.0476             0             0             0             0
StringCopy/gather/32768/32/manual_time                                                                                         +0.0538         +0.0492             0             0             0             0
StringCopy/gather/32768/128/manual_time                                                                                        +0.0540         +0.0477             0             0             0             0
StringCopy/scatter/4096/32/manual_time                                                                                         +0.0571         +0.0526             0             0             0             0
StringCopy/scatter/32768/32/manual_time                                                                                        +0.0541         +0.0509             0             0             0             0
StringFindScalar/find_multi/4096/32/manual_time                                                                                +0.0525         +0.0460             0             0             0             0
StringFindScalar/find_multi/32768/32/manual_time                                                                               +0.0538         +0.0489             0             0             0             0
StringFindScalar/contains/4096/32/manual_time                                                                                  +0.0502         +0.0471             0             0             0             0
StringFindScalar/starts_with/4096/32/manual_time                                                                               +0.0528         +0.0476             0             0             0             0
StringFindScalar/starts_with/4096/2048/manual_time                                                                             +0.0575         +0.0475             0             0             0             0
StringFindScalar/starts_with/4096/8192/manual_time                                                                             +0.0606         +0.0515             0             0             0             0
StringFindScalar/starts_with/32768/32/manual_time                                                                              +0.0690         +0.0592             0             0             0             0
StringFindScalar/starts_with/32768/128/manual_time                                                                             +0.0589         +0.0499             0             0             0             0
StringFindScalar/starts_with/32768/512/manual_time                                                                             +0.0567         +0.0521             0             0             0             0
StringFindScalar/starts_with/32768/2048/manual_time                                                                            +0.0517         +0.0501             0             0             0             0
StringFindScalar/starts_with/262144/32/manual_time                                                                             +0.0555         +0.0525             0             0             0             0
StringFindScalar/ends_with/4096/2048/manual_time                                                                               +0.0526         +0.0446             0             0             0             0
StringFindScalar/ends_with/4096/8192/manual_time                                                                               +0.0568         +0.0485             0             0             0             0
StringFindScalar/ends_with/32768/32/manual_time                                                                                +0.0654         +0.0567             0             0             0             0
StringFindScalar/ends_with/32768/512/manual_time                                                                               +0.0546         +0.0502             0             0             0             0
StringFindScalar/ends_with/262144/32/manual_time                                                                               +0.0523         +0.0517             0             0             0             0
RepeatStrings/scalar_times/256/16/manual_time                                                                                  +0.0555         +0.0501             0             0             0             0
RepeatStrings/scalar_times/1024/16/manual_time                                                                                 +0.0562         +0.0519             0             0             0             0
RepeatStrings/column_times/256/16/manual_time                                                                                  +0.0645         +0.0579             0             0             0             0
RepeatStrings/column_times/256/64/manual_time                                                                                  +0.0506         +0.0472             0             0             0             0
RepeatStrings/column_times/1024/16/manual_time                                                                                 +0.0643         +0.0578             0             0             0             0
RepeatStrings/column_times/4096/16/manual_time                                                                                 +0.0537         +0.0502             0             0             0             0
RepeatStrings/column_times/16384/16/manual_time                                                                                +0.0565         +0.0514             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/16/manual_time                                                                  +0.0626         +0.0490             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/64/manual_time                                                                  +0.0539         +0.0434             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/256/manual_time                                                                 +0.0694         +0.0525             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/16/manual_time                                                                 +0.0526         +0.0422             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/64/manual_time                                                                 +0.0630         +0.0493             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/256/manual_time                                                                +0.0533         +0.0460             0             0             0             0
RepeatStrings/precomputed_sizes/256/16/manual_time                                                                             +0.0674         +0.0602             0             0             0             0
RepeatStrings/precomputed_sizes/1024/16/manual_time                                                                            +0.0544         +0.0488             0             0             0             0
RepeatStrings/precomputed_sizes/4096/16/manual_time                                                                            +0.0531         +0.0492             0             0             0             0
RepeatStrings/precomputed_sizes/16384/16/manual_time                                                                           +0.0522         +0.0470             0             0             0             0
StringReplace/slice/4096/32/manual_time                                                                                        +0.0559         +0.0534             0             0             0             0
StringReplace/slice/32768/32/manual_time                                                                                       +0.0509         +0.0472             0             0             0             0
StringSplit/split_ws/4096/32/manual_time                                                                                       +0.0507         +0.0493             0             0             0             0
StringSubstring/multi_position/4096/32/manual_time                                                                             +0.0560         +0.0515             0             0             0             0
StringSubstring/delimiter/4096/32/manual_time                                                                                  +0.0532         +0.0504             0             0             0             0
StringSubstring/delimiter/32768/128/manual_time                                                                                +0.0531         +0.0535             0             0             0             0
StringSubstring/multi_delimiter/4096/32/manual_time                                                                            +0.0544         +0.0522             0             0             0             0
CsvWrite/string_file_output/23/0/manual_time                                                                                   -0.3111         -0.0110          1421           979           842           833
Shift/shift_ten_percent_nullable_out/32768/manual_time                                                                         -0.0786         -0.0650             0             0             0             0
Shift/shift_full_nullable_out/1073741824/manual_time                                                                           +0.0511         +0.0510            11            11            11            11
TypeDispatcher/fp64_bandwidth_host/8/1024/1/manual_time                                                                        +0.1281         +0.0638         18970         21400         37938         40357
TypeDispatcher/fp64_bandwidth_host/4/2048/1/manual_time                                                                        +0.0928         +0.0345         11556         12629         30463         31513
TypeDispatcher/fp64_bandwidth_host/2/4096/1/manual_time                                                                        +0.0768         +0.0270          7421          7991         26234         26943
TypeDispatcher/fp64_bandwidth_host/1/8192/1/manual_time                                                                        +0.0729         +0.0209          5029          5396         24111         24615
TypeDispatcher/fp64_bandwidth_device/8/1024/1/manual_time                                                                      +0.1176         +0.0632         16518         18460         35703         37961
TypeDispatcher/fp64_bandwidth_device/4/2048/1/manual_time                                                                      +0.0787         +0.0457         14424         15559         33546         35079
TypeDispatcher/fp64_bandwidth_device/2/4096/1/manual_time                                                                      +0.0500         +0.0327         13594         14274         32740         33811
TypeDispatcher/fp64_bandwidth_no/2/1024/1/manual_time                                                                          +0.0590         +0.0131          5065          5364         23966         24281
TypeDispatcher/fp64_bandwidth_no/8/1024/1/manual_time                                                                          +0.2305         +0.0699          6912          8506         25803         27607
TypeDispatcher/fp64_bandwidth_no/1/2048/1/manual_time                                                                          +0.0574         +0.0120          4854          5133         23782         24067
TypeDispatcher/fp64_bandwidth_no/4/2048/1/manual_time                                                                          +0.1602         +0.0461          6010          6973         24906         26054
TypeDispatcher/fp64_bandwidth_no/2/4096/1/manual_time                                                                          +0.0949         +0.0330          5583          6113         24469         25275
TypeDispatcher/fp64_bandwidth_no/4/4096/1/manual_time                                                                          +0.0623         +0.0175          6991          7427         26088         26545
TypeDispatcher/fp64_bandwidth_no/8/4096/1/manual_time                                                                          +0.0521         +0.0173          8953          9419         28000         28484
TypeDispatcher/fp64_bandwidth_no/1/8192/1/manual_time                                                                          +0.0607         +0.0257          5225          5542         24107         24727
TypeDispatcher/fp64_bandwidth_no/2/8192/1/manual_time                                                                          +0.0588         +0.0115          5964          6315         25052         25341
TypeDispatcher/fp64_bandwidth_no/1/16384/1/manual_time                                                                         +0.0541         +0.0119          5443          5737         24515         24806
TextTokenize/ngrams/2097152/128/manual_time                                                                                    +0.0624         +0.0623            10            10            10            10
MultibyteSplitBenchmark/multibyte_split_simple/1/1/1/32768/manual_time                                                         +0.4019         +0.4024             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/1/1/32768/manual_time                                                         +0.4099         +0.4073             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/32768/manual_time                                                         +0.3999         +0.3961             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/32768/manual_time                                                         +0.3969         +0.3980             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/7/1/32768/manual_time                                                         +0.4107         +0.3971             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/7/1/32768/manual_time                                                         +0.3833         +0.3948             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/1/25/32768/manual_time                                                        +0.3807         +0.3772             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/1/25/32768/manual_time                                                        +0.3834         +0.3702             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/25/32768/manual_time                                                        +0.3646         +0.3661             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/4/25/32768/manual_time                                                        +0.3722         +0.3743             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/7/25/32768/manual_time                                                        +0.3575         +0.3664             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/7/25/32768/manual_time                                                        +0.3761         +0.3744             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/1073741824/manual_time                                                    -0.1017         -0.1040          1681          1510          1681          1506
MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/1073741824/manual_time                                                    -0.1817         -0.1817          4102          3357          4101          3356
MultibyteSplitBenchmark/multibyte_split_simple/0/7/25/1073741824/manual_time                                                   -0.0704         -0.0704           345           320           345           320
OVERALL_GEOMEAN                                                                                                                +0.0974         +0.0970             0             0             0             0
Groupby/BasicSumScan/100000000/manual_time                                                                                     +0.2947         +0.2947           135           175           135           175
CsvRead/decimal_file_input/35/0/manual_time                                                                                    +0.0508         +0.0511           151           159           151           159
ReductionScan/double_nulls/100000/manual_time                                                                                  +0.0721         +0.0609         22874         24524         40726         43206
OrcWrite/integral_file_output/30/0/32/1/0/manual_time                                                                          -0.1923         -0.0371           913           738           763           735
OrcWrite/integral_file_output/30/0/1/0/0/manual_time                                                                           +0.2668         -0.0297           754           955           722           701
OrcWrite/integral_file_output/30/1000/1/0/0/manual_time                                                                        -0.1090         -0.0510           986           878           725           688
OrcWrite/integral_file_output/30/0/32/0/0/manual_time                                                                          +0.0594         -0.0575           981          1039           738           696
OrcWrite/integral_buffer_output/30/1000/32/1/1/manual_time                                                                     +0.0882         +0.0885            85            92            85            92
OrcWrite/integral_buffer_output/30/1000/32/0/1/manual_time                                                                     -0.0966         -0.0955            98            89            98            89
OrcWrite/floats_file_output/31/0/1/1/0/manual_time                                                                             +0.0600         -0.0538           737           781           737           697
OrcWrite/floats_file_output/31/0/32/1/0/manual_time                                                                            +0.0670         +0.0021          1203          1284           715           717
OrcWrite/floats_file_output/31/0/1/0/0/manual_time                                                                             -0.2406         -0.0605           865           657           698           656
OrcWrite/floats_file_output/31/1000/1/0/0/manual_time                                                                          -0.2006         -0.0642          1122           897           706           660
OrcWrite/floats_file_output/31/0/32/0/0/manual_time                                                                            -0.1759         -0.0563          1131           932           708           668
OrcWrite/floats_file_output/31/1000/32/0/0/manual_time                                                                         -0.1600         -0.0640          1095           919           702           657
OrcWrite/decimal_file_output/35/1000/1/0/0/manual_time                                                                         +0.1622         -0.0865          1110          1290           588           537
OrcWrite/timestamps_file_output/33/0/1/0/0/manual_time                                                                         +0.1884         -0.0494           552           657           552           524
OrcWrite/timestamps_file_output/33/1000/1/0/0/manual_time                                                                      +0.1409         +0.0064           650           742           541           544
OrcWrite/list_file_output/24/0/1/0/0/manual_time                                                                               -0.0723         -0.0788           713           661           711           655
OrcWrite/list_file_output/24/1000/1/0/0/manual_time                                                                            +0.0935         -0.0468           696           761           689           657
Concatenate/BM_concatenate_nullable_false/4096/2/manual_time                                                                   +0.1055         +0.0672             0             0             0             0
Concatenate/BM_concatenate_nullable_false/512/8/manual_time                                                                    +0.0548         +0.0379             0             0             0             0
Concatenate/BM_concatenate_nullable_true/32768/8/manual_time                                                                   +0.0501         +0.0415             0             0             0             0
Concatenate/BM_concatenate_nullable_true/64/64/manual_time                                                                     +0.0570         +0.0400             0             0             0             0
Concatenate/BM_concatenate_nullable_true/512/64/manual_time                                                                    +0.0894         +0.0606             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/4096/2/2/manual_time                                                          +0.1086         +0.0771             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/512/8/2/manual_time                                                           +0.0920         +0.0828             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/4096/8/2/manual_time                                                          +0.0549         +0.0502             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/256/32/2/manual_time                                                          +0.1036         +0.1009             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/512/32/2/manual_time                                                          +0.0827         +0.0813             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/4096/32/2/manual_time                                                         +0.0788         +0.0768             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/256/8/64/manual_time                                                          +0.0525         +0.0490             0             0             0             0
ParquetRead/integral_buffer_input/29/1000/1/0/1/manual_time                                                                    +0.0929         +0.0928            46            50            46            50
ParquetRead/timestamps_file_input/33/0/32/0/0/manual_time                                                                      -0.0896         -0.0897           127           116           128           116
OrcRead/integral_buffer_input/30/1000/1/0/1/manual_time                                                                        +0.1087         +0.1087            88            97            88            97
OrcRead/floats_file_input/31/0/1/1/0/manual_time                                                                               +0.1528         +0.1526           134           155           134           155
OrcRead/floats_buffer_input/31/1000/1/0/1/manual_time                                                                          +0.1349         +0.1350            75            85            75            85
OrcRead/decimal_buffer_input/35/0/1/0/1/manual_time                                                                            -0.1137         -0.1137           264           234           264           234
OrcRead/string_file_input/23/0/1/0/0/manual_time                                                                               -0.0750         -0.0750           162           150           162           150
OrcRead/string_file_input/23/0/32/0/0/manual_time                                                                              -0.0963         -0.0963           163           147           163           147
OrcRead/string_buffer_input/23/0/32/0/1/manual_time                                                                            -0.1586         -0.0139           114            96            97            96
OrcRead/list_file_input/24/1000/1/0/0/manual_time                                                                              +0.0515         +0.0517           176           185           176           185
OrcRead/list_file_input/24/0/32/0/0/manual_time                                                                                +0.0925         +0.0922           173           189           173           189
OrcRead/list_buffer_input/24/0/1/1/1/manual_time                                                                               -0.1288         -0.1291           139           121           139           121
BINARYOP<int32_t, TreeType::IMBALANCED_LEFT, true>/binaryop_int32_imbalanced_reuse/100000/2/manual_time                        +0.0533         +0.0381             0             0             0             0
COMPILED_BINARYOP/NULL_MAX_decimal32_decimal32_decimal32/100000/manual_time                                                    +0.0509         +0.0320            13            14            32            33
COMPILED_BINARYOP/NULL_MIN_timestamp_D_timestamp_s_timestamp_s/10000/manual_time                                               +0.0509         +0.0374            11            12            30            31
ParquetWrite/integral_file_output/29/0/1/1/0/manual_time                                                                       +0.3011         +0.0605           726           945           726           770
ParquetWrite/integral_file_output/29/1000/1/1/0/manual_time                                                                    +0.0812         +0.0804           311           336           310           335
ParquetWrite/integral_file_output/29/0/32/1/0/manual_time                                                                      +0.3497         +0.0714           948          1279           734           786
ParquetWrite/integral_file_output/29/1000/32/1/0/manual_time                                                                   +0.0559         +0.0558            62            65            62            65
ParquetWrite/integral_file_output/29/0/1/0/0/manual_time                                                                       +0.1829         +0.0679           702           830           700           748
ParquetWrite/integral_file_output/29/1000/1/0/0/manual_time                                                                    +0.0829         +0.0852           284           307           283           307
ParquetWrite/integral_file_output/29/0/32/0/0/manual_time                                                                      -0.3273         +0.0451          1063           715           683           714
ParquetWrite/integral_file_output/29/1000/32/0/0/manual_time                                                                   +0.0835         +0.0834            58            63            58            63
ParquetWrite/integral_buffer_output/29/0/1/1/1/manual_time                                                                     +0.0608         +0.0609           874           927           874           927
ParquetWrite/floats_file_output/31/0/1/1/0/manual_time                                                                         +0.1916         +0.0634           694           827           693           737
ParquetWrite/floats_file_output/31/1000/1/1/0/manual_time                                                                      +0.0560         +0.0553           217           229           217           229
ParquetWrite/floats_file_output/31/0/32/1/0/manual_time                                                                        +0.0517         +0.0546          1020          1073           721           760
ParquetWrite/floats_file_output/31/1000/32/1/0/manual_time                                                                     +0.1149         +0.0631            45            50            39            42
ParquetWrite/floats_file_output/31/0/1/0/0/manual_time                                                                         +0.1165         +0.0471           880           983           664           695
ParquetWrite/floats_file_output/31/1000/1/0/0/manual_time                                                                      +0.3996         +0.0038           237           331           219           219
ParquetWrite/floats_file_output/31/0/32/0/0/manual_time                                                                        +0.3109         +0.0673           666           873           666           710
ParquetWrite/floats_file_output/31/1000/32/0/0/manual_time                                                                     +0.0798         +0.0790            38            41            38            41
ParquetWrite/floats_buffer_output/31/1000/1/1/1/manual_time                                                                    +0.0710         +0.0709           208           223           208           223
ParquetWrite/floats_buffer_output/31/0/32/1/1/manual_time                                                                      +0.0677         +0.0673           732           782           732           782
ParquetWrite/floats_buffer_output/31/0/1/0/1/manual_time                                                                       +0.0663         +0.0659           682           728           682           727
ParquetWrite/floats_buffer_output/31/1000/1/0/1/manual_time                                                                    +0.0785         +0.0780           188           203           188           203
ParquetWrite/decimal_file_output/35/0/1/1/0/manual_time                                                                        +0.0655         +0.0636           277           296           277           295
ParquetWrite/decimal_file_output/35/1000/1/1/0/manual_time                                                                     +0.0657         +0.0634           242           258           242           257
ParquetWrite/decimal_file_output/35/0/32/1/0/manual_time                                                                       +0.1194         +0.0577           291           325           290           307
ParquetWrite/decimal_file_output/35/1000/32/1/0/manual_time                                                                    +0.0852         +0.0836           170           185           170           184
ParquetWrite/decimal_file_output/35/0/1/0/0/manual_time                                                                        +0.3802         +0.0372           346           477           325           337
ParquetWrite/decimal_file_output/35/1000/1/0/0/manual_time                                                                     +0.8101         +0.1543           374           677           373           431
ParquetWrite/decimal_file_output/35/0/32/0/0/manual_time                                                                       +1.4742         +0.0541           328           812           327           344
ParquetWrite/decimal_file_output/35/1000/32/0/0/manual_time                                                                    +0.5398         +0.0463           391           603           390           409
ParquetWrite/decimal_buffer_output/35/0/1/1/1/manual_time                                                                      +0.0571         +0.0570           301           318           301           318
ParquetWrite/decimal_buffer_output/35/1000/1/1/1/manual_time                                                                   +0.1955         +0.1953           253           302           253           302
ParquetWrite/decimal_buffer_output/35/0/32/1/1/manual_time                                                                     +0.0655         +0.0641           306           326           306           325
ParquetWrite/decimal_buffer_output/35/0/1/0/1/manual_time                                                                      +0.0595         +0.0591           381           404           381           404
ParquetWrite/decimal_buffer_output/35/1000/1/0/1/manual_time                                                                   +0.0650         +0.0643           515           548           515           548
ParquetWrite/decimal_buffer_output/35/0/32/0/1/manual_time                                                                     +0.0595         +0.0591           386           409           386           409
ParquetWrite/decimal_buffer_output/35/1000/32/0/1/manual_time                                                                  +0.0595         +0.0590           517           547           516           547
ParquetWrite/timestamps_file_output/33/0/1/1/0/manual_time                                                                     +0.0566         +0.0580           724           765           721           762
ParquetWrite/timestamps_file_output/33/1000/1/1/0/manual_time                                                                  -0.6229         -0.0258           526           198           203           198
ParquetWrite/timestamps_file_output/33/0/32/1/0/manual_time                                                                    -0.0955         +0.0444           928           840           733           766
ParquetWrite/timestamps_file_output/33/1000/32/1/0/manual_time                                                                 +0.0794         +0.0725            36            39            36            39
ParquetWrite/timestamps_file_output/33/0/1/0/0/manual_time                                                                     +0.2140         +0.0788           626           760           626           676
ParquetWrite/timestamps_file_output/33/1000/1/0/0/manual_time                                                                  +0.0778         +0.0760           174           188           174           187
ParquetWrite/timestamps_file_output/33/0/32/0/0/manual_time                                                                    +0.4682         +0.0758           636           934           636           684
ParquetWrite/timestamps_file_output/33/1000/32/0/0/manual_time                                                                 +0.0938         +0.0929            34            38            34            38
ParquetWrite/timestamps_buffer_output/33/0/1/1/1/manual_time                                                                   +0.0559         +0.0559           837           884           837           884
ParquetWrite/timestamps_buffer_output/33/0/1/0/1/manual_time                                                                   +0.0612         +0.0612           714           758           714           758
ParquetWrite/timestamps_buffer_output/33/1000/1/0/1/manual_time                                                                -0.2022         -0.2021           229           183           229           183
ParquetWrite/timestamps_buffer_output/33/0/32/0/1/manual_time                                                                  +0.0609         +0.0596           721           765           721           764
ParquetWrite/string_file_output/23/0/1/1/0/manual_time                                                                         +0.1674         +0.1004          1231          1437           869           956
ParquetWrite/string_file_output/23/1000/1/1/0/manual_time                                                                      +0.0748         +0.0675           124           133           107           114
ParquetWrite/string_file_output/23/0/32/1/0/manual_time                                                                        +0.0497         +0.0541          1197          1256           893           942
ParquetWrite/string_file_output/23/1000/32/1/0/manual_time                                                                     +0.0822         +0.0551            38            41            34            35
ParquetWrite/string_file_output/23/0/1/0/0/manual_time                                                                         +0.3477         +0.0668           892          1202           828           883
ParquetWrite/string_file_output/23/1000/1/0/0/manual_time                                                                      +0.1446         +0.1474            98           113            98           113
ParquetWrite/string_file_output/23/1000/32/0/0/manual_time                                                                     +0.0596         +0.0590            33            35            33            35
ParquetWrite/string_buffer_output/23/1000/1/0/1/manual_time                                                                    +0.0598         +0.0594           104           110           104           110
ParquetWrite/string_void_output/23/1000/32/0/2/manual_time                                                                     -0.3901         +0.0015            34            21            21            21
ParquetWrite/list_file_output/24/0/1/0/0/manual_time                                                                           -0.1313         +0.0831          1033           897           828           897
ParquetWrite/list_file_output/24/1000/1/0/0/manual_time                                                                        +0.0559         +0.0537           521           550           521           549
ParquetWrite/list_file_output/24/0/32/0/0/manual_time                                                                          -0.1942         -0.0129          1183           954           888           877
ContiguousSplit/1Gb512ColsValidity/1073741824/512/256/1/iterations:8/manual_time                                               +0.0660         +0.0659            30            32            30            32
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/1000000/1/manual_time                   +0.0540         +0.0453             0             0             0             0
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/10000000/1/manual_time                  +0.0657         +0.0642             1             1             1             1
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/100000000/1/manual_time                 +0.0704         +0.0702             8             9             8             9
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/1000000/1/manual_time                     +0.0549         +0.0473             0             0             0             0
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/10000000/1/manual_time                    +0.0745         +0.0723             1             1             1             1
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/100000000/1/manual_time                   +0.0758         +0.0755             7             8             7             8
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/1/manual_time                  +0.0534         +0.0522             1             1             1             1
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/10/manual_time                 +0.0610         +0.0606             3             3             3             3
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/1/manual_time                 +0.0538         +0.0537             9            10             9            10
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/10/manual_time                +0.0579         +0.0579            26            27            26            27
Rank/nulls/1024/manual_time                                                                                                    +0.7608         +0.6280             0             0             0             0
Rank/nulls/4096/manual_time                                                                                                    +0.2739         +0.2437             0             0             0             0
Rank/nulls/32768/manual_time                                                                                                   +0.1599         +0.1469             0             0             0             0
Rank/nulls/262144/manual_time                                                                                                  +0.0813         +0.0793             0             0             0             0
Rank/nulls/2097152/manual_time                                                                                                 -0.4178         -0.4162             5             3             5             3
Rank/nulls/16777216/manual_time                                                                                                -0.3688         -0.3686            45            28            45            28
Rank/nulls/67108864/manual_time                                                                                                -0.3576         -0.3576           181           117           181           117
Sort<false>/unstable_no_nulls/1024/8/manual_time                                                                               +0.2655         +0.2554             1             1             1             1
Sort<false>/unstable_no_nulls/4096/8/manual_time                                                                               +0.3212         +0.3081             0             1             1             1
Sort<false>/unstable_no_nulls/32768/8/manual_time                                                                              +0.1430         +0.1395             1             1             1             1
Sort<false>/unstable_no_nulls/262144/8/manual_time                                                                             +0.1080         +0.1064             1             1             1             2
Sort<false>/unstable_no_nulls/2097152/8/manual_time                                                                            -0.0740         -0.0740            15            14            15            14
Sort<false>/unstable_no_nulls/16777216/8/manual_time                                                                           -0.0882         -0.0882           215           196           215           196
Sort<false>/unstable_no_nulls/67108864/8/manual_time                                                                           -0.0848         -0.0848          1170          1071          1170          1071
Sort<true>/stable_no_nulls/1024/8/manual_time                                                                                  +0.2656         +0.2553             1             1             1             1
Sort<true>/stable_no_nulls/4096/8/manual_time                                                                                  +0.3215         +0.3081             0             1             1             1
Sort<true>/stable_no_nulls/32768/8/manual_time                                                                                 +0.1427         +0.1392             1             1             1             1
Sort<true>/stable_no_nulls/262144/8/manual_time                                                                                +0.1082         +0.1066             1             1             1             2
Sort<true>/stable_no_nulls/2097152/8/manual_time                                                                               -0.0737         -0.0735            15            14            15            14
Sort<true>/stable_no_nulls/16777216/8/manual_time                                                                              -0.0889         -0.0887           215           196           215           196
Sort<true>/stable_no_nulls/67108864/8/manual_time                                                                              -0.0848         -0.0846          1170          1071          1170          1071
Sort<false>/unstable/1024/1/manual_time                                                                                        +0.8698         +0.7017             0             0             0             0
Sort<false>/unstable/4096/1/manual_time                                                                                        +0.2846         +0.2506             0             0             0             0
Sort<false>/unstable/32768/1/manual_time                                                                                       +0.1640         +0.1492             0             0             0             0
Sort<false>/unstable/262144/1/manual_time                                                                                      +0.0818         +0.0794             0             0             0             0
Sort<false>/unstable/2097152/1/manual_time                                                                                     -0.4431         -0.4414             5             3             5             3
Sort<false>/unstable/16777216/1/manual_time                                                                                    -0.4282         -0.4280            38            22            38            22
Sort<false>/unstable/67108864/1/manual_time                                                                                    -0.4168         -0.4168           155            90           155            90
Sort<false>/unstable/1024/8/manual_time                                                                                        +0.2213         +0.2142             1             1             1             1
Sort<false>/unstable/4096/8/manual_time                                                                                        +0.2784         +0.2687             1             1             1             1
Sort<false>/unstable/32768/8/manual_time                                                                                       +0.1115         +0.1094             1             1             1             1
Sort<false>/unstable/262144/8/manual_time                                                                                      +0.1030         +0.1016             2             2             2             2
Sort<true>/stable/1024/1/manual_time                                                                                           +0.8684         +0.7016             0             0             0             0
Sort<true>/stable/4096/1/manual_time                                                                                           +0.2860         +0.2517             0             0             0             0
Sort<true>/stable/32768/1/manual_time                                                                                          +0.1638         +0.1497             0             0             0             0
Sort<true>/stable/262144/1/manual_time                                                                                         +0.0817         +0.0798             0             0             0             0
Sort<true>/stable/2097152/1/manual_time                                                                                        -0.4431         -0.4415             5             3             5             3
Sort<true>/stable/16777216/1/manual_time                                                                                       -0.4279         -0.4277            38            22            38            22
Sort<true>/stable/67108864/1/manual_time                                                                                       -0.4176         -0.4176           155            90           155            90
Sort<true>/stable/1024/8/manual_time                                                                                           +0.2211         +0.2138             1             1             1             1
Sort<true>/stable/4096/8/manual_time                                                                                           +0.2808         +0.2706             1             1             1             1
Sort<true>/stable/32768/8/manual_time                                                                                          +0.1117         +0.1096             1             1             1             1
Sort<true>/stable/262144/8/manual_time                                                                                         +0.1029         +0.1013             2             2             2             2
Sort/strings/262144/manual_time                                                                                                -0.0781         -0.0777             4             4             4             4
Scatter/double_coalesce_x/2048/2/manual_time                                                                                   +0.0614         +0.0472         27988         29705         46846         49057
Scatter/double_coalesce_x/32768/2/manual_time                                                                                  +0.0637         +0.0522         30209         32133         47991         50496
Scatter/double_coalesce_x/131072/2/manual_time                                                                                 +0.0558         +0.0444         37821         39932         54883         57321
Scatter/double_coalesce_x/1024/4/manual_time                                                                                   +0.0811         +0.0663         53699         58053         72617         77434
Scatter/double_coalesce_x/2048/4/manual_time                                                                                   +0.0535         +0.0468         56040         59038         74848         78348
Scatter/double_coalesce_x/4096/4/manual_time                                                                                   +0.0514         +0.0449         56187         59073         74930         78291
Scatter/double_coalesce_x/8192/4/manual_time                                                                                   +0.0516         +0.0452         56747         59674         75140         78533
Scatter/double_coalesce_x/16384/4/manual_time                                                                                  +0.0520         +0.0479         57412         60400         75292         78895
Scatter/double_coalesce_x/32768/4/manual_time                                                                                  +0.0610         +0.0544         58151         61699         75398         79499
Scatter/double_coalesce_x/1024/8/manual_time                                                                                   +0.0526         +0.0486        110089        115882        129032        135301
Scatter/double_coalesce_x/2048/8/manual_time                                                                                   +0.0546         +0.0506        110864        116921        129784        136352
Scatter/double_coalesce_x/4096/8/manual_time                                                                                   +0.0612         +0.0554        110733        117506        129306        136465
Scatter/double_coalesce_x/8192/8/manual_time                                                                                   +0.0635         +0.0579        111614        118703        129727        137233
Scatter/double_coalesce_x/16384/8/manual_time                                                                                  +0.0665         +0.0604        111918        119366        129458        137275
Scatter/double_coalesce_x/32768/8/manual_time                                                                                  +0.0545         +0.0543        114993        121260        131951        139113
Scatter/double_coalesce_x/65536/8/manual_time                                                                                  +0.0619         +0.0560        119167        126540        136092        143717
Scatter/double_coalesce_o/2048/2/manual_time                                                                                   +0.0542         +0.0418         29300         30889         48197         50211
Scatter/double_coalesce_o/32768/2/manual_time                                                                                  +0.0556         +0.0464         32069         33851         49914         52229
Scatter/double_coalesce_o/1024/4/manual_time                                                                                   +0.0684         +0.0569         56480         60346         75468         79761
Scatter/double_coalesce_o/8192/4/manual_time                                                                                   +0.0572         +0.0497         59554         62960         77958         81834
Scatter/double_coalesce_o/16384/4/manual_time                                                                                  +0.0572         +0.0525         59839         63260         77704         81781
Scatter/double_coalesce_o/32768/4/manual_time                                                                                  +0.0564         +0.0514         62493         66015         79779         83883
Scatter/double_coalesce_o/1024/8/manual_time                                                                                   +0.0566         +0.0515        112968        119360        131925        138723
Scatter/double_coalesce_o/2048/8/manual_time                                                                                   +0.0565         +0.0518        113151        119548        132028        138870
Scatter/double_coalesce_o/4096/8/manual_time                                                                                   +0.0594         +0.0545        114566        121374        133078        140333
Scatter/double_coalesce_o/8192/8/manual_time                                                                                   +0.0587         +0.0534        116146        122963        134282        141449
Scatter/double_coalesce_o/16384/8/manual_time                                                                                  +0.0663         +0.0597        116445        124161        134038        142046
Scatter/double_coalesce_o/32768/8/manual_time                                                                                  +0.0555         +0.0566        122258        129043        139016        146891
Scatter/double_coalesce_o/65536/8/manual_time                                                                                  +0.0553         +0.0498        133373        140749        150403        157896
Quantiles/no_nulls/65536/4/1/manual_time                                                                                       +0.1394         +0.1370             1             1             1             1
Quantiles/no_nulls/262144/4/1/manual_time                                                                                      +0.1372         +0.1348             1             1             1             1
Quantiles/no_nulls/1048576/4/1/manual_time                                                                                     -0.0944         -0.0943             6             5             6             5
Quantiles/no_nulls/4194304/4/1/manual_time                                                                                     -0.1068         -0.1070            35            32            35            32
Quantiles/no_nulls/16777216/4/1/manual_time                                                                                    -0.0882         -0.0884           210           191           210           191
Quantiles/no_nulls/67108864/4/1/manual_time                                                                                    -0.0855         -0.0858          1148          1050          1148          1050
Quantiles/no_nulls/65536/8/1/manual_time                                                                                       +0.1312         +0.1290             1             1             1             1
Quantiles/no_nulls/262144/8/1/manual_time                                                                                      +0.1058         +0.1044             1             2             1             2
Quantiles/no_nulls/4194304/8/1/manual_time                                                                                     -0.0982         -0.0984            37            33            37            33
Quantiles/no_nulls/16777216/8/1/manual_time                                                                                    -0.0886         -0.0888           215           196           215           196
Quantiles/no_nulls/67108864/8/1/manual_time                                                                                    -0.0866         -0.0868          1173          1071          1173          1071
Quantiles/no_nulls/65536/4/4/manual_time                                                                                       +0.1413         +0.1385             1             1             1             1
Quantiles/no_nulls/262144/4/4/manual_time                                                                                      +0.1355         +0.1332             1             1             1             1
Quantiles/no_nulls/1048576/4/4/manual_time                                                                                     -0.0944         -0.0943             6             5             6             5
Quantiles/no_nulls/4194304/4/4/manual_time                                                                                     -0.1061         -0.1063            35            32            35            32
Quantiles/no_nulls/16777216/4/4/manual_time                                                                                    -0.0877         -0.0879           210           191           210           191
Quantiles/no_nulls/67108864/4/4/manual_time                                                                                    -0.0863         -0.0865          1149          1050          1149          1049
Quantiles/no_nulls/65536/8/4/manual_time                                                                                       +0.1328         +0.1308             1             1             1             1
Quantiles/no_nulls/262144/8/4/manual_time                                                                                      +0.1058         +0.1047             1             2             1             2
Quantiles/no_nulls/4194304/8/4/manual_time                                                                                     -0.0970         -0.0970            37            33            37            33
Quantiles/no_nulls/16777216/8/4/manual_time                                                                                    -0.0886         -0.0888           215           196           215           196
Quantiles/no_nulls/67108864/8/4/manual_time                                                                                    -0.0863         -0.0865          1172          1071          1172          1071
Quantiles/no_nulls/65536/4/12/manual_time                                                                                      +0.1411         +0.1384             1             1             1             1
Quantiles/no_nulls/262144/4/12/manual_time                                                                                     +0.1360         +0.1338             1             1             1             1
Quantiles/no_nulls/1048576/4/12/manual_time                                                                                    -0.0953         -0.0952             6             5             6             5
Quantiles/no_nulls/4194304/4/12/manual_time                                                                                    -0.1054         -0.1056            35            32            35            32
Quantiles/no_nulls/16777216/4/12/manual_time                                                                                   -0.0871         -0.0873           210           191           210           191
Quantiles/no_nulls/67108864/4/12/manual_time                                                                                   -0.0858         -0.0860          1148          1050          1148          1049
Quantiles/no_nulls/65536/8/12/manual_time                                                                                      +0.1323         +0.1302             1             1             1             1
Quantiles/no_nulls/262144/8/12/manual_time                                                                                     +0.1060         +0.1047             1             2             1             2
Quantiles/no_nulls/1048576/8/12/manual_time                                                                                    -0.0702         -0.0703             6             6             6             6
Quantiles/no_nulls/4194304/8/12/manual_time                                                                                    -0.0971         -0.0973            37            33            37            33
Quantiles/no_nulls/16777216/8/12/manual_time                                                                                   -0.0885         -0.0887           215           196           215           196
Quantiles/no_nulls/67108864/8/12/manual_time                                                                                   -0.0865         -0.0866          1173          1071          1172          1071
Quantiles/nulls/65536/1/1/manual_time                                                                                          +0.0958         +0.0916             0             0             0             0
Quantiles/nulls/262144/1/1/manual_time                                                                                         +0.0750         +0.0728             0             0             0             0
Quantiles/nulls/1048576/1/1/manual_time                                                                                        -0.1901         -0.1874             2             1             2             1
Quantiles/nulls/4194304/1/1/manual_time                                                                                        -0.4297         -0.4288            10             5            10             6
Quantiles/nulls/16777216/1/1/manual_time                                                                                       -0.4270         -0.4268            38            22            38            22
Quantiles/nulls/67108864/1/1/manual_time                                                                                       -0.4151         -0.4152           155            90           155            90
Quantiles/nulls/65536/4/1/manual_time                                                                                          +0.1027         +0.1007             1             1             1             1
Quantiles/nulls/262144/4/1/manual_time                                                                                         +0.1119         +0.1103             1             1             1             1
Quantiles/nulls/65536/8/1/manual_time                                                                                          +0.1193         +0.1174             1             2             1             2
Quantiles/nulls/262144/8/1/manual_time                                                                                         +0.0973         +0.0963             2             2             2             2
Quantiles/nulls/65536/1/4/manual_time                                                                                          +0.0973         +0.0928             0             0             0             0
Quantiles/nulls/262144/1/4/manual_time                                                                                         +0.0759         +0.0731             0             0             0             0
Quantiles/nulls/1048576/1/4/manual_time                                                                                        -0.1906         -0.1879             2             1             2             1
Quantiles/nulls/4194304/1/4/manual_time                                                                                        -0.4296         -0.4287            10             5            10             5
Quantiles/nulls/16777216/1/4/manual_time                                                                                       -0.4278         -0.4277            38            22            38            22
Quantiles/nulls/67108864/1/4/manual_time                                                                                       -0.4153         -0.4154           155            90           155            90
Quantiles/nulls/65536/4/4/manual_time                                                                                          +0.1047         +0.1027             1             1             1             1
Quantiles/nulls/262144/4/4/manual_time                                                                                         +0.1116         +0.1100             1             1             1             1
Quantiles/nulls/65536/8/4/manual_time                                                                                          +0.1194         +0.1175             1             2             1             2
Quantiles/nulls/262144/8/4/manual_time                                                                                         +0.0975         +0.0964             2             2             2             2
Quantiles/nulls/65536/1/12/manual_time                                                                                         +0.0954         +0.0909             0             0             0             0
Quantiles/nulls/262144/1/12/manual_time                                                                                        +0.0779         +0.0749             0             0             0             0
Quantiles/nulls/1048576/1/12/manual_time                                                                                       -0.1873         -0.1848             2             1             2             1
Quantiles/nulls/4194304/1/12/manual_time                                                                                       -0.4304         -0.4295            10             5            10             5
Quantiles/nulls/16777216/1/12/manual_time                                                                                      -0.4277         -0.4276            38            22            38            22
Quantiles/nulls/67108864/1/12/manual_time                                                                                      -0.4144         -0.4145           154            90           154            90
Quantiles/nulls/65536/4/12/manual_time                                                                                         +0.1006         +0.0987             1             1             1             1
Quantiles/nulls/262144/4/12/manual_time                                                                                        +0.1120         +0.1104             1             1             1             1
Quantiles/nulls/65536/8/12/manual_time                                                                                         +0.1193         +0.1174             1             2             1             2
Quantiles/nulls/262144/8/12/manual_time                                                                                        +0.0953         +0.0942             2             2             2             2

Additional benchmarking from @randerzander and @GregoryKimball indicate that sort and quantile benchmarks show improvements for large data sizes, as much as 34% reduction in time for "Rank nulls 67108864." The benchmark "Quantiles nulls 67108864" shows roughly a 6% reduction in runtime. Small sizes sometimes showed slowdowns, like "Rank nulls 1024" going from 98 microseconds to 177 microseconds. However, these small data sizes are typically not the cases we are optimizing for. I discussed these results in detail with @GregoryKimball and we decided the benchmarks were "green light."

@bdice bdice added the improvement Improvement / enhancement to an existing function label Mar 23, 2022
@bdice bdice self-assigned this Mar 23, 2022
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Mar 23, 2022
@bdice bdice added the non-breaking Non-breaking change label Mar 23, 2022
@codecov
Copy link

codecov bot commented Mar 24, 2022

Codecov Report

Merging #10489 (a2a5daa) into branch-22.06 (73bc7d7) will not change coverage.
The diff coverage is n/a.

@@              Coverage Diff              @@
##           branch-22.06   #10489   +/-   ##
=============================================
  Coverage         86.33%   86.33%           
=============================================
  Files               140      140           
  Lines             22300    22300           
=============================================
  Hits              19253    19253           
  Misses             3047     3047           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4775f11...a2a5daa. Read the comment docs.

rapids-bot bot pushed a commit to rapidsai/rmm that referenced this pull request Mar 29, 2022
## Description

This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.16 when that is updated in rapids-cmake.

## Context

I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions:
> [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries.

It looks like rmm may be able to build with Thrust 1.16 even without these changes, but I think this changeset may help prevent future problems arising from inconsistency and reliance on `detail` headers.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Conor Hoekstra (https://github.com/codereport)

URL: #1011
@bdice bdice marked this pull request as ready for review March 29, 2022 19:33
@bdice bdice requested review from a team as code owners March 29, 2022 19:34
@bdice bdice requested review from cwharris and vyasr March 29, 2022 19:34
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me assuming that you don't find anything surprising in new benchmarks. Please request my review again if benchmarks reveal an unexpected issue.

@bdice
Copy link
Contributor Author

bdice commented Mar 31, 2022

Java CI is failing. I see that TableTest.testSample compares to a set of fixed results for a randomly generated sample (source). We should refactor this test to make assertions that are not dependent on the exact random state, like asserting that the length of the result matches the expected length, rather than asserting that the specific values match an expectation. Even with a seed specified, I am not sure if we can guarantee the upstream behavior of this random generator to be stable across changing versions of libcudf's dependencies like Thrust.

@github-actions github-actions bot added the Java Affects Java cuDF API. label Mar 31, 2022
@bdice bdice requested a review from a team March 31, 2022 20:49
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java approval

@bdice
Copy link
Contributor Author

bdice commented Apr 1, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit ca952f8 into rapidsai:branch-22.06 Apr 1, 2022
rapids-bot bot pushed a commit that referenced this pull request Apr 4, 2022
Fixes `thrust.patch` to patch the CUB source for `sort` to minimize the inlining of the comparator functor. The build was updated in #10489 to thrust-1.16 which includes change to thrust sort using CUB's `DeviceMergeSort`. This means the previous patch does not apply to the new thrust/cub source. This dramatically increased the build for `sort.cu` and other related source files as can be seen in this Build Metrics Report from #10489: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/8633/Build_20Metrics_20Report/

This PR moves the `pragma unroll` changes into the appropriate CUB source files reducing the build time back to the previous levels (or close to it I hope).

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Nghia Truong (https://github.com/ttnghia)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #10577
rapids-bot bot pushed a commit that referenced this pull request Apr 5, 2022
PR #10489 updated from Thrust 1.15 to Thrust 1.16. However, this appears to be causing conflicts with other repositories -- [cuSpatial](rapidsai/cuspatial#511 (comment)) and cuGraph have reported issues where their builds are finding Thrust 1.16 from libcudf instead of Thrust 1.15 which is [currently pinned by rapids-cmake](https://github.com/rapidsai/rapids-cmake/blob/06a657281cdd83781e49afcdbb39abc491eeab17/rapids-cmake/cpm/versions.json#L26).

This PR is intended to unblock local builds and CI builds for other RAPIDS packages until we are able to identify the root cause (which may be due to CMake include path orderingsrapids-cmake).

Last time Thrust was updated, [rapids-cmake was updated](rapidsai/rapids-cmake#138) one day before [libcudf was updated](#9912). That may explain why we didn't notice this problem with the 1.15 update.

The plan I currently have in mind is:

1. Merge this PR to roll back libcudf to Thrust 1.15 (and revert the patch for Thrust 1.16 [10577](#10577)). This will hopefully unblock CI for cugraph and cuspatial.
2. Try to work out whatever issues with CMake / include paths may exist.
3. Prepare all rapids-cmake repos for Thrust 1.16 compatibility. I've [done this for RMM already](rapidsai/rmm#1011), and I am working on [PR 4675](rapidsai/cuml#4675) to cuML now. I am planning to make the same fixes for `#include`s in cuCollections, raft, cuSpatial, and cuGraph so they will be compatible with Thrust 1.16.
4. Try to upgrade libcudf to Thrust 1.16 again (and re-apply the updated patch). If (2) has been resolved, I hope we won't see any issues in other RAPIDS libraries
5. Upgrade rapids-cmake to Thrust 1.16.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)

URL: #10586
abellina pushed a commit to abellina/cudf that referenced this pull request Apr 11, 2022
Fixes `thrust.patch` to patch the CUB source for `sort` to minimize the inlining of the comparator functor. The build was updated in rapidsai#10489 to thrust-1.16 which includes change to thrust sort using CUB's `DeviceMergeSort`. This means the previous patch does not apply to the new thrust/cub source. This dramatically increased the build for `sort.cu` and other related source files as can be seen in this Build Metrics Report from rapidsai#10489: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/8633/Build_20Metrics_20Report/

This PR moves the `pragma unroll` changes into the appropriate CUB source files reducing the build time back to the previous levels (or close to it I hope).

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Nghia Truong (https://github.com/ttnghia)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#10577
abellina pushed a commit to abellina/cudf that referenced this pull request Apr 11, 2022
PR rapidsai#10489 updated from Thrust 1.15 to Thrust 1.16. However, this appears to be causing conflicts with other repositories -- [cuSpatial](rapidsai/cuspatial#511 (comment)) and cuGraph have reported issues where their builds are finding Thrust 1.16 from libcudf instead of Thrust 1.15 which is [currently pinned by rapids-cmake](https://github.com/rapidsai/rapids-cmake/blob/06a657281cdd83781e49afcdbb39abc491eeab17/rapids-cmake/cpm/versions.json#L26).

This PR is intended to unblock local builds and CI builds for other RAPIDS packages until we are able to identify the root cause (which may be due to CMake include path orderingsrapids-cmake).

Last time Thrust was updated, [rapids-cmake was updated](rapidsai/rapids-cmake#138) one day before [libcudf was updated](rapidsai#9912). That may explain why we didn't notice this problem with the 1.15 update.

The plan I currently have in mind is:

1. Merge this PR to roll back libcudf to Thrust 1.15 (and revert the patch for Thrust 1.16 [10577](rapidsai#10577)). This will hopefully unblock CI for cugraph and cuspatial.
2. Try to work out whatever issues with CMake / include paths may exist.
3. Prepare all rapids-cmake repos for Thrust 1.16 compatibility. I've [done this for RMM already](rapidsai/rmm#1011), and I am working on [PR 4675](rapidsai/cuml#4675) to cuML now. I am planning to make the same fixes for `#include`s in cuCollections, raft, cuSpatial, and cuGraph so they will be compatible with Thrust 1.16.
4. Try to upgrade libcudf to Thrust 1.16 again (and re-apply the updated patch). If (2) has been resolved, I hope we won't see any issues in other RAPIDS libraries
5. Upgrade rapids-cmake to Thrust 1.16.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#10586
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Jun 29, 2022
## Description

This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake.

## Context

I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, version 1.16 of Thrust reduced the number of internal header inclusions:
> [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries.

This changeset also makes it more obvious where cugraph depends on `thrust/detail` headers.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Seunghwa Kang (https://github.com/seunghwak)

URL: #2310
rapids-bot bot pushed a commit to rapidsai/cuml that referenced this pull request Jul 7, 2022
## Description

This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake.

## Context

I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions:
> [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries.

This changeset also removes dependence on `thrust/detail` headers.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - William Hicks (https://github.com/wphicks)

URL: #4675
rapids-bot bot pushed a commit that referenced this pull request Aug 4, 2022
Thrust 1.16 removed internal header inclusions that libcudf relied on. This PR adds missing `#include`s that were found automatically by a script I wrote. See notes on #10489. This was previously applied in #10489 but the script became more sophisticated (and libcudf has changed) since I last applied it, so more missing `#include`s were found.

Required for #11437 to upgrade to Thrust 1.17. This change has been separated from #11437 to minimize that PR's diff. Some additional changes will be needed on that PR but we don't want to hold off on fixing these includes, as recommended by @davidwendt.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Karthikeyan (https://github.com/karthikeyann)
  - Nghia Truong (https://github.com/ttnghia)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #11457
rapids-bot bot pushed a commit that referenced this pull request Aug 11, 2022
Updates the bundled version of Thrust to 1.17.0. I will run benchmarks and include results in a comment below.

Depends on #11457.

Supersedes #10489, #10577, #10586. Closes #10841. **This should be merged concurrently with rapidsai/rapids-cmake#231

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Nghia Truong (https://github.com/ttnghia)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #11437
jakirkham pushed a commit to jakirkham/cuml that referenced this pull request Feb 27, 2023
## Description

This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake.

## Context

I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions:
> [rapidsai#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries.

This changeset also removes dependence on `thrust/detail` headers.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - William Hicks (https://github.com/wphicks)

URL: rapidsai#4675
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants