Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Tell gcc this sizeof division is intended ( -Wsizeof-array-div ) #418

Merged
merged 2 commits into from
Jan 19, 2022

Conversation

robertmaynard
Copy link
Collaborator

No description provided.

@alliepiper
Copy link
Collaborator

LGTM, I'll start testing soon.

@alliepiper alliepiper self-assigned this Jan 13, 2022
@alliepiper alliepiper added type: bug: functional Does not work as intended. P1: should have Necessary, but not critical. labels Jan 13, 2022
@alliepiper alliepiper added this to Inbox in PR Tracking via automation Jan 13, 2022
@alliepiper alliepiper added this to the 1.16.0 milestone Jan 13, 2022
@alliepiper alliepiper moved this from Inbox to Need Testing in PR Tracking Jan 13, 2022
@alliepiper alliepiper added the testing: gpuCI in progress Started gpuCI testing. label Jan 17, 2022
@alliepiper
Copy link
Collaborator

gpuCI: NVIDIA/thrust#1590
DVS CL: 30883617

@alliepiper alliepiper added the testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). label Jan 17, 2022
@alliepiper alliepiper moved this from Need Testing to Tests Pending in PR Tracking Jan 17, 2022
@alliepiper
Copy link
Collaborator

The warning is still emitted on gcc 11. I also tried just adding parens around the sizeof(DeviceWord) like the new warning suggests, but it still emits the diagnostic:

/workspace/dependencies/cub/cub/block/specializations/../../util_type.cuh:687:19: error: expression does not compute the number of elements in this array; element type is ‘longlong4’, not ‘cub::Uninitialized<longlong4 [33]>::DeviceWord’ {aka ‘ulonglong2’} [-Werror=sizeof-array-div]
  687 |         WORDS = (sizeof(T) / sizeof(DeviceWord))
      |         ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
/workspace/dependencies/cub/cub/block/specializations/../../util_type.cuh:687:21: note: add parentheses around ‘sizeof (cub::Uninitialized<longlong4 [33]>::DeviceWord)’ to silence this warning
  687 |         WORDS = (sizeof(T) / sizeof(DeviceWord))
      |                     ^~~~~~~~~~~~~~~~~~
      |                     (                 )

I have another workaround incoming.

The compiler suggests putting parentheses around the division to
suppress the warning, but this isn't working on gcc 11. Refactoring
this out a bit works though.
@alliepiper alliepiper merged commit d4e8d5c into NVIDIA:main Jan 19, 2022
PR Tracking automation moved this from Tests Pending to Done Jan 19, 2022
@robertmaynard robertmaynard deleted the remove_gcc11_warnings branch January 20, 2022 14:22
mfep pushed a commit to ROCm/hipCUB that referenced this pull request Mar 16, 2022
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Apr 1, 2022
This PR updates the version of Thrust from 1.15 to 1.16 ([changelog](https://github.com/NVIDIA/thrust/blob/main/CHANGELOG.md#thrust-1160)). This update is needed to fix compilation with GCC 11, because of some warnings-as-errors present in Thrust 1.15 with GCC 11 (such as this one from Thrust's copy of cub: https://github.com/NVIDIA/cub/pull/418).

Notably, Thrust reduced the number of internal header inclusions:
> [#1572](https://github.com/NVIDIA/thrust/pull/1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior.

This change illuminated many missing includes in libcudf, so I added `#include <thrust/...>` for all thrust features used in each file (with help from a Python script).

I included raw benchmarks that I recorded below.

<details>
<summary>Benchmarks:</summary>

```
Benchmark                                                                                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CopyIfElse/int16_no_nulls/4096/manual_time                                                                                     +0.0581         +0.0307             0             0             0             0
CopyIfElse/uint32_no_nulls/4096/manual_time                                                                                    +0.1308         +0.0463             0             0             0             0
CopyIfElse/uint32_no_nulls/32768/manual_time                                                                                   +0.1043         +0.0485             0             0             0             0
CopyIfElse/float64_no_nulls/4096/manual_time                                                                                   +0.0894         +0.0422             0             0             0             0
StringDateTime/from_days/32768/manual_time                                                                                     +0.0529         +0.0491            93            98           112           118
StringDateTime/to_days/1024/manual_time                                                                                        +0.0596         +0.0493            35            37            54            57
StringDateTime/to_days/32768/manual_time                                                                                       +0.0547         +0.0460            37            39            55            58
StringToDurations/to_durations_ms/1024/manual_time                                                                             +0.0516         +0.0426            30            31            49            51
StringToDurations/to_durations_ms/32768/manual_time                                                                            +0.0542         +0.0506            32            34            52            55
StringToDurations/to_durations_us/32768/manual_time                                                                            +0.0520         +0.0440            32            34            52            55
StringsFromFixedPoint/strings_from_decimal64/16384/manual_time                                                                 +0.0530         +0.0508            94            99           113           119
StringsToNumeric/strings_to_float32/1024/manual_time                                                                           +0.0521         +0.0451            31            32            50            52
StringsToNumeric/strings_to_float64/16384/manual_time                                                                          +0.0517         +0.0437            32            34            51            53
StringsToNumeric/strings_to_float64/65536/manual_time                                                                          +0.0505         +0.0496            35            36            53            56
StringsToNumeric/strings_to_uint8/4096/manual_time                                                                             +0.0559         +0.0466            24            25            43            45
StringsToNumeric/strings_to_uint8/65536/manual_time                                                                            +0.0563         +0.0458            26            27            44            46
StringCopy/gather/4096/32/manual_time                                                                                          +0.0652         +0.0574             0             0             0             0
StringCopy/gather/4096/128/manual_time                                                                                         +0.0706         +0.0615             0             0             0             0
StringCopy/gather/4096/512/manual_time                                                                                         +0.0547         +0.0476             0             0             0             0
StringCopy/gather/32768/32/manual_time                                                                                         +0.0538         +0.0492             0             0             0             0
StringCopy/gather/32768/128/manual_time                                                                                        +0.0540         +0.0477             0             0             0             0
StringCopy/scatter/4096/32/manual_time                                                                                         +0.0571         +0.0526             0             0             0             0
StringCopy/scatter/32768/32/manual_time                                                                                        +0.0541         +0.0509             0             0             0             0
StringFindScalar/find_multi/4096/32/manual_time                                                                                +0.0525         +0.0460             0             0             0             0
StringFindScalar/find_multi/32768/32/manual_time                                                                               +0.0538         +0.0489             0             0             0             0
StringFindScalar/contains/4096/32/manual_time                                                                                  +0.0502         +0.0471             0             0             0             0
StringFindScalar/starts_with/4096/32/manual_time                                                                               +0.0528         +0.0476             0             0             0             0
StringFindScalar/starts_with/4096/2048/manual_time                                                                             +0.0575         +0.0475             0             0             0             0
StringFindScalar/starts_with/4096/8192/manual_time                                                                             +0.0606         +0.0515             0             0             0             0
StringFindScalar/starts_with/32768/32/manual_time                                                                              +0.0690         +0.0592             0             0             0             0
StringFindScalar/starts_with/32768/128/manual_time                                                                             +0.0589         +0.0499             0             0             0             0
StringFindScalar/starts_with/32768/512/manual_time                                                                             +0.0567         +0.0521             0             0             0             0
StringFindScalar/starts_with/32768/2048/manual_time                                                                            +0.0517         +0.0501             0             0             0             0
StringFindScalar/starts_with/262144/32/manual_time                                                                             +0.0555         +0.0525             0             0             0             0
StringFindScalar/ends_with/4096/2048/manual_time                                                                               +0.0526         +0.0446             0             0             0             0
StringFindScalar/ends_with/4096/8192/manual_time                                                                               +0.0568         +0.0485             0             0             0             0
StringFindScalar/ends_with/32768/32/manual_time                                                                                +0.0654         +0.0567             0             0             0             0
StringFindScalar/ends_with/32768/512/manual_time                                                                               +0.0546         +0.0502             0             0             0             0
StringFindScalar/ends_with/262144/32/manual_time                                                                               +0.0523         +0.0517             0             0             0             0
RepeatStrings/scalar_times/256/16/manual_time                                                                                  +0.0555         +0.0501             0             0             0             0
RepeatStrings/scalar_times/1024/16/manual_time                                                                                 +0.0562         +0.0519             0             0             0             0
RepeatStrings/column_times/256/16/manual_time                                                                                  +0.0645         +0.0579             0             0             0             0
RepeatStrings/column_times/256/64/manual_time                                                                                  +0.0506         +0.0472             0             0             0             0
RepeatStrings/column_times/1024/16/manual_time                                                                                 +0.0643         +0.0578             0             0             0             0
RepeatStrings/column_times/4096/16/manual_time                                                                                 +0.0537         +0.0502             0             0             0             0
RepeatStrings/column_times/16384/16/manual_time                                                                                +0.0565         +0.0514             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/16/manual_time                                                                  +0.0626         +0.0490             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/64/manual_time                                                                  +0.0539         +0.0434             0             0             0             0
RepeatStrings/compute_output_strings_sizes/256/256/manual_time                                                                 +0.0694         +0.0525             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/16/manual_time                                                                 +0.0526         +0.0422             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/64/manual_time                                                                 +0.0630         +0.0493             0             0             0             0
RepeatStrings/compute_output_strings_sizes/1024/256/manual_time                                                                +0.0533         +0.0460             0             0             0             0
RepeatStrings/precomputed_sizes/256/16/manual_time                                                                             +0.0674         +0.0602             0             0             0             0
RepeatStrings/precomputed_sizes/1024/16/manual_time                                                                            +0.0544         +0.0488             0             0             0             0
RepeatStrings/precomputed_sizes/4096/16/manual_time                                                                            +0.0531         +0.0492             0             0             0             0
RepeatStrings/precomputed_sizes/16384/16/manual_time                                                                           +0.0522         +0.0470             0             0             0             0
StringReplace/slice/4096/32/manual_time                                                                                        +0.0559         +0.0534             0             0             0             0
StringReplace/slice/32768/32/manual_time                                                                                       +0.0509         +0.0472             0             0             0             0
StringSplit/split_ws/4096/32/manual_time                                                                                       +0.0507         +0.0493             0             0             0             0
StringSubstring/multi_position/4096/32/manual_time                                                                             +0.0560         +0.0515             0             0             0             0
StringSubstring/delimiter/4096/32/manual_time                                                                                  +0.0532         +0.0504             0             0             0             0
StringSubstring/delimiter/32768/128/manual_time                                                                                +0.0531         +0.0535             0             0             0             0
StringSubstring/multi_delimiter/4096/32/manual_time                                                                            +0.0544         +0.0522             0             0             0             0
CsvWrite/string_file_output/23/0/manual_time                                                                                   -0.3111         -0.0110          1421           979           842           833
Shift/shift_ten_percent_nullable_out/32768/manual_time                                                                         -0.0786         -0.0650             0             0             0             0
Shift/shift_full_nullable_out/1073741824/manual_time                                                                           +0.0511         +0.0510            11            11            11            11
TypeDispatcher/fp64_bandwidth_host/8/1024/1/manual_time                                                                        +0.1281         +0.0638         18970         21400         37938         40357
TypeDispatcher/fp64_bandwidth_host/4/2048/1/manual_time                                                                        +0.0928         +0.0345         11556         12629         30463         31513
TypeDispatcher/fp64_bandwidth_host/2/4096/1/manual_time                                                                        +0.0768         +0.0270          7421          7991         26234         26943
TypeDispatcher/fp64_bandwidth_host/1/8192/1/manual_time                                                                        +0.0729         +0.0209          5029          5396         24111         24615
TypeDispatcher/fp64_bandwidth_device/8/1024/1/manual_time                                                                      +0.1176         +0.0632         16518         18460         35703         37961
TypeDispatcher/fp64_bandwidth_device/4/2048/1/manual_time                                                                      +0.0787         +0.0457         14424         15559         33546         35079
TypeDispatcher/fp64_bandwidth_device/2/4096/1/manual_time                                                                      +0.0500         +0.0327         13594         14274         32740         33811
TypeDispatcher/fp64_bandwidth_no/2/1024/1/manual_time                                                                          +0.0590         +0.0131          5065          5364         23966         24281
TypeDispatcher/fp64_bandwidth_no/8/1024/1/manual_time                                                                          +0.2305         +0.0699          6912          8506         25803         27607
TypeDispatcher/fp64_bandwidth_no/1/2048/1/manual_time                                                                          +0.0574         +0.0120          4854          5133         23782         24067
TypeDispatcher/fp64_bandwidth_no/4/2048/1/manual_time                                                                          +0.1602         +0.0461          6010          6973         24906         26054
TypeDispatcher/fp64_bandwidth_no/2/4096/1/manual_time                                                                          +0.0949         +0.0330          5583          6113         24469         25275
TypeDispatcher/fp64_bandwidth_no/4/4096/1/manual_time                                                                          +0.0623         +0.0175          6991          7427         26088         26545
TypeDispatcher/fp64_bandwidth_no/8/4096/1/manual_time                                                                          +0.0521         +0.0173          8953          9419         28000         28484
TypeDispatcher/fp64_bandwidth_no/1/8192/1/manual_time                                                                          +0.0607         +0.0257          5225          5542         24107         24727
TypeDispatcher/fp64_bandwidth_no/2/8192/1/manual_time                                                                          +0.0588         +0.0115          5964          6315         25052         25341
TypeDispatcher/fp64_bandwidth_no/1/16384/1/manual_time                                                                         +0.0541         +0.0119          5443          5737         24515         24806
TextTokenize/ngrams/2097152/128/manual_time                                                                                    +0.0624         +0.0623            10            10            10            10
MultibyteSplitBenchmark/multibyte_split_simple/1/1/1/32768/manual_time                                                         +0.4019         +0.4024             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/1/1/32768/manual_time                                                         +0.4099         +0.4073             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/32768/manual_time                                                         +0.3999         +0.3961             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/32768/manual_time                                                         +0.3969         +0.3980             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/7/1/32768/manual_time                                                         +0.4107         +0.3971             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/2/7/1/32768/manual_time                                                         +0.3833         +0.3948             8            12             8            12
MultibyteSplitBenchmark/multibyte_split_simple/1/1/25/32768/manual_time                                                        +0.3807         +0.3772             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/1/25/32768/manual_time                                                        +0.3834         +0.3702             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/25/32768/manual_time                                                        +0.3646         +0.3661             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/4/25/32768/manual_time                                                        +0.3722         +0.3743             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/7/25/32768/manual_time                                                        +0.3575         +0.3664             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/2/7/25/32768/manual_time                                                        +0.3761         +0.3744             9            12             9            12
MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/1073741824/manual_time                                                    -0.1017         -0.1040          1681          1510          1681          1506
MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/1073741824/manual_time                                                    -0.1817         -0.1817          4102          3357          4101          3356
MultibyteSplitBenchmark/multibyte_split_simple/0/7/25/1073741824/manual_time                                                   -0.0704         -0.0704           345           320           345           320
OVERALL_GEOMEAN                                                                                                                +0.0974         +0.0970             0             0             0             0
Groupby/BasicSumScan/100000000/manual_time                                                                                     +0.2947         +0.2947           135           175           135           175
CsvRead/decimal_file_input/35/0/manual_time                                                                                    +0.0508         +0.0511           151           159           151           159
ReductionScan/double_nulls/100000/manual_time                                                                                  +0.0721         +0.0609         22874         24524         40726         43206
OrcWrite/integral_file_output/30/0/32/1/0/manual_time                                                                          -0.1923         -0.0371           913           738           763           735
OrcWrite/integral_file_output/30/0/1/0/0/manual_time                                                                           +0.2668         -0.0297           754           955           722           701
OrcWrite/integral_file_output/30/1000/1/0/0/manual_time                                                                        -0.1090         -0.0510           986           878           725           688
OrcWrite/integral_file_output/30/0/32/0/0/manual_time                                                                          +0.0594         -0.0575           981          1039           738           696
OrcWrite/integral_buffer_output/30/1000/32/1/1/manual_time                                                                     +0.0882         +0.0885            85            92            85            92
OrcWrite/integral_buffer_output/30/1000/32/0/1/manual_time                                                                     -0.0966         -0.0955            98            89            98            89
OrcWrite/floats_file_output/31/0/1/1/0/manual_time                                                                             +0.0600         -0.0538           737           781           737           697
OrcWrite/floats_file_output/31/0/32/1/0/manual_time                                                                            +0.0670         +0.0021          1203          1284           715           717
OrcWrite/floats_file_output/31/0/1/0/0/manual_time                                                                             -0.2406         -0.0605           865           657           698           656
OrcWrite/floats_file_output/31/1000/1/0/0/manual_time                                                                          -0.2006         -0.0642          1122           897           706           660
OrcWrite/floats_file_output/31/0/32/0/0/manual_time                                                                            -0.1759         -0.0563          1131           932           708           668
OrcWrite/floats_file_output/31/1000/32/0/0/manual_time                                                                         -0.1600         -0.0640          1095           919           702           657
OrcWrite/decimal_file_output/35/1000/1/0/0/manual_time                                                                         +0.1622         -0.0865          1110          1290           588           537
OrcWrite/timestamps_file_output/33/0/1/0/0/manual_time                                                                         +0.1884         -0.0494           552           657           552           524
OrcWrite/timestamps_file_output/33/1000/1/0/0/manual_time                                                                      +0.1409         +0.0064           650           742           541           544
OrcWrite/list_file_output/24/0/1/0/0/manual_time                                                                               -0.0723         -0.0788           713           661           711           655
OrcWrite/list_file_output/24/1000/1/0/0/manual_time                                                                            +0.0935         -0.0468           696           761           689           657
Concatenate/BM_concatenate_nullable_false/4096/2/manual_time                                                                   +0.1055         +0.0672             0             0             0             0
Concatenate/BM_concatenate_nullable_false/512/8/manual_time                                                                    +0.0548         +0.0379             0             0             0             0
Concatenate/BM_concatenate_nullable_true/32768/8/manual_time                                                                   +0.0501         +0.0415             0             0             0             0
Concatenate/BM_concatenate_nullable_true/64/64/manual_time                                                                     +0.0570         +0.0400             0             0             0             0
Concatenate/BM_concatenate_nullable_true/512/64/manual_time                                                                    +0.0894         +0.0606             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/4096/2/2/manual_time                                                          +0.1086         +0.0771             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/512/8/2/manual_time                                                           +0.0920         +0.0828             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/4096/8/2/manual_time                                                          +0.0549         +0.0502             0             0             0             0
Concatenate/BM_concatenate_tables_nullable_false/256/32/2/manual_time                                                          +0.1036         +0.1009             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/512/32/2/manual_time                                                          +0.0827         +0.0813             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/4096/32/2/manual_time                                                         +0.0788         +0.0768             1             1             1             1
Concatenate/BM_concatenate_tables_nullable_false/256/8/64/manual_time                                                          +0.0525         +0.0490             0             0             0             0
ParquetRead/integral_buffer_input/29/1000/1/0/1/manual_time                                                                    +0.0929         +0.0928            46            50            46            50
ParquetRead/timestamps_file_input/33/0/32/0/0/manual_time                                                                      -0.0896         -0.0897           127           116           128           116
OrcRead/integral_buffer_input/30/1000/1/0/1/manual_time                                                                        +0.1087         +0.1087            88            97            88            97
OrcRead/floats_file_input/31/0/1/1/0/manual_time                                                                               +0.1528         +0.1526           134           155           134           155
OrcRead/floats_buffer_input/31/1000/1/0/1/manual_time                                                                          +0.1349         +0.1350            75            85            75            85
OrcRead/decimal_buffer_input/35/0/1/0/1/manual_time                                                                            -0.1137         -0.1137           264           234           264           234
OrcRead/string_file_input/23/0/1/0/0/manual_time                                                                               -0.0750         -0.0750           162           150           162           150
OrcRead/string_file_input/23/0/32/0/0/manual_time                                                                              -0.0963         -0.0963           163           147           163           147
OrcRead/string_buffer_input/23/0/32/0/1/manual_time                                                                            -0.1586         -0.0139           114            96            97            96
OrcRead/list_file_input/24/1000/1/0/0/manual_time                                                                              +0.0515         +0.0517           176           185           176           185
OrcRead/list_file_input/24/0/32/0/0/manual_time                                                                                +0.0925         +0.0922           173           189           173           189
OrcRead/list_buffer_input/24/0/1/1/1/manual_time                                                                               -0.1288         -0.1291           139           121           139           121
BINARYOP<int32_t, TreeType::IMBALANCED_LEFT, true>/binaryop_int32_imbalanced_reuse/100000/2/manual_time                        +0.0533         +0.0381             0             0             0             0
COMPILED_BINARYOP/NULL_MAX_decimal32_decimal32_decimal32/100000/manual_time                                                    +0.0509         +0.0320            13            14            32            33
COMPILED_BINARYOP/NULL_MIN_timestamp_D_timestamp_s_timestamp_s/10000/manual_time                                               +0.0509         +0.0374            11            12            30            31
ParquetWrite/integral_file_output/29/0/1/1/0/manual_time                                                                       +0.3011         +0.0605           726           945           726           770
ParquetWrite/integral_file_output/29/1000/1/1/0/manual_time                                                                    +0.0812         +0.0804           311           336           310           335
ParquetWrite/integral_file_output/29/0/32/1/0/manual_time                                                                      +0.3497         +0.0714           948          1279           734           786
ParquetWrite/integral_file_output/29/1000/32/1/0/manual_time                                                                   +0.0559         +0.0558            62            65            62            65
ParquetWrite/integral_file_output/29/0/1/0/0/manual_time                                                                       +0.1829         +0.0679           702           830           700           748
ParquetWrite/integral_file_output/29/1000/1/0/0/manual_time                                                                    +0.0829         +0.0852           284           307           283           307
ParquetWrite/integral_file_output/29/0/32/0/0/manual_time                                                                      -0.3273         +0.0451          1063           715           683           714
ParquetWrite/integral_file_output/29/1000/32/0/0/manual_time                                                                   +0.0835         +0.0834            58            63            58            63
ParquetWrite/integral_buffer_output/29/0/1/1/1/manual_time                                                                     +0.0608         +0.0609           874           927           874           927
ParquetWrite/floats_file_output/31/0/1/1/0/manual_time                                                                         +0.1916         +0.0634           694           827           693           737
ParquetWrite/floats_file_output/31/1000/1/1/0/manual_time                                                                      +0.0560         +0.0553           217           229           217           229
ParquetWrite/floats_file_output/31/0/32/1/0/manual_time                                                                        +0.0517         +0.0546          1020          1073           721           760
ParquetWrite/floats_file_output/31/1000/32/1/0/manual_time                                                                     +0.1149         +0.0631            45            50            39            42
ParquetWrite/floats_file_output/31/0/1/0/0/manual_time                                                                         +0.1165         +0.0471           880           983           664           695
ParquetWrite/floats_file_output/31/1000/1/0/0/manual_time                                                                      +0.3996         +0.0038           237           331           219           219
ParquetWrite/floats_file_output/31/0/32/0/0/manual_time                                                                        +0.3109         +0.0673           666           873           666           710
ParquetWrite/floats_file_output/31/1000/32/0/0/manual_time                                                                     +0.0798         +0.0790            38            41            38            41
ParquetWrite/floats_buffer_output/31/1000/1/1/1/manual_time                                                                    +0.0710         +0.0709           208           223           208           223
ParquetWrite/floats_buffer_output/31/0/32/1/1/manual_time                                                                      +0.0677         +0.0673           732           782           732           782
ParquetWrite/floats_buffer_output/31/0/1/0/1/manual_time                                                                       +0.0663         +0.0659           682           728           682           727
ParquetWrite/floats_buffer_output/31/1000/1/0/1/manual_time                                                                    +0.0785         +0.0780           188           203           188           203
ParquetWrite/decimal_file_output/35/0/1/1/0/manual_time                                                                        +0.0655         +0.0636           277           296           277           295
ParquetWrite/decimal_file_output/35/1000/1/1/0/manual_time                                                                     +0.0657         +0.0634           242           258           242           257
ParquetWrite/decimal_file_output/35/0/32/1/0/manual_time                                                                       +0.1194         +0.0577           291           325           290           307
ParquetWrite/decimal_file_output/35/1000/32/1/0/manual_time                                                                    +0.0852         +0.0836           170           185           170           184
ParquetWrite/decimal_file_output/35/0/1/0/0/manual_time                                                                        +0.3802         +0.0372           346           477           325           337
ParquetWrite/decimal_file_output/35/1000/1/0/0/manual_time                                                                     +0.8101         +0.1543           374           677           373           431
ParquetWrite/decimal_file_output/35/0/32/0/0/manual_time                                                                       +1.4742         +0.0541           328           812           327           344
ParquetWrite/decimal_file_output/35/1000/32/0/0/manual_time                                                                    +0.5398         +0.0463           391           603           390           409
ParquetWrite/decimal_buffer_output/35/0/1/1/1/manual_time                                                                      +0.0571         +0.0570           301           318           301           318
ParquetWrite/decimal_buffer_output/35/1000/1/1/1/manual_time                                                                   +0.1955         +0.1953           253           302           253           302
ParquetWrite/decimal_buffer_output/35/0/32/1/1/manual_time                                                                     +0.0655         +0.0641           306           326           306           325
ParquetWrite/decimal_buffer_output/35/0/1/0/1/manual_time                                                                      +0.0595         +0.0591           381           404           381           404
ParquetWrite/decimal_buffer_output/35/1000/1/0/1/manual_time                                                                   +0.0650         +0.0643           515           548           515           548
ParquetWrite/decimal_buffer_output/35/0/32/0/1/manual_time                                                                     +0.0595         +0.0591           386           409           386           409
ParquetWrite/decimal_buffer_output/35/1000/32/0/1/manual_time                                                                  +0.0595         +0.0590           517           547           516           547
ParquetWrite/timestamps_file_output/33/0/1/1/0/manual_time                                                                     +0.0566         +0.0580           724           765           721           762
ParquetWrite/timestamps_file_output/33/1000/1/1/0/manual_time                                                                  -0.6229         -0.0258           526           198           203           198
ParquetWrite/timestamps_file_output/33/0/32/1/0/manual_time                                                                    -0.0955         +0.0444           928           840           733           766
ParquetWrite/timestamps_file_output/33/1000/32/1/0/manual_time                                                                 +0.0794         +0.0725            36            39            36            39
ParquetWrite/timestamps_file_output/33/0/1/0/0/manual_time                                                                     +0.2140         +0.0788           626           760           626           676
ParquetWrite/timestamps_file_output/33/1000/1/0/0/manual_time                                                                  +0.0778         +0.0760           174           188           174           187
ParquetWrite/timestamps_file_output/33/0/32/0/0/manual_time                                                                    +0.4682         +0.0758           636           934           636           684
ParquetWrite/timestamps_file_output/33/1000/32/0/0/manual_time                                                                 +0.0938         +0.0929            34            38            34            38
ParquetWrite/timestamps_buffer_output/33/0/1/1/1/manual_time                                                                   +0.0559         +0.0559           837           884           837           884
ParquetWrite/timestamps_buffer_output/33/0/1/0/1/manual_time                                                                   +0.0612         +0.0612           714           758           714           758
ParquetWrite/timestamps_buffer_output/33/1000/1/0/1/manual_time                                                                -0.2022         -0.2021           229           183           229           183
ParquetWrite/timestamps_buffer_output/33/0/32/0/1/manual_time                                                                  +0.0609         +0.0596           721           765           721           764
ParquetWrite/string_file_output/23/0/1/1/0/manual_time                                                                         +0.1674         +0.1004          1231          1437           869           956
ParquetWrite/string_file_output/23/1000/1/1/0/manual_time                                                                      +0.0748         +0.0675           124           133           107           114
ParquetWrite/string_file_output/23/0/32/1/0/manual_time                                                                        +0.0497         +0.0541          1197          1256           893           942
ParquetWrite/string_file_output/23/1000/32/1/0/manual_time                                                                     +0.0822         +0.0551            38            41            34            35
ParquetWrite/string_file_output/23/0/1/0/0/manual_time                                                                         +0.3477         +0.0668           892          1202           828           883
ParquetWrite/string_file_output/23/1000/1/0/0/manual_time                                                                      +0.1446         +0.1474            98           113            98           113
ParquetWrite/string_file_output/23/1000/32/0/0/manual_time                                                                     +0.0596         +0.0590            33            35            33            35
ParquetWrite/string_buffer_output/23/1000/1/0/1/manual_time                                                                    +0.0598         +0.0594           104           110           104           110
ParquetWrite/string_void_output/23/1000/32/0/2/manual_time                                                                     -0.3901         +0.0015            34            21            21            21
ParquetWrite/list_file_output/24/0/1/0/0/manual_time                                                                           -0.1313         +0.0831          1033           897           828           897
ParquetWrite/list_file_output/24/1000/1/0/0/manual_time                                                                        +0.0559         +0.0537           521           550           521           549
ParquetWrite/list_file_output/24/0/32/0/0/manual_time                                                                          -0.1942         -0.0129          1183           954           888           877
ContiguousSplit/1Gb512ColsValidity/1073741824/512/256/1/iterations:8/manual_time                                               +0.0660         +0.0659            30            32            30            32
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/1000000/1/manual_time                   +0.0540         +0.0453             0             0             0             0
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/10000000/1/manual_time                  +0.0657         +0.0642             1             1             1             1
AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/100000000/1/manual_time                 +0.0704         +0.0702             8             9             8             9
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/1000000/1/manual_time                     +0.0549         +0.0473             0             0             0             0
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/10000000/1/manual_time                    +0.0745         +0.0723             1             1             1             1
AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/100000000/1/manual_time                   +0.0758         +0.0755             7             8             7             8
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/1/manual_time                  +0.0534         +0.0522             1             1             1             1
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/10/manual_time                 +0.0610         +0.0606             3             3             3             3
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/1/manual_time                 +0.0538         +0.0537             9            10             9            10
AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/10/manual_time                +0.0579         +0.0579            26            27            26            27
Rank/nulls/1024/manual_time                                                                                                    +0.7608         +0.6280             0             0             0             0
Rank/nulls/4096/manual_time                                                                                                    +0.2739         +0.2437             0             0             0             0
Rank/nulls/32768/manual_time                                                                                                   +0.1599         +0.1469             0             0             0             0
Rank/nulls/262144/manual_time                                                                                                  +0.0813         +0.0793             0             0             0             0
Rank/nulls/2097152/manual_time                                                                                                 -0.4178         -0.4162             5             3             5             3
Rank/nulls/16777216/manual_time                                                                                                -0.3688         -0.3686            45            28            45            28
Rank/nulls/67108864/manual_time                                                                                                -0.3576         -0.3576           181           117           181           117
Sort<false>/unstable_no_nulls/1024/8/manual_time                                                                               +0.2655         +0.2554             1             1             1             1
Sort<false>/unstable_no_nulls/4096/8/manual_time                                                                               +0.3212         +0.3081             0             1             1             1
Sort<false>/unstable_no_nulls/32768/8/manual_time                                                                              +0.1430         +0.1395             1             1             1             1
Sort<false>/unstable_no_nulls/262144/8/manual_time                                                                             +0.1080         +0.1064             1             1             1             2
Sort<false>/unstable_no_nulls/2097152/8/manual_time                                                                            -0.0740         -0.0740            15            14            15            14
Sort<false>/unstable_no_nulls/16777216/8/manual_time                                                                           -0.0882         -0.0882           215           196           215           196
Sort<false>/unstable_no_nulls/67108864/8/manual_time                                                                           -0.0848         -0.0848          1170          1071          1170          1071
Sort<true>/stable_no_nulls/1024/8/manual_time                                                                                  +0.2656         +0.2553             1             1             1             1
Sort<true>/stable_no_nulls/4096/8/manual_time                                                                                  +0.3215         +0.3081             0             1             1             1
Sort<true>/stable_no_nulls/32768/8/manual_time                                                                                 +0.1427         +0.1392             1             1             1             1
Sort<true>/stable_no_nulls/262144/8/manual_time                                                                                +0.1082         +0.1066             1             1             1             2
Sort<true>/stable_no_nulls/2097152/8/manual_time                                                                               -0.0737         -0.0735            15            14            15            14
Sort<true>/stable_no_nulls/16777216/8/manual_time                                                                              -0.0889         -0.0887           215           196           215           196
Sort<true>/stable_no_nulls/67108864/8/manual_time                                                                              -0.0848         -0.0846          1170          1071          1170          1071
Sort<false>/unstable/1024/1/manual_time                                                                                        +0.8698         +0.7017             0             0             0             0
Sort<false>/unstable/4096/1/manual_time                                                                                        +0.2846         +0.2506             0             0             0             0
Sort<false>/unstable/32768/1/manual_time                                                                                       +0.1640         +0.1492             0             0             0             0
Sort<false>/unstable/262144/1/manual_time                                                                                      +0.0818         +0.0794             0             0             0             0
Sort<false>/unstable/2097152/1/manual_time                                                                                     -0.4431         -0.4414             5             3             5             3
Sort<false>/unstable/16777216/1/manual_time                                                                                    -0.4282         -0.4280            38            22            38            22
Sort<false>/unstable/67108864/1/manual_time                                                                                    -0.4168         -0.4168           155            90           155            90
Sort<false>/unstable/1024/8/manual_time                                                                                        +0.2213         +0.2142             1             1             1             1
Sort<false>/unstable/4096/8/manual_time                                                                                        +0.2784         +0.2687             1             1             1             1
Sort<false>/unstable/32768/8/manual_time                                                                                       +0.1115         +0.1094             1             1             1             1
Sort<false>/unstable/262144/8/manual_time                                                                                      +0.1030         +0.1016             2             2             2             2
Sort<true>/stable/1024/1/manual_time                                                                                           +0.8684         +0.7016             0             0             0             0
Sort<true>/stable/4096/1/manual_time                                                                                           +0.2860         +0.2517             0             0             0             0
Sort<true>/stable/32768/1/manual_time                                                                                          +0.1638         +0.1497             0             0             0             0
Sort<true>/stable/262144/1/manual_time                                                                                         +0.0817         +0.0798             0             0             0             0
Sort<true>/stable/2097152/1/manual_time                                                                                        -0.4431         -0.4415             5             3             5             3
Sort<true>/stable/16777216/1/manual_time                                                                                       -0.4279         -0.4277            38            22            38            22
Sort<true>/stable/67108864/1/manual_time                                                                                       -0.4176         -0.4176           155            90           155            90
Sort<true>/stable/1024/8/manual_time                                                                                           +0.2211         +0.2138             1             1             1             1
Sort<true>/stable/4096/8/manual_time                                                                                           +0.2808         +0.2706             1             1             1             1
Sort<true>/stable/32768/8/manual_time                                                                                          +0.1117         +0.1096             1             1             1             1
Sort<true>/stable/262144/8/manual_time                                                                                         +0.1029         +0.1013             2             2             2             2
Sort/strings/262144/manual_time                                                                                                -0.0781         -0.0777             4             4             4             4
Scatter/double_coalesce_x/2048/2/manual_time                                                                                   +0.0614         +0.0472         27988         29705         46846         49057
Scatter/double_coalesce_x/32768/2/manual_time                                                                                  +0.0637         +0.0522         30209         32133         47991         50496
Scatter/double_coalesce_x/131072/2/manual_time                                                                                 +0.0558         +0.0444         37821         39932         54883         57321
Scatter/double_coalesce_x/1024/4/manual_time                                                                                   +0.0811         +0.0663         53699         58053         72617         77434
Scatter/double_coalesce_x/2048/4/manual_time                                                                                   +0.0535         +0.0468         56040         59038         74848         78348
Scatter/double_coalesce_x/4096/4/manual_time                                                                                   +0.0514         +0.0449         56187         59073         74930         78291
Scatter/double_coalesce_x/8192/4/manual_time                                                                                   +0.0516         +0.0452         56747         59674         75140         78533
Scatter/double_coalesce_x/16384/4/manual_time                                                                                  +0.0520         +0.0479         57412         60400         75292         78895
Scatter/double_coalesce_x/32768/4/manual_time                                                                                  +0.0610         +0.0544         58151         61699         75398         79499
Scatter/double_coalesce_x/1024/8/manual_time                                                                                   +0.0526         +0.0486        110089        115882        129032        135301
Scatter/double_coalesce_x/2048/8/manual_time                                                                                   +0.0546         +0.0506        110864        116921        129784        136352
Scatter/double_coalesce_x/4096/8/manual_time                                                                                   +0.0612         +0.0554        110733        117506        129306        136465
Scatter/double_coalesce_x/8192/8/manual_time                                                                                   +0.0635         +0.0579        111614        118703        129727        137233
Scatter/double_coalesce_x/16384/8/manual_time                                                                                  +0.0665         +0.0604        111918        119366        129458        137275
Scatter/double_coalesce_x/32768/8/manual_time                                                                                  +0.0545         +0.0543        114993        121260        131951        139113
Scatter/double_coalesce_x/65536/8/manual_time                                                                                  +0.0619         +0.0560        119167        126540        136092        143717
Scatter/double_coalesce_o/2048/2/manual_time                                                                                   +0.0542         +0.0418         29300         30889         48197         50211
Scatter/double_coalesce_o/32768/2/manual_time                                                                                  +0.0556         +0.0464         32069         33851         49914         52229
Scatter/double_coalesce_o/1024/4/manual_time                                                                                   +0.0684         +0.0569         56480         60346         75468         79761
Scatter/double_coalesce_o/8192/4/manual_time                                                                                   +0.0572         +0.0497         59554         62960         77958         81834
Scatter/double_coalesce_o/16384/4/manual_time                                                                                  +0.0572         +0.0525         59839         63260         77704         81781
Scatter/double_coalesce_o/32768/4/manual_time                                                                                  +0.0564         +0.0514         62493         66015         79779         83883
Scatter/double_coalesce_o/1024/8/manual_time                                                                                   +0.0566         +0.0515        112968        119360        131925        138723
Scatter/double_coalesce_o/2048/8/manual_time                                                                                   +0.0565         +0.0518        113151        119548        132028        138870
Scatter/double_coalesce_o/4096/8/manual_time                                                                                   +0.0594         +0.0545        114566        121374        133078        140333
Scatter/double_coalesce_o/8192/8/manual_time                                                                                   +0.0587         +0.0534        116146        122963        134282        141449
Scatter/double_coalesce_o/16384/8/manual_time                                                                                  +0.0663         +0.0597        116445        124161        134038        142046
Scatter/double_coalesce_o/32768/8/manual_time                                                                                  +0.0555         +0.0566        122258        129043        139016        146891
Scatter/double_coalesce_o/65536/8/manual_time                                                                                  +0.0553         +0.0498        133373        140749        150403        157896
Quantiles/no_nulls/65536/4/1/manual_time                                                                                       +0.1394         +0.1370             1             1             1             1
Quantiles/no_nulls/262144/4/1/manual_time                                                                                      +0.1372         +0.1348             1             1             1             1
Quantiles/no_nulls/1048576/4/1/manual_time                                                                                     -0.0944         -0.0943             6             5             6             5
Quantiles/no_nulls/4194304/4/1/manual_time                                                                                     -0.1068         -0.1070            35            32            35            32
Quantiles/no_nulls/16777216/4/1/manual_time                                                                                    -0.0882         -0.0884           210           191           210           191
Quantiles/no_nulls/67108864/4/1/manual_time                                                                                    -0.0855         -0.0858          1148          1050          1148          1050
Quantiles/no_nulls/65536/8/1/manual_time                                                                                       +0.1312         +0.1290             1             1             1             1
Quantiles/no_nulls/262144/8/1/manual_time                                                                                      +0.1058         +0.1044             1             2             1             2
Quantiles/no_nulls/4194304/8/1/manual_time                                                                                     -0.0982         -0.0984            37            33            37            33
Quantiles/no_nulls/16777216/8/1/manual_time                                                                                    -0.0886         -0.0888           215           196           215           196
Quantiles/no_nulls/67108864/8/1/manual_time                                                                                    -0.0866         -0.0868          1173          1071          1173          1071
Quantiles/no_nulls/65536/4/4/manual_time                                                                                       +0.1413         +0.1385             1             1             1             1
Quantiles/no_nulls/262144/4/4/manual_time                                                                                      +0.1355         +0.1332             1             1             1             1
Quantiles/no_nulls/1048576/4/4/manual_time                                                                                     -0.0944         -0.0943             6             5             6             5
Quantiles/no_nulls/4194304/4/4/manual_time                                                                                     -0.1061         -0.1063            35            32            35            32
Quantiles/no_nulls/16777216/4/4/manual_time                                                                                    -0.0877         -0.0879           210           191           210           191
Quantiles/no_nulls/67108864/4/4/manual_time                                                                                    -0.0863         -0.0865          1149          1050          1149          1049
Quantiles/no_nulls/65536/8/4/manual_time                                                                                       +0.1328…
stanleytsang-amd added a commit to ROCm/hipCUB that referenced this pull request Apr 12, 2022
* test_hipcub_device_radix_sort.cpp Correctly test -NaN.

* `test_utils::native_half` -NaN to `float` fix

* `hipcub::WarpExchange` interface to `::rocprim::warp_exchange`

* Fix after review

* Default CUDA architecture is 53 to fix __half

* Apply 1 suggestion(s) to 1 file(s)

* Added NVGPU_TARGETS to gitlab-ci

* Update .gitlab-ci.yml file

* Changes from [PR346](NVIDIA/cub#346)

* Add deprecation warnings.

* Update of deprecated statement.

* Adding constants from [PR418](NVIDIA/cub#418).

* Fix deprecation warnings.

* Fix a forgotten deprecation warnings.

* Fix deprecation warnings.

* Fix deprecation warnings for nvcc.

* Replace '__host__ __device__' by 'HIPCUB_HOST_DEVICE'

* Added Cuda standard

* Bumped referenced CUB and thrust version to 1.16

* Download thrust in test/extra

* Added the interface for UniqueByKey

* Added test for UniqueByKey

* Added benchmark for UniqueByKey

* Add UniqueByKey interface

* Fix alignment of UniqueByKey parameters

* Use 'unsigned int' instead of a one element vector for selected_count_output in UniqueByKey benchmark

* Update interface

* Update tests, add test for int64_t size

* Upde CUB interface

* Apply 1 suggestion(s) to 1 file(s)

* Add interfaces for subtract

* Ignore deprecation warnings from rocPRIM for flags API

* Add deprecation warnings for Flags API

* Ignore deprecation warnings for Flags API tests

* Fix Subtract interfaces

* Fix SubtractRightPartial not using the right method

* Add benchmark for AdjacentDifference (Subtract)

* Add test for AdjacentDifference (Subtract)

* Use 'HIPCUB_HOST_DEVICE' macro

* Fix a typo

* Fix interfaces of Subtract not matching the CUB one

* Upadte the tests and benchmarks to the fixed interfaces of Subtract

* Fix to use temp_storage_ in subtract call

* Fix the tests of Subtract to work with the CUB interfaces

* Add the macros to ignore warning in config.hpp and remove it from block_adjacent_difference file and the from the tests

* Device adjacent difference CUB backend

* New thread operators [skip ci]

* Test device adjacent difference [skip ci]

* Device adjacent difference rocPRIM backend

* Added new headers to the hipcub.hpp-s

* Benchmark for device adjacent difference

* Added missing thread operators

* Updated changelog for CUB 1.16

* Updating changelog for hipCUB 1.16 in next release

Co-authored-by: Vince <vince@streamhpc.com>
Co-authored-by: Gergely Mészáros <gergely@streamhpc.com>
Co-authored-by: Théo Battrel <theo@streamhpc.com>
Co-authored-by: Balint Soproni <balint@streamhpc.com>
Co-authored-by: Stanley Tsang <stanley.tsang@amd.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1: should have Necessary, but not critical. testing: gpuCI in progress Started gpuCI testing. testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). type: bug: functional Does not work as intended.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants