Skip to content

Conversation

sterrettm2
Copy link
Contributor

@sterrettm2 sterrettm2 commented Nov 14, 2023

This replaces the specialized sorting networks used for argsort and argselect with generic ones.

Benchmark                                                                Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort vs. simdargsort]/smallrandom_128/int64_t                 -0.2268         -0.2268           979           757           979           757
[simdargsort vs. simdargsort]/smallrandom_256/int64_t                 -0.1771         -0.1771          2162          1779          2162          1779
[simdargsort vs. simdargsort]/smallrandom_512/int64_t                 +0.0221         +0.0221          4720          4824          4720          4824
[simdargsort vs. simdargsort]/smallrandom_1k/int64_t                  +0.0169         +0.0169         10394         10570         10394         10570
[simdargsort vs. simdargsort]/random_5k/int64_t                       +0.0441         +0.0441         64957         67823         64956         67821
[simdargsort vs. simdargsort]/random_100k/int64_t                     -0.0611         -0.0611       2104894       1976340       2104825       1976240
[simdargsort vs. simdargsort]/random_1m/int64_t                       -0.0297         -0.0297      36122543      35049727      36119086      35047376
[simdargsort vs. simdargsort]/random_10m/int64_t                      -0.0130         -0.0130     989874079     977017100     989793993     976911876
[simdargsort vs. simdargsort]/sorted_10k/int64_t                      +0.0448         +0.0447        127882        133610        127877        133598
[simdargsort vs. simdargsort]/constant_10k/int64_t                    +0.0061         +0.0060         10237         10300         10237         10299
[simdargsort vs. simdargsort]/reverse_10k/int64_t                     +0.0242         +0.0242        129422        132550        129417        132547
[simdargsort vs. simdargsort]/smallrandom_128/uint64_t                -0.2294         -0.2294           956           737           956           737
[simdargsort vs. simdargsort]/smallrandom_256/uint64_t                -0.1774         -0.1774          2163          1780          2163          1779
[simdargsort vs. simdargsort]/smallrandom_512/uint64_t                +0.0198         +0.0198          4727          4821          4727          4820
[simdargsort vs. simdargsort]/smallrandom_1k/uint64_t                 +0.0136         +0.0136         10398         10540         10398         10539
[simdargsort vs. simdargsort]/random_5k/uint64_t                      +0.0432         +0.0432         65002         67810         64999         67805
[simdargsort vs. simdargsort]/random_100k/uint64_t                    -0.0626         -0.0627       2119238       1986559       2119105       1986337
[simdargsort vs. simdargsort]/random_1m/uint64_t                      -0.0324         -0.0325      36010610      34842490      36009336      34839311
[simdargsort vs. simdargsort]/random_10m/uint64_t                     +0.0001         +0.0001     986263244     986389263     986203810     986289484
[simdargsort vs. simdargsort]/sorted_10k/uint64_t                     +0.0369         +0.0368        129103        133861        129097        133854
[simdargsort vs. simdargsort]/constant_10k/uint64_t                   +0.0005         +0.0004         10254         10258         10254         10258
[simdargsort vs. simdargsort]/reverse_10k/uint64_t                    +0.0223         +0.0223        129633        132523        129630        132515
[simdargsort vs. simdargsort]/smallrandom_128/double                  -0.3802         -0.3802          1067           661          1067           661
[simdargsort vs. simdargsort]/smallrandom_256/double                  -0.2691         -0.2691          2105          1539          2105          1539
[simdargsort vs. simdargsort]/smallrandom_512/double                  -0.2012         -0.2012          5305          4238          5305          4238
[simdargsort vs. simdargsort]/smallrandom_1k/double                   -0.2099         -0.2099         11220          8865         11220          8865
[simdargsort vs. simdargsort]/random_5k/double                        -0.0314         -0.0315         61942         59994         61942         59991
[simdargsort vs. simdargsort]/random_100k/double                      -0.1191         -0.1191       2124507       1871469       2124391       1871378
[simdargsort vs. simdargsort]/random_1m/double                        -0.0640         -0.0639      35722260      33437254      35719215      33435097
[simdargsort vs. simdargsort]/random_10m/double                       -0.0224         -0.0224     995601589     973342561     995503060     973238965
[simdargsort vs. simdargsort]/sorted_10k/double                       -0.0639         -0.0639        128381        120179        128375        120170
[simdargsort vs. simdargsort]/constant_10k/double                     -0.0098         -0.0098          9907          9810          9906          9809
[simdargsort vs. simdargsort]/reverse_10k/double                      -0.0716         -0.0716        127821        118672        127820        118661
[simdargsort vs. simdargsort]/smallrandom_128/int32_t                 -0.3001         -0.3001           865           605           865           605
[simdargsort vs. simdargsort]/smallrandom_256/int32_t                 -0.2997         -0.2997          2027          1419          2027          1419
[simdargsort vs. simdargsort]/smallrandom_512/int32_t                 +0.0103         +0.0103          3912          3953          3912          3952
[simdargsort vs. simdargsort]/smallrandom_1k/int32_t                  +0.0595         +0.0594          9350          9906          9350          9906
[simdargsort vs. simdargsort]/random_5k/int32_t                       +0.0372         +0.0372         52619         54577         52617         54575
[simdargsort vs. simdargsort]/random_100k/int32_t                     -0.0684         -0.0684       1794433       1671619       1794330       1671540
[simdargsort vs. simdargsort]/random_1m/int32_t                       -0.0530         -0.0530      28613481      27098352      28612343      27095195
[simdargsort vs. simdargsort]/random_10m/int32_t                      -0.0208         -0.0209     797978264     781393996     797925336     781259870
[simdargsort vs. simdargsort]/sorted_10k/int32_t                      +0.0227         +0.0226        108189        110644        108183        110631
[simdargsort vs. simdargsort]/constant_10k/int32_t                    -0.0000         -0.0000          9716          9716          9716          9716
[simdargsort vs. simdargsort]/reverse_10k/int32_t                     +0.0040         +0.0039        109137        109569        109133        109561
[simdargsort vs. simdargsort]/smallrandom_128/uint32_t                -0.3001         -0.3002           868           607           868           607
[simdargsort vs. simdargsort]/smallrandom_256/uint32_t                -0.3157         -0.3157          2062          1411          2062          1411
[simdargsort vs. simdargsort]/smallrandom_512/uint32_t                +0.0020         +0.0019          3933          3941          3933          3941
[simdargsort vs. simdargsort]/smallrandom_1k/uint32_t                 +0.0546         +0.0545          9375          9887          9375          9886
[simdargsort vs. simdargsort]/random_5k/uint32_t                      +0.0313         +0.0312         52852         54505         52851         54498
[simdargsort vs. simdargsort]/random_100k/uint32_t                    -0.0703         -0.0703       1799887       1673327       1799761       1673204
[simdargsort vs. simdargsort]/random_1m/uint32_t                      -0.0538         -0.0538      28622200      27082860      28618952      27079005
[simdargsort vs. simdargsort]/random_10m/uint32_t                     -0.0204         -0.0205     798566107     782265459     798509383     782179069
[simdargsort vs. simdargsort]/sorted_10k/uint32_t                     +0.0139         +0.0138        108912        110420        108909        110408
[simdargsort vs. simdargsort]/constant_10k/uint32_t                   +0.0061         +0.0061          9699          9758          9698          9757
[simdargsort vs. simdargsort]/reverse_10k/uint32_t                    +0.0018         +0.0018        109384        109580        109380        109576
[simdargsort vs. simdargsort]/smallrandom_128/float                   -0.3024         -0.3024           947           661           947           661
[simdargsort vs. simdargsort]/smallrandom_256/float                   -0.2949         -0.2949          2142          1510          2142          1510
[simdargsort vs. simdargsort]/smallrandom_512/float                   -0.0816         -0.0816          4527          4158          4527          4158
[simdargsort vs. simdargsort]/smallrandom_1k/float                    -0.1366         -0.1367         10853          9370         10853          9369
[simdargsort vs. simdargsort]/random_5k/float                         -0.1214         -0.1215         60037         52747         60035         52743
[simdargsort vs. simdargsort]/random_100k/float                       -0.1314         -0.1314       1997298       1734898       1997125       1734756
[simdargsort vs. simdargsort]/random_1m/float                         -0.0882         -0.0882      30312812      27639748      30310567      27635754
[simdargsort vs. simdargsort]/random_10m/float                        -0.0330         -0.0330     818804097     791797623     818754779     791744774
[simdargsort vs. simdargsort]/sorted_10k/float                        -0.0623         -0.0623        124070        116346        124065        116340
[simdargsort vs. simdargsort]/constant_10k/float                      +0.0050         +0.0050          9982         10031          9981         10031
[simdargsort vs. simdargsort]/reverse_10k/float                       -0.0842         -0.0842        125760        115176        125756        115169
OVERALL_GEOMEAN                                                       -0.0794         -0.0795             0             0             0             0

@r-devulap
Copy link
Member

Rebased with main branch.

@r-devulap
Copy link
Member

Rebased to add new CI coverage.

Copy link
Member

@r-devulap r-devulap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @sterrettm2 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants