Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARROW-17135: [C++] Reduce code size in compute/kernels/scalar_compare…
….cc (#13654) This "leaner" implementation reduces the generated code size of this C++ file from 2307768 bytes to 1192608 bytes in gcc 10.3.0. The benchmarks are also faster (on my avx2 laptop): before ``` ----------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------- GreaterArrayArrayInt64/32768/10000 32.1 us 32.1 us 21533 items_per_second=1020.16M/s null_percent=0.01 size=32.768k GreaterArrayArrayInt64/32768/100 32.1 us 32.1 us 21603 items_per_second=1019.27M/s null_percent=1 size=32.768k GreaterArrayArrayInt64/32768/10 32.1 us 32.1 us 21479 items_per_second=1020.82M/s null_percent=10 size=32.768k GreaterArrayArrayInt64/32768/2 32.0 us 32.0 us 21468 items_per_second=1023.12M/s null_percent=50 size=32.768k GreaterArrayArrayInt64/32768/1 32.3 us 32.3 us 21720 items_per_second=1013.44M/s null_percent=100 size=32.768k GreaterArrayArrayInt64/32768/0 31.6 us 31.6 us 21828 items_per_second=1036.94M/s null_percent=0 size=32.768k GreaterArrayScalarInt64/32768/10000 20.8 us 20.8 us 33461 items_per_second=1.57238G/s null_percent=0.01 size=32.768k GreaterArrayScalarInt64/32768/100 20.9 us 20.9 us 33625 items_per_second=1.56611G/s null_percent=1 size=32.768k GreaterArrayScalarInt64/32768/10 20.8 us 20.8 us 33553 items_per_second=1.57338G/s null_percent=10 size=32.768k GreaterArrayScalarInt64/32768/2 20.9 us 20.9 us 33348 items_per_second=1.5687G/s null_percent=50 size=32.768k GreaterArrayScalarInt64/32768/1 20.9 us 20.9 us 33419 items_per_second=1.56879G/s null_percent=100 size=32.768k GreaterArrayScalarInt64/32768/0 20.5 us 20.5 us 34116 items_per_second=1.59837G/s null_percent=0 size=32.768k ``` after ``` ----------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------- GreaterArrayArrayInt64/32768/10000 18.1 us 18.1 us 38751 items_per_second=1.81199G/s null_percent=0.01 size=32.768k GreaterArrayArrayInt64/32768/100 17.5 us 17.5 us 39374 items_per_second=1.86821G/s null_percent=1 size=32.768k GreaterArrayArrayInt64/32768/10 19.0 us 19.0 us 33941 items_per_second=1.72066G/s null_percent=10 size=32.768k GreaterArrayArrayInt64/32768/2 18.0 us 18.0 us 39589 items_per_second=1.81817G/s null_percent=50 size=32.768k GreaterArrayArrayInt64/32768/1 18.1 us 18.1 us 39061 items_per_second=1.80719G/s null_percent=100 size=32.768k GreaterArrayArrayInt64/32768/0 17.5 us 17.5 us 39813 items_per_second=1.87031G/s null_percent=0 size=32.768k GreaterArrayScalarInt64/32768/10000 16.3 us 16.3 us 42281 items_per_second=2.01525G/s null_percent=0.01 size=32.768k GreaterArrayScalarInt64/32768/100 16.5 us 16.5 us 42266 items_per_second=1.98195G/s null_percent=1 size=32.768k GreaterArrayScalarInt64/32768/10 16.5 us 16.5 us 41872 items_per_second=1.98615G/s null_percent=10 size=32.768k GreaterArrayScalarInt64/32768/2 16.3 us 16.3 us 42130 items_per_second=2.00447G/s null_percent=50 size=32.768k GreaterArrayScalarInt64/32768/1 16.2 us 16.2 us 42391 items_per_second=2.02296G/s null_percent=100 size=32.768k GreaterArrayScalarInt64/32768/0 15.9 us 15.9 us 43498 items_per_second=2.0614G/s null_percent=0 size=32.768k ``` Authored-by: Wes McKinney <wesm@apache.org> Signed-off-by: Wes McKinney <wesm@apache.org>
- Loading branch information