Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Jan 22, 2026

Which issue does this PR close?

Rationale for this change

I saw this was a bit hot in profiles.

Details
lt scalar StringViewArray
                        time:   [18.997 ms 19.046 ms 19.122 ms]
                        change: [−46.019% −45.725% −45.407%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

eq scalar StringViewArray 4 bytes
                        time:   [4.1758 ms 4.2382 ms 4.3220 ms]
                        change: [−65.309% −64.797% −64.047%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

eq scalar StringViewArray 6 bytes
                        time:   [4.1705 ms 4.1841 ms 4.2011 ms]
                        change: [−65.384% −65.155% −64.926%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

eq scalar StringViewArray 13 bytes
                        time:   [6.6653 ms 6.6951 ms 6.7368 ms]
                        change: [−45.014% −44.679% −44.284%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

eq StringViewArray StringViewArray
                        time:   [9.0877 ms 9.1225 ms 9.1626 ms]
                        change: [−29.753% −29.416% −29.044%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

eq StringViewArray StringViewArray inlined bytes
                        time:   [4.7619 ms 4.7771 ms 4.8014 ms]
                        change: [−63.522% −63.288% −63.041%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
  12 (12.00%) high mild
  7 (7.00%) high severe

lt StringViewArray StringViewArray inlined bytes
                        time:   [12.219 ms 12.308 ms 12.445 ms]
                        change: [−24.196% −23.586% −22.697%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 36 outliers among 100 measurements (36.00%)
  2 (2.00%) low severe
  13 (13.00%) low mild
  2 (2.00%) high mild
  19 (19.00%) high severe

eq long same prefix strings StringViewArray
                        time:   [639.53 µs 642.16 µs 645.97 µs]
                        change: [−27.425% −26.845% −26.240%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

neq long same prefix strings StringViewArray
                        time:   [641.40 µs 642.99 µs 644.94 µs]
                        change: [−27.323% −26.696% −26.066%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

lt long same prefix strings StringViewArray
                        time:   [674.98 µs 679.64 µs 687.89 µs]
                        change: [−22.103% −21.388% −20.647%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@Dandandan
Copy link
Contributor Author

run benchmark comparison_kernels

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (e08bff9) to ebace17 diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench comparison_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 22, 2026
@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                    main                                   speedup_comparison_string_view
-----                                                                                                    ----                                   ------------------------------
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.00      2.9±0.04ms        ? ?/sec    1.04      3.0±0.11ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.01      3.3±0.08ms        ? ?/sec    1.00      3.2±0.07ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.00      2.7±0.10ms        ? ?/sec    1.27      3.4±1.49ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.01      2.2±0.05ms        ? ?/sec    1.00      2.1±0.05ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.00      3.0±0.06ms        ? ?/sec    1.05      3.1±0.06ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      3.3±0.06ms        ? ?/sec    1.06      3.5±0.08ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.00      2.7±0.12ms        ? ?/sec    1.36      3.6±0.47ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.00      2.2±0.10ms        ? ?/sec    1.11      2.5±0.38ms        ? ?/sec
eq Float32                                                                                               1.00     44.3±0.44µs        ? ?/sec    1.00     44.4±0.33µs        ? ?/sec
eq Int32                                                                                                 1.00     44.7±1.26µs        ? ?/sec    1.00     44.5±1.11µs        ? ?/sec
eq MonthDayNano                                                                                          1.00     90.5±0.48µs        ? ?/sec    1.02     92.6±3.34µs        ? ?/sec
eq StringArray StringArray                                                                               1.00     29.2±0.48ms        ? ?/sec    1.00     29.2±0.40ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.00     25.1±0.19ms        ? ?/sec    1.01     25.3±0.44ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.00     25.4±0.59ms        ? ?/sec    1.10     27.9±0.16ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.00    843.2±2.84µs        ? ?/sec    1.03    867.3±6.40µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.03    565.2±9.43µs        ? ?/sec    1.00    547.9±7.55µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.22    969.7±5.31µs        ? ?/sec    1.00    792.7±5.62µs        ? ?/sec
eq scalar Float32                                                                                        1.00     44.3±0.55µs        ? ?/sec    1.00     44.3±0.38µs        ? ?/sec
eq scalar Int32                                                                                          1.01     44.6±1.01µs        ? ?/sec    1.00     44.4±0.57µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.23     71.3±0.27µs        ? ?/sec    1.00     58.2±1.32µs        ? ?/sec
eq scalar StringArray                                                                                    1.00     25.4±0.17ms        ? ?/sec    1.06     26.8±0.39ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.39     22.3±0.22ms        ? ?/sec    1.00     16.1±0.18ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.55     22.3±0.47ms        ? ?/sec    1.00     14.4±0.29ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.55     22.3±0.46ms        ? ?/sec    1.00     14.4±0.23ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     77.8±1.04µs        ? ?/sec    1.00     77.8±0.81µs        ? ?/sec
gt Float32                                                                                               1.01     57.5±0.48µs        ? ?/sec    1.00     56.9±1.09µs        ? ?/sec
gt Int32                                                                                                 1.00     44.5±1.39µs        ? ?/sec    1.00     44.4±0.47µs        ? ?/sec
gt scalar Float32                                                                                        1.00     45.9±0.43µs        ? ?/sec    1.00     46.0±0.49µs        ? ?/sec
gt scalar Int32                                                                                          1.00     44.2±0.26µs        ? ?/sec    1.00     44.2±0.65µs        ? ?/sec
gt_eq Float32                                                                                            1.00     57.0±0.59µs        ? ?/sec    1.00     57.1±0.60µs        ? ?/sec
gt_eq Int32                                                                                              1.00     44.3±0.40µs        ? ?/sec    1.00     44.3±0.42µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.00     46.6±1.06µs        ? ?/sec    1.00     46.5±0.42µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.00     44.4±1.27µs        ? ?/sec    1.00     44.3±0.81µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     78.0±0.72µs        ? ?/sec    1.00     78.0±0.65µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.01      3.7±0.09ms        ? ?/sec    1.00      3.7±0.08ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.03      4.7±0.12ms        ? ?/sec    1.00      4.5±0.11ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.05  1181.2±81.40µs        ? ?/sec    1.00  1123.7±40.40µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.09   707.7±35.81µs        ? ?/sec    1.00   649.1±29.66µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.03  1040.1±56.03µs        ? ?/sec    1.00  1008.7±22.40µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     78.5±1.16µs        ? ?/sec    1.00     78.4±0.76µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00      2.9±0.06ms        ? ?/sec    1.01      2.9±0.07ms        ? ?/sec
like_utf8 scalar contains                                                                                1.00  1868.2±30.06µs        ? ?/sec    1.00  1865.9±28.71µs        ? ?/sec
like_utf8 scalar ends with                                                                               1.04   393.0±22.18µs        ? ?/sec    1.00    378.3±8.76µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.00     93.3±2.29µs        ? ?/sec    1.00     92.9±0.64µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.03   336.0±22.00µs        ? ?/sec    1.00   327.8±11.04µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.00     78.5±1.60µs        ? ?/sec    1.00     78.4±1.23µs        ? ?/sec
like_utf8view scalar complex                                                                             1.00    233.9±2.22ms        ? ?/sec    1.00    233.3±2.12ms        ? ?/sec
like_utf8view scalar contains                                                                            1.00    156.4±1.12ms        ? ?/sec    1.00    157.0±1.76ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.00     43.2±0.29ms        ? ?/sec    1.01     43.5±0.40ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.00     44.0±0.35ms        ? ?/sec    1.01     44.5±0.37ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.00     43.7±0.53ms        ? ?/sec    1.01     44.2±0.24ms        ? ?/sec
like_utf8view scalar equals                                                                              1.00     32.6±0.42ms        ? ?/sec    1.01     32.9±0.68ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.00     44.6±0.75ms        ? ?/sec    1.00     44.6±0.50ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.00     33.0±0.30ms        ? ?/sec    1.00     33.2±0.18ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.04     47.4±4.70ms        ? ?/sec    1.00     45.6±0.28ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1822.0±18.93µs        ? ?/sec    1.00  1817.8±30.55µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      4.4±0.05ms        ? ?/sec    1.00      4.4±0.02ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1974.8±55.42µs        ? ?/sec    1.00  1965.4±11.82µs        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.02   658.2±36.26µs        ? ?/sec    1.00    647.1±4.31µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00      2.3±0.05ms        ? ?/sec    1.00      2.3±0.05ms        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.00  1869.0±59.82µs        ? ?/sec    1.00  1866.0±18.10µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.00      4.4±0.19ms        ? ?/sec    1.09      4.8±0.07ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.00  1991.0±36.72µs        ? ?/sec    1.00  1995.4±75.79µs        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.02   706.8±15.16µs        ? ?/sec    1.00    692.8±4.19µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.00      2.2±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
lt Float32                                                                                               1.00     56.4±1.38µs        ? ?/sec    1.01     57.2±1.52µs        ? ?/sec
lt Int32                                                                                                 1.01     44.9±2.41µs        ? ?/sec    1.00     44.4±0.25µs        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.03     31.8±0.43ms        ? ?/sec    1.00     31.0±0.13ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    682.4±4.86µs        ? ?/sec    1.00    679.7±5.19µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.02    931.0±5.76µs        ? ?/sec    1.00   909.8±24.54µs        ? ?/sec
lt scalar Float32                                                                                        1.00     46.5±0.19µs        ? ?/sec    1.00     46.5±0.60µs        ? ?/sec
lt scalar Int32                                                                                          1.00     44.2±0.26µs        ? ?/sec    1.00     44.4±0.26µs        ? ?/sec
lt scalar StringArray                                                                                    1.01     41.7±0.30ms        ? ?/sec    1.00     41.5±0.27ms        ? ?/sec
lt scalar StringViewArray                                                                                1.00     52.2±0.22ms        ? ?/sec    1.12     58.3±0.35ms        ? ?/sec
lt_eq Float32                                                                                            1.00     56.4±1.48µs        ? ?/sec    1.02     57.4±2.04µs        ? ?/sec
lt_eq Int32                                                                                              1.00     44.3±0.34µs        ? ?/sec    1.01     44.6±0.81µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     46.1±0.38µs        ? ?/sec    1.00     45.9±1.16µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     44.2±0.39µs        ? ?/sec    1.00     44.4±1.11µs        ? ?/sec
neq Float32                                                                                              1.00     44.3±0.33µs        ? ?/sec    1.00     44.4±0.22µs        ? ?/sec
neq Int32                                                                                                1.01     44.7±0.80µs        ? ?/sec    1.00     44.5±0.55µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.02    565.9±7.01µs        ? ?/sec    1.00   555.4±23.12µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.23    970.0±9.32µs        ? ?/sec    1.00    791.9±9.51µs        ? ?/sec
neq scalar Float32                                                                                       1.00     44.4±0.20µs        ? ?/sec    1.00     44.3±0.49µs        ? ?/sec
neq scalar Int32                                                                                         1.01     44.6±1.52µs        ? ?/sec    1.00     44.3±0.20µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.00      3.8±0.15ms        ? ?/sec    1.00      3.8±0.15ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.02      4.6±0.13ms        ? ?/sec    1.00      4.5±0.06ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.00  1119.1±56.26µs        ? ?/sec    1.00  1123.1±51.45µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.06   665.0±31.13µs        ? ?/sec    1.00   625.2±15.91µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00  1013.6±45.74µs        ? ?/sec    1.03  1047.5±58.77µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.08      3.2±0.23ms        ? ?/sec    1.00      2.9±0.10ms        ? ?/sec
nlike_utf8 scalar contains                                                                               1.04  1943.1±42.20µs        ? ?/sec    1.00  1877.0±25.60µs        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.00   383.8±10.76µs        ? ?/sec    1.00    384.6±7.86µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00     93.1±0.77µs        ? ?/sec    1.00     93.0±0.81µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.08   358.3±21.36µs        ? ?/sec    1.00    332.0±9.92µs        ? ?/sec

@Dandandan
Copy link
Contributor Author

eq scalar StringViewArray 13 bytes                                                                       1.39     22.3±0.22ms        ? ?/sec    1.00     16.1±0.18ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.55     22.3±0.47ms        ? ?/sec    1.00     14.4±0.29ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.55     22.3±0.46ms        ? ?/sec    1.00     14.4±0.23ms        ? ?/sec

Seems a good direction

@alamb
Copy link
Contributor

alamb commented Jan 22, 2026

@zhuqi-lucas as you have spent time optimzing string view comparison in other PRs, do you have time to help review this PR as well?

@Dandandan
Copy link
Contributor Author

run benchmark comparison_kernels

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (32be328) to ebace17 diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench comparison_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@zhuqi-lucas
Copy link
Contributor

@zhuqi-lucas as you have spent time optimzing string view comparison in other PRs, do you have time to help review this PR as well?

Nice improvement, i plan to review this PR in few days.

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                    main                                   speedup_comparison_string_view
-----                                                                                                    ----                                   ------------------------------
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.00      2.8±0.06ms        ? ?/sec    1.11      3.1±0.10ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.00      3.1±0.03ms        ? ?/sec    1.04      3.2±0.08ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.00      2.6±0.05ms        ? ?/sec    1.98      5.1±1.22ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.00      2.1±0.04ms        ? ?/sec    1.04      2.2±0.11ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.00      2.8±0.06ms        ? ?/sec    1.02      2.9±0.03ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      3.1±0.12ms        ? ?/sec    1.07      3.3±0.13ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.00      2.6±0.07ms        ? ?/sec    1.02      2.7±0.03ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.00      2.2±0.04ms        ? ?/sec    1.33      2.9±1.08ms        ? ?/sec
eq Float32                                                                                               1.00     44.3±0.30µs        ? ?/sec    1.00     44.2±0.24µs        ? ?/sec
eq Int32                                                                                                 1.00     44.4±0.75µs        ? ?/sec    1.00     44.2±0.22µs        ? ?/sec
eq MonthDayNano                                                                                          1.00     92.3±3.90µs        ? ?/sec    1.01     93.5±4.36µs        ? ?/sec
eq StringArray StringArray                                                                               1.00     31.5±0.32ms        ? ?/sec    1.08     34.0±0.37ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.00     27.0±0.23ms        ? ?/sec    1.02     27.5±0.39ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.14     25.6±0.39ms        ? ?/sec    1.00     22.4±0.27ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.00    802.7±4.65µs        ? ?/sec    1.04    834.5±7.73µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.00    567.3±5.99µs        ? ?/sec    1.01   573.5±11.24µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.17    982.4±5.31µs        ? ?/sec    1.00   843.2±16.19µs        ? ?/sec
eq scalar Float32                                                                                        1.00     44.1±0.12µs        ? ?/sec    1.00     44.2±0.30µs        ? ?/sec
eq scalar Int32                                                                                          1.00     44.2±0.18µs        ? ?/sec    1.00     44.2±0.20µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.00     71.3±0.48µs        ? ?/sec    1.01     71.7±0.74µs        ? ?/sec
eq scalar StringArray                                                                                    1.00     26.5±0.57ms        ? ?/sec    1.14     30.2±0.41ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.12     23.9±0.17ms        ? ?/sec    1.00     21.3±0.36ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.36     24.0±0.36ms        ? ?/sec    1.00     17.7±0.21ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.37     24.1±0.20ms        ? ?/sec    1.00     17.6±0.32ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     77.8±0.79µs        ? ?/sec    1.00     78.0±0.95µs        ? ?/sec
gt Float32                                                                                               1.04     59.1±6.95µs        ? ?/sec    1.00     57.1±0.81µs        ? ?/sec
gt Int32                                                                                                 1.00     44.3±0.21µs        ? ?/sec    1.00     44.3±0.29µs        ? ?/sec
gt scalar Float32                                                                                        1.00     45.9±0.48µs        ? ?/sec    1.00     45.9±1.23µs        ? ?/sec
gt scalar Int32                                                                                          1.00     44.2±0.22µs        ? ?/sec    1.00     44.3±0.61µs        ? ?/sec
gt_eq Float32                                                                                            1.00     57.1±0.48µs        ? ?/sec    1.00     57.1±0.82µs        ? ?/sec
gt_eq Int32                                                                                              1.00     44.3±0.31µs        ? ?/sec    1.00     44.3±0.29µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.00     46.4±0.35µs        ? ?/sec    1.00     46.4±0.39µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.00     44.2±0.57µs        ? ?/sec    1.00     44.2±0.26µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     77.8±0.39µs        ? ?/sec    1.00     77.9±0.45µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.00      3.6±0.09ms        ? ?/sec    2.18      7.8±0.28ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.00      4.3±0.04ms        ? ?/sec    1.31      5.7±0.07ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.00  1110.0±90.92µs        ? ?/sec    1.54  1711.4±88.65µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.00   621.6±46.45µs        ? ?/sec    1.70  1059.3±66.59µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.00   969.9±44.01µs        ? ?/sec    1.68  1627.6±102.09µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     78.6±1.76µs        ? ?/sec    1.00     78.4±1.42µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00      2.9±0.05ms        ? ?/sec    2.27      6.5±0.08ms        ? ?/sec
like_utf8 scalar contains                                                                                1.00  1756.6±16.90µs        ? ?/sec    1.38      2.4±0.03ms        ? ?/sec
like_utf8 scalar ends with                                                                               1.00   407.6±15.26µs        ? ?/sec    1.15   469.6±28.04µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.00    107.7±1.43µs        ? ?/sec    1.00    108.0±0.88µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.00   349.0±19.50µs        ? ?/sec    1.08   378.6±14.58µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.00     78.4±0.68µs        ? ?/sec    1.00     78.2±1.11µs        ? ?/sec
like_utf8view scalar complex                                                                             1.00    235.2±2.49ms        ? ?/sec    1.01    237.1±2.68ms        ? ?/sec
like_utf8view scalar contains                                                                            1.00    161.3±0.76ms        ? ?/sec    1.01    163.1±1.71ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.00     50.1±0.43ms        ? ?/sec    1.04     52.3±0.64ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.00     51.6±0.29ms        ? ?/sec    1.05     54.4±0.54ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.00     51.1±0.47ms        ? ?/sec    1.05     53.7±0.54ms        ? ?/sec
like_utf8view scalar equals                                                                              1.00     34.5±0.29ms        ? ?/sec    1.04     35.8±0.19ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.00     44.9±0.43ms        ? ?/sec    1.06     47.7±0.62ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.00     28.1±0.15ms        ? ?/sec    1.04     29.3±0.23ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.00     45.4±0.26ms        ? ?/sec    1.06     48.2±0.42ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1724.7±20.71µs        ? ?/sec    1.03  1782.0±23.25µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      4.3±0.10ms        ? ?/sec    1.05      4.6±0.08ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1965.7±31.50µs        ? ?/sec    1.03      2.0±0.02ms        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.00   649.2±23.62µs        ? ?/sec    1.03   669.9±18.55µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00      2.2±0.05ms        ? ?/sec    1.10      2.4±0.07ms        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.00  1787.3±12.46µs        ? ?/sec    1.04  1862.1±41.26µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.00      4.4±0.02ms        ? ?/sec    1.05      4.6±0.05ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.00  1999.1±23.01µs        ? ?/sec    1.03      2.1±0.03ms        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.00   695.3±11.65µs        ? ?/sec    1.01   702.4±11.13µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.03      2.3±0.02ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
lt Float32                                                                                               1.01     57.2±0.90µs        ? ?/sec    1.00     56.9±0.44µs        ? ?/sec
lt Int32                                                                                                 1.00     44.4±0.66µs        ? ?/sec    1.00     44.5±1.40µs        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.00     31.9±0.57ms        ? ?/sec    1.01     32.3±0.39ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    699.5±8.71µs        ? ?/sec    1.00   702.9±10.89µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.13    981.9±7.37µs        ? ?/sec    1.00   866.9±13.45µs        ? ?/sec
lt scalar Float32                                                                                        1.00     46.5±0.61µs        ? ?/sec    1.00     46.5±0.26µs        ? ?/sec
lt scalar Int32                                                                                          1.00     44.3±0.37µs        ? ?/sec    1.00     44.4±1.06µs        ? ?/sec
lt scalar StringArray                                                                                    1.00     44.3±0.49ms        ? ?/sec    1.05     46.5±0.83ms        ? ?/sec
lt scalar StringViewArray                                                                                1.43     53.6±1.45ms        ? ?/sec    1.00     37.5±0.63ms        ? ?/sec
lt_eq Float32                                                                                            1.00     56.4±2.50µs        ? ?/sec    1.01     57.2±0.63µs        ? ?/sec
lt_eq Int32                                                                                              1.00     44.3±0.21µs        ? ?/sec    1.00     44.3±0.24µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     45.8±0.30µs        ? ?/sec    1.00     46.0±1.02µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     44.2±0.51µs        ? ?/sec    1.00     44.3±0.58µs        ? ?/sec
neq Float32                                                                                              1.00     44.4±0.79µs        ? ?/sec    1.00     44.3±0.84µs        ? ?/sec
neq Int32                                                                                                1.00     44.3±0.51µs        ? ?/sec    1.01     44.7±0.85µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.00    561.2±6.54µs        ? ?/sec    1.01    564.9±7.43µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.15    984.5±9.31µs        ? ?/sec    1.00   855.6±21.02µs        ? ?/sec
neq scalar Float32                                                                                       1.00     44.1±0.26µs        ? ?/sec    1.00     44.3±0.38µs        ? ?/sec
neq scalar Int32                                                                                         1.00     44.2±0.42µs        ? ?/sec    1.00     44.2±0.17µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.00      3.6±0.08ms        ? ?/sec    1.10      3.9±0.20ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.00      4.4±0.05ms        ? ?/sec    1.27      5.5±0.08ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.00  1098.8±51.73µs        ? ?/sec    1.06  1168.4±72.11µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.00   594.8±46.68µs        ? ?/sec    1.69  1006.8±55.65µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00  1003.5±47.32µs        ? ?/sec    1.12  1122.3±51.96µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.00      2.9±0.04ms        ? ?/sec    2.25      6.6±0.06ms        ? ?/sec
nlike_utf8 scalar contains                                                                               1.00  1754.1±26.40µs        ? ?/sec    1.38      2.4±0.04ms        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.00   408.5±21.11µs        ? ?/sec    1.12   456.5±11.09µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00    108.5±1.92µs        ? ?/sec    1.00    108.4±1.46µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.00   345.5±12.46µs        ? ?/sec    1.12    385.9±8.47µs        ? ?/sec

@Dandandan
Copy link
Contributor Author

run benchmark comparison_kernels

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (fdf9829) to ebace17 diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench comparison_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@Dandandan
Copy link
Contributor Author

run benchmark comparison_kernels

@Dandandan
Copy link
Contributor Author

run benchmark arrow_reader_clickbench

@Dandandan Dandandan changed the title Speed up string view comparison Speed up string view comparison (up to 5x) Jan 23, 2026
@Dandandan Dandandan changed the title Speed up string view comparison (up to 5x) Speed up string view comparison (up to 3x) Jan 23, 2026
@Dandandan
Copy link
Contributor Author

@zhuqi-lucas as you have spent time optimzing string view comparison in other PRs, do you have time to help review this PR as well?

Nice improvement, i plan to review this PR in few days.

I found some more time for improvement 🚀

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                    main                                   speedup_comparison_string_view
-----                                                                                                    ----                                   ------------------------------
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.00      2.8±0.05ms        ? ?/sec    1.11      3.2±0.13ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.00      3.1±0.06ms        ? ?/sec    1.12      3.5±0.08ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.00      2.6±0.04ms        ? ?/sec    1.50      3.9±0.72ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.00      2.2±0.03ms        ? ?/sec    1.14      2.5±0.37ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.00      2.8±0.04ms        ? ?/sec    1.15      3.2±0.07ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      3.1±0.03ms        ? ?/sec    1.09      3.4±0.09ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.00      2.6±0.03ms        ? ?/sec    1.26      3.3±0.62ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.00      2.2±0.03ms        ? ?/sec    1.19      2.6±0.21ms        ? ?/sec
eq Float32                                                                                               1.00     44.4±0.29µs        ? ?/sec    1.00     44.5±0.55µs        ? ?/sec
eq Int32                                                                                                 1.00     44.2±0.49µs        ? ?/sec    1.01     44.8±2.44µs        ? ?/sec
eq MonthDayNano                                                                                          1.00     90.6±1.08µs        ? ?/sec    1.01     91.9±3.17µs        ? ?/sec
eq StringArray StringArray                                                                               1.00     31.6±0.38ms        ? ?/sec    1.06     33.6±0.53ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.00     27.1±0.18ms        ? ?/sec    1.02     27.5±0.36ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.15     25.4±0.23ms        ? ?/sec    1.00     22.2±0.20ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.00    806.9±7.73µs        ? ?/sec    1.03    834.8±4.84µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.01    569.3±8.27µs        ? ?/sec    1.00   565.2±10.56µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.16   987.6±13.47µs        ? ?/sec    1.00   854.1±15.52µs        ? ?/sec
eq scalar Float32                                                                                        1.00     44.2±0.31µs        ? ?/sec    1.00     44.4±0.77µs        ? ?/sec
eq scalar Int32                                                                                          1.00     44.3±1.74µs        ? ?/sec    1.00     44.3±1.04µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.40     71.4±0.32µs        ? ?/sec    1.00     51.1±0.87µs        ? ?/sec
eq scalar StringArray                                                                                    1.00     27.1±0.48ms        ? ?/sec    1.06     28.8±0.24ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.07     24.0±0.24ms        ? ?/sec    1.00     22.5±0.29ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.44     24.0±0.13ms        ? ?/sec    1.00     16.6±0.18ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.44     24.0±0.14ms        ? ?/sec    1.00     16.6±0.28ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     77.9±1.31µs        ? ?/sec    1.00     77.7±0.46µs        ? ?/sec
gt Float32                                                                                               1.00     57.0±1.02µs        ? ?/sec    1.00     57.0±1.30µs        ? ?/sec
gt Int32                                                                                                 1.00     44.5±0.82µs        ? ?/sec    1.00     44.3±0.51µs        ? ?/sec
gt scalar Float32                                                                                        1.00     45.6±0.43µs        ? ?/sec    1.01     46.0±0.68µs        ? ?/sec
gt scalar Int32                                                                                          1.00     44.2±0.49µs        ? ?/sec    1.00     44.2±0.25µs        ? ?/sec
gt_eq Float32                                                                                            1.01     57.2±1.34µs        ? ?/sec    1.00     56.6±1.30µs        ? ?/sec
gt_eq Int32                                                                                              1.01     44.5±0.99µs        ? ?/sec    1.00     44.2±0.17µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.00     46.4±0.21µs        ? ?/sec    1.00     46.2±0.55µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.00     44.4±1.36µs        ? ?/sec    1.00     44.5±1.75µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     78.0±1.67µs        ? ?/sec    1.00     77.7±0.25µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.00      3.7±0.06ms        ? ?/sec    1.00      3.7±0.13ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.00      4.4±0.05ms        ? ?/sec    1.00      4.5±0.09ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.06  1139.5±60.44µs        ? ?/sec    1.00  1079.9±49.48µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.00   627.8±27.72µs        ? ?/sec    1.00   629.9±35.91µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.00  1026.1±35.81µs        ? ?/sec    1.04  1063.6±55.56µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     78.3±0.54µs        ? ?/sec    1.00     78.4±0.96µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00      2.9±0.04ms        ? ?/sec    1.09      3.2±0.09ms        ? ?/sec
like_utf8 scalar contains                                                                                1.00  1777.9±11.13µs        ? ?/sec    1.03  1823.8±20.52µs        ? ?/sec
like_utf8 scalar ends with                                                                               1.01   414.5±15.82µs        ? ?/sec    1.00   410.9±12.50µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.01    108.3±2.54µs        ? ?/sec    1.00    107.7±0.77µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.00   346.1±14.90µs        ? ?/sec    1.06   368.0±15.78µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.00     78.1±0.30µs        ? ?/sec    1.00     78.3±0.78µs        ? ?/sec
like_utf8view scalar complex                                                                             1.00    235.8±1.69ms        ? ?/sec    1.01    237.2±3.08ms        ? ?/sec
like_utf8view scalar contains                                                                            1.00    161.5±1.80ms        ? ?/sec    1.02    165.0±1.21ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.00     49.8±0.33ms        ? ?/sec    1.02     51.1±0.21ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.00     51.5±0.44ms        ? ?/sec    1.03     52.9±0.35ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.00     51.0±0.44ms        ? ?/sec    1.03     52.4±0.60ms        ? ?/sec
like_utf8view scalar equals                                                                              1.00     34.5±0.52ms        ? ?/sec    1.00     34.6±0.26ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.00     44.7±0.58ms        ? ?/sec    1.04     46.6±0.86ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.00     28.0±0.17ms        ? ?/sec    1.04     29.2±0.21ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.00     45.3±0.54ms        ? ?/sec    1.03     46.6±0.69ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1727.2±27.11µs        ? ?/sec    1.00  1726.4±10.58µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      4.3±0.04ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1959.9±24.88µs        ? ?/sec    1.04      2.0±0.14ms        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.01   646.4±14.30µs        ? ?/sec    1.00   643.0±10.67µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00      2.2±0.01ms        ? ?/sec    1.00      2.2±0.05ms        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.00  1785.8±16.03µs        ? ?/sec    1.01  1803.1±26.24µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.02      4.4±0.04ms        ? ?/sec    1.00      4.4±0.11ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.00  1988.7±22.63µs        ? ?/sec    1.01      2.0±0.06ms        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.00    693.8±7.01µs        ? ?/sec    1.00    694.1±6.53µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.01      2.2±0.04ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
lt Float32                                                                                               1.00     57.1±0.69µs        ? ?/sec    1.00     57.1±1.20µs        ? ?/sec
lt Int32                                                                                                 1.00     44.4±0.37µs        ? ?/sec    1.00     44.3±0.31µs        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.14     31.9±0.36ms        ? ?/sec    1.00     28.0±0.48ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.02    699.8±6.64µs        ? ?/sec    1.00    685.4±6.16µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.36   980.8±12.37µs        ? ?/sec    1.00    723.7±7.15µs        ? ?/sec
lt scalar Float32                                                                                        1.00     46.3±0.77µs        ? ?/sec    1.00     46.2±0.50µs        ? ?/sec
lt scalar Int32                                                                                          1.00     44.2±0.36µs        ? ?/sec    1.00     44.2±0.44µs        ? ?/sec
lt scalar StringArray                                                                                    1.00     44.4±0.35ms        ? ?/sec    1.02     45.4±0.38ms        ? ?/sec
lt scalar StringViewArray                                                                                1.62     53.6±0.84ms        ? ?/sec    1.00     33.1±0.16ms        ? ?/sec
lt_eq Float32                                                                                            1.00     57.2±0.54µs        ? ?/sec    1.00     57.2±1.31µs        ? ?/sec
lt_eq Int32                                                                                              1.01     44.6±2.62µs        ? ?/sec    1.00     44.3±0.30µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     45.8±0.83µs        ? ?/sec    1.00     45.9±0.55µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     44.2±0.76µs        ? ?/sec    1.00     44.2±0.23µs        ? ?/sec
neq Float32                                                                                              1.01     44.5±1.76µs        ? ?/sec    1.00     44.3±0.30µs        ? ?/sec
neq Int32                                                                                                1.00     44.3±0.63µs        ? ?/sec    1.00     44.4±0.91µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.00    564.0±7.64µs        ? ?/sec    1.00    563.2±4.88µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.16    984.4±4.82µs        ? ?/sec    1.00   846.8±10.62µs        ? ?/sec
neq scalar Float32                                                                                       1.00     44.3±0.17µs        ? ?/sec    1.00     44.4±0.94µs        ? ?/sec
neq scalar Int32                                                                                         1.00     44.2±0.44µs        ? ?/sec    1.00     44.2±0.25µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.00      3.6±0.11ms        ? ?/sec    1.08      3.9±0.28ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.00      4.4±0.09ms        ? ?/sec    1.06      4.7±0.13ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.00  1116.5±57.35µs        ? ?/sec    1.03  1151.6±48.74µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.00   609.0±35.91µs        ? ?/sec    1.01   616.8±26.98µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00  1036.4±55.21µs        ? ?/sec    1.09  1131.5±73.76µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.00      2.9±0.05ms        ? ?/sec    1.03      3.0±0.07ms        ? ?/sec
nlike_utf8 scalar contains                                                                               1.00  1784.4±29.01µs        ? ?/sec    1.02  1812.2±50.10µs        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.01   417.0±17.59µs        ? ?/sec    1.00   411.6±16.25µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00    109.0±2.73µs        ? ?/sec    1.00    108.5±1.99µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.00   352.4±14.67µs        ? ?/sec    1.03   361.5±12.57µs        ? ?/sec

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (27b3097) to ebace17 diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench comparison_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@Dandandan Dandandan marked this pull request as ready for review January 23, 2026 08:53
@Dandandan
Copy link
Contributor Author

On my machine (Apple M2) it makes more of a difference, but still overall improvements (also I think the prefix change might make more of a difference.in real life due to avoiding random access).

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                    main                                   speedup_comparison_string_view
-----                                                                                                    ----                                   ------------------------------
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.00      2.8±0.06ms        ? ?/sec    1.06      3.0±0.10ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.00      3.1±0.07ms        ? ?/sec    1.05      3.2±0.05ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.00      2.6±0.03ms        ? ?/sec    1.05      2.7±0.13ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.01      2.2±0.08ms        ? ?/sec    1.00      2.1±0.03ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.00      2.8±0.03ms        ? ?/sec    1.23      3.4±0.09ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      3.1±0.04ms        ? ?/sec    1.17      3.6±0.07ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.00      2.6±0.05ms        ? ?/sec    1.26      3.3±0.48ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.00      2.2±0.03ms        ? ?/sec    1.21      2.6±0.24ms        ? ?/sec
eq Float32                                                                                               1.00     44.4±0.29µs        ? ?/sec    1.00     44.3±0.35µs        ? ?/sec
eq Int32                                                                                                 1.00     44.5±0.67µs        ? ?/sec    1.00     44.4±0.34µs        ? ?/sec
eq MonthDayNano                                                                                          1.02     93.2±4.60µs        ? ?/sec    1.00     91.3±1.11µs        ? ?/sec
eq StringArray StringArray                                                                               1.11     35.5±0.26ms        ? ?/sec    1.00     32.0±0.46ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.07     28.7±0.21ms        ? ?/sec    1.00     26.8±0.20ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.25     27.5±0.34ms        ? ?/sec    1.00     22.0±0.24ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.03    842.0±8.61µs        ? ?/sec    1.00    818.6±4.18µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.02   577.0±10.98µs        ? ?/sec    1.00   563.9±11.45µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.29  1001.1±10.47µs        ? ?/sec    1.00    773.9±8.07µs        ? ?/sec
eq scalar Float32                                                                                        1.00     44.3±0.13µs        ? ?/sec    1.00     44.3±0.48µs        ? ?/sec
eq scalar Int32                                                                                          1.00     44.4±0.42µs        ? ?/sec    1.00     44.3±0.52µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.41     71.8±0.50µs        ? ?/sec    1.00     50.9±1.12µs        ? ?/sec
eq scalar StringArray                                                                                    1.13     31.1±0.19ms        ? ?/sec    1.00     27.6±0.53ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.48     25.5±0.19ms        ? ?/sec    1.00     17.2±0.30ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.56     25.4±0.18ms        ? ?/sec    1.00     16.3±0.20ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.56     25.5±0.22ms        ? ?/sec    1.00     16.4±0.20ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     77.8±0.84µs        ? ?/sec    1.01     78.6±3.52µs        ? ?/sec
gt Float32                                                                                               1.00     57.0±1.71µs        ? ?/sec    1.01     57.7±0.24µs        ? ?/sec
gt Int32                                                                                                 1.00     44.5±0.44µs        ? ?/sec    1.00     44.4±0.38µs        ? ?/sec
gt scalar Float32                                                                                        1.00     46.1±0.39µs        ? ?/sec    1.00     45.9±0.72µs        ? ?/sec
gt scalar Int32                                                                                          1.00     44.4±0.30µs        ? ?/sec    1.00     44.2±0.48µs        ? ?/sec
gt_eq Float32                                                                                            1.00     57.0±0.93µs        ? ?/sec    1.01     57.4±0.39µs        ? ?/sec
gt_eq Int32                                                                                              1.00     44.5±0.29µs        ? ?/sec    1.00     44.3±0.48µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.01     46.6±0.12µs        ? ?/sec    1.00     46.3±0.32µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.01     44.4±0.40µs        ? ?/sec    1.00     44.2±0.20µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     78.0±1.46µs        ? ?/sec    1.00     78.3±1.17µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.02      3.6±0.09ms        ? ?/sec    1.00      3.5±0.07ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.02      4.4±0.07ms        ? ?/sec    1.00      4.4±0.05ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.00  1052.6±43.42µs        ? ?/sec    1.00  1051.3±42.81µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.00   597.5±38.11µs        ? ?/sec    1.13   673.4±37.33µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.05  1012.7±64.69µs        ? ?/sec    1.00   967.1±33.59µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     79.1±5.36µs        ? ?/sec    1.00     79.0±0.36µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00      2.9±0.09ms        ? ?/sec    1.01      2.9±0.08ms        ? ?/sec
like_utf8 scalar contains                                                                                1.06  1864.6±43.97µs        ? ?/sec    1.00  1757.0±29.46µs        ? ?/sec
like_utf8 scalar ends with                                                                               1.00   415.8±18.45µs        ? ?/sec    1.01   421.3±17.94µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.00    107.9±0.49µs        ? ?/sec    1.00    107.5±0.44µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.01   363.5±19.30µs        ? ?/sec    1.00   360.6±16.06µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.00     78.1±0.33µs        ? ?/sec    1.01     78.8±0.97µs        ? ?/sec
like_utf8view scalar complex                                                                             1.00    235.3±2.02ms        ? ?/sec    1.02    240.2±2.47ms        ? ?/sec
like_utf8view scalar contains                                                                            1.01    161.5±2.41ms        ? ?/sec    1.00    159.4±1.19ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.00     50.1±0.45ms        ? ?/sec    1.02     50.9±0.32ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.00     51.9±0.32ms        ? ?/sec    1.02     52.8±0.50ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.00     51.7±0.86ms        ? ?/sec    1.02     52.5±0.47ms        ? ?/sec
like_utf8view scalar equals                                                                              1.00     34.3±0.22ms        ? ?/sec    1.00     34.5±0.37ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.00     45.0±0.22ms        ? ?/sec    1.02     46.0±0.77ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.00     27.7±0.27ms        ? ?/sec    1.05     29.0±0.15ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.00     45.7±0.51ms        ? ?/sec    1.02     46.7±0.72ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1729.7±26.00µs        ? ?/sec    1.00  1729.0±29.29µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      4.3±0.08ms        ? ?/sec    1.01      4.4±0.04ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1953.4±10.19µs        ? ?/sec    1.00  1956.4±15.90µs        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.00    642.2±8.82µs        ? ?/sec    1.00    644.6±9.96µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00      2.2±0.01ms        ? ?/sec    1.07      2.3±0.10ms        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.01  1802.4±52.29µs        ? ?/sec    1.00   1784.5±7.66µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.00      4.4±0.03ms        ? ?/sec    1.00      4.4±0.13ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.00  1993.4±43.90µs        ? ?/sec    1.00  1990.6±21.81µs        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.00   696.4±26.86µs        ? ?/sec    1.00    694.9±4.49µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.01      2.2±0.05ms        ? ?/sec    1.00      2.2±0.05ms        ? ?/sec
lt Float32                                                                                               1.00     57.4±0.87µs        ? ?/sec    1.00     57.3±1.15µs        ? ?/sec
lt Int32                                                                                                 1.00     44.5±0.24µs        ? ?/sec    1.00     44.5±0.43µs        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.16     33.1±0.28ms        ? ?/sec    1.00     28.5±0.19ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    705.8±7.38µs        ? ?/sec    1.00   704.3±19.18µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.38   994.3±13.88µs        ? ?/sec    1.00    718.1±5.36µs        ? ?/sec
lt scalar Float32                                                                                        1.00     46.6±0.54µs        ? ?/sec    1.00     46.4±0.46µs        ? ?/sec
lt scalar Int32                                                                                          1.00     44.3±0.15µs        ? ?/sec    1.00     44.3±0.44µs        ? ?/sec
lt scalar StringArray                                                                                    1.07     48.5±0.22ms        ? ?/sec    1.00     45.3±0.76ms        ? ?/sec
lt scalar StringViewArray                                                                                1.47     54.9±0.24ms        ? ?/sec    1.00     37.3±0.14ms        ? ?/sec
lt_eq Float32                                                                                            1.00     57.1±0.79µs        ? ?/sec    1.01     57.7±1.98µs        ? ?/sec
lt_eq Int32                                                                                              1.01     44.6±0.25µs        ? ?/sec    1.00     44.3±0.26µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     46.1±0.47µs        ? ?/sec    1.00     46.1±0.29µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     44.4±0.31µs        ? ?/sec    1.00     44.3±0.22µs        ? ?/sec
neq Float32                                                                                              1.00     44.4±0.16µs        ? ?/sec    1.00     44.4±0.50µs        ? ?/sec
neq Int32                                                                                                1.01     44.6±0.42µs        ? ?/sec    1.00     44.3±0.21µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.02   575.5±13.39µs        ? ?/sec    1.00    566.4±7.00µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.30  1003.2±16.90µs        ? ?/sec    1.00    774.4±5.04µs        ? ?/sec
neq scalar Float32                                                                                       1.00     44.4±0.25µs        ? ?/sec    1.00     44.4±0.92µs        ? ?/sec
neq scalar Int32                                                                                         1.00     44.4±0.22µs        ? ?/sec    1.00     44.2±0.75µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.00      3.6±0.13ms        ? ?/sec    1.00      3.6±0.08ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.00      4.4±0.06ms        ? ?/sec    1.01      4.4±0.09ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.00  1025.6±16.46µs        ? ?/sec    1.11  1141.5±54.78µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.00   574.7±22.97µs        ? ?/sec    1.17   671.5±33.88µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00   999.9±50.49µs        ? ?/sec    1.02  1022.8±58.89µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.00      2.9±0.10ms        ? ?/sec    1.08      3.1±0.13ms        ? ?/sec
nlike_utf8 scalar contains                                                                               1.00  1751.3±26.43µs        ? ?/sec    1.16      2.0±0.27ms        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.00   406.8±17.88µs        ? ?/sec    1.00    406.3±7.67µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00    108.3±1.07µs        ? ?/sec    1.00    107.8±1.41µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.00   348.5±12.49µs        ? ?/sec    1.09   378.7±10.79µs        ? ?/sec

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (27b3097) to ebace17 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@Dandandan
Copy link
Contributor Author

eq StringViewArray StringViewArray                                                                       1.07     28.7±0.21ms        ? ?/sec    1.00     26.8±0.20ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.25     27.5±0.34ms        ? ?/sec    1.00     22.0±0.24ms        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.29  1001.1±10.47µs        ? ?/sec    1.00    773.9±8.07µs        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.48     25.5±0.19ms        ? ?/sec    1.00     17.2±0.30ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.56     25.4±0.18ms        ? ?/sec    1.00     16.3±0.20ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.56     25.5±0.22ms        ? ?/sec    1.00     16.4±0.20ms        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.16     33.1±0.28ms        ? ?/sec    1.00     28.5±0.19ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    705.8±7.38µs        ? ?/sec    1.00   704.3±19.18µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.38   994.3±13.88µs        ? ?/sec    1.00    718.1±5.36µs        ? ?/sec
lt scalar StringViewArray                                                                                1.47     54.9±0.24ms        ? ?/sec    1.00     37.3±0.14ms        ? ?/sec

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                main                                   speedup_comparison_string_view
-----                                ----                                   ------------------------------
arrow_reader_clickbench/async/Q1     1.04      2.4±0.04ms        ? ?/sec    1.00      2.3±0.03ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.34     16.3±0.21ms        ? ?/sec    1.00     12.1±0.20ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.27     17.5±0.36ms        ? ?/sec    1.00     13.7±0.19ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.18     28.0±0.25ms        ? ?/sec    1.00     23.7±0.57ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.15     33.6±0.36ms        ? ?/sec    1.00     29.2±0.31ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.15     30.3±0.32ms        ? ?/sec    1.00     26.3±0.66ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.12      5.8±0.12ms        ? ?/sec    1.00      5.2±0.07ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.38    154.6±1.86ms        ? ?/sec    1.00    112.1±1.52ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.01    164.5±1.34ms        ? ?/sec    1.00    163.5±2.24ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.08   333.4±13.44ms        ? ?/sec    1.00   308.2±19.56ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.01    408.0±3.49ms        ? ?/sec    1.00    403.4±5.84ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.11     35.7±0.40ms        ? ?/sec    1.00     32.0±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.01     98.1±0.45ms        ? ?/sec    1.00     97.1±1.68ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.02     97.6±0.56ms        ? ?/sec    1.00     95.7±0.92ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.09     31.0±0.53ms        ? ?/sec    1.00     28.5±0.51ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.01    106.8±1.13ms        ? ?/sec    1.00    105.5±0.91ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.01     84.1±1.12ms        ? ?/sec    1.00     83.4±1.25ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.01     32.3±0.66ms        ? ?/sec    1.00     31.9±0.25ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.02     45.6±0.70ms        ? ?/sec    1.00     44.6±0.51ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.03     27.5±0.33ms        ? ?/sec    1.00     26.8±0.29ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.03     22.4±0.41ms        ? ?/sec    1.00     21.8±0.23ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     10.8±0.12ms        ? ?/sec    1.00     10.8±0.14ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.0±0.03ms        ? ?/sec    1.00      2.0±0.07ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.12     10.1±0.36ms        ? ?/sec    1.00      9.1±0.10ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.10     11.7±0.27ms        ? ?/sec    1.00     10.6±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.06     33.2±0.60ms        ? ?/sec    1.00     31.2±0.53ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     40.8±3.30ms        ? ?/sec    1.09     44.5±1.27ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     35.8±0.49ms        ? ?/sec    1.15     41.1±0.67ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.00      4.2±0.12ms        ? ?/sec    1.00      4.2±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    171.4±1.44ms        ? ?/sec    1.00    172.0±1.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    224.2±5.78ms        ? ?/sec    1.02    228.4±2.18ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    466.8±3.43ms        ? ?/sec    1.00    465.3±2.94ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   431.5±25.34ms        ? ?/sec    1.00   432.1±20.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.07     46.1±1.11ms        ? ?/sec    1.00     42.9±1.62ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.05    154.8±4.00ms        ? ?/sec    1.00    147.3±1.16ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.05    150.2±2.94ms        ? ?/sec    1.00    143.5±1.47ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.22     34.8±0.23ms        ? ?/sec    1.00     28.6±0.50ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.08    160.0±2.40ms        ? ?/sec    1.00    147.6±1.27ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.06     91.3±0.99ms        ? ?/sec    1.00     85.8±0.63ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.07     30.1±0.33ms        ? ?/sec    1.00     28.0±0.26ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.16     37.4±0.66ms        ? ?/sec    1.00     32.3±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.14     28.9±0.53ms        ? ?/sec    1.00     25.5±0.22ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.11     31.4±0.50ms        ? ?/sec    1.00     28.3±0.82ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.06     12.8±0.17ms        ? ?/sec    1.00     12.1±0.13ms        ? ?/sec

@Dandandan
Copy link
Contributor Author

Improvements on clickbench seem to correlate well with ClickBenchPredicate::not_empty

@jhorstmann
Copy link
Contributor

Previous optimization attempt, mostly for the empty string case: #7767

@Dandandan
Copy link
Contributor Author

run benchmark arrow_reader_clickbench

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_comparison_string_view (233fb27) to 3c6ca57 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=speedup_comparison_string_view
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                main                                   speedup_comparison_string_view
-----                                ----                                   ------------------------------
arrow_reader_clickbench/async/Q1     1.01      2.3±0.04ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.09     11.6±0.26ms        ? ?/sec    1.00     10.6±0.16ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.08     13.2±0.16ms        ? ?/sec    1.00     12.3±0.21ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.07     24.0±0.35ms        ? ?/sec    1.00     22.4±0.76ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.05     29.4±0.43ms        ? ?/sec    1.00     28.0±0.51ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.05     26.6±0.33ms        ? ?/sec    1.00     25.4±0.56ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.3±0.07ms        ? ?/sec    1.04      5.5±0.11ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.02    155.9±1.79ms        ? ?/sec    1.00    153.3±1.39ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00    164.4±2.39ms        ? ?/sec    1.04    171.3±1.38ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.05   315.5±28.74ms        ? ?/sec    1.00   299.6±46.85ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    410.0±2.42ms        ? ?/sec    1.01    415.0±2.31ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.03     33.0±0.43ms        ? ?/sec    1.00     32.1±0.87ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    100.0±0.56ms        ? ?/sec    1.01    101.1±0.98ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.01     99.5±2.36ms        ? ?/sec    1.00     98.9±0.91ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.04     29.4±0.58ms        ? ?/sec    1.00     28.1±0.66ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.00    108.5±0.69ms        ? ?/sec    1.02    110.2±1.07ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00     84.7±0.80ms        ? ?/sec    1.01     85.6±0.50ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     32.9±0.52ms        ? ?/sec    1.02     33.7±0.30ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     46.0±0.69ms        ? ?/sec    1.02     46.9±0.63ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.00     27.7±0.48ms        ? ?/sec    1.04     28.8±0.41ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.00     22.1±0.26ms        ? ?/sec    1.05     23.2±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     10.7±0.13ms        ? ?/sec    1.03     11.0±0.13ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.1±0.05ms        ? ?/sec    1.00      2.0±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.12      8.7±0.07ms        ? ?/sec    1.00      7.7±0.16ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.11     10.2±0.15ms        ? ?/sec    1.00      9.2±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     31.8±2.23ms        ? ?/sec    1.02     32.6±2.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.35     44.5±1.09ms        ? ?/sec    1.00     33.1±1.44ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.27     39.0±0.78ms        ? ?/sec    1.00     30.8±0.24ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.00      4.1±0.05ms        ? ?/sec    1.01      4.2±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.01    177.7±1.18ms        ? ?/sec    1.00    175.6±1.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    231.9±5.08ms        ? ?/sec    1.01    234.9±1.82ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    476.5±3.74ms        ? ?/sec    1.00    476.3±3.62ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.03   442.2±21.44ms        ? ?/sec    1.00   427.3±14.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.07     42.4±0.88ms        ? ?/sec    1.00     39.5±0.54ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.01    153.8±1.43ms        ? ?/sec    1.00    152.9±1.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.01    148.7±1.38ms        ? ?/sec    1.00    147.7±1.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.08     29.1±0.36ms        ? ?/sec    1.00     27.1±0.36ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    152.8±1.31ms        ? ?/sec    1.00    152.4±1.18ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     86.6±1.42ms        ? ?/sec    1.00     86.5±1.20ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     28.9±0.28ms        ? ?/sec    1.00     28.9±0.58ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.01     33.5±0.39ms        ? ?/sec    1.00     33.1±0.58ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     25.9±0.30ms        ? ?/sec    1.00     25.9±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     28.5±0.24ms        ? ?/sec    1.00     28.6±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     12.2±0.07ms        ? ?/sec    1.00     12.2±0.21ms        ? ?/sec

Copy link
Contributor

@zhuqi-lucas zhuqi-lucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work @Dandandan , the benchmark result is promising, and the code is more clean.

/// - The combination ensures that a single `u128` comparison correctly orders by inline data
/// first (lexicographically), then by length (numerically).
#[inline(always)]
pub fn inline_key_fast(raw: u128) -> u128 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datafusion seems also use this function, we also need to run datafusion based this change to make sure it's not breaking something.

if l_len == 0 && r_len == 0 {

// Both are empty
if l_len == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this will compile to same code finally, but it's more clean here, i agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...

@Dandandan Dandandan merged commit 0258ec0 into apache:main Jan 23, 2026
26 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 23, 2026

I have a few more tests I would like to propose:

Dandandan pushed a commit that referenced this pull request Jan 24, 2026
# Which issue does this PR close?

- Follow on to #9250

# Rationale for this change

While (posthumously) reviewing
#9250 from @Dandandan and
@zhuqi-lucas I noticed that some of the special case branches are not
covered.

# What changes are included in this PR?

Add some more tests to cover all the special cases

# Are these changes tested?

Yes, only tests

# Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.

If there are any breaking changes to public APIs, please call them out.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Speed up string view comparison

5 participants