Skip to content

perf: Optimize lower, upper for sliced arrays#21814

Merged
comphead merged 1 commit intoapache:mainfrom
neilconway:neilc/perf-lower-upper-sliced-arrays
Apr 24, 2026
Merged

perf: Optimize lower, upper for sliced arrays#21814
comphead merged 1 commit intoapache:mainfrom
neilconway:neilc/perf-lower-upper-sliced-arrays

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

case_conversion_ascii_array operates directly on the underlying values buffer, but it neglects to ensure it only looks at bytes within the visible slice. For sliced arrays, this can lead to doing substantial unnecessary work.

What changes are included in this PR?

  • Optimize case_conversion_ascii_array for sliced arrays
  • Add a unit test
  • Add a benchmark. We can make the "sliced array" case arbitrarily extreme, so the raw benchmark number here is less important; it is more important that this benchmark confirms that the work we do scales with the visible size of a sliced array, which it does.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions Bot added the functions Changes to functions implementation label Apr 23, 2026
Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neilconway
would have been awesome if benchmarks details added

@neilconway
Copy link
Copy Markdown
Contributor Author

@comphead

  Sliced ASCII
  - parent=8192, slice=128, str_len=32: main 11.83 µs → branch 344 ns — ~34× faster (−97.2%)
  - parent=65536, slice=128, str_len=32: main 93 µs → branch 337 ns — ~275× faster (−99.65%)
  - parent=65536, slice=1024, str_len=32: main 94 µs → branch 1.66 µs — ~57× faster (−98.3%)

  Non-sliced ASCII (lower_all_values_are_ascii)
  - size 1024: main 1.68 µs → branch 1.62 µs — −4.1%
  - size 4096: main 6.27 µs → branch 6.27 µs — −2.0%
  - size 8192: main 12.28 µs → branch 12.21 µs — −0.5%

  Non-sliced non-ASCII, first row non-ASCII (lower_the_first_value_is_nonascii)
  - size 1024: main 20.89 µs → branch 19.79 µs — −5.3%
  - size 4096: main 84.54 µs → branch 79.90 µs — −5.5%
  - size 8192: main 156.10 µs → branch 160.77 µs — +2.5%

  Non-sliced non-ASCII, middle row non-ASCII (lower_the_middle_value_is_nonascii)
  - size 1024: main 20.72 µs → branch 20.30 µs — +0.06% (noise)
  - size 4096: main 84.68 µs → branch 85.42 µs — +1.3%
  - size 8192: main 167.11 µs → branch 165.70 µs — +0.8% (noise)

@comphead comphead added this pull request to the merge queue Apr 24, 2026
Merged via the queue into apache:main with commit 7d5ddca Apr 24, 2026
31 checks passed
@neilconway neilconway deleted the neilc/perf-lower-upper-sliced-arrays branch April 24, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lower, upper is inefficient for sliced arrays

2 participants