Skip to content

Sort: improve some of the performances#11412

Merged
cakebaker merged 3 commits intouutils:mainfrom
sylvestre:sort-perf
Mar 20, 2026
Merged

Sort: improve some of the performances#11412
cakebaker merged 3 commits intouutils:mainfrom
sylvestre:sort-perf

Conversation

@sylvestre
Copy link
Contributor

Closes: #11258

Expose ICU4X's write_sort_key_utf8_to via a public helper that appends
a collation sort key to a caller-supplied buffer. This enables computing
sort keys once per line (O(n)) and then comparing cheap byte arrays
during sorting, instead of calling compare_utf8 on every comparison.
Add collation_key_buffer (arena) and collation_key_ends (offsets) to
LineData and RecycledChunk, with a collation_key() accessor. All sort
keys for a chunk are stored in a single Vec<u8> to avoid millions of
small heap allocations.
Add a fast_locale_collation path that pre-computes ICU sort keys during
line parsing (O(n)), then uses cheap byte-array comparison during
sorting (O(n log n)). This replaces per-comparison ICU compare_utf8
calls for the common case of plain `sort` with a UTF-8 locale.

With 1M lines and LC_ALL=en_US.UTF-8, this is ~2.6x faster than
GNU sort (314ms vs 822ms).
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 19, 2026

Merging this PR will degrade performance by 45.43%

⚡ 1 improved benchmark
❌ 2 regressed benchmarks
✅ 295 untouched benchmarks
⏩ 48 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory sort_german_de_locale 1.8 MB 3.3 MB -45.43%
Simulation sort_ascii_utf8_locale 15.2 ms 16.1 ms -5.87%
Simulation sort_german_de_locale 640.5 ms 131.6 ms ×4.9

Comparing sylvestre:sort-perf (d6baba4) with main (0b437e9)

Open in CodSpeed

Footnotes

  1. 48 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sylvestre sylvestre marked this pull request as ready for review March 19, 2026 20:46
@sylvestre sylvestre requested a review from cakebaker March 19, 2026 20:46
@sylvestre
Copy link
Contributor Author

i am fine trading memory vs speed :)

@cakebaker cakebaker merged commit d830e84 into uutils:main Mar 20, 2026
158 of 160 checks passed
@sylvestre sylvestre deleted the sort-perf branch March 20, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test case where uutils sort is 25 times slower than GNU sort

2 participants