Skip to content

sort: Replace malloc and 0 fill with huge reserve & min 0 fill#10975

Open
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:sort-malloc0
Open

sort: Replace malloc and 0 fill with huge reserve & min 0 fill#10975
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:sort-malloc0

Conversation

@oech3
Copy link
Contributor

@oech3 oech3 commented Feb 16, 2026

Found at #10954

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 16, 2026

Merging this PR will improve performance by 64.53%

⚡ 22 improved benchmarks
✅ 266 untouched benchmarks
⏩ 38 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation sort_german_c_locale 37.3 ms 36.2 ms +3.1%
Simulation sort_ascii_utf8_locale 17.6 ms 15.8 ms +11.51%
Simulation sort_ascii_c_locale 18.2 ms 16 ms +13.91%
Simulation sort_unique_utf8_locale 37.3 ms 36.2 ms +3.24%
Simulation sort_long_line[160000] 1,434.1 µs 871.6 µs +64.53%
Memory sort_general_numeric[200000] 23.6 MB 22.8 MB +3.3%
Memory sort_mixed_utf8_locale 2.6 MB 2.3 MB +11.26%
Memory sort_dictionary_order[500000] 22 MB 20.5 MB +7.08%
Memory sort_long_line[160000] 702 KB 604.4 KB +16.16%
Memory sort_mixed_c_locale 2.6 MB 2.3 MB +11.26%
Memory sort_german_de_locale 2.7 MB 1.8 MB +49.14%
Memory sort_reverse_locale[500000] 21.6 MB 20.5 MB +5.27%
Memory sort_unique_utf8_locale 3.7 MB 3.5 MB +7.53%
Memory sort_mixed_data[500000] 21.9 MB 20.5 MB +7.1%
Memory sort_ascii_only[500000] 21.6 MB 20.4 MB +5.58%
Memory sort_unique_locale[500000] 33 MB 31.9 MB +3.38%
Memory sort_numeric[500000] 44.8 MB 43.3 MB +3.34%
Memory sort_accented_data[500000] 21.6 MB 20.5 MB +5.27%
Memory sort_ascii_utf8_locale 5.1 MB 4.6 MB +10.16%
Memory sort_german_c_locale 2.7 MB 1.8 MB +49.14%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing oech3:sort-malloc0 (635e73f) with main (298e147)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@oech3 oech3 marked this pull request as ready for review February 16, 2026 12:20
} else if estimated == 1 {
const LINE_LEN_HINT: usize = 32;
estimated = (read.len() / LINE_LEN_HINT).max(1);
const LINE_LEN_HINT: usize = 128;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattsu2020 how did you decide to use 32 for LINE_LEN_HINT in the existing code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can define const MAYBE_LN_CACHE_SIZE: usize for everywhere.

@oech3 oech3 marked this pull request as draft February 16, 2026 13:49
@oech3 oech3 force-pushed the sort-malloc0 branch 4 times, most recently from e875422 to 204214a Compare February 16, 2026 14:37
@oech3
Copy link
Contributor Author

oech3 commented Feb 16, 2026

I think mimalloc benchmark is useful to investigate malloc cost even if we decided to not adopt it.

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/pr/bounded-memory is now being skipped but was previously passing.

@oech3 oech3 marked this pull request as ready for review February 16, 2026 15:17
if buffer.len() < carry_over.len() {
buffer.resize(carry_over.len() + 10 * 1024, 0);
buffer.resize(carry_over.len(), 0);
let new_len = (carry_over.len() * 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment explaining why

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

GeneralBigDecimalParseResult, GlobalSettings, Line, SortMode, numeric_str_cmp::NumInfo,
};

const MAYBE_LN_CACHE_SIZE: usize = 64 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the variable name MAYBE_LN_CACHE_SIZE mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu's L1 cache size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

@oech3 oech3 force-pushed the sort-malloc0 branch 3 times, most recently from 3be05e9 to 635e73f Compare February 17, 2026 06:34
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/cut/bounded-memory. tests/cut/bounded-memory is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants