New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace sort implementations #124032
base: master
Are you sure you want to change the base?
Replace sort implementations #124032
Conversation
- `slice::sort` -> driftsort https://github.com/Voultapher/sort-research-rs/blob/main/writeup/driftsort_introduction/text.md - `slice::sort_unstable` -> ipnsort https://github.com/Voultapher/sort-research-rs/blob/main/writeup/ipnsort_introduction/text.md Replaces the sort implementations with tailor made ones that strike a balance of run-time, compile-time and binary-size, yielding run-time and compile-time improvements. Regressing binary-size for `slice::sort` while improving it for `slice::sort_unstable`. All while upholding the existing soft and hard safety guarantees, and even extending the soft guarantees, detecting strict weak ordering violations with a high chance and reporting it to users via a panic. In addition the implementation of `select_nth_unstable` is also adapted as it uses `slice::sort_unstable` internals.
r? thomcc |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
@bors try (looks like the try build didn't get through) |
Replace sort implementations This PR replaces the sort implementations with tailor-made ones that strike a balance of run-time, compile-time and binary-size, yielding run-time and compile-time improvements. Regressing binary-size for `slice::sort` while improving it for `slice::sort_unstable`. All while upholding the existing soft and hard safety guarantees, and even extending the soft guarantees, detecting strict weak ordering violations with a high chance and reporting it to users via a panic. * `slice::sort` -> driftsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/driftsort_introduction/text.md) * `slice::sort_unstable` -> ipnsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/ipnsort_introduction/text.md) #### Why should we change the sort implementations? In the [2023 Rust survey](https://blog.rust-lang.org/2024/02/19/2023-Rust-Annual-Survey-2023-results.html#challenges), one of the questions was: "In your opinion, how should work on the following aspects of Rust be prioritized?". The second place was "Runtime performance" and the third one "Compile Times". This PR aims to improve both. #### Why is this one big PR and not multiple? * The current documentation gives performance recommendations for `slice::sort` and `slice::sort_unstable`. If for example only one of them were to changed, this advise may be misleading for some Rust versions. By replacing them atomically, the advise remains largely unchanged, and users don't have to change their code. * driftsort and ipnsort share a substantial part of their implementations. * The implementation of `select_nth_unstable` uses internals of `slice::sort_unstable`, which makes it impractical to split changes. --- This PR is a collaboration with `@orlp.`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (a4132fa): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 679.119s -> 683.166s (0.60%) |
Also includes small doc fixes.
The binary-size results are inline with what we expected, given that Regarding compile-times, we looked at those in our analysis here and here. The instruction count metric does not account for gains via parallelizability, right? |
One thing that I find particularly strange is a 4.31% regression in |
@orlp Nearly the all the benchmarks represent the measurement on |
|
@Urgau Yes I'm aware, nevertheless a 4.3% regression in compile-time for This indicates that it's likely a hidden a sort call embedded in this program that needs extra time to be compiled. I conjecture due to the backtrace routine containing a sort call. Also, when I say 'time' I must concur with @Voultapher in that we found that our new sort implementation does use more resources to compile, but may not actually be slower in real-time as more can be compiled in parallel. |
In the past I looked at how much run-time rustc spends inside sort #108005 (comment) and it's less than 0.1% of the total runtime. The regressions are caused by more complex implementations that may take longer to compile, and for example due to backtrace-rs every program contains at least one call to |
It would be nice to add some (even very trivial) runtime benchmark that uses sorting. I know that our runtime benchmark measurement process is far from ideal, but it would be good to see some green result on this PR to confirm that it is an improvement in some basic real world usage. Also, could you please post results of benchmarks from stdlib (IIRC there are some microbenchmarks for sorting?). (Great job with the new sort. implementation, btw!) |
Yeah, I meant adding it to rustc-perf. But maybe it's not needed, given that you already have such an extensive suite. |
@Kobzol as discussed on Zulip, this stand-alone benchmark runner and analysis would be a good fit for the rustc-perf suite https://github.com/Voultapher/sort-research-rs/tree/main/util/rustc-sort-bench. I think we should add it to notice future regressions and improvements. |
Yeah, that would be nice, although I still haven't found time to look at it, sorry (was quite busy lately, and will still be in the upcoming weeks). This is not something that has to block progress on this PR though! |
I concur, let's not block this PR on extending the rustc-perf suite. |
@Kobzol here are the microbenchmark results you asked for on my Zen 3 machine:
|
I suggest we also run the rustc-perf suite, I would only expect large changes if any of the simulated workloads spend a substantial amount of time in |
We already did. |
I though those were the compile-time benchmarks, and not the run-time suite. Or are they together in the same suite? |
Wait, never mind I just found the tab in the results page. |
Yeah, they are together. You can see the runtime results in the Runtime tab. But there's nothing much to see here.. :) |
This PR replaces the sort implementations with tailor-made ones that strike a balance of run-time, compile-time and binary-size, yielding run-time and compile-time improvements. Regressing binary-size for
slice::sort
while improving it forslice::sort_unstable
. All while upholding the existing soft and hard safety guarantees, and even extending the soft guarantees, detecting strict weak ordering violations with a high chance and reporting it to users via a panic.slice::sort
-> driftsort design documentslice::sort_unstable
-> ipnsort design documentWhy should we change the sort implementations?
In the 2023 Rust survey, one of the questions was: "In your opinion, how should work on the following aspects of Rust be prioritized?". The second place was "Runtime performance" and the third one "Compile Times". This PR aims to improve both.
Why is this one big PR and not multiple?
slice::sort
andslice::sort_unstable
. If for example only one of them were to changed, this advise may be misleading for some Rust versions. By replacing them atomically, the advise remains largely unchanged, and users don't have to change their code.select_nth_unstable
uses internals ofslice::sort_unstable
, which makes it impractical to split changes.This PR is a collaboration with @orlp.