-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vec's swap_remove is needlessly slow #52150
Comments
Thanks a lot for looking into this! Comparing the assembly for the two implementations in nightly on godbolt reveals that the only substantial difference is that |
@rkruppe Do you know what change causes nightly to be significantly better at reasoning about the bounds check than stable, and whether that functionality is coming to stable soon? Regardless, the benchmarks I posted above are done on nightly, so there still seems to be a performance benefit. |
Note that my implementation, which doesn't use
We backport serious bug fixes but not performance fixes. So any performance improvements in nightly get into beta in a six week cycle, and that beta becomes the next stable another six weeks later.
Yes, there is a performance benefit to be had by replacing the current implementation in std. I'm only talking about how much |
@rkruppe I think I managed to reduce the unsafe code to a minimum with the following implementation: pub fn swap_remove(&mut self, index: usize) -> T {
unsafe {
let hole: *mut T = &mut self[index];
std::ptr::replace(hole, self.pop().unwrap())
}
} This still compiles to the essentially the same assembly on stable. |
That's great! Do you want to add comments describing the motivation and why this unsafe code is sound and submit this optimization as a pull request? |
Performance improvement of Vec's swap_remove. The old implementation *literally* swapped and then removed, which resulted in unnecessary move instructions. The new implementation does use unsafe code, but is easy to see that it is correct. Fixes #52150.
Popping the vector subtracts one from the length, suggesting that the subtraction in the above code may not be required. If you want to access an item in a vector before removing it, that is easy to do, suggesting that returning the item in the above code may not be required. So,
|
Currently
Vec
'sswap_remove
has the following implementation:This is needlessly slow. The
swap
does a bounds check, and thenpop
doesanother bounds check that never fails. Furthermore, there is an actual swap
right before the pop - this results in a lot useless moves that don't (seem to)
get optimized out.
It's possible to write a safe implementation that only does one bounds check and
uses two moves with a little unsafe code:
I found it quite hard to benchmark this, but the following benchmark showed on
my machine the new implementation to be faster:
As I said, it's quite hard to benchmark - lots of overhead and a small difference to be observed. Regardless, running multiple times show consistently
swap_remove_opt
to be faster.Although I can't run the above benchmark on stable due to
test
being unstable,I conjecture that on stable Rust the difference is much bigger. A quick look at
the disassembly should show why: https://godbolt.org/g/fU4Edu
Nightly fares a lot better in the disassembly but as seen above still loses out in the benchmark.
The text was updated successfully, but these errors were encountered: