-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: check pointer equality when comparing byte slices #33892
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Do you have some benchmarks or such which show this as a performance improvement? Ideally something more than a microbenchmark as well if possible? |
I cannot reliably compile new rustc's myself, so I did some testing in hyper using if req.method().as_ref() == "GET" {
Next::write()
} else {
panic!("only can handle GET")
} vs let s = req.method().as_ref();
// interestingly, using the literal `"GET"` here had a different pointer address.
// i assume since the examples are compiled as a separate crate, they each had
// their own static copy of the string
if s.len() == 3 && s.as_ptr() == ::hyper::Get.as_ref().as_ptr() {
Next::write()
} else {
panic!("only can handle GET")
} Compiled with release, using 1 thread, on a basic linode box, and then ran Without pointer comparison: Requests/sec 29,885 This is versus a comparison of a 3 byte string. I imagine performance would be even more noticeable when the string is a more likely size, like 10 or 20 bytes (I'm thinking header names, personally). |
@seanmonstar Nice catch! I have to admit that I just assumed that this was done for the performance gains. Do we know if this change will apply to strings as well? |
@shepmaster it does. The str_eq lang item calls this function. |
Great! One more easy question: why does this optimization not apply in the non- |
I don't know. Maybe it does. I wasn't sure I knew the implications of altering the non-bitwise equality impl, but I did know I wanted to improve string literals. If it could also improve other slices, we can open a second PR? |
⌛ Testing commit 6af17e6 with merge 6e13e51... |
💔 Test failed - auto-linux-64-cross-armhf |
@bors: retry On Tue, May 31, 2016 at 3:40 PM, bors notifications@github.com wrote:
|
…chton core: check pointer equality when comparing byte slices If pointer address and length are the same, it should be the same slice. In experiments, I've seen that this doesn't happen as often in debug builds, but release builds seem to optimize to using a single pointer more often.
core: check for pointer equality when comparing Eq slices Because `Eq` types must be reflexively equal, an equal-length slice to the same memory location must be equal. This is related to rust-lang#33892 (and rust-lang#32699) answering this comment from that PR: > Great! One more easy question: why does this optimization not apply in the non-BytewiseEquality implementation directly above? Because slices of non-reflexively equal types (like `f64`) are not equal even if it's the same slice. But if the types are `Eq`, we can use this same-address optimization, which this PR implements. Obviously this changes behavior if types violate the reflexivity condition of `Eq`, because their impls of `PartialEq` will no longer be called per-item, but 🤷♂ . It's not clear how often this optimization comes up in the real world outside of the same-`&str` case covered by rust-lang#33892, so **I'm requesting a perf run** (on MacOS today, so can't run `rustc_perf` myself). I'm going ahead and making the PR on the basis of being surprised things didn't already work this way. This is my first time hacking rust itself, so as a perf sanity check I ran `./x.py bench --stage 0 src/lib{std,alloc}`, but the differences were noisy. To make the existing specialization for `BytewiseEquality` explicit, it's now a supertrait of `Eq + Copy`. `Eq` should be sufficient, but `Copy` was included for clarity.
If pointer address and length are the same, it should be the same slice.
In experiments, I've seen that this doesn't happen as often in debug builds, but release builds seem to optimize to using a single pointer more often.