Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd unsafe Option::unwrap_unchecked() #1095
Conversation
dschatzberg
referenced this pull request
Apr 28, 2015
Closed
Add unsafe Option::unwrap_unchecked() #24905
lilyball
reviewed
Apr 28, 2015
| bike-shedding). It is implemented quite simply by using the | ||
| `unreachable` intrinsic to indicate that the `Option` cannot be `None` | ||
| and then calling `unwrap()`. I expect that the compiler can them | ||
| optimize out the conditional. A prototype implement can be found |
This comment has been minimized.
This comment has been minimized.
lilyball
Apr 28, 2015
Contributor
Have you verified that the compiler does in fact optimize it away?
This comment has been minimized.
This comment has been minimized.
dschatzberg
Apr 28, 2015
Author
This is dependent on a lot of different cases. If I build a library as follows:
#![feature(core)]
use std::intrinsics;
pub fn my_unwrap(o: Option<i32>) -> i32 {
match o {
Some(v) => v,
None => panic!("panic")
}
}
pub unsafe fn my_unwrap_unsafe(o: Option<i32>) -> i32 {
match o {
Some(v) => v,
None => intrinsics::unreachable()
}
}
$ rustc -O --crate-type=lib unwrap.rs
$ objdump -d libunwrap.rlib
0000000000000000 <_ZN16my_unwrap_unsafe20h8f4545557bc4e296MaaE>:
0: 48 c1 ef 20 shr $0x20,%rdi
4: 89 f8 mov %edi,%eax
6: c3 retq
You can see no conditionals in the unwrap_unsafe asm. Further testing on the actual standard library implementation would be needed to confirm, but at least one method of implementing it seems to work as I expect.
ghost
reviewed
Apr 28, 2015
| if self.nelem == 0 { | ||
| return None; | ||
| } | ||
| self.head.as_ref().map(|head| { |
This comment has been minimized.
This comment has been minimized.
ghost
Apr 28, 2015
Well, this is a curious one. It seems like the first if isn't necessary at all and is there either by accident or it's an optimisation itself.
I believe the community's consensus is that in principle unwrap() is not a good idiom, and in all likelihood neither would unwrap_unchecked(). If there's an implicit contract about an Option value always being a Some or None based on some other state (or, in general, an enum value always being of a certain variant based on some other state), then the state machine is perhaps not represented properly (the two semi-related enumerated datums should in those cases be consolidated into one).
This comment has been minimized.
This comment has been minimized.
dschatzberg
Apr 28, 2015
Author
The first if is necessary because the iterator is double ended. Without checking it, next will only return None at the tail. The argument against unwrap() is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.
I'd be curious to see your recommendation as to how we modify the linked_list and its iterator to consolidate the semi-related datums. Otherwise, I see unwrap_unchecked or an alternative as the right way to do it.
This comment has been minimized.
This comment has been minimized.
EdorianDark
May 14, 2015
The argument against
unwrap()is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.
A unwrap() would lead to a panic, unwrap_unchecked() would lead to memory override or other corruption in case of a bug. Memory corruption sounds worse than panic to me. Therefore i would think, that to use unwrap_unchecked() is a even worse idiom than pure unwrap().
ghost
reviewed
Apr 28, 2015
| } | ||
|
|
||
| I feel that this is less desirable as it causes right shift in the | ||
| code and has poor ergonomics. Forcing developers down this path will |
This comment has been minimized.
This comment has been minimized.
ghost
Apr 28, 2015
Ergonomics are only really important for common use cases, by definition. It seems like it's not a very compelling argument given that I believe this is nonetheless a rather uncommon one.
This comment has been minimized.
This comment has been minimized.
dschatzberg
Apr 28, 2015
Author
Ergonomics are about efficiency (by definition). A common use case or not, all else being equal I would prefer something to be more efficient to type than not. I also presented an example from the standard library where it seems that the developer either forgot this optimization was possible or avoided it due to the ugliness involved.
This comment has been minimized.
This comment has been minimized.
bluss
May 1, 2015
I know one such place, the chars iterator. I've checked it again and I didn't make any performance gains by trying to do this kind of unchecked unwrap.
This comment has been minimized.
This comment has been minimized.
ruuda
May 7, 2015
I think we agree that the situation where the programmer has extra information to prove that the option is not None is relatively uncommon, but suppose the programmer has this information. Then the case where avoiding the check will have a performance benefit will be even more uncommon. (If this is called only once, why bother?) But suppose there is a performance benefit here. Then the case where the programmer cares about performance will be even more uncommon. (Just that Rust is a systems language does not mean all code is performance-critical.)
If you are in this really specific situation where you have extra information, there is a performance benefit, you have measured it, and it is critical, then it wouldn’t be so bad to manually do the match, right?
This comment has been minimized.
This comment has been minimized.
bluss
May 7, 2015
I agree @ruud-v-a. And there are more enums than just Option that you want to do this with.
This comment has been minimized.
This comment has been minimized.
tbu-
May 7, 2015
Contributor
@ruud-v-a Consider this article, it advocates knowing where the performance-critical parts of programs are and not requiring measuring to allow optimizing.
This comment has been minimized.
This comment has been minimized.
dschatzberg
May 7, 2015
Author
@ruud-v-a I disagree with the assertion that the case where avoiding the check will have a binary performance benefit or not and I also disagree that the amount a programmer cares is similarly a binary value.
Our goal is to make efficient code easy to write. I think in a vacuum, having this method makes it more likely that developers will do this optimization when possible. The drawback is that it's yet another method of Option (of which there are many). So we must ask, does the advantage gained outweigh the disadvantage? I don't see it being simplified any further.
@bluss Yes, there are more cases to apply this. I felt that the RFC should cover a small case where I have a concrete example to have this debate.
This comment has been minimized.
This comment has been minimized.
seanmonstar
May 7, 2015
Contributor
The cases where it could make no difference is when llvm is able to optimize 2 checks into 1 (such as you already check is_some() earlier in the function).
This comment has been minimized.
This comment has been minimized.
|
Other possible alternative: If we added branch weights, such as |
This comment has been minimized.
This comment has been minimized.
reem
commented
Apr 30, 2015
|
One benefit of adding this is that it would allow stable users to (unsafely) express unreachability without needing the unstable |
This comment has been minimized.
This comment has been minimized.
|
I've definitely wanted this before for expressing the invariants in data structures (stuff like "if I'm at this point in the program the root/parent must not be null"). Not for any particularly profiled reason, though. Just a vague sense of "I don't like having the code think it can panic here". However if the intrinsic was properly exposed I believe this could just be written as |
This comment has been minimized.
This comment has been minimized.
|
I'm interested to see specific code in std where this optimization helps. |
This comment has been minimized.
This comment has been minimized.
|
@brson How is what you're asking different from what is already in the RFC? The example I provided is from std. |
This comment has been minimized.
This comment has been minimized.
|
You can also use |
alexcrichton
added
the
T-libs
label
May 14, 2015
This comment has been minimized.
This comment has been minimized.
|
As far as I can tell this can be done in a library outside of |
This comment has been minimized.
This comment has been minimized.
|
@SimonSapin That can be said for all things in the standard library. :) |
This comment has been minimized.
This comment has been minimized.
bluss
commented
May 28, 2015
|
I actually think the explicit match is desireable. It would be much better if we suggested this pattern: let inner_value = match opt {
Some(x) => x,
None => debug_assert_unreachable("This is a bug"),
};I.e. we encourage the user to check their assumptions in the code with debug assertions, this particular macro or function after reem's model would behave like debug_assert!() in debug builds and be the unreachable intrinsic in release builds. |
This comment has been minimized.
This comment has been minimized.
|
@bluss One could make a functions which does exactly this to encourage the pattern. :) |
Gankro
self-assigned this
Jun 5, 2015
alexcrichton
added
the
final-comment-period
label
Jun 10, 2015
This comment has been minimized.
This comment has been minimized.
|
This RFC is now entering the final-comment-period for one week. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@SimonSapin yeah, I was just thinking that we'd want to pursue stabilizing that intrinsic (in a different location) instead of stabilizing a one-off method on |
This comment has been minimized.
This comment has been minimized.
|
I am unconvinced that I'd even want to do this optimization at all. It's such a tiny gain (avoids a single, trivially predictable branch), and the consequences for getting it wrong (i.e., the value being I am even less convinced that this optimization needs to be codified in a method in I'd appreciate hard data (i.e., well-done benchmarks) showing a measurable performance gain in any non-pathological code --- ideally, a real library or application. But even with that I'd be inclined to say "well that's nice but you can do the same thing yourself, so why bake it into std for all eternity?" Overall, |
This comment has been minimized.
This comment has been minimized.
reem
commented
Jun 10, 2015
|
Also of note: you can now have an unreachable optimization hint in stable code through the |
This comment has been minimized.
This comment has been minimized.
Uh, that looks crazy magic to me. Does the complier really eliminate branches that cause transmuting to |
This comment has been minimized.
This comment has been minimized.
reem
commented
Jun 10, 2015
|
@SimonSapin it works because rustc emits an unreachable hint after all calls to diverging functions. |
This comment has been minimized.
This comment has been minimized.
bluss
commented
Jun 10, 2015
|
@SimonSapin rustc's part of the magic is this, it simply emits unreachable when matching an empty enum: fn unreachable(v: Void) -> ! {
match v { /* empty */ }
} |
This comment has been minimized.
This comment has been minimized.
|
I see, thanks for the explanation! |
This comment has been minimized.
This comment has been minimized.
|
Thanks again for the RFC @dschatzberg! The consensus of the library subteam is to not merge this RFC at this time due to the worries about composability and how stabilizing an |
alexcrichton
closed this
Jun 16, 2015
This comment has been minimized.
This comment has been minimized.
|
Thanks for the response, @alexcrichton. Could you clarify what the worries about |
This comment has been minimized.
This comment has been minimized.
|
Ah it's what I meant in a previous comment about how an |
dschatzberg commentedApr 28, 2015
No description provided.