Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unsafe Option::unwrap_unchecked() #1095

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
@dschatzberg
Copy link

dschatzberg commented Apr 28, 2015

No description provided.

bike-shedding). It is implemented quite simply by using the
`unreachable` intrinsic to indicate that the `Option` cannot be `None`
and then calling `unwrap()`. I expect that the compiler can them
optimize out the conditional. A prototype implement can be found

This comment has been minimized.

@lilyball

lilyball Apr 28, 2015

Contributor

Have you verified that the compiler does in fact optimize it away?

This comment has been minimized.

@dschatzberg

dschatzberg Apr 28, 2015

Author

This is dependent on a lot of different cases. If I build a library as follows:

#![feature(core)]

use std::intrinsics;

pub fn my_unwrap(o: Option<i32>) -> i32 {
    match o {
        Some(v) => v,
        None => panic!("panic")
    }
}

pub unsafe fn my_unwrap_unsafe(o: Option<i32>) -> i32 {
    match o {
        Some(v) => v,
        None => intrinsics::unreachable()
    }
}


$ rustc -O --crate-type=lib unwrap.rs
$ objdump -d libunwrap.rlib
0000000000000000 <_ZN16my_unwrap_unsafe20h8f4545557bc4e296MaaE>:
   0:   48 c1 ef 20             shr    $0x20,%rdi
   4:   89 f8                   mov    %edi,%eax
   6:   c3                      retq   

You can see no conditionals in the unwrap_unsafe asm. Further testing on the actual standard library implementation would be needed to confirm, but at least one method of implementing it seems to work as I expect.

if self.nelem == 0 {
return None;
}
self.head.as_ref().map(|head| {

This comment has been minimized.

@ghost

ghost Apr 28, 2015

Well, this is a curious one. It seems like the first if isn't necessary at all and is there either by accident or it's an optimisation itself.

I believe the community's consensus is that in principle unwrap() is not a good idiom, and in all likelihood neither would unwrap_unchecked(). If there's an implicit contract about an Option value always being a Some or None based on some other state (or, in general, an enum value always being of a certain variant based on some other state), then the state machine is perhaps not represented properly (the two semi-related enumerated datums should in those cases be consolidated into one).

This comment has been minimized.

@dschatzberg

dschatzberg Apr 28, 2015

Author

The first if is necessary because the iterator is double ended. Without checking it, next will only return None at the tail. The argument against unwrap() is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.

I'd be curious to see your recommendation as to how we modify the linked_list and its iterator to consolidate the semi-related datums. Otherwise, I see unwrap_unchecked or an alternative as the right way to do it.

This comment has been minimized.

@EdorianDark

EdorianDark May 14, 2015

The argument against unwrap() is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.

A unwrap() would lead to a panic, unwrap_unchecked() would lead to memory override or other corruption in case of a bug. Memory corruption sounds worse than panic to me. Therefore i would think, that to use unwrap_unchecked() is a even worse idiom than pure unwrap().

}

I feel that this is less desirable as it causes right shift in the
code and has poor ergonomics. Forcing developers down this path will

This comment has been minimized.

@ghost

ghost Apr 28, 2015

Ergonomics are only really important for common use cases, by definition. It seems like it's not a very compelling argument given that I believe this is nonetheless a rather uncommon one.

This comment has been minimized.

@dschatzberg

dschatzberg Apr 28, 2015

Author

Ergonomics are about efficiency (by definition). A common use case or not, all else being equal I would prefer something to be more efficient to type than not. I also presented an example from the standard library where it seems that the developer either forgot this optimization was possible or avoided it due to the ugliness involved.

This comment has been minimized.

@bluss

bluss May 1, 2015

I know one such place, the chars iterator. I've checked it again and I didn't make any performance gains by trying to do this kind of unchecked unwrap.

This comment has been minimized.

@ruuda

ruuda May 7, 2015

I think we agree that the situation where the programmer has extra information to prove that the option is not None is relatively uncommon, but suppose the programmer has this information. Then the case where avoiding the check will have a performance benefit will be even more uncommon. (If this is called only once, why bother?) But suppose there is a performance benefit here. Then the case where the programmer cares about performance will be even more uncommon. (Just that Rust is a systems language does not mean all code is performance-critical.)

If you are in this really specific situation where you have extra information, there is a performance benefit, you have measured it, and it is critical, then it wouldn’t be so bad to manually do the match, right?

This comment has been minimized.

@bluss

bluss May 7, 2015

I agree @ruud-v-a. And there are more enums than just Option that you want to do this with.

This comment has been minimized.

@tbu-

tbu- May 7, 2015

Contributor

@ruud-v-a Consider this article, it advocates knowing where the performance-critical parts of programs are and not requiring measuring to allow optimizing.

This comment has been minimized.

@dschatzberg

dschatzberg May 7, 2015

Author

@ruud-v-a I disagree with the assertion that the case where avoiding the check will have a binary performance benefit or not and I also disagree that the amount a programmer cares is similarly a binary value.

Our goal is to make efficient code easy to write. I think in a vacuum, having this method makes it more likely that developers will do this optimization when possible. The drawback is that it's yet another method of Option (of which there are many). So we must ask, does the advantage gained outweigh the disadvantage? I don't see it being simplified any further.

@bluss Yes, there are more cases to apply this. I felt that the RFC should cover a small case where I have a concrete example to have this debate.

This comment has been minimized.

@seanmonstar

seanmonstar May 7, 2015

Contributor

The cases where it could make no difference is when llvm is able to optimize 2 checks into 1 (such as you already check is_some() earlier in the function).

@seanmonstar

This comment has been minimized.

Copy link
Contributor

seanmonstar commented Apr 28, 2015

Other possible alternative: If we added branch weights, such as llvm.expect, we could expect the Some case, which would help the compiler to make it the fast case.

@reem

This comment has been minimized.

Copy link

reem commented Apr 30, 2015

One benefit of adding this is that it would allow stable users to (unsafely) express unreachability without needing the unstable intrinsics API.

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Apr 30, 2015

I've definitely wanted this before for expressing the invariants in data structures (stuff like "if I'm at this point in the program the root/parent must not be null"). Not for any particularly profiled reason, though. Just a vague sense of "I don't like having the code think it can panic here".

However if the intrinsic was properly exposed I believe this could just be written as foo.unwrap_or_else(intrinsics::unreachable). This is easy for someone to make a function/extension trait for if they want better semantics. As such it seems like a very minor ergonomic benefit to expose a dedicated method for.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented May 5, 2015

I'm interested to see specific code in std where this optimization helps.

@dschatzberg

This comment has been minimized.

Copy link
Author

dschatzberg commented May 5, 2015

@brson How is what you're asking different from what is already in the RFC? The example I provided is from std.

@tbu-

This comment has been minimized.

Copy link
Contributor

tbu- commented May 6, 2015

You can also use .unwrap_or_else(|| intrinsics::unreachable())

@alexcrichton alexcrichton added the T-libs label May 14, 2015

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented May 28, 2015

As far as I can tell this can be done in a library outside of std, with the only downside that you’d have to use a trait where you use it.

@tbu-

This comment has been minimized.

Copy link
Contributor

tbu- commented May 28, 2015

@SimonSapin That can be said for all things in the standard library. :)

@bluss

This comment has been minimized.

Copy link

bluss commented May 28, 2015

I actually think the explicit match is desireable. It would be much better if we suggested this pattern:

let inner_value = match opt {
    Some(x) => x,
    None => debug_assert_unreachable("This is a bug"),
};

I.e. we encourage the user to check their assumptions in the code with debug assertions, this particular macro or function after reem's model would behave like debug_assert!() in debug builds and be the unreachable intrinsic in release builds.

@tbu-

This comment has been minimized.

Copy link
Contributor

tbu- commented May 29, 2015

@bluss One could make a functions which does exactly this to encourage the pattern. :)

@Gankro Gankro self-assigned this Jun 5, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 10, 2015

This RFC is now entering the final-comment-period for one week.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 10, 2015

I personally see this as a minor convenience that doesn't necessarily pull its weight as a method, I'd prefer either @tbu-'s or @bluss's suggestion which would involve stabilizing this form of "unsafe unreachable" in a different manner.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 10, 2015

I'd prefer either @tbu-'s or @bluss's suggestion which would involve stabilizing this form of "unsafe unreachable" in a different manner.

Isn’t that std::intrinsics::unreachable?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 10, 2015

@SimonSapin yeah, I was just thinking that we'd want to pursue stabilizing that intrinsic (in a different location) instead of stabilizing a one-off method on Option.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Jun 10, 2015

I am unconvinced that I'd even want to do this optimization at all. It's such a tiny gain (avoids a single, trivially predictable branch), and the consequences for getting it wrong (i.e., the value being None) are pretty dire, up to and including interpreting arbitrary bits as as pointer. That's a pretty terribly ratio of potential danger to potential benefit.

I am even less convinced that this optimization needs to be codified in a method in std, especially when it's trivial to express otherwise on nightly. I don't think there is any pressing need to get the possibility of this optimization into stable: There are already many unstable optimizations (e.g., various collection APIs, other intrinsics) that are far more widely applicable, and unreachable will be stabilized in some form or another at some point in the future. If anyone wants this optimization on stable, working to stabilize std::intrinsics::unreachable would probably be more effective as @alexcrichton already said.

I'd appreciate hard data (i.e., well-done benchmarks) showing a measurable performance gain in any non-pathological code --- ideally, a real library or application. But even with that I'd be inclined to say "well that's nice but you can do the same thing yourself, so why bake it into std for all eternity?"

Overall, 👎

@reem

This comment has been minimized.

Copy link

reem commented Jun 10, 2015

Also of note: you can now have an unreachable optimization hint in stable code through the unreachable crates.io crate, which provides the same interface as std::intrinsincs::unreachable but doesn't need any unstable code.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 10, 2015

the unreachable crates.io crate

Uh, that looks crazy magic to me. Does the complier really eliminate branches that cause transmuting to Void? Reliably?

@reem

This comment has been minimized.

Copy link

reem commented Jun 10, 2015

@SimonSapin it works because rustc emits an unreachable hint after all calls to diverging functions.

@bluss

This comment has been minimized.

Copy link

bluss commented Jun 10, 2015

@SimonSapin rustc's part of the magic is this, it simply emits unreachable when matching an empty enum:

fn unreachable(v: Void) -> ! {
    match v { /* empty */ }
}
@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 11, 2015

I see, thanks for the explanation!

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 16, 2015

Thanks again for the RFC @dschatzberg! The consensus of the library subteam is to not merge this RFC at this time due to the worries about composability and how stabilizing an unreachable-like function in the standard library (or using @reem's crate) may be the best way to go here.

@dschatzberg

This comment has been minimized.

Copy link
Author

dschatzberg commented Jun 16, 2015

Thanks for the response, @alexcrichton. Could you clarify what the worries about
composability are?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 16, 2015

Ah it's what I meant in a previous comment about how an unreachable function of some form would be more composable instead of only having this method on Option itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.