Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null fn check incorrect on release builds #40913

Closed
rillian opened this issue Mar 29, 2017 · 20 comments
Closed

null fn check incorrect on release builds #40913

rillian opened this issue Mar 29, 2017 · 20 comments
Labels
regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@rillian
Copy link
Contributor

rillian commented Mar 29, 2017

Testing Firefox against rustc 1.17.0-nightly (ccce2c6 2017-03-27) I have a unit test failure where it looks like a nullptr check on a function pointer in a struct passed over FFI doesn't work correctly. The failing test is supposed to check for a null callback function pointer and return null in that case. Instead it returns an allocated context object.

One can verify with ./mach gtest rust.MP4MetadataEmpty with ac_add_options --enable-optimize in mozconfig.

See also https://bugzilla.mozilla.org/show_bug.cgi?id=1351497

@rillian
Copy link
Contributor Author

rillian commented Mar 29, 2017

Here is a simplified testcase: https://play.rust-lang.org/?gist=2438d07552b441017c9a870e347edef2

This code should print "Ok!" and does on debug builds, and stable release builds, but not on nightly release builds.

type ReadFn = extern fn(*mut u8, usize) -> isize;

#[repr(C)]
#[derive(Clone)]
pub struct mp4parse_io {
    pub read: ReadFn,
}

unsafe extern fn mp4parse_new(io: *const mp4parse_io) -> *mut mp4parse_io {
    if ((*io).read as *mut std::os::raw::c_void).is_null() {
        return std::ptr::null_mut();
    }
    let parser = Box::new( (*io).clone() );

    Box::into_raw(parser)
}

fn boom() {
    let parser = unsafe {
        let io = mp4parse_io {
            read: std::mem::transmute::<*const (), ReadFn>(std::ptr::null()),
        };
        mp4parse_new(&io)
    };
    assert!(parser.is_null());
}

fn main() {
    println!("Testing null read callback... ");
    boom();
    println!("Ok!");
}

Note: I'm willing to believe the code is bad, but I don't understand how, so looking for guidance either way. :)

@steveklabnik steveklabnik added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-nominated labels Mar 29, 2017
@steveklabnik
Copy link
Member

Tagging and nominating.

@rillian
Copy link
Contributor Author

rillian commented Mar 29, 2017

I did a bit more experimentation. The function call between creating the struct and the null check is important. The .clone() and #[repr(C)] are not.

@jonas-schievink
Copy link
Contributor

That transmute seems like UB, no?

Anyways, nightly produces this for the null check:

  %4 = load i64 (i8*, i64)*, i64 (i8*, i64)** %3, !dbg !275, !nonnull !49
  %5 = bitcast i64 (i8*, i64)* %4 to i8*, !dbg !275
  %6 = call zeroext i1 @"_ZN4core3ptr31_$LT$impl$u20$$BP$mut$u20$T$GT$7is_null17hdd9ae6e9ddd6f4bcE"(i8* %5), !dbg !275

...and Beta this:

  %4 = load i64 (i8*, i64)*, i64 (i8*, i64)** %3, !dbg !254
  %5 = bitcast i64 (i8*, i64)* %4 to i8*, !dbg !254
  %6 = call zeroext i1 @"_ZN4core3ptr31_$LT$impl$u20$$BP$mut$u20$T$GT$7is_null17hdd9ae6e9ddd6f4bcE"(i8* %5), !dbg !254

So it's pretty clear what's happening :)

@rillian
Copy link
Contributor Author

rillian commented Mar 29, 2017

The transmute is to represent receiving a null function pointer over FFI from C, just to make the testcase easier to run.

@Diggsey
Copy link
Contributor

Diggsey commented Mar 29, 2017

@rillian function pointers are not allowed to be null in rust. Your transmute is therefore UB.

If you want to represent nullable function pointers, use Option<ReadFn> instead - it is guaranteed to be layout compatible.

@retep998
Copy link
Member

Winapi consistently uses Option<unsafe extern "something" fn(...) -> ...> across the board because it has a compatible representation with C function pointers, gracefully handles null pointers, and the unsafe bit ensures that code knows what it is doing before it invokes a function pointer provided by the system.

Rust function pointers are like Rust references, they can never be null. There is also no raw pointer equivalent of a Rust function pointer, so Option is the only way to handle nullable function pointers.

kinetiknz added a commit to kinetiknz/mp4parse-rust that referenced this issue Mar 30, 2017
This addresses https://bugzilla.mozilla.org/show_bug.cgi?id=1351497, which
was filed as a rustc bug in rust-lang/rust#40913,
but revealed to be undefined behaviour that happened to work with earlier
compiler versions.

As the comment removed in this patch describes, the only reason this code
existed was to work around a limitation of rusty-cheddar.
kinetiknz added a commit to kinetiknz/mp4parse-rust that referenced this issue Mar 30, 2017
This addresses https://bugzilla.mozilla.org/show_bug.cgi?id=1351497, which
was filed as a rustc bug in rust-lang/rust#40913,
but revealed to be undefined behaviour that happened to work with earlier
compiler versions.

As the comment removed in this patch describes, the only reason this code
existed was to work around a limitation of rusty-cheddar.
@arielb1
Copy link
Contributor

arielb1 commented Mar 30, 2017

I marked function pointer loads as nonnull in one of the commits in #39628 (I thought I would be using that codepath for vtables, where we have nonnull today I wanted to preserve - not that I think it helps anybody - but eventually ended up not using it).

However, null function pointers are UB since, well, forever (see Option<fn()>).

@rillian
Copy link
Contributor Author

rillian commented Mar 30, 2017

Thanks @arielb1. Do you think it would be reasonable to revert the nonnull change and maintain the current behaviour?

That's three opinions for this being a new optimization, instead of a regression, but the null ptr optimization of Option<extern fn ...> being stable ABI. We were aware of that possibility, and didn't use it just because of missing support in our binding generator (Sean1708/rusty-cheddar#30). However, there is an advantage to being able to call is_null() on function pointers.

In our code the function pointer isn't nullable semantically. It's an error for it to be invalid, which is what the unit test was verifying. Since the struct is filled out by unsafe C code, the value in the struct can certainly be null, but I like being able to verify that once at initialization time instead of checking at every callsite. If the compiler is going to assume function pointers are nonnull even in unsafe code, then I would need two different structs to express that, one with an Option<extern fn> for ffi and another with just extern fn for the Rust side of the code, but otherwise identical.

@retep998
Copy link
Member

retep998 commented Mar 30, 2017

Because Option<fn(...)> has had the right representation for at least as long as Rust has been stable, and Rust has held the position that function pointers should never be null for at least that long as well, I'd be in favor of simply keeping the nonnull change and focusing on fixing the ecosystem to ensure people don't make this mistake. If anything, this change forced at least one project to realize they were using function pointers correctly, so maybe we can do more to detect this scenario at compile time, perhaps via a lint that detects transmutes from *const T to fn(...) and warns you that you should be doing *const T to Option<fn(...)> instead.

@Diggsey
Copy link
Contributor

Diggsey commented Mar 30, 2017

@rillian If the C library is filling in the struct then you must use Option<fn(...)>, or else the construction of the struct on the rust side with null is UB before you even call the C library.

Also, if returning a non-null function pointer is part of the contract of the C library, then it would make more sense to verify that in the test suite for the C library, rather than on the rust side, which avoids the problem you are having.

Unsafe doesn't mean that all of the constraints, like the aliasing rules are allowed to be broken - anything guaranteed to the compiler in safe rust is also guaranteed in unsafe rust. Unsafe rust just means that the compiler will allow you to do things which it can't prove don't invalidate those constraints. Actually invalidating them would still be UB.

@pnkfelix pnkfelix added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Mar 30, 2017
@pnkfelix
Copy link
Member

(nominating for lang team discussion, mostly to check whether lang team is satisfied with status quo of null fn ptr's being undefined behavior.)

@rillian
Copy link
Contributor Author

rillian commented Mar 30, 2017

Thanks pnkfelix.

@retep998
Copy link
Member

retep998 commented Mar 30, 2017

If we made null function pointers legal then None == Some(func) where func is a null function pointer would have to evaluate to true, otherwise that would break existing representation guarantees, but in turn it would break invariants about enum variants.

@pnkfelix
Copy link
Member

(to be clear, my inclination is to leave null fn ptr's as undefined behavior... i just wanted to make sure the lang team was aware of the matter.)

kinetiknz added a commit to kinetiknz/mp4parse-rust that referenced this issue Mar 31, 2017
This addresses https://bugzilla.mozilla.org/show_bug.cgi?id=1351497, which
was filed as a rustc bug in rust-lang/rust#40913,
but revealed to be undefined behaviour that happened to work with earlier
compiler versions.

As the comment removed in this patch describes, the only reason this code
existed was to work around a limitation of rusty-cheddar.
@arielb1
Copy link
Contributor

arielb1 commented Mar 31, 2017

Can we close this?

@pnkfelix
Copy link
Member

@arielb1 the Lang team only addresses nominates issues every other week, so we will look at it in a week

(However it might suffice if I just got the members to weigh in outside of the weekly mtg format )

@pnkfelix
Copy link
Member

pnkfelix commented Apr 6, 2017

(removing T-compiler flag since this is currently sitting in T-lang queue alone.)

@pnkfelix pnkfelix removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 6, 2017
@pnkfelix
Copy link
Member

pnkfelix commented Apr 6, 2017

consensus of lang team is that we are not going to change the behavior here. fn pointer types are non-null (in terms of the Rust type system), and the official way to express a nullable fn pointer is, as noted above, via an Option<fn> type.

So decision is that this is at most documentation issue, not a bug in the compiler/language.

@rillian
Copy link
Contributor Author

rillian commented Apr 6, 2017

Awesome. Thanks for taking a look at the issue. We've migrated our code to Option<fn>, so all's good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants