Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying linkage on externs silently removes indirection #31508

Open
jethrogb opened this Issue Feb 9, 2016 · 3 comments

Comments

Projects
None yet
5 participants
@jethrogb
Copy link
Contributor

jethrogb commented Feb 9, 2016

When compiling the following code:

// externs-c.c
unsigned char myarr[10]={1,2,3,4,5,6,7,8,9,10};
unsigned char (*implicitvar)[10]=&myarr;
unsigned char (*explicitvar)[10]=&myarr;
// externs-rust.rs
#![feature(linkage)]

extern {
    static implicitvar: *const [u8;10];
    // Should have no effect, external linkage is the default in an extern block
    #[linkage="external"]
    static explicitvar: *const [u8;10];
}

fn as_option(p: *const [u8;10]) -> Option<&'static [u8;10]> {
    unsafe{std::mem::transmute(p)}
}

fn main() {
    println!("implicitvar = {:?}",as_option(implicitvar));
    println!("explicitvar = {:?}",as_option(explicitvar));
}

using

clang -c externs-c.c && rustc externs-rust.rs -C link-args=./externs.o

running ./externs will output something like the following:

implicitvar = Some([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
explicitvar = Some([168, 4, 122, 85, 85, 85, 0, 0, 0, 0])

Wat.

Taking a look at the IR:

; externs-c.ll
@myarr = global [10 x i8] c"\01\02\03\04\05\06\07\08\09\0A", align 1
@implicitvar = global [10 x i8]* @myarr, align 8
@explicitvar = global [10 x i8]* @myarr, align 8
; externs-rust.ll
@implicitvar = external global [10 x i8]*
@explicitvar = external global [10 x i8]
@_rust_extern_with_linkage_explicitvar = internal global [10 x i8]* @explicitvar

So, Rust removes a layer of indirection defining static explicitvar: [u8;10] and adding a new variable static _rust_extern_with_linkage_explicitvar: *const [u8;10]=&explicitvar. All mentions of explicitvar in Rust source code get replaced with _rust_extern_with_linkage_explicitvar. This results in the C version and this new Rust version not having the same type! To get “correct” behavior in the example above, you would need to define static explicitvar: *const *const [u8;10] instead.

This weird assymmetry between the types associated with symbols in Rust and in C is a source of great confusion and can easily lead to bugs. In the example above, we just read 2 bytes past some pointer by interpreting it as a 10-byte array.

This weird behavior was introduced in #12556 (see also #11978), the rationale being weak linkage and the fact that some pointers can't be null in the Rust typesystem. While true, I don't think that's sufficient rationale to add this layer of indirection. I think the layer of indirection should be removed completely. For weak linkage, a restriction can be added to allow only zeroable types.

@Aatch

This comment has been minimized.

Copy link
Contributor

Aatch commented Feb 9, 2016

This seems like very strange behaviour to me. I'm not sure if "some pointers can't be null" is a sensible reason for this behaviour at all.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 10, 2016

The supported added in #12556 (the indirection here) was only really intended for weak symbols. Weak symbols are currently required to be pointers as otherwise this is not memory safe:

extern {
    #[linkage = "extern_weak"]
    static SYMBOL: extern fn();
}

because fn types aren't nullable. I believe it was intended that this linkage restriction was detected at compile-time in some typechecking pass, although I'm not sure if that was ever implemented.

@jethrogb

This comment has been minimized.

Copy link
Contributor Author

jethrogb commented Feb 10, 2016

I guess this bug in particular is about the non-weak pointer case. However, I think the weak pointer case also warrants more discussion, for which I've opened a topic on internals.r-l.o.

Here's the code in question btw: https://github.com/rust-lang/rust/blob/3e9589c/src/librustc_trans/trans/foreign.rs#L139-L145 . It does not distinguish at all between different types of linkage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.