tuple structs cause FFI segfaults on 32-bit Linux #39394

Closed
Wilfred opened this Issue Jan 29, 2017 · 8 comments

Projects

None yet

5 participants

@Wilfred
Contributor
Wilfred commented Jan 29, 2017

Given the following Rust code, compiled as a staticlib:

extern crate libc;

#[repr(C)]
pub struct LispObject(libc::c_int);

#[no_mangle]
pub extern "C" fn fcdr(list: LispObject) -> LispObject {
    println!("fcdr: {:?}", list.0);
    list
}

And the following C program to call it:

#include <stdio.h>

int fcdr(int);

int main() {
  printf("call fcdr with 0\n");
  fcdr(0);
  return 0;
}

Running the compiled program on 32-bit linux isn't passing the int correctly:

$ ./example
call fcdr with 0
fcdr: -1077451660
Segmentation fault (core dumped)

64-bit linux works fine.

I have a full example repo here: https://github.com/Wilfred/rust_struct_test

I'm not sure if this is a bug with Rust itself, or in my code.

@Wilfred Wilfred referenced this issue in Wilfred/remacs Jan 29, 2017
Closed

Linux 32 bits build is broken #95

@Wilfred
Contributor
Wilfred commented Jan 29, 2017 edited

Bluss has kindly pointed out rust-lang/rfcs#1758 .

@nagisa
Contributor
nagisa commented Jan 29, 2017 edited

Your code is wrong.

#[repr(C)] struct LispObject(c_int)

is not the same as

type LispObject = c_int;

ABI wise. One is a structure and the other is a scalar. They may be handled differently if ABI specifies so. In this case 32-bit SysV ABI specifies that

fn fcdr(list: LispObject) -> LispObject

is actually

fn fcdr(sret: *mut LispObject, arg1: *const LispObject)

under the covers, whereas

fn fcdr(list: c_int) -> cint

is

fn fcdr(c_int) -> c_int

Now its easy to see that sret pointer is NULL and in order to return the value, a store to this null pointer happens.

@jeandudey

The rust-bindgen project has the same problem (see servo/rust-bindgen#439).

I think this behaviour should be documented in the nomicon.

@emilio
Contributor
emilio commented Jan 30, 2017

Yeah, bindgen's issue is not caused by bindgen per se, but by clang-sys.

In any case, as I commented in the rust-bindgen issue linked above, I've found this several times in different FFI-related libraries.

I believe this kind of usage is going to be hard to exterminate from the ecosystem, and if specializing newtypes in the compiler so they interact with the ABI in the same way as its underlying type is not possible, this is going to require a fair amount of docs.

@emilio
Contributor
emilio commented Jan 30, 2017

Oh, or repr(transparent), of course.

@jmesmon
Contributor
jmesmon commented Jan 30, 2017 edited

So the correct way to call would be:

#include <stdio.h>
struct foo {
        int c;
};
struct foo fcdr(struct foo);
int main(void) {
        printf("call fcdr with 0\n");
        struct foo x = { 0 };
        fcdr(x);
        return 0;
}

(this works fine for me locally using i686-unknown-linux-gnu & gcc -m32 on a multilib system)

@emilio emilio referenced this issue in rust-lang/rfcs Jan 30, 2017
Open

Specify #[repr(transparent)] #1758

@Wilfred
Contributor
Wilfred commented Jan 31, 2017

I don't want to leave an issue that doesn't have a clear next step.

Should I close this in favour of the RFC? Is this a docs issue that I should leave open?

@nagisa
Contributor
nagisa commented Jan 31, 2017

I’m not sure there’s anything to be documented here. The ABI between caller and callee was violated and that’s it. I’m not sure what the expectations in this case were, but the assumptions which caused this bug seem trivially wrong to me.

Tuple structs are also documented in the reference.

@Wilfred Wilfred closed this Jan 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment