Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pass small struct by value across FFI on linux-gnu #59187

Open
annedrewhu opened this issue Mar 14, 2019 · 5 comments
Open

Cannot pass small struct by value across FFI on linux-gnu #59187

annedrewhu opened this issue Mar 14, 2019 · 5 comments
Labels
A-ffi Area: Foreign Function Interface (FFI) C-bug Category: This is a bug. O-linux Operating system: Linux O-x86_64 Target: x86-64 processors (like x86_64-*) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@annedrewhu
Copy link

I am creating an FFI function that takes a tagged union as a parameter. It corresponds to a Rust enum with variant. The tagged union definition for the C++ code was generated with cbindgen. The problem is that the calling convention between Rust's extern "C" and the actual C calling convention aren't matching up, so it gets faulty values for the argument.

This looks like it's similar to #5744.

I tried this code:

Rust:

#[repr(C)]
#[no_mangle]
pub enum SLCArgs {
  Add(*const c_char), // borrowed
  Count,
  Print,
}

#[no_mangle]
// works if changed to extern "Rust"
pub unsafe extern "C" fn StringListContainer_do(slc_p : *mut StringListContainer, m : SLCArgs) {
  let slc = &mut *(slc_p);
  match m {
    SLCArgs::Count => println!("{}", slc.Count()),
    SLCArgs::Add(cstr) => StringListContainer_Add(slc_p, cstr),
    SLCArgs::Print => slc.print(),
  }
} 

bindings.h:

struct StringListContainer;

struct SLCArgs {
  enum class Tag {
    Add,
    Count,
    Print,
  };

  struct Add_Body {
    const char *_0;
  };

  Tag tag;
  union {
    Add_Body add;
  };
};

extern "C" {
void StringListContainer_do(StringListContainer *slc_p, SLCArgs m);
//...
} // extern "C"

C++:

int main() {
  StringListContainer* slc = new_StringListContainer();

  StringListContainer_do(slc, {SLCArgs::Tag::Add, {"top text"}});
  StringListContainer_do(slc, {SLCArgs::Tag::Add, {"middle text"}});
  StringListContainer_do(slc, {SLCArgs::Tag::Add, {"bottom text"}});
  StringListContainer_do(slc, {SLCArgs::Tag::Count, {}});
  StringListContainer_do(slc, {SLCArgs::Tag::Print, {}});
}

I expected to see this happen:

3
top text
middle text
bottom text

Instead, this happened:

It prints nothing out because Rust believes that the struct was placed in the argument build area of the stack, just above the return address. In reality, since this is a small struct, it was placed in the second and third registers %rsi %rdx.

According to this interpretation of the System V AMD64 ABI, structs between 2-4 words are passed sequentially through registers. Although, according to some experimentation, it looks like any struct over 2 words is also passed on the stack.

Looking at the assembly, the first SLCArgs was initialized on the stack in the argument build area just by chance, so the values that Rust reads when it looks for the struct is Add("top text"). However, this happens every time we call StringListContainer_do, so we never print anything out, just keep adding the first string.

However, this does work on Windows-MSVC and on Linux-GNU but only if you mark StringListContainer_do as extern "Rust". Just extern and extern "C" will not work.

The repo for this code is here, if that helps

Meta

rustc --version --verbose:

rustc 1.33.0 (2aa4c46cf 2019-02-28)
binary: rustc
commit-hash: 2aa4c46cfdd726e97360c2734835aa3515e8c858
commit-date: 2019-02-28
host: x86_64-unknown-linux-gnu
release: 1.33.0
LLVM version: 8.0
@sfackler sfackler added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-linux Operating system: Linux labels Mar 14, 2019
@cuviper
Copy link
Member

cuviper commented Mar 19, 2019

Although, according to some experimentation, it looks like any struct over 2 words is also passed on the stack.

This looks supported by the ABI: Section 3.2.3, Classification 5. (c) If the size of the aggregate exceeds two eightbytes and the first eightbyte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

Rust implements this condition here:

if n > 2 {
if cls[0] != Some(Class::Sse) {
return Err(Memory);
}

Your enum should be just two eightbytes, but I think it's getting cut out before that:

abi::Variants::Tagged { .. } |
abi::Variants::NicheFilling { .. } => return Err(Memory),

I suspect #[repr(C)] enum ought to be classified more like a plain struct (i.e. Variant::Single).

@cuviper
Copy link
Member

cuviper commented Mar 19, 2019

AFAICS none of the other targets check Variants at all, so I'm labeling this for x86_64 only.

@cuviper cuviper added O-x86_64 Target: x86-64 processors (like x86_64-*) C-bug Category: This is a bug. labels Mar 19, 2019
@nbdd0121
Copy link
Contributor

I think I ran into the same issue today. Here is a minimal snippet:

#[repr(C)]
#[derive(Clone, Copy)]
pub union MyResult {
    ok: MyOk,
    err: MyErr,
}

#[repr(usize)]
#[derive(Clone, Copy)]
enum MyResultTag { Ok, Err }

#[repr(C)]
#[derive(Clone, Copy)]
pub struct MyOk(MyResultTag, usize);

#[repr(C)]
#[derive(Clone, Copy)]
pub struct MyErr(MyResultTag, isize);

#[repr(usize)]
#[derive(Clone, Copy)]
pub enum MyResult2 {
    Ok(usize),
    Err(isize),
}

pub fn size() -> usize {
    std::mem::size_of::<MyResult>()
}

pub fn size2() -> usize {
    std::mem::size_of::<MyResult2>()
}

extern {
    fn t(_: MyResult);
    fn t2(_: MyResult2);
}

pub fn try() {
    unsafe { t(MyResult{ok:MyOk(MyResultTag::Ok, 1)}) };
}

pub fn try2() {
    unsafe { t2(MyResult2::Ok(1)) };
}

According to https://github.com/rust-lang/rfcs/blob/master/text/2195-really-tagged-unions.md, they should be identical for layout and FFI purposes. However Rust is trying to pass MyResult by register (correct) and MyResult2 by memory (wrong), which breaks the interop with C/C++. As a result, try and try2 in the above snippet generates different code (you can see the result directly here: https://godbolt.org/z/pGWeFA).

@annedrewhu
Copy link
Author

It also looks like any tagged argument, will be passed on the stack, even if there is only one tag (which could be optimized away) and the contents are only one byte

Relevant godbolt: https://www.godbolt.org/z/xjTrXA

@Enselic
Copy link
Member

Enselic commented Jul 24, 2023

@rustbot label A-ffi

@rustbot rustbot added the A-ffi Area: Foreign Function Interface (FFI) label Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ffi Area: Foreign Function Interface (FFI) C-bug Category: This is a bug. O-linux Operating system: Linux O-x86_64 Target: x86-64 processors (like x86_64-*) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants