Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong offset of CxxString field when preceded by padding #18

Closed
dtolnay opened this issue Oct 11, 2020 · 4 comments · Fixed by #22
Closed

Wrong offset of CxxString field when preceded by padding #18

dtolnay opened this issue Oct 11, 2020 · 4 comments · Fixed by #22

Comments

@dtolnay
Copy link
Contributor

dtolnay commented Oct 11, 2020

For this input header, autocxx is compiling the following generated Rust code, which puts the s field at offset 4 in S while the C++ string field is actually at offset 8 on my machine.

#pragma once
#include <cstdint>
#include <string>
struct S {
  explicit S(uint32_t i);
  uint32_t i;
  std::string s;
};
mod ffi {
    pub type __uint32_t = ::std::os::raw::c_uint;
    unsafe impl cxx::ExternType for bindgen::S {
        type Id = cxx::type_id!("S");
        type Kind = cxx::kind::Opaque;
    }
    mod bindgen {
        #[repr(C)]
        pub struct S {
            pub i: u32,
            pub s: ::cxx::CxxString,
            pub __bindgen_padding_0: [u64; 3usize],
        }
        impl S {
            pub fn make_unique(i: u32) -> cxx::UniquePtr<Self> {
                super::cxxbridge::S_make_unique(i)
            }
        }
    }
    #[cxx::bridge]
    pub mod cxxbridge {
        impl UniquePtr<S> {}
        extern "C" {
            pub fn S_make_unique(arg0: u32) -> UniquePtr<S>;
            include!("input.h");
            include!("autocxxgen.h");
            type S = super::bindgen::S;
        }
    }
}
@adetaylor
Copy link
Collaborator

Right, the problem here is that I (stupidly) hadn't realized field access was even possible via UniquePtr. Oops. I was expecting field access to opaque types solely to be possible only via accessor methods in C++ which I have yet to generate.

(The solution is probably for autocxx to generate a blank struct for things which are opaque, such that there are no fields to have the wrong offsets.)

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 11, 2020

The solution is probably for autocxx to generate a blank struct for things which are opaque, such that there are no fields to have the wrong offsets.

Public fields should be fine as long as there is a way to find out the presence of padding from bindgen. (I have never looked into what information bindgen has; you might already have considered this and it's not possible or would be too platform-specific.)

The Rust struct I would expect generated for that input would be something like:

#[repr(C)]
pub struct S {
    pub i: u32,
    _padding0: [u8; 4],
    pub s: ::cxx::CxxString,
    _actual_s: [u64; 3],
}

Notice not all fields are pub which makes the struct not constructible from Rust with an S { ... } struct literal expression, but that is fine because it shouldn't be constructible by value in Rust anyway.

This way we're still able to take references to s for passing to C++ or calling e.g. the Debug impl of CxxString.

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 11, 2020

Oh here is a way that would only use information that the current implementation is already using:

#[repr(C)]
pub struct S {
    pub i: u32,
    _align_s: [[u64; 3]; 0],  // [T; 0] where T is the type of _actual_s determined through bindgen
    pub s: ::cxx::CxxString,
    _actual_s: [u64; 3],
}

@adetaylor
Copy link
Collaborator

as long as there is a way to find out the presence of padding from bindgen. (I have never looked into what information bindgen has; you might already have considered this and it's not possible or would be too platform-specific.)

Unfortunately, there be doom here.

The relevant limitation on bindgen is lack of support for template specialization. It seems from my explorations that any real STL type except std::unique_ptr involves partial template specialization and therefore the layout cannot be accurately calculated by bindgen. In fact, in autocxx, we replace std::string with a nonsense fictional type as it simply failed to deal with the real std::string at all. (I can't remember the exact symptoms, but they seemed definitively doomed, and expected).

This leads to an interesting and worrying question though.

In cxx, we have:

  • Opaque
  • Trivial - no non-trivial move constructor, no destructor

For the purposes of field access, we need something slightly different:

  • "Plain old data" - bindgen can calculate field offsets correctly
  • Not POD - bindgen can't due to std::string or other type that might involve template specialization.

These things have been 100% correlated so far, but they're not the same and I'll need to be aware of this. #17 is therefore possibly just wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants