Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upWrong ABI used for small C++ classes on Linux. #778
Comments
This comment has been minimized.
This comment has been minimized.
|
Yeah, this is somewhat annoying, but passing non-trivial arguments by value needs rustc support, and last time I checked with them they really weren't interested in implementing some kind of |
This comment has been minimized.
This comment has been minimized.
|
(The same happens for arguments, btw, we've run into this before, see https://bugzilla.mozilla.org/show_bug.cgi?id=1366247) |
emilio
added
the
bug
label
Jun 23, 2017
This comment has been minimized.
This comment has been minimized.
|
In my specific case, the struct was requested to be opaque, so maybe bindgen could use some other "filler" type? Looking at cabi_x86_64.rs, it seems that some kind on weird enum (one that ends up with StructWrappedNullablePointer layout) might do the trick. Or, maybe there should be a sub-option of Overall, I think it would be preferable to emit an error rather than generate bindings that are known produce incorrect code. |
This comment has been minimized.
This comment has been minimized.
|
Looks like this does the trick (i.e. some struct member must be misaligned): #[repr(C, packed)]
struct Bar {
data1: u8,
data2: u16,
data3: [u8; 5],
}Would only be useful for opaque structs, of course... |
This comment has been minimized.
This comment has been minimized.
|
On second thought, this
is pretty much the distinction between |
This comment has been minimized.
This comment has been minimized.
comex
commented
Jun 24, 2017
•
|
Huh, seems very similar to an MSVC ABI issue. And here I thought the madness of changing ABI based on the presence of methods was Windows-specific... |
This comment has been minimized.
This comment has been minimized.
Please no. This would wreak so much havoc when ABIs mysteriously change due to the addition or removal of |
This comment has been minimized.
This comment has been minimized.
|
As I've said in IRLO thread, I think there are two ways we can go here:
@emilio, do you have an opinion on the best way to proceed? |
This comment has been minimized.
This comment has been minimized.
|
Right now the instant solution which works on stable is to transform the bindings to pass certain types by pointer and handle certain return values correctly. In the future Rust definitely should add more calling conventions and attributes however so that Rust can perform these transformations internally. For example in windows API there is a COM method that looks like this in C++: D3D12_HEAP_PROPERTIES GetCustomHeapProperties(
UINT nodeMask,
D3D12_HEAP_TYPE heapType
);In order to get the ABI correct I have to make it look like this in Rust. Note that this already works in stable Rust, so it is a solution people can implement immediately in their projects. fn GetCustomHeapProperties(
nodeMask: UINT,
heapType: D3D12_HEAP_TYPE,
__ret_val: *mut D3D12_HEAP_PROPERTIES,
) -> *mut D3D12_HEAP_PROPERTIESHowever, in the future I'd really love for Rust to have calling conventions for C++ methods as well as attributes to indicate that a type is not POD, so in some future version I could simplify the signature down to: fn GetCustomHeapProperties(
nodeMask: UINT,
heapType: D3D12_HEAP_TYPE,
) -> D3D12_HEAP_PROPERTIESSo basically, do the bindgen transform regardless because it is needed for correctness now while in the future Rust really should add attributes and calling conventions to facilitate this and bindgen can be updated later to take advantage of those new language features if and when they do get implemented and stabilized. |
This comment has been minimized.
This comment has been minimized.
|
@retep998: It seems to me that you are describing a different problem. Is there already an open issue for it? Let's take discussion there. |
This comment has been minimized.
This comment has been minimized.
|
@vadimcn The two Rust issues are:
While this issue here is specifically for bindgen and what it should do about these functions whose ABI is changed slightly by things that Rust is currently incapable of representing. |
This comment has been minimized.
This comment has been minimized.
|
bindgen could also fix up function signatures if it generated a small wrapper function. For example: extern "C" {
#[link_name = "..."]
fn raw_ClassMethod(this: *mut Class, ret_value: *mut SomeStruct) -> *mut SomeStruct;
}
unsafe fn ClassMethod(this: *mut Class) -> SomeStruct {
let s: SomeStruct = mem::uninitialized();
raw_ClassMethod(this, &s);
s
} |
This comment has been minimized.
This comment has been minimized.
|
That'd yield also similar problems as #607, though. |
This comment has been minimized.
This comment has been minimized.
|
In this case SomeStruct is a POD. For true C++ classes, you might consider some sort of "assume_trivially_movable" annotation, because in practice, 95% of them are. ...On the other hand, one can always do that in the safe idiomatic-Rust wrapper, so maybe not worth it... |
fitzgen
added
A-C++
I-ABI-bug
labels
Jul 21, 2017
This comment has been minimized.
This comment has been minimized.
|
Other user found this today wrapping a very simple C++ file: https://github.com/ssloy/tinyrenderer/blob/master/tgaimage.cpp (The call to |
This comment has been minimized.
This comment has been minimized.
doppioandante
commented
Aug 7, 2017
|
Actually the tgaimage.cpp that I was using is here (didn't notice that this version was different from the one in the repo, that actually passes TGAColor as a reference). |
This comment has been minimized.
This comment has been minimized.
|
Another option that we discussed with some of the rust dev tools team folks was to emit C++ |
This comment has been minimized.
This comment has been minimized.
|
clang certainly does have the logic to figure out which types must be passed by-reference. Unfortunately, this is not exposed in libclang. IMO, the options we have here are:
|
This comment has been minimized.
This comment has been minimized.
|
On 08/09/2017 10:41 PM, Vadim Chugunov wrote:
clang certainly does have the logic
<https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/TargetInfo.cpp>
to figure out which types must be passed by-reference. Unfortunately,
this is not exposed in libclang.
IMO, the options we have here are:
1. What @fitzgen <https://github.com/fitzgen> suggests.
It's not clear to me how easy would this be, and whether it's acceptable
for every use-case, though it sounds fine to me.
2. Expose the required API in libclang (of course, bindgen would then
require at least that version of libclang).
Well, this is fair... Thanks for pointing this out!
I added the CXTargetInfo API to eventually get target triple and pointer
width from clang, and we already load the functions dynamically (thus
can fallback), so it shouldn't be terribly hard to add an API that does
this.
I could try to write a patch soon-ish :)
3. Try to compute the same info ourselves (could be very
labor-intensive or even impossible, as libclang's API seems to omit
many of the finer c++ details).
This looks like a no-go... I don't think we should aim for this one,
maintaining this looks like a pain from miles away :)
|
This comment has been minimized.
This comment has been minimized.
I gave it a try, but the information we need seems to be locked in the CodeGen layer. I could not see a way to get it out without instantiating a CodeGenModule and all that... |
This comment has been minimized.
This comment has been minimized.
|
Even if we get the information from libclang, we don't have a way to convey it to rustc yet. Did I miss that development? |
This comment has been minimized.
This comment has been minimized.
It's "just" another pass over our IR, this time emitting C++ instead of Rust. Not terribly difficult, but not exactly trivial either ;) As for acceptability, I suspect it is probably fine for most cases. I expect the cases where performance really matters enough that the function call overhead is unacceptable would already be porting the C++ methods by hand to Rust so that they can be inlined (we do this with |
This comment has been minimized.
This comment has been minimized.
bindgen would just explicitly emit this parameter/return value as by-reference in the signature of "C" stub. One wrinkle in this fine plan is, of course, that by-value/by-reference rules differ by platform (e.g. Linux vs Windows). Not sure what's the best way to deal with that... Can we do this? Of course, this goes against the very idea that the address of such objects is significant. ...Or should we do the opposite, i.e. convert such parameters to by-ref, as long as any one platform would pass them by-ref? ...Or just tell wrapper writers to use #[cfg($platform)] when calling such functions? |
This comment has been minimized.
This comment has been minimized.
This requires having a working c++ compiler as well as adding a c++ build step into your build script. On the other hand, this would nicely solve the issue with inline functions... |
This comment has been minimized.
This comment has been minimized.
Ah ok, understood.
We would rely on libclang giving us the correct information for the configured target platform. This is the same as what we do for size/align/field offset/etc. I agree with what @emilio said: maintaining our own reimplementation of different ABI parameter passing rules would be hairy and is something I'd also like to avoid unless we have no other options. Even in a scenario where we were without other options, we would need someone to step up and really own this bit of code and very thoroughly testing it for me to be comfortable landing it. Really would like to avoid this situation.
None of these options are great. If only we had placement new... :-/ Given the world we live in, however, I think the second option is the best choice available. |
This comment has been minimized.
This comment has been minimized.
It's easy to have sizeof(T) as a platform-dependent variable in an expression. Not so much with a function signature, which you'd have to use a different syntax to invoke.
This would mean, among other things, having to re-run clang on headers for all possible targets :( Well, ok, we can probably figure out which are the ones that cause trouble and run only those. Hmm... the more I think about this, the more appealing the idea of using a C++ compiler to generate "C" wrappers arounda C++ api sounds. I actually quite like the approach rust-cpp takes on that. |
This comment has been minimized.
This comment has been minimized.
DemiMarie
commented
Sep 11, 2017
|
@vadimcn I agree. Just invoke the C++ compiler, create an |
vadimcn commentedJun 23, 2017
Input C/C++ Header
Bindgen Invocation
Actual Results
Expected Results
Not sure...
Discussion
MakeFooandMakeBarlook very similar, however in C++ they use different ABIs, at least on Linux.Here's a quote from System V AMD64 ABI spec (page 20):
This means that while
Fooget returned by-value in theraxregister,Barmust be returned by-pointer to a caller allocated memory. To Rust compiler, however, these look identical, so it expects both to be returned inrax.