Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI and union #5492

Closed
sanxiyn opened this Issue Mar 22, 2013 · 28 comments

Comments

Projects
None yet
@sanxiyn
Copy link
Member

sanxiyn commented Mar 22, 2013

How would one call C functions involving union with Rust FFI?

SpiderMonkey's jsval is one example.

@thestinger

This comment has been minimized.

Copy link
Contributor

thestinger commented Mar 24, 2013

There could be unsafe enum with the layout defined to be the same as C for interoperability. The only other way to deal with it would be finding the alignof and sizeof of the union in C for each platform and then translating that to Rust.

@sanxiyn

This comment has been minimized.

Copy link
Member Author

sanxiyn commented Apr 24, 2013

Referencing Aatch/rust-xcb#2.

@yichoi

This comment has been minimized.

Copy link
Contributor

yichoi commented Apr 25, 2013

referencing servo/servo#398

referencing servo/rust-mozjs#9

@Aatch

This comment has been minimized.

Copy link
Contributor

Aatch commented May 9, 2013

The unsafe enum idea appeals to me, since I thought about it as an option when trying to solve the union issue in rust-xcb, but decided that relying on the representation of enums was too "hacky" and fragile.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Jul 1, 2013

brson mentions in the description for #6346 that a "macro based solution" would be appropriate here, though I do not current know what that would entail. (It sounds to me like a potential alternative to the changes to the grammar to add unsafe enum that have been discussed here.)

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Jul 1, 2013

Nominating for milestone 3, feature complete.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Jul 31, 2013

I don't think a "macro-based solution" would be appropriate, as you need to restrict the valid range of values at the site of usage, which macros cannot do.

@graydon

This comment has been minimized.

Copy link
Contributor

graydon commented Aug 8, 2013

An attribute on an enum that makes it have no discriminant and makes any match on the variant-part succeed, should be sufficient. Not pretty but neither are C union semantics.

@graydon

This comment has been minimized.

Copy link
Contributor

graydon commented Aug 8, 2013

accepted for feature-complete milestone

@Skrylar

This comment has been minimized.

Copy link

Skrylar commented Dec 13, 2013

I ran in to this problem recently as well; Allegro makes use of Unions for passing events around in C, which turns out to be a pain to deal with in Rust.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Feb 13, 2014

We do want to solve this problem eventually, but it need not block 1.0. Assigning P-low.

@pnkfelix pnkfelix added P-low and removed P-high-untriaged labels Feb 13, 2014

@alxkolm

This comment has been minimized.

Copy link

alxkolm commented Nov 3, 2014

What status?

@alexchandel

This comment has been minimized.

Copy link

alexchandel commented Jan 21, 2015

What's the recommended way to do FFI-compatible unions?

@jdm

This comment has been minimized.

Copy link
Contributor

jdm commented Jan 21, 2015

I believe structs containing a field which is at least as big as the largest type the union can represent and manual transmutes is the state of the art right now.

@mzabaluev

This comment has been minimized.

Copy link
Contributor

mzabaluev commented Jan 21, 2015

I believe structs containing a field which is at least as big as the largest type the union can represent and manual transmutes is the state of the art right now.

Make sure you get the alignments right. The struct should have #[repr(C)] and the field posing as the union (or the inner type, in case the newtype struct emulates the union itself) has the alignment of the most-aligned variant.

@alexchandel

This comment has been minimized.

Copy link

alexchandel commented Jan 21, 2015

@jdm Even when variants are different sizes? transmute errors when T and U have different sizes, and transmute_copy is just as dangerous since it copies sizeof(U) bytes, triggering "undefined behavior".

@mzabaluev

This comment has been minimized.

Copy link
Contributor

mzabaluev commented Jan 22, 2015

Also, the overall size of the union is a multiple of the alignment of its most-aligned variant. This union has the size of 8:

union A {
    int32_t intval;
    char chars[5];
};

Which would require a Rust representation like:

#[repr(C)]
struct A {
    union_data: [i32; 2]
}

So yes, representing unions is not for the unwary.

@alexchandel

This comment has been minimized.

Copy link

alexchandel commented Jan 22, 2015

@mzabaluev For a C union like this:

struct INPUT {
  DWORD type;
  union {
    MOUSEINPUT    mi;
    KEYBDINPUT    ki;
    HARDWAREINPUT hi;
  };
};

I use a struct field rather bytes. It's easier because the size and alignment change between platforms, and you can't do [u8; size_of::<MOUSEINPUT>()]

#[repr(C)]
pub struct MOUSEINPUT { ... }
#[repr(C)]
pub struct KEYBDINPUT { ... }
#[repr(C)]
pub struct HARDWAREINPUT { ... }

#[repr(C)]
pub struct INPUT {
    pub tag_: DWORD,
    pub union_: MOUSEINPUT, // MOUSEINPUT largest and most aligned
}
@mzabaluev

This comment has been minimized.

Copy link
Contributor

mzabaluev commented Jan 23, 2015

@alexchandel Good when it works, but sometimes the largest variant is not the most aligned, like in my example above.

@niconiconico9

This comment has been minimized.

Copy link

niconiconico9 commented Jul 11, 2015

Is there a reason why this bug is tagged as "P-low"?
The alternatives that are proposed and I guess currently used entails that a great care is taken for handling alignment properly.
The last example on how this can be fixed without any language addition, is a perfect example how the language is promoting to write code that is incorrent because it don't provide a proper solution

@Daggerbot

This comment has been minimized.

Copy link

Daggerbot commented Aug 24, 2015

I don't know how feasible it would be to implement, but an example usage could be:

#[repr(union)]
pub struct XEvent {
  pub type_: c_int,
  pub xany: XAnyEvent,
  // ...
  pub pad: [c_long; 24],
}

Like C unions, each field would start at the beginning of the struct, and the size of the struct would be that of its longest field. This wouldn't require adding union as a language keyword. The only limitation I can think of would be that accessing a field in the union would require unsafe, which is already used often when interfacing with C libraries.

A macro based solution could look something like:

union! {
  pub union XEvent {
    pub type_: c_int,
    pub xany: XAnyEvent,
    // ...
    pub pad: [c_long; 24],
  }
}

// functions generated by macro:
impl XEvent {
  pub unsafe fn type_<'a> (&'a self) -> &'a c_int { ::std::mem::transmute(self) }
  pub unsafe fn type__mut<'a> (&'a mut self) -> &'a mut c_int { ::std::mem::transmute(self) }
  pub unsafe fn xany<'a> (&'a self) -> &'a XAnyEvent { ::std::mem::transmute(self) }
  pub unsafe fn xany_mut<'a> (&'a mut self) -> &'a mut XAnyEvent { ::std::mem::transmute(self) }
  // ...
  pub unsafe fn pad<'a> (&'a self) -> &'a [c_long; 24] { ::std::mem::transmute(self) }
  pub unsafe fn pad_mut<'a> (&'a mut self) -> &'a mut [c_long; 24] { ::std::mem::transmute(self) }
}

The only thing that prevented me from writing this macro is the inability to determine the size of the union at compile time. The best workaround I could come up with is providing a guess of the size of the largest field and making the union generate tests to verify this.

union! {
  pub union XEvent : [c_long; 24] {
    pub type_: c_int,
    pub xany: XAnyEvent,
    // ...
    pub pad: [c_long; 24],
  }
}

// test generated by macro:
#[test]
fn test_union_size_XEvent () {
  use std::cmp::max;
  use std::mem::size_of;
  let sizes = [
    size_of::<c_int>(),
    size_of::<XAnyEvent>(),
    // ...
    size_of::<[c_long; 24]>(),
  ];
  assert!(sizes.iter().fold(0, |a, b| max(a, *b)) == size_of::<[c_long; 24]>());
}

Of course, it would be much easier on developers of language bindings to have unions available as a language feature.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Aug 24, 2015

winapi would benefit massively from unions as part of the core language. I currently use a macro to make do, but its just not the same.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Sep 21, 2015

I'm interested in unions as well, for several Linux kernel APIs. The proposal of having an "unsafe union", guaranteed to match the C layout, would work perfectly; almost any non-trivial instance of such a C union only makes sense to access in an unsafe block, given its trivial equivalence to the unsafe std::mem::transmute.

@serprex

This comment has been minimized.

Copy link

serprex commented Oct 25, 2015

Most unions in C have a descriptor field, therefore there's a need for 2 cases (has-desciptor & has-no-descriptor). Being able to specify a struct-unique enum with custom type descriptor & the fields corresponding values would allow Rust to use the union in a type safe manner while being able to interoperate with C APIs

Essentially something like

#[enum_explicit_descriptor(t)]
#[enum_explicit_values = "I: 0, N: 1"]
unsafe struct TValue{
  t: u8,
  val: unsafe enum IntOrFloat{
    I(i32),
    N(f32),
  },
}

Using unsafe struct to handle cases where the type descriptor isn't adjacent to the union. Even then, something could be done like

#[enum_explicit_descriptor_type(u8)]
#[enum_explicit_descriptor_typeoffset(-1)] // This could be behind-the-struct by default
#[enum_explicit_values = "I: 0, N: 1"]
enum IntOrFloat{
  I(i32),
  N(f32),
}

Then there'd need to be compile-time machinery that makes sure there's a valid u8 behind the enum in definitions, though user code would access a struct TValue{ t:u8, val: IntOrFloat }

The issue of having typeoffset could be resolved by requiring explicit enums only be contained in structs & have enum_explicit_layout_typeoffset be specified by the struct. Would require a bit more strictness though since one wouldn't be able to know how to find the descriptor of an &IntOrFloat parameter

@mzabaluev

This comment has been minimized.

Copy link
Contributor

mzabaluev commented Oct 25, 2015

@serprex: I don't think it's worthwhile to add language support for external descriptors of unions, even in cases where there is a 1:1 match between a single descriptor field value and a union variant. The code using unions is expected to be close to FFI, where unsafe is the norm; so variant matching can be always unsafe, and the burden of ensuring the correct variant would be completely on the programmer, as it is in C.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Oct 26, 2015

@mzabaluev I agree. For a first pass, at least, we just need an unsafe construct to access fields of a C union in a C-compatible, interoperable way. We can always produce a safe wrapper around that, and even produce macros to generate such wrappers for common cases.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Dec 6, 2015

I posted a preliminary proposal using #[repr(C,union)] struct { ... } (requiring unsafe blocks for field accesses, assignments, or initializations) to https://internals.rust-lang.org/t/pre-rfc-unsafe-enums/2873/23.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Jan 5, 2016

Closing in favour of rust-lang/rfcs#877.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.