New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhantomData fields in repr(C) structs change ABI on aarch64 #56877

Open
glandium opened this Issue Dec 16, 2018 · 40 comments

Comments

Projects
None yet
10 participants
@glandium
Copy link
Contributor

glandium commented Dec 16, 2018

Take this testcase:

#[repr(C)]
pub struct Foo {
    pub a: f32,
    pub b: f32,
}

#[no_mangle]
pub extern "C" fn foo(f: Foo) -> bool {
    f.a != f.b
}

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
    pub _unit: std::marker::PhantomData<()>,
}

#[no_mangle]
pub extern "C" fn bar(f: Bar) -> bool {
    f.a != f.b
}

Compile with:

rustc +nightly --target=aarch64-pc-windows-msvc --emit asm test.rs --crate-type staticlib -C opt-level=2

And check the output test.s file:

	.text
	.section	.text,"xr",one_only,foo
	.globl	foo
	.p2align	2
foo:
.seh_proc foo
	fcmp	s0, s1
	cset	w0, ne
	ret
	.section	.xdata,"dr",associative,foo
	.seh_handlerdata
	.section	.text,"xr",one_only,foo
	.seh_endproc

	.section	.text,"xr",one_only,bar
	.globl	bar
	.p2align	2
bar:
.seh_proc bar
	lsr	x8, x0, #32
	fmov	s0, w0
	fmov	s1, w8
	fcmp	s0, s1
	cset	w0, ne
	ret
	.section	.xdata,"dr",associative,bar
	.seh_handlerdata
	.section	.text,"xr",one_only,bar
	.seh_endproc

One would expect foo and bar having the same code, but that's not the case here.

This is the root cause of https://bugzilla.mozilla.org/show_bug.cgi?id=1512519

@glandium

This comment has been minimized.

Copy link
Contributor

glandium commented Dec 16, 2018

Here is the non-optimized llvm-ir:

target datalayout = "e-m:w-p:64:64-i32:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-pc-windows-msvc"

%Bar = type { [0 x i32], float, [0 x i32], float, [0 x i8], %"core::marker::PhantomData<()>", [0 x i8] }
%"core::marker::PhantomData<()>" = type {}

; Function Attrs: nounwind uwtable
define zeroext i1 @foo([2 x float]) unnamed_addr #0 {
start:
  %abi_cast = alloca [2 x float], align 4
  %f = alloca { float, float }, align 4
  store [2 x float] %0, [2 x float]* %abi_cast, align 4
  %1 = bitcast { float, float }* %f to i8* 
  %2 = bitcast [2 x float]* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4 %2, i64 8, i1 false)
  %3 = bitcast { float, float }* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds { float, float }, { float, float }* %f, i32 0, i32 1
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

; Function Attrs: nounwind uwtable
define zeroext i1 @bar(i64) unnamed_addr #0 {
start:
  %abi_cast = alloca i64, align 8
  %f = alloca %Bar, align 4
  store i64 %0, i64* %abi_cast, align 8
  %1 = bitcast %Bar* %f to i8* 
  %2 = bitcast i64* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast %Bar* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds %Bar, %Bar* %f, i32 0, i32 3
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

For reference, the llvm-ir for the same source code, for x86_64:

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%Bar = type { [0 x i32], float, [0 x i32], float, [0 x i8], %"core::marker::PhantomData<()>", [0 x i8] }
%"core::marker::PhantomData<()>" = type {}

; Function Attrs: nounwind nonlazybind uwtable
define zeroext i1 @foo(double) unnamed_addr #0 {
start:
  %abi_cast = alloca double, align 8
  %f = alloca { float, float }, align 4
  store double %0, double* %abi_cast, align 8
  %1 = bitcast { float, float }* %f to i8* 
  %2 = bitcast double* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast { float, float }* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds { float, float }, { float, float }* %f, i32 0, i32 1
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

; Function Attrs: nounwind nonlazybind uwtable
define zeroext i1 @bar(double) unnamed_addr #0 {
start:
  %abi_cast = alloca double, align 8
  %f = alloca %Bar, align 4
  store double %0, double* %abi_cast, align 8
  %1 = bitcast %Bar* %f to i8* 
  %2 = bitcast double* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast %Bar* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds %Bar, %Bar* %f, i32 0, i32 3
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

The notable difference is i64 vs. double. Note that the above is the output of --emit llvm-ir with no opt-level. I don't know if that matches what rustc passes to llvm. (I don't remember what the option to output the llvm-ir before/after all passes is)

@glandium

This comment has been minimized.

Copy link
Contributor

glandium commented Dec 16, 2018

It's interesting to note that x86-64 takes the two floats as a single double, while aarch64 takes two floats, but the main problem here is that bar is taking an i64 instead of two floats on aarch64...

@glandium

This comment has been minimized.

Copy link
Contributor

glandium commented Dec 16, 2018

BTW, this also happens on aarch64-unknown-linux-gnu, and I've now confirmed that the IR before the first optimization pass is the same.

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Dec 17, 2018

This happens because homogeneous aggregates are treated differently in the PCS compared to inhomogeneous aggregates. It depends how zero sized types are defined to behave in #[repr(C)] types as to whether this is a bug or not. Should they be completely ignored?

@emilio

This comment has been minimized.

Copy link
Contributor

emilio commented Dec 17, 2018

Yes, or at least that's the assumption that all of bindgen / cbindgen / the improper_ctypes lint make.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Dec 18, 2018

Nominating: this is likely to be a very high priority for FF over the next month or so, and if we can get a fix in now that will be a big win.

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Dec 18, 2018

FWIW the code to fix is here. The question is, should just PhantomData be ignored or all ZSTs or just ZSTs with aligmnent=1?

EDIT: or actually here but perhaps PhantomData should be stripped out of the type much earlier.

@glandium

This comment has been minimized.

Copy link
Contributor

glandium commented Dec 18, 2018

There's nothing platform-dependent in that code, how come this only affects aarch64?

Edit: Oh, right #56877 (comment)

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Dec 20, 2018

discussed at T-compiler meeting. P-high. assigning to self to make sure it doesn't get lost. (@nikomatsakis says "maybe we're close to a fix", which I assume is based on the comment thread here).

@pnkfelix pnkfelix self-assigned this Dec 20, 2018

@pnkfelix pnkfelix added P-high and removed I-nominated labels Dec 20, 2018

@jrmuizel

This comment has been minimized.

Copy link
Contributor

jrmuizel commented Dec 20, 2018

FWIW, cbindgen explicitly ignores PhantomData. Other ZST are not ignored. However, that seems like a bug in cbindgen: eqrion/cbindgen#262

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Dec 21, 2018

FWIW the code to fix is here. The question is, should just PhantomData be ignored or all ZSTs or just ZSTs with aligmnent=1?

If the struct has actual padding, then it won't be a homogeneous aggregate. Therefore, the only risk here is for structs whose alignment is increased by a ZST, but only up to the size of the type itself. I don't think there's a hazard in regarding these as homogeneous aggregates.

For example,

#[repr(C)]
struct Foo {
    a: f32,
    b: f32,
    c: [f64; 0]
}

EDIT: or actually here but perhaps PhantomData should be stripped out of the type much earlier.

No. self.field has to return fields in source order for things to work.

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Dec 21, 2018

Therefore, the only risk here is for structs whose alignment is increased by a ZST, but only up to the size of the type itself. I don't think there's a hazard in regarding these as homogeneous aggregates.

Unfortunately I don't think that's the case. In C

struct X {
    float a;
    float b;
    int x []; // or int x[0]
};

is not a homogeneous aggregate. I think we have to have a special case just for PhantomData.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Dec 21, 2018

I think we have to have a special case just for PhantomData.

Or some other kind of emptyness-tracking that treats [T; 0] as being ABI-relevant while not treating PhantomData as being ABI-relevant.

@nagisa nagisa assigned nikomatsakis and unassigned pnkfelix Jan 3, 2019

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jan 3, 2019

Discussed at the T-compiler. It feels as if this needs a strategy of some sort before any implementation work can proceed.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 3, 2019

Looking over the homogeneous-aggregate code, it already checks the following conditions:

  • Each field maps to same "unit" (register, etc) of size U
  • The offset of field F is equal to U*F (i.e., no padding between the fields)
  • The total size of the struct is equal to U*N where N is the number of fields (i.e., no padding at the end)

It seems like we could basically keep all of these conditions, but filter the field list to those with non-zero size, and everything would be fine. In particular, the concerns that @arielb1 raised here regarding alignment are being checked I believe.

So roughly speaking the definition of a "homogeneous aggregate" would be something like:

  • Let F be a list of fields whose types have non-zero size and N be the number of such fields
    • Let type(F[x]) be the type of the field with index x
  • Let U(T) be the "unit" used to pass a value of type T
    • I'm not entirely sure how to define a unit, but it's a concept pre-existing in the code
  • There must be some unit U0 used to pass each field f in F
    • that is, for all f in F, U(f) = U0
  • The offset of each field with index i must be i * sizeof(U)
  • The total size of the aggegate must be N * sizeof(U)

If all conditions are met, the type is a "homogeneous aggregate".

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 3, 2019

Does that sound about right? This seems like it wouldn't be too hard to implement.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 3, 2019

@nikomatsakis

This does not handle the C case with an empty array

struct X {
    float a;
    float b;
    int x []; // or int x[0]
};

We need some way of defining empty arrays as "homogeneous aggregate breaking" while defining PhantomData as not.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 4, 2019

@arielb1

I see, I missed that subtlety.

It comes down what the "filter" is that we apply to the types -- I specified "non-zero-size", but it could easily be "exclude empty structs". Or, perhaps, exclude anything that is zero-sized except for arrays whose length is non-zero (or whose element types are not zero-sized).

It seems to come down to whether we want a "whitelist" or a "blacklist" here.

It occurs to me that there is one other concern:

It is possible to define zero-sized structs in C with suitable compiler extensions (see this section of the nascent unsafe code guidelines for examples). Can anyone validate whether a struct like struct Foo { f64 x; f64 y; Bar bar; }; struct Bar { }; in C would be treated as a homogeneous aggregate? I created this godbolt example but I can't quite tell how to interpret it https://godbolt.org/z/hkaEda =)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 4, 2019

It seems clear though that if you have a #[repr(C)] empty struct that is embedded into another #[repr(C)] struct, it should be an aggregate iff the C compiler would consider to be one.

We have some freedom to do otherwise for structs and types that are not #[repr(C)].

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

I used return b.y to read things out instead of doing an addition (https://godbolt.org/z/eQHDrX), and I found that having an empty C struct behaves as an empty array no that was an error, it does not. I think that an empty #[repr(Rust)] struct should behave "like PhantomData" - I'll note that is already sort of "improper_ctypes" territory - we shouldn't define it, but it feels to be a better idea than special-casing PhantomData.

struct Foo {};

struct Baz {
    float x;
    float y;
    struct Foo b;
};
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

Can someone who knows the code look at it? I think C compilers specifically have a problem with zero-length arrays and structs, but not with positive-length arrays or structs.

More examples:

// still an homogeneous aggregate, despite having a non-0-sized struct and float.
struct Foo2 { float t; };
struct Baz2 {
    struct Foo2 x;
    float y[1];
};

float sizeof_baz2(struct Baz2 b) {
    return b.y[0];
}
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

i.e., I prefer the rule "if a type has a zero-sized repr(C) struct or zero-length array, it is not a homogeneous aggregate". I think the best way to put that is in a flag in Abi::Aggregate, because both of these cases will always be Abi::Aggregate.

EDIT 2019-01-09: the repr(C) struct bit was inaccurate.

cc @eddyb

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

This is getting more baroque with unions:

struct Empty {};
struct Empty2 { struct Empty e; };
struct Empty3 { float z[0]; };
struct Empty4 { struct Empty3 e; };

union U1 {
    struct Empty s;
};

union U2 {
    struct Empty2 s;
};

union U3 {
    struct Empty3 s;
};

union U4 {
    struct Empty4 s;
};

// is an aggregate
struct Baz1 {
    float x;
    float y;
    union U1 u;
};

// is an aggregate
struct Baz2 {
    float x;
    float y;
    union U2 u;
};

// not an aggregate
struct Baz3 {
    float x;
    float y;
    union U3 u;
};

// not an aggregate
struct Baz4 {
    float x;
    float y;
    union U4 u;
};
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

Actually, reading the code, I think it's mainly after "VLAs", which does make some sense.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 4, 2019

So from the code, the rule seems to be that having a zero-sized array member makes a struct not a VLA. I suppose we should just implement that using a flag in Abi::Aggregate.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 9, 2019

@arielb1 I am a bit confused by your last few comments. For example, you write:

// is an aggregate
struct Baz1 {
    float x;
    float y;
    union U1 u; // contains an empty struct
};

but you also wrote (emphasis mine)

I prefer the rule "if a type has a zero-sized repr(C) struct or zero-length array, it is not a homogeneous aggregate"

in this case, the type has a zero-sized C struct, and yet it is an aggregate. So I guess you are refining that rule? Or is there a distinction between structs that directly embed a zero-sized struct and those that do so via a union? (That seems like a bug, if so.)

Otherwise, I guess the rule is that:

  • zero-sized types are filtered out, as I originally wrote...
  • ...except for those that contain a zero-length array (transitively) (which is considered a VLA)?

I suppose we should just implement that using a flag in Abi::Aggregate.

I don't fully understand this but I'll have to review the code. The purpose of this flag is check whether the type contains a zero-sized array member? (i.e., VLA)?

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Jan 9, 2019

ignore zero-sized fields for homogeneous aggr tests except for VLA
This seems to match the behavior of clang as discussed here:

rust-lang#56877 (comment)
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 9, 2019

I figured i'd try to get the ball rolling on something. I started messing around with a branch here:

https://github.com/nikomatsakis/rust/tree/issue-56877-abi-aggregates

The rule I was trying to implement is something like this:

  • Propagate a has_vla flag indicating if this is a #[repr(C)] that ends with a zero-length array
    • the flag is "transitive" meaning that the zero-length array may be buried in a substruct
  • Things with this flag are never considered homogeneous aggregates

Otherwise, we ignore ZST in the homogeneous aggregate calculation.

This was my sort of best interpretation of what I think @arielb1 was saying, but I've not dug into the clang code. I'm also not that familiar with the ABI code so I may have messed things up. I've also not figured out yet how to write tests for this. =)

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jan 10, 2019

I prefer the rule "if a type has a zero-sized repr(C) struct or zero-length array, it is not a homogeneous aggregate"

That was from before I saw the clang impl. The real clang impl treats only VLAs as bad.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

OK, so let me try to open up this branch as a PR and we'll take it from there. I have to figure out the testing, most of all.

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Jan 15, 2019

ignore zero-sized fields for homogeneous aggr tests except for VLA
This seems to match the behavior of clang as discussed here:

rust-lang#56877 (comment)

@nikomatsakis nikomatsakis referenced a pull request that will close this issue Jan 15, 2019

Open

abi aggregates #57645

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Jan 15, 2019

ignore zero-sized fields for homogeneous aggr tests except for VLA
This seems to match the behavior of clang as discussed here:

rust-lang#56877 (comment)

bors added a commit that referenced this issue Jan 16, 2019

Auto merge of #57645 - nikomatsakis:issue-56877-abi-aggregates, r=<try>
abi aggregates

Ignore zero-sized types when computing whether something is a homogeneous aggregate, except be careful of VLA.

Fixes #56877

r? @arielb1
cc @eddyb
@parched

This comment has been minimized.

Copy link
Contributor

parched commented Jan 16, 2019

@nikomatsakis

It comes down what the "filter" is that we apply to the types

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration", e.g. PhantomData, but more generally it's all repr(Rust) fields because they can't be declared in C. As a consequence of the Rust struct and C struct needing to be compatible they must also be zero-sized, non-alignment inducing and non-padding inducing. However, there's no real need to check for that in the homogeneity calculation because those must always be improper C types as the struct size wouldn't be the same.

The homogenous_aggregate function is fine as is for non-repr(Rust) types, even

#[repr(C)]
struct Foo {
    a: f32,
    b: f32,
    c: [i32; 0]
}

To be clear, there are three potential problematic array fields in C to deal with

  • flexible array members e.g., int x[]
  • zero length arrays e.g., int [0]
  • variable length arrays (VLAs) e.g., int [n]

The latter I don't think can even be passed by value in rust currently, e.g.,

#[repr(C)]
struct X {
    x : [i32],
}

pub extern "C" fn use_x(x : X) {
}
   |
11 | pub extern "C" fn use_x(x : X) {
   |                         ^ doesn't have a size known at compile-time
   |

The first two look to be handled differently on x86 😟 https://godbolt.org/z/o0z52H (although my x86 assembly isn't very good so maybe someone else can verify).


Something I found interesting in https://github.com/rust-rfcs/unsafe-code-guidelines/blob/9c9840297ca47d3085876cec7b59bb92d8554591/reference/src/layout/structs-and-tuples.md#function-call-abi-compatibility

#[repr(transparent)] can only be applied to structs with a single
field whose type T has non-zero size, along with some number of
other fields whose types are all zero-sized (typically
std::marker::PhantomData fields). The struct then takes on the "ABI
behavior" of the type T that has non-zero size.

This seems like the kind of filter we want here too, is there code for that somewhere we can reuse?

Alternatively that got me thinking, can the use case of adding PhantomData to repr(C) structs be covered using repr(transparent)?
e.g., instead of

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
    pub _unit: std::marker::PhantomData<()>,
}

do

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
}

#[repr(transparent)]
pub struct Bar {
    pub c: BarC,
    pub _unit: std::marker::PhantomData<()>,
}

then make PhantomData an improper C type again and the existing rustc code works?

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Jan 17, 2019

The first two look to be handled differently on x86 😟 https://godbolt.org/z/o0z52H (although my x86 assembly isn't very good so maybe someone else can verify).

Actually, it looks like ABI for flexible array members for x86 changed in GCC 4.4 but clang uses the old behavior, am I interpreting that right? Does rust claim that flexible array members are representable in rust as zero sized arrays, if so there might be an issue, if not it's a potential foot-gun if someone assumes so.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jan 18, 2019

By the way, I’ve found just today that

struct banana {
    int peach[0];
}

and

struct banana {
    int peach[];
}

are not equivalent. Namely the following code is UB with the first structure but fine with the second, suggesting that only the second structure is variable-length...

int foo(struct banana x) {
    return x.peach[0];
}

EDIT: I’m surprised I’ve found it only today as codebase at my $dayjob is littered with the former variant for VL aggregates...

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 18, 2019

@parched

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration",...but more generally it's all #[repr(Rust)] fields because they can't be declared in C.

This doesn't sound right. If a #[repr(Rust)] struct has non-zero-size, it can't just be removed when converted to C -- rather, the type itself cannot be represented in C. But that data is still there and can't be ignored.

Does rust claim that flexible array members are representable in rust as zero sized arrays

I don't think we have another way to declare it, really, so presently the answer is yes. I presume here that "flexible array member" means T[] vs T[0]? (As @nagisa noted, I always considered foo[0] and foo[] equivalent in C and have definitely seen lots of code that uses foo[0]...)

@nikic

This comment has been minimized.

Copy link
Contributor

nikic commented Jan 18, 2019

@nagisa int peach[0] is not legal C, but commonly accepted as a compiler extension from pre-C99 times. Similarly int peach[1] is generally excluded from optimizations that depend on the array length, because it is a common flexible array member idiom in pre-C99 code (or for that matter, code that needs to interoperate with C++).

Edit: That is, if reading OOB of a zero or one sized array in tail position results in a miscompile, that's usually a compiler bug, even though it's technically UB. Your particular example is a bit odd because the struct is passed by value, and the usual guarantees for the struct hack patterns probably don't apply there.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 18, 2019

Some notes:

  • First off, when @arielb1 wrote "VLA" I was presuming they meant what @parched calls "flexible array members". =) But perhaps we should adopt the "flexible" terminology.
  • Second, we probably need to address (at the lang level) the fact that we can't draw all the distinctions that C can (e.g., flexible vs zero-length etc). But that will take some time.

I am wondering if we can agree on a pragmatic compromise to get us going forward, and leave some amount of work for later. To some extent, I think my PR represents a decent shot at such a compromise, since it seems to match clang (maybe?) and older gcc, and it allows one to represent flexible arrays using zero-length arrays. It's not perfect if the C code uses zero-length arrays in the newer sense (not the older, flexible sense), though you can add a "dummy" member after the ZLA (e.g., PhantomData<()>) as a workaround.

A more conservative rule might be to just filter out zero-sized, repr(Rust) structs (which would certainly cover phantomdata). This would not match the C behavior for zero-sized C structs or arrays, but those are quite unusual. On the other hand, it feels kind of strictly worse than my current PR, because it is basically just wrong more often (put another way, my PR does match C behavior for zero-sized structs).

At least, this is how I understand it now.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 18, 2019

(I should probably try to draw up some concrete examples to make my points here)

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Jan 21, 2019

@parched

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration",...but more generally it's all #[repr(Rust)] fields because they can't be declared in C.

This doesn't sound right. If a #[repr(Rust)] struct has non-zero-size, it can't just be removed when converted to C -- rather, the type itself cannot be represented in C. But that data is still there and can't be ignored.

Yes, but what I meant in the sentence following that we don't need to worry about it in the ABI computation because code like that would be ill formed regardless of the function call ABI, i.e., their memory layouts wouldn't be compatible.


To be clear, what I gather from the disassembly here is

struct X {
    float a;
    float b;
    int c [];
};

struct Y {
    float a;
    float b;
    int c[0];
};

are treated the same on:

  • AArch64 GCC
  • AArch64 Clang
  • x86_64 GCC >= 4.4

but differently on:

  • x86_64 GCC < 4.4
  • x86_64 Clang

IMO rust should match the second one (struct Y) where they differ because it doesn't really make sense to pass a struct with a flexible array member by value because it will get sliced, or does it? Whereas, I can imagine cases where you do want to pass a struct with a zero sized array by value like.

#define NUM_THINGS 5 // change this to suit, might be zero

struct Y {
    float a;
    float b;
    int things[NUM_THINGS];
};

A more conservative rule might be to just filter out zero-sized, repr(Rust) structs (which would certainly cover phantomdata). This would not match the C behavior for zero-sized C structs or arrays, but those are quite unusual.

I think this is the way to go. As I understand it, it would match the C behaviour in those cases, no?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2019

@parched

If I understand you, you are arguing that we should not consider [T; 0] to be a "flexible array member". In effect, we would basically be saying that we have no way (in Rust) to express that. I find this logic in particular persuasive:

it doesn't really make sense to pass a struct with a flexible array member by value because it will get sliced

However, that leaves me in a bit of a quandary. In particular, what set of ZST are we going to exclude when performing the aggregate test? At minimum, that set should include #[repr(Rust)] structs (which includes PhantomData).

I'm not sure what #[repr(C)] types it should include -- it sounds to me you are arguing it sohuld include [T; 0] and probably all zero-sized structs. This would then match the first set of compilers you gave:

  • AArch64 GCC
  • AArch64 Clang
  • x86_64 GCC >= 4.4

but it would not match:

  • x86_64 GCC < 4.4
  • x86_64 Clang

Given that GCC changed behavior here, I think matching the new behavior makes sense. And obviously matching AArch64 is our real goal here (I'm not sure how important this test is for x86_64).

I think this is the way to go. As I understand it, it would match the C behaviour in those cases, no?

I'm not entirely sure what you meant by this. My "conservative" proposal is just to filter out repr-rust things, but in that case we would NOT match the C behavior, since a struct with a zero-sized repr-C thing would not be considered a homogeneous aggregate. This seems like an ok intermediate step but obviously not the final goal, which ought to be compatibility with C compilers.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2019

I'd like to land something ASAP. I'd be happy to modify my PR to one of two things:

  • Only filter out repr-rust and phantomdata types
  • Filter out all zero-sized types

Either fixes the immediate problem. The latter seems to me to be more compatible with C-Rust interop. The question boils down to what you consider the Rust type [T; 0] to map to in C -- is it T[0] or T[]?

If we think about compatibility with newer GCC 4.4, then I think the two options work out like this:

Pattern Filter only rust Filter all ZST
repr(C) struct with phantomdata in Rust decl
repr(C) struct with empty repr(C) struct
repr(C) struct with T[0] in C decl and [T; 0] in Rust decl
repr(C) struct with T[] in C decl and [T; 0] in Rust decl

The first line corresponds to this issue. However, as @parched points out, the final line may be a moot point, since such types ought not to be passed by value.

(Note: I wonder if we want something like FlexibleArray<T> as a special marker type for translating T[] to Rust?)

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Jan 22, 2019

ignore zero-sized fields for homogeneous aggr tests except for VLA
This seems to match the behavior of clang as discussed here:

rust-lang#56877 (comment)
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2019

After some discussion on Zulip, I am currently leaning towards "just filter rust types" as a simple, "conservative" step. I think likely we will want to make further steps here eventually but this should solve the immediate problem and seems uncontroversial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment