Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C interop: returning structs by value from imported C functions looses content #3211

Closed
floooh opened this issue Sep 10, 2019 · 13 comments
Closed
Labels
bug Observed behavior contradicts documented or intended behavior miscompilation The compiler reports success but produces semantically incorrect code. stage1 The process of building from source via WebAssembly and the C backend.
Milestone

Comments

@floooh
Copy link
Contributor

floooh commented Sep 10, 2019

I stumbled over a curious problem today while continuing with the Zig bindings for my C headers. Some of my C functions return structs by value, and those returned structs don't arrive with their content intact on the Zig side.

I wrote a minimal reproduction test case:

https://github.com/floooh/zig-test

When you do a zig build run in there you should see output like this:

C struct:
  .instancing: true
  .origin_top_left: true
  .multiple_render_targets: true
  .msaa_render_targets: true
  .imagetype_3d: true
  .imagetype_array: true
  .image_clamp_to_border: true
Zig struct:
  .instancing: false
  .origin_top_left: false
  .multiple_render_targets: false
  .msaa_render_targets: false
  .imagetype_3d: false
  .imagetype_array: false
  .image_clamp_to_border: false

Note how the values under C struct are different from the values under Zig struct, they should be the same (all should be "true")

The project consists of a C header/source pair test.h/.c and the Zig main file test.zig.

The C part mainly defines a function query_features which returns a struct by value made entirely of bools, and all bools are set to true:

https://github.com/floooh/zig-test/blob/d8a409e621e1f635cad07e08ac829ccb22971ff3/src/test.c#L5-L16

There's also another C function which calls the query_features function and prints the content of the returned struct. This is used to check that returning the struct works in C:

https://github.com/floooh/zig-test/blob/d8a409e621e1f635cad07e08ac829ccb22971ff3/src/test.c#L19-L29

The Zig main file imports the C header and calls both functions

https://github.com/floooh/zig-test/blob/d8a409e621e1f635cad07e08ac829ccb22971ff3/src/test.zig#L1-L20

Here's the problem:

https://github.com/floooh/zig-test/blob/d8a409e621e1f635cad07e08ac829ccb22971ff3/src/test.zig#L11

The returned Zig struct imported from C seems to have lost its content. In this test I always get a struct back that's entirely filled with "false", but in more interesting cases (my C headers) the content changes from run to run, as if the struct contains junk.

Another interesting effect is that when I change all the bools in the C struct to int, then the expected content is returned (all 1's).

The Zig version is compiled from HEAD from a few days ago, OS is macOS.

@floooh
Copy link
Contributor Author

floooh commented Sep 11, 2019

PS: when building with the stable zig 0.4.0+ version, I get the "junk" behaviour where the result differs between runs, e.g. one run is:

C struct:
  .instancing: true
  .origin_top_left: true
  .multiple_render_targets: true
  .msaa_render_targets: true
  .imagetype_3d: true
  .imagetype_array: true
  .image_clamp_to_border: true
Zig struct:
  .instancing: false
  .origin_top_left: false
  .multiple_render_targets: false
  .msaa_render_targets: true
  .imagetype_3d: true
  .imagetype_array: false
  .image_clamp_to_border: false

And another run is:

C struct:
  .instancing: true
  .origin_top_left: true
  .multiple_render_targets: true
  .msaa_render_targets: true
  .imagetype_3d: true
  .imagetype_array: true
  .image_clamp_to_border: true
Zig struct:
  .instancing: false
  .origin_top_left: false
  .multiple_render_targets: true
  .msaa_render_targets: false
  .imagetype_3d: true
  .imagetype_array: false
  .image_clamp_to_border: false

(note how for instance .msaa_render_targets is true in one run and false in the next.

@andrewrk andrewrk added this to the 0.6.0 milestone Sep 11, 2019
@andrewrk andrewrk added the bug Observed behavior contradicts documented or intended behavior label Sep 11, 2019
@andrewrk
Copy link
Member

Thanks for the report! I don't think there is enough time to get this in for 0.5.0 but definitely want to fix this.

@floooh
Copy link
Contributor Author

floooh commented Sep 11, 2019

No worries!

@gustavolsson
Copy link
Contributor

gustavolsson commented Oct 1, 2019

This seems to work fine on 0.5.0 and windows :)

C struct:
  .instancing: true
  .origin_top_left: true
  .multiple_render_targets: false
  .msaa_render_targets: true
  .imagetype_3d: false
  .imagetype_array: true
  .image_clamp_to_border: true
Zig struct:
  .instancing: true
  .origin_top_left: true
  .multiple_render_targets: false
  .msaa_render_targets: true
  .imagetype_3d: false
  .imagetype_array: true
  .image_clamp_to_border: true

@floooh
Copy link
Contributor Author

floooh commented Oct 1, 2019

I'll give it another try in the evening (a few hours from now) on macOS (I'll test both with the 0.5.0 release and the current HEAD version).

@floooh
Copy link
Contributor Author

floooh commented Oct 2, 2019

Ok, I just tested the 0.5.0 release on macOS (Cataline Beta) and I still see the faulty behavior.

Strange that it would behave differently between Mac and Windows. I used the precompiled tar.xz Zig distribution instead of the homebrew version, because compiling the LLVM dependency fails for some reason on the macOS Catalina Beta.

@jacereda
Copy link

The C function compiles to:

_query_features:
00000001000007a0	movabsq	$0x1010101010101, %rax  ## imm = 0x1010101010101
00000001000007aa	retq

So, it's returning short structs in registers.

The call site:

00000001000009e0	callq	_query_features
00000001000009e5	movb	0x16(%rsp), %al
00000001000009e9	movb	%al, 0xf(%rsp)
00000001000009ed	movb	0x15(%rsp), %al
00000001000009f1	movb	%al, 0xe(%rsp)
00000001000009f5	movb	0x14(%rsp), %al
00000001000009f9	movb	%al, 0xd(%rsp)
00000001000009fd	movb	0x13(%rsp), %al
0000000100000a01	movb	%al, 0xc(%rsp)
0000000100000a05	movb	0x12(%rsp), %r13b
0000000100000a0a	movb	0x10(%rsp), %r14b
0000000100000a0f	movb	0x11(%rsp), %r12b

So, it's expecting the struct to be on the stack.

@jacereda
Copy link

zig targets reports:

C ABIs:
  none
  gnu (native)
  gnuabin32
  gnuabi64
  gnueabi
  gnueabihf
  gnux32
  code16
  eabi
  eabihf
  elfv1
  elfv2
  android
  musl
  musleabi
  musleabihf
  msvc
  itanium
  cygnus
  coreclr
  simulator
  macabi

Should gnu be the native ABI on Darwin?

@andrewrk andrewrk added the stage1 The process of building from source via WebAssembly and the C backend. label Feb 13, 2020
@Sobeston
Copy link
Sponsor Contributor

Just cloned it now. Had to add exe.linkLibC(); for it to build for stdio.h, but I'm getting correct behavior with windows, x86-64 in that everything is true. Using a new master build.

Changing the build script for .x86_64, .linux, .gnu means I'm getting all false under Zig struct (with all true under C struct). This behavior is consistent between runs.

@andrewrk andrewrk added the miscompilation The compiler reports success but produces semantically incorrect code. label Mar 4, 2020
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Apr 5, 2020
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 17, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Nov 6, 2020
@floooh
Copy link
Contributor Author

floooh commented Jan 2, 2021

Some new info from my side (since I noticed some more strange behaviour while continuing on my Sokol-header Zig bindings):

The returned struct size matters: the problem only appears if the return-struct is <= 16 bytes. If the struct size is above 16 bytes, the returned struct is not corrupted (tested on Linux and Mac).

To verify just add an int x[4]; to the end of the test struct here:

https://github.com/floooh/zig-test/blob/6b97b330eafd0515ba22a555c5e31c19aa39dedb/src/test.h#L10

...then do another zig build run and the result looks correct.

I also found out that once such a "return value corruption" happens, the function parameters of a function which returns such a <16 bytes struct may also get corrupted.

This 16-bytes cutoff looks to me like it's an ABI issue.

@ifreund
Copy link
Member

ifreund commented Jan 2, 2021

This 16-bytes cutoff looks to me like it's an ABI issue.

Indeed, it's a known limitation of stage1 but it looks like nobody's linked #1481 yet.

@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 May 19, 2021
@floooh
Copy link
Contributor Author

floooh commented Aug 5, 2021

Just a heads up, it seems this issue is fixed in the current zig 0.9.0 dev version. I no longer need to pad small C struct args and return values to get above 16 bytes on macOS.

@ifreund
Copy link
Member

ifreund commented Aug 5, 2021

Indeed, looks like this particular case of #1481 has been fixed by #9443.

@ifreund ifreund closed this as completed Aug 5, 2021
@ifreund ifreund removed this from the 0.10.0 milestone Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior miscompilation The compiler reports success but produces semantically incorrect code. stage1 The process of building from source via WebAssembly and the C backend.
Projects
None yet
Development

No branches or pull requests

6 participants