Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More extern type validation #7477

Closed
wants to merge 2 commits into from
Closed

Conversation

LemonBoy
Copy link
Contributor

Spotted in #7336 by @andrewrk.

@Vexu Vexu added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label Dec 17, 2020
@tadeokondrak
Copy link
Contributor

Isn't this intentional? #6700 (comment)

@LemonBoy
Copy link
Contributor Author

Isn't this intentional? #6700 (comment)

If you're exporting a pointer to some object you can't use in C I'd expect opaque{} to be used, you can easily get from/to the original type by writing some small helpers like I did for std.Progress and std.Progress.Node.
This is also the only case where the child type is not validated, arrays are also passed by reference but the compiler checks the element type.

@andrewrk
Copy link
Member

My apologies, please see #7336 (comment)

@LemonBoy
Copy link
Contributor Author

I do like this extra check, it prevents a nasty footgun like

const S = struct { ... }; // oh no! I skipped the extern!
extern fn foo(x: *S) void; // oh no!

and makes the language more "regular", this is the only case where we don't check the child type.

Validate the pointer child type when checking if a given type is valid
for an extern function.
@ifreund
Copy link
Member

ifreund commented Dec 28, 2020

19:54 <TheLemonMan> andrewrk, what's your stance on #7477 ?
19:55 <andrewrk> where's the footgun in that last comment there?
19:56 <pjz> any chance on reconsidering #447 at least wrt non-pub functions ?
19:56 <TheLemonMan> the missing `extern`
19:56 <TheLemonMan> that's even worse when you're dealing with C APIs using *c_void
19:58 <andrewrk> I see, because the layout won't match the C version
20:05 <TheLemonMan> yeah, that's a source of hard-to-debug problems
20:11 <andrewrk> TheLemonMan, my stance is that ABI boundaries are unavoidable footguns. A single typo can result in this kind of bug. Also `packed` vs `extern` or a missing `align`. So I don't find this argument compelling. I think pointers to things should be OK in the C ABI boundary
20:11 <TheLemonMan> it's a pointer to something you can't use from the C side
20:12 <TheLemonMan> that's something you'd use @opaque for, and I think that's also what -femit-h does
20:12 <andrewrk> you use opaque for when the C code has the definition; you just use a pointer when your own code has the definition and the C side doesn't
20:12 <andrewrk> doesn't your patch introduce pointer casts that previously weren't there?
20:13 <andrewrk> yep there it is
20:13 <andrewrk> no good; this introduces a footgun while promising to take one away
20:13 <TheLemonMan> the casts are done on the Zig side because we're crossing the boundary twice
20:14 <andrewrk> the zig code has no boundary to cross
20:14 <fengb> It’d just be a forward declared struct in C
20:15 <fengb> And we could avoid defining an extra export function just for C
20:15 <TheLemonMan> the front-end is written in Zig, the backend is written in C++ and calls into the Zig code
20:15 <TheLemonMan> so the Progress objects go Zig -> C++ -> Zig, that's why you need the ptrcast
20:18 <andrewrk> the zig code is all one object; the Progress struct in both cases is part of the same compilation
20:21 <TheLemonMan> yes? I'm not following anymore
20:23 <TheLemonMan> I think @opaque is the solution whenever you don't want/you can't leak the definition to the C side, like we do for the Progress structure
20:24 <TheLemonMan> like a *c_void, but type-safe
20:26 <andrewrk> it's the other way around, opaque is the solution when you can't leak the definition to the Zig side
20:26 <andrewrk> the type is opaque *to zig*
20:27 <andrewrk> opaque is the wrong type for std.Progress
20:27 <ifreund> I've had crashes because I forgot an extern before while writing wlroots bindings and this check would have prevented that code from compiling 
20:28 <andrewrk> I recognize that as one of many footguns at the ABI layer. I don't think it stands out as any more precarious than any other one, and it introduces footguns to take this feature away (see the new pointer casts in #7477)
20:28 <ifreund> I think requiring an explicit @ptrCast() on the ABI boundry when passing a non-extern struct is a good tradeoff
20:29 <TheLemonMan> well, the opaqueness is valid on both sides of the ABI boundary, why not take advantage of the "other half" ?
20:29 <ifreund> I may be missing something but I just see more verbosity not new footguns
20:29 <andrewrk> ifreund, I disagree. That same logic could be used to argue that extern function declarations should have no types at all, and you should have to pointer cast to get any type out of an external symbol. At which point we've reinvented extern fn declarations
20:30 <andrewrk> the fact is that an ABI declaration must be correct and there is no type safety available to check your work. a pointer to a zig struct is a valid type of parameter in a C ABI function
20:31 <TheLemonMan> how is that specific @ptrCast a footgun? every single @ptrCast is a footgun if you're not careful enough
20:31 <TheLemonMan> and I'd expect people to know what they're doing when playing with this kind of non-trivial C/Zig interop
20:31 <andrewrk> that's my point. if we treat all footguns as equal you've removed one and you've added one, so net zero, while making the code more complex
20:34 <TheLemonMan> I wouldn't say it's more complex, there are only 3-4 @ptrCast in the whole codebase (compiler+runtime+stdlib) that are going away once the stage1 gets the boot
20:34 <TheLemonMan> and, as I wrote on the ticket, this is the only irregular case where the child type isn't checked
20:35 <TheLemonMan> if you declare the parameter as a C array (that eventually decays into a pointer anyway) you get a cold hard compile error
20:36 <andrewrk> I don't understand that argument - it's also the only case where it's a pointer, so the meaning of a child type is different
20:36 <ifreund> That's a good point, one of these foot guns is much more common in practice 
20:38 <andrewrk> what do you mean to declare the parameter as a C array? zig doesn't have "C array" types
20:38 <andrewrk> zig arrays have no corresponding C type, hence compile error
20:40 <andrewrk> as a reminder we also have https://github.com/ziglang/zig/issues/168 which includes a partial shuffling of fields for debug builds to catch issues such as this
20:40 <ifreund> I think TheLemonMan was saying that you aren't currently allowed to pass an array of non C ABI types over they ABI boundry without a pointer cast
20:40 <andrewrk> right because there is no C type corresponding to that. But if it were a pointer to such a thing, there would be a C type for that - a forward declared struct
20:41 <TheLemonMan> in iter_function_params_c_abi arrays are passed as pointers and type_allowed_in_extern does allow arrays
20:42 <TheLemonMan> from the C POV there's no difference between the two, so why the difference?
20:43 <andrewrk> https://clbin.com/ZSPEk
20:44 <andrewrk> this is correct, if you find an example where it allows arrays, that's an implementation bug
20:44 <TheLemonMan> alright, I stand corrected
20:46 <andrewrk> I'm pretty confident in this design decision
20:46 <TheLemonMan> but it still feels like a weird special case, arrays not being allowed aside
20:46 <ifreund> I don't hate the status quo tbh. Eventually someone will probably write a static analysis tool to give me warnings for this
20:47 <TheLemonMan> if we take river as an example, this change would introduce zero @ptrCast, right?
20:48 <andrewrk> in zig code we model external things strictly from zig's perspective
20:48 <ifreund> TheLemonMan: I disagree that it feels like a special case. The pointer itself has a well defined representation on the ABI boundry, so it's allowed.
20:48 <ifreund> It feels different from how C headers work for sure though
20:48 <andrewrk> the other side of C ABI might not even be C. It could be any other language, even zig
20:49 <TheLemonMan> a pointer to... something you can't declare! isn't that an opaque type?
20:49 <ifreund> it would be in a C header declaring this function
20:49 <andrewrk> but we can declare it. the definition is right over there: struct Foo { a: T, b: T}
20:49 <TheLemonMan> yeah, if you twist the meaning of opaque to mean "opaque for Zig" then it makes sense
20:50 <TheLemonMan> I meant on the C side
20:50 <ifreund> that's how I interpert opaque currently at least
20:50 <ifreund> extern definitons aren't an API spec though
20:50 <ifreund> if you want someone to use your library you'd write a C header and use an opaque struct
20:51 <andrewrk> right. and with a zig-emitted .h file it would emit a forward declared struct
20:51 <TheLemonMan> my interpretation is a bit wider, I'm using that for types I want to keep on one or the other side
20:52 <ifreund> TheLemonMan: you're right though, if we look only at river and it's dependencies/bindings this change would be a strict improvement to safety
20:52 <ifreund> river is not a good example though for this use case
20:52 <TheLemonMan> I'll make my own fork of Zig with blackjack and hookers and @opaque types
20:52 <ifreund> I expect this to come up a lot more when writing Zig libraries exporting a C ABI
20:52 <companion_cube> you could call it zen++
20:52 <companion_cube> or zune
20:53 <TheLemonMan> well river is heavy on handcrafted C bindings
20:53 <TheLemonMan> zen++, I like how that sounds
20:53 <companion_cube> river.zenpp
20:54 <ifreund> TheLemonMan: indeed, but you won't find a single export statement
20:54 <andrewrk> hmmmm. in theory I think we could make it a compile error if the type is never exported
20:55 <TheLemonMan> I think the biggest library written in Zig is the compiler_rt, that's full of @export
20:55 <andrewrk> it's only because we export the type that it could ever be passed in an extern fn
20:55 <andrewrk> the difference between export / extern is a nice observation
20:57 <ifreund> TheLemonMan: if I were to implement e.g. libwayland's API in zig I would want to use non-extern structs behind all the opaque pointers for example
20:57 <ifreund> alright andrewrk's convinced me that the status quo is better
20:58 <TheLemonMan> under the "opaque for zig" light that makes sense
21:00 <TheLemonMan> I'm still not sold tho, next time you get a nasty crash because of a missing extern please think of me :>
21:01 <andrewrk> ok, deal
21:01 <ifreund> TheLemonMan: I still want a static analysis tool to spit warnings at me :D
21:01 <ifreund> but that can come way down the road
21:01 <ifreund> hmm, should probably paste this chat log as a comment on the PR...

@SpexGuy
Copy link
Contributor

SpexGuy commented Dec 28, 2020

Another note on this: if we remove the result type from casts, this becomes a much bigger footgun - changing the type of a parameter would not cause compile errors at call sites because of the ptrCast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants