New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove confusing comment about ideally using `!` for `c_void` #56594

Open
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@sdroege
Contributor

sdroege commented Dec 7, 2018

Using ! for c_void would have the problem that pointers and
potentially references to an uninhabited type would be created, and at
least for references this is UB.

In addition document that newtype wrappers around c_void can be used
safely in place of extern type until the latter is stabilized.


I'm not 100% sure about the usage for opaque types as the nomicon still recommends using #[repr(C)] pub struct Foo { _private: [u8; 0] } but it seems like these two should be equivalent in the end? Also the #[repr(C)] (in both cases) should be unneeded because such types never being passed by value, never being dereferenced but only passed around as pointer or reference, so the representation of (values of) the type itself should not matter at all?

Also in context of c_void and ! the second unresolved question in the extern type stabilization ticket seems relevant

In std's source, it is mentioned that LLVM expects i8* for C's void*.
We'd need to continue to hack this for the two c_voids in std and libc.
But perhaps this should be done across-the-board for all extern types?
Somebody should check what Clang does.

Please correct me if my understanding is wrong and everything's actually fine as is.

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Dec 7, 2018

r? @TimNN

(rust_highfive has picked a reviewer for you, use r? to override)

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 7, 2018

This might be something @RalfJung could know and clarify :)

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 8, 2018

at least for references this is UB.

Maybe. That's not decided yet (and I think it shouldn't be).

But &*x on x: ! is UB. And for mutable references, the compiler will insert such reborrows automatically in lots of places. References to uninhabited types are certainly a huge footgun waiting to go off, and almost impossible to create/use UB-free.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 8, 2018

So concerning this PR, I basically agree with this change, but there are some comments here that are still dangerously misleading. Namely, the comments say that the two variants shouldn't have to exist. But that would be fatal! An enum with no variants behaves exactly like ! as far as UB is concerned.

So the non-doc-comments here should also be fixed to explain that we need at least one variant to avoid UB.

I'm not 100% sure about the usage for opaque types as the nomicon still recommends using #[repr(C)] pub struct Foo { _private: [u8; 0] } but it seems like these two should be equivalent in the end?

Cc @Gankro

The nomicon pattern is designed to avoid the non-FFI-safe-types warning. Does this enum also avoid that?

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 10, 2018

Thanks a lot for the clarifications!

So the non-doc-comments here should also be fixed to explain that we need at least one variant to avoid UB.

Thanks, I've updated the comment and reason for the variants being unstable

at least for references this is UB.

Maybe. That's not decided yet (and I think it shouldn't be).

Ok, intuitively it would seem like such references should never exist similar to how references to NULL should never exist as they simply don't point to a valid value of the given type. But I guess NULL-references are also only UB once they are dereferenced?

The nomicon pattern is designed to avoid the non-FFI-safe-types warning. Does this enum also avoid that?

There don't seem to be any warnings, no. See https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7c3c81c0b724dafbb2ed11d6ee943697

use std::ffi::c_void;

pub struct Foo(c_void);

pub extern "C" fn meh1(m: *mut Foo) -> *mut Foo { m }

#[repr(C)]
pub struct Bar {
    v: *mut Foo,
}

pub extern "C" fn meh2(m: *mut Bar) -> *mut Bar { m }
@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 10, 2018

But I guess NULL-references are also only UB once they are dereferenced?

No, references must never be NULL. That's part of their validity invariant.

Ok, intuitively it would seem like such references should never exist similar to how references to NULL should never exist as they simply don't point to a valid value of the given type.

Uninhabitedness is not a special property when it comes to UB. I think "is having a &! UB" should have the same answer as "is it UB to have an &bool that points to 2". That's a discussion that the UCG WG is going to have soon, but there are quite a few use-cases for references to uninitialized data, so I don't think we should require a reference to always point to valid data.

There don't seem to be any warnings

That's great! @SimonSapin do you remember why you suggested this empty-array pattern in rust-lang-nursery/nomicon#44 instead of a newtype around c_void?

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 10, 2018

There don't seem to be any warnings

Actually, it seems there never are warnings for exporting functions.

But this also does not warn (we do need the repr(C) on the newtype though):

use std::ffi::c_void;

#[repr(C)]
pub struct Foo(c_void);

extern "C" {
  fn meh1(m: *mut Foo) -> *mut Foo;
}
@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Dec 10, 2018

why you suggested this empty-array pattern

To avoid uninhabited types, it seemed like a zero-size type would be the next closest thing. To avoid some misuses this type should not be trivial to construct, so it should have a private field. () would be the most natural type for that field, but that causes warning: `extern` block uses type `()` which is not FFI-safe: tuples have unspecified layout. An empty array was the only think I could think of that would be both FFI-safe (according to this warning) and zero-size.

c_void has size 1.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Dec 10, 2018

the only think I could think of

That would fit in a one-liner.

@Gankro

This comment has been minimized.

Contributor

Gankro commented Dec 10, 2018

(gonna assume Simon has this issue covered)

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 10, 2018

But this also does not warn (we do need the repr(C) on the newtype though):

Why is the #[repr(C)] needed btw? The repr attribute is about value representation, but for these types we only ever look at pointers/references so how the value is represented should not really matter

An empty array was the only think I could think of that would be both FFI-safe (according to this warning) and zero-size.

c_void has size 1.

Is passing of ZST over FFI well-defined? A pointer to one, sure

Is there any advantage for having a ZST here over c_void, which looks less magic and semantically closer to what one would expect (in the end, in C you'd have a void *). And the comment above the c_void definition about LLVM wanting *mut i8 for void * sounds like it would also be potentially more future-proof as there seem to be some assumptions built-in to LLVM in this area.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 10, 2018

Ah right, the point was to make it a ZST! Thanks for reminding me.

Is there any advantage for having a ZST here over c_void,

Yes: If you ever accidentally dereference the pointer, it's a NOP if it points to a ZST.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Dec 10, 2018

Yes, that was the point but I’m not sure how important it is. When a type does not implement any trait (in particular not Copy or Clone), the only way to dereference a pointer to it is doing something like ptr::read, ptr::write, mem::swap, etc. So you sort of have to go out of your way, it’s not easy to do accidentally.

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 10, 2018

Yes: If you ever accidentally dereference the pointer, it's a NOP if it points to a ZST.

Ok, that makes sense but also doesn't really make a difference in practice compared to reading a byte or does it? Unless it was an dangling pointer but in that case...

So I guess,

  1. How should we move forward with this PR, what would you prefer to have changed if anything at all? I believe the parts about ! and the enum currently having to contain two variants are quite clear?
  2. Is #[repr(C)] needed for such newtype wrappers for opaque pointers if they are only ever used by reference/pointer?
  3. Should a newtype wrapper around c_void be documented as a valid option, mostly for reasons for intuitiveness and looking less magic compared to the empty-array option?
@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 12, 2018

Ok, that makes sense but also doesn't really make a difference in practice compared to reading a byte or does it? Unless it was an dangling pointer but in that case...

Given that this is a pointer to an opaque type, the opaque type might have size 0, and in that case the read would be a problem.

How should we move forward with this PR, what would you prefer to have changed if anything at all? I believe the parts about ! and the enum currently having to contain two variants are quite clear?

I think this PR is an improvement over the status quo certainly, and as such I'm in favor of landing it. However, it'd be nice to be consistent with the FFI section in the nomicon -- we should recommend the same thing for the same job.

So, I'd prefer if instead of recommending to use this for opaque extern types, it would point to the nomicon.

Is #[repr(C)] needed for such newtype wrappers for opaque pointers if they are only ever used by reference/pointer?

Yes. Struct field ordering otherwise may differ between C and Rust (and even between different Rust versions).

Should a newtype wrapper around c_void be documented as a valid option, mostly for reasons for intuitiveness and looking less magic compared to the empty-array option?

I'd prefer to keep recommending the empty array. Eventually (TM), we'll have extern types and then the answer about what to use is quite clear.

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 12, 2018

Thanks for the clarifications :)

Yes. Struct field ordering otherwise may differ between C and Rust (and even between different Rust versions).

But as this is an opaque type we don't even know its fields. That was my point, you only have a pointer of reference but don't know what's behind it anyway.

So, I'd prefer if instead of recommending to use this for opaque extern types, it would point to the nomicon.

OK, I'll update it accordingly later and also add a link to the extern type RFC I guess

Thanks again for all the explanations :)

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 12, 2018

Updated

@sdroege sdroege force-pushed the sdroege:c_void-is-not-never branch 2 times, most recently from 392db0f to 7a4d32f Dec 12, 2018

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 12, 2018

But as this is an opaque type we don't even know its fields. That was my point, you only have a pointer of reference but don't know what's behind it anyway.

I guess then it strictly speaking doesn't matter. But there also is no reason not to use repr(C), and it is good practice to use that attribute whenever you are making assumptions about how field layout works or using things for FFI.

Show resolved Hide resolved src/libcore/ffi.rs Outdated
@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 13, 2018

But as this is an opaque type we don't even know its fields. That was my point, you only have a pointer of reference but don't know what's behind it anyway.

I guess then it strictly speaking doesn't matter. But there also is no reason not to use repr(C), and it is good practice to use that attribute whenever you are making assumptions about how field layout works or using things for FFI.

Sure, this was just to complete my understanding. I agree that marking things as #[repr(C)] is also a nice way of marking them as being passed via FFI.

sdroege added some commits Dec 7, 2018

Remove confusing comment about ideally using `!` for `c_void`
Using `!` for `c_void` would have the problem that pointers and
potentially references to an uninhabited type would be created, and at
least for references this is UB.

Also document in addition that newtype wrappers around `c_void` are not
recommended for representing opaque types (as a workaround for `extern
type` not being stable) but instead refer to the Nomicon.
Update code comments of `c_void` to explain the reasoning for its cur…
…rent implementation

We need at least two variants of the enum as otherwise the compiler
complains about the #[repr(u8)] attribute and we also need at least one
variant as otherwise the enum would be uninhabitated and dereferencing
pointers to it would be UB.

As such, mark the variants not unstable because they should not actually
exist but because they are temporary implementation details until
`extern type` is stable and can be used instead.

@sdroege sdroege force-pushed the sdroege:c_void-is-not-never branch from 7a4d32f to 8de8880 Dec 13, 2018

@RalfJung

This comment has been minimized.

Member

RalfJung commented Dec 13, 2018

This patch looks good to me now. @SimonSapin what do you think?

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Dec 13, 2018

The diff says temporary implementation detail about the variants of c_void. How are they temporary? What do you expect to the replace them with?

Note that extern { type c_void; } would be a breaking change because extern types are !Sized, but c_void has size 1. (This matters in particular for <*mut T>::offset.)

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 14, 2018

The diff says temporary implementation detail about the variants of c_void. How are they temporary? What do you expect to the replace them with?

The idea was that it could be replaced with extern type, also in the context of the second unresolved question of the extern type stabilization RFC. It seems like llvm bakes in the assumption that void *-like pointers are always represented as int8_t * for various functions, so the same might be necessary for extern type to not break any assumptions inside llvm:

In std's source, it is mentioned that LLVM expects i8* for C's void*.
We'd need to continue to hack this for the two c_voids in std and libc.
But perhaps this should be done across-the-board for all extern types?
Somebody should check what Clang does.

Moving c_void to extern type is also mentioned in the extern type RFC:

C's "pointer void" (not (), but the void used in void* and similar) is currently defined in two official places: std::os::raw::c_void and libc::c_void. Unifying these is out of scope for this RFC, but this feature should be used in their definition instead of the current tricks. Strictly speaking, this is a breaking change, but the std docs explicitly say that void shouldn't be used without indirection. And libc can, in the worst-case, make a breaking change.

And this documentation/comments update here would bring all these things in sync again.

However your point about Sized is a problem indeed and I don't have an answer to that.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Dec 14, 2018

Moving c_void to extern type is also mentioned in the extern type RFC

I assume we didn’t realize the !Sized problem at the time that was written.

@Gankro

This comment has been minimized.

Contributor

Gankro commented Dec 14, 2018

also extern types have been stalled precisely because how size is supposed to work was punted on, and no one can agree on how to proceed

@sdroege

This comment has been minimized.

Contributor

sdroege commented Dec 15, 2018

I see, I didn't realize that this is still an open issue. So what to do here then? Feel free to simply close the PR if you think there's no value in this before everything else is cleared up elsewhere.

IMHO c_void and extern type should behave exactly the same but obviously that's a problem if the latter is a ZST while the former currently has size 1, so changing c_void would potentially break existing code in non-obvious ways. From my feeling, both should behave in case of pointer/reference to them like a *mut i8 (alignment, LLVM, ...) but have unsized values, but it's apparently not that easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment