Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify `Box<T>` representation and its use in FFI #62514

Open
wants to merge 1 commit into
base: master
from

Conversation

@stephaneyfx
Copy link
Contributor

commented Jul 9, 2019

This officializes what was only shown as a code example in the unsafe code guidelines and follows the discussion in the corresponding repository.

It is also related to the issue regarding marking Box<T> #[repr(transparent)].

If the statement this PR adds is incorrect or a more in-depth discussion is warranted, I apologize. Should it be the case, the example in the unsafe code guidelines should be amended and some document should make it clear that it is not sound/supported.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Jul 9, 2019

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@@ -63,6 +63,27 @@
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the
//! [`Global`] allocator with `Layout::for_value(&*value)`.
//!
//! `Box<SomethingSized>` has the same representation as `*mut SomethingSized`.

This comment has been minimized.

Copy link
@CryZe

CryZe Jul 9, 2019

Contributor

They would always have the same representation, not just if T is Sized.

This comment has been minimized.

Copy link
@stephaneyfx

stephaneyfx Jul 9, 2019

Author Contributor

Right, sorry. I was just so focused on the FFI aspect of it that started this PR in the first place that I phrased that poorly. A fix is on the way. Thank you!

@stephaneyfx stephaneyfx changed the title State that `Box<SomethingSized>` and `*mut SomethingSized` have the same representation Clarify `Box<T>` representation and its use in FFI Jul 9, 2019

@stephaneyfx stephaneyfx force-pushed the stephaneyfx:box-ffi branch from e3671f6 to ecbca77 Jul 9, 2019

@stephaneyfx
Copy link
Contributor Author

left a comment

I should probably change the parameter type of bar to Option<Box<Foo>> so that the Rust code can handle C code calling this function with a null pointer, which would lead to UB. I will update the PR tonight.

Show resolved Hide resolved src/liballoc/boxed.rs Outdated

@stephaneyfx stephaneyfx force-pushed the stephaneyfx:box-ffi branch from ecbca77 to 318c5d6 Jul 10, 2019

Clarify `Box<T>` representation and its use in FFI
This officializes what was only shown as a code example in [the unsafe code guidelines](https://rust-lang.github.io/unsafe-code-guidelines/layout/function-pointers.html?highlight=box#use) and follows [the discussion](rust-lang/unsafe-code-guidelines#157) in the corresponding repository.

It is also related to [the issue](#52976) regarding marking `Box<T>` `#[repr(transparent)]`.
@dtolnay

This comment has been minimized.

Copy link
Member

commented Jul 12, 2019

I am on board with documenting this as a guarantee. Let's get a set of lang team eyes as well:
r? @joshtriplett

@Alexendoo

This comment has been minimized.

Copy link
Member

commented Jul 24, 2019

Ping from triage @joshtriplett, any updates?

@Centril Centril added the I-nominated label Jul 24, 2019

@gnzlbg

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2019

This officializes what was only shown as a code example in the unsafe code guidelines and follows the discussion in the corresponding repository.

There is a PR to propose guaranteeing this as part of the UCGs RFC, but that PR has not been merged yet: rust-lang/unsafe-code-guidelines#164

@gnzlbg

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2019

@stephaneyfx

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2019

rust-lang/unsafe-code-guidelines#164 is more general and does not address Box in particular. Given that the field of Box is private, other fields could theoretically be added (however unlikely it is) and so this UCG RFC is not enough for users to rely on the representation of Box being the same as a C pointer. I think that this PR makes a stronger and straightforward commitment FFI writers can rely on.

@joshtriplett

This comment has been minimized.

Copy link
Member

commented Jul 25, 2019

Sorry for the delayed review.

I have a concern here regarding semantics, rather than representation.

If you actually pass a Box<T> to an external function, that would pass ownership of it, so Rust won't know about it anymore and won't free it, so does that work for giving over ownership of freeing it to C? And conversely, if C allocates something and your Rust allocator can free C objects, can you accept a return value of Box<T> to take ownership of that heap object?

(Also, this should not work for any T with a destructor.)

This seems consistent with the language you've used here. I just want to raise the point explicitly and ask if this will actually work in all cases. If it will, then by all means let's specify this, and it seems like a very useful way to specify ownership transfer of a heap object.

@joshtriplett

This comment has been minimized.

Copy link
Member

commented Jul 25, 2019

@rfcbot merge

Based on discussion in the lang team meeting, we're confident that this will work, and does indeed represent ownership transfer of a heap object across an FFI boundary.

@rfcbot

This comment has been minimized.

Copy link

commented Jul 25, 2019

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2019

(Tagging @pnkfelix and @withoutboats as they are on leave.)

@Centril Centril added the relnotes label Jul 25, 2019

@Centril Centril added this to the 1.38 milestone Jul 25, 2019

@Centril Centril modified the milestones: 1.38, 1.39 Aug 13, 2019

@Centril

This comment has been minimized.

Copy link
Member

commented Aug 13, 2019

Odd... I thought I had reviewed this...

@rfcbot reviewed

@rfcbot

This comment has been minimized.

Copy link

commented Aug 13, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@@ -63,6 +63,28 @@
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the
//! [`Global`] allocator with `Layout::for_value(&*value)`.
//!
//! `Box<T>` has the same representation as `*mut T`. In particular, when

This comment has been minimized.

Copy link
@RalfJung

RalfJung Aug 14, 2019

Member

Is "the same representation" precise enough? Size and alignment are the same, but the niche is not. Producing a NULL or unaligned Box<T> is insta-UB, unlike for *mut T.

Also see these UCG bikesheds.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Aug 15, 2019

Contributor

The size, alignment, niches, and call ABI of Box<T> match that of &mut T.

The validity invariant of Box might match that of &mut T as well, see the UCG issue for the validity of Box.

I think the value representation relation of Box matches that of &mut T as well.

I'm not sure how relevant it is that Box has a pointer field and &mut T has no fields, given that their call ABI is the same and both are Scalar::Pointer, but this can be an aspect of Layout where they differ.

Either way, I'd be more comfortable with relating Box to preferably &mut T or maybe also NonNull<T>? here. E.g. from the point-of-view of performing a transmute, one can transmute a Option<U>::Some to a U if U is a Box<T>, &mut T, NonNull<T>, but this is not the case if U is a *mut T.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Aug 15, 2019

Contributor

Wait, the whole point of this change is to guarantee that Box can be used in C FFI.

Guaranteeing that Box<T> has the same call ABI as *mut T would be enough to solve that problem. For example:

Box<T: Sized> has the same call ABI as *mut T which allows using Box in C FFI: ...example...

Upholding the validity invariant of Box would then be the responsibility of the FFI wrapper, but that Box cannot be null, point to a properly aligned allocation, etc. is already mentioned somewhere else.

This comment has been minimized.

Copy link
@RalfJung

RalfJung Aug 15, 2019

Member

Guaranteeing that Box has the same call ABI as *mut T would be enough to solve that problem.

Well for passing things in-memory it also needs to have the same "memory ABI" if that's a thing.

This comment has been minimized.

Copy link
@RalfJung

RalfJung Aug 15, 2019

Member

Do we have similar wording for references anywhere? We should probably sync that up. But I couldn't fit it in the reference type docs?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Aug 15, 2019

Contributor

Well for passing things in-memory it also needs to have the same "memory ABI" if that's a thing.

True. At this point we might just want to say that Box<T> has the same ABI as a &mut T (not just call ABI, but niches, and anything else that might matter).

This comment has been minimized.

Copy link
@RalfJung

RalfJung Aug 15, 2019

Member

we might just want to say that Box has the same ABI as a &mut T

Yeah, I like that. Everything should be equal for those two.

Then, can we link to some other document talking about the use of references for FFI?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Aug 15, 2019

Contributor

The API docs of references mention, e.g.,

In fact, Option<&T> has the same memory representation as a nullable pointer, and can be passed across FFI boundaries as such.

but that's all I've been able to find written down anywhere :/

This comment has been minimized.

Copy link
@eddyb

eddyb Aug 15, 2019

Member

IMO the only difference between Box<T> and &'static mut T (or more accurately, &'a mut T where T: 'a by virtue of 'a being the shortest lifetime in T, so sort of for<'a> &'a mut T) is the set of operations you can do to it, which is strictly larger for Box<T> (Box::leak effectively letting you get the &mut T, but nothing safe goes in the other direction).

So, yeah, I wouldn't mind guaranteeing that they have identical ABI.

This comment has been minimized.

Copy link
@Centril

Centril Aug 15, 2019

Member

@stephaneyfx Want to update the PR based on the conversation ^---? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.