Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stabilization] Pin APIs #55766

Closed
withoutboats opened this issue Nov 7, 2018 · 213 comments
Closed

[Stabilization] Pin APIs #55766

withoutboats opened this issue Nov 7, 2018 · 213 comments

Comments

@withoutboats
Copy link
Contributor

@withoutboats withoutboats commented Nov 7, 2018

@rfcbot fcp merge
Feature name: pin
Stabilization target: 1.32.0
Tracking issue: #49150
Related RFCs: rust-lang/rfcs#2349

This is a proposal to stabilize the pin library feature, making the "pinning"
APIs for manipulating pinned memory usable on stable.

(I've tried to write this proposal as a comprehensive "stabilization report.")

Stabilized feature or APIs

[std|core]::pin::Pin

This stabilizes the Pin type in the pin submodule of std/core. Pin is
a fundamental, transparent wrapper around a generic type P, which is intended
to be a pointer type (for example, Pin<&mut T> and Pin<Box<T>> are both
valid, intended constructs). The Pin wrapper modifies the pointer to "pin"
the memory it refers to in place, preventing the user from moving objects out
of that memory.

The usual way to use the Pin type is to construct a pinned variant of some
kind of owning pointer (Box, Rc, etc). The std library owning pointers all
provide a pinned constructor which returns this. Then, to manipulate the
value inside, all of these pointers provide a way to degrade toward Pin<&T>
and Pin<&mut T>. Pinned pointers can deref, giving you back &T, but cannot
safely mutably deref: this is only possible using the unsafe get_mut
function.

As a result, anyone mutating data through a pin will be required to uphold the
invariant that they never move out of that data. This allows other code to
safely assume that the data is never moved, allowing it to contain (for
example) self references.

The Pin type will have these stabilized APIs:

impl<P> Pin<P> where P: Deref, P::Target: Unpin

  • fn new(pointer: P) -> Pin<P>

impl<P> Pin<P> where P: Deref

  • unsafe fn new_unchecked(pointer: P) -> Pin<P>
  • fn as_ref(&self) -> Pin<&P::Target>

impl<P> Pin<P> where P: DerefMut

  • fn as_mut(&mut self) -> Pin<&mut P::Target>
  • fn set(&mut self, P::Target);

impl<'a, T: ?Sized> Pin<&'a T>

  • unsafe fn map_unchecked<U, F: FnOnce(&T) -> &U>(self, f: F) -> Pin<&'a U>
  • fn get_ref(self) -> &'a T

impl<'a, T: ?Sized> Pin<&'a mut T>

  • fn into_ref(self) -> Pin<&'a T>
  • unsafe fn get_unchecked_mut(self) -> &'a mut T
  • unsafe fn map_unchecked_mut<U, F: FnOnce(&mut T) -> &mut U>(self, f: F) -> Pin<&'a mut U>

impl<'a, T: ?Sized> Pin<&'a mut T> where T: Unpin

  • fn get_mut(self) -> &'a mut T

Trait impls

Most of the trait impls on Pin are fairly rote, these two are important to
its operation:

  • impl<P: Deref> Deref for Pin<P> { type Target = P::Target }
  • impl<P: DerefMut> DerefMut for Pin<P> where P::Target: Unpin { }

std::marker::Unpin

Unpin is a safe auto trait which opts out of the guarantees of pinning. If the
target of a pinned pointer implements Unpin, it is safe to mutably
dereference to it. Unpin types do not have any guarantees that they will not
be moved out of a Pin.

This makes it as ergonomic to deal with a pinned reference to something that
does not contain self-references as it would be to deal with a non-pinned
reference. The guarantees of Pin only matter for special case types like
self-referential structures: those types do not implement Unpin, so they have
the restrictions of the Pin type.

Notable implementations of Unpin in std:

  • impl<'a, T: ?Sized> Unpin for &'a T
  • impl<'a, T: ?Sized> Unpin for &'a mut T
  • impl<T: ?Sized> Unpin for Box<T>
  • impl<T: ?Sized> Unpin for Rc<T>
  • impl<T: ?Sized> Unpin for Arc<T>

These codify the notion that pinnedness is not transitive across pointers. That
is, a Pin<&T> only pins the actual memory block represented by T in a
place. Users have occassionally been confused by this and expected that a type
like Pin<&mut Box<T>> pins the data of T in place, but it only pins the
memory the pinned reference actually refers to: in this case, the Box's
representation, which a pointer into the heap.

std::marker::Pinned

The Pinned type is a ZST which does not implement Unpin; it allows you to
supress the auto-implementation of Unpin on stable, where !Unpin impls
would not be stable yet.

Smart pointer constructors

Constructors are added to the std smart pointers to create pinned references:

  • Box::pinned(data: T) -> Pin<Box<T>>
  • Rc::pinned(data: T) -> Pin<Rc<T>>
  • Arc::pinned(data: T) -> Pin<Arc<T>>

Notes on pinning & safety

Over the last 9 months the pinning APIs have gone through several iterations as
we have investigated their expressive power and also the soundness of their
guarantees. I would now say confidently that the pinning APIs stabilized here
are sound and close enough to the local maxima in ergonomics and
expressiveness; that is, ready for stabilization.

One of the trickier issues of pinning is determining when it is safe to perform
a pin projection: that is, to go from a Pin<P<Target = Foo>> to a
Pin<P<Target = Bar>>, where Bar is a field of Foo. Fortunately, we have
been able to codify a set of rules which can help users determine if such a
projection is safe:

  1. It is only safe to pin project if (Foo: Unpin) implies (Bar: Unpin): that
    is, if it is never the case that Foo (the containing type) is Unpin while
    Bar (the projected type) is not Unpin.
  2. It is only safe if Bar is never moved during the destruction of Foo,
    meaning that either Foo has no destructor, or the destructor is carefully
    checked to make sure that it never moves out of the field being projected to.
  3. It is only safe if Foo (the containing type) is not repr(packed),
    because this causes fields to be moved around to realign them.

Additionally, the std APIs provide no safe way to pin objects to the stack.
This is because there is no way to implement that safely using a function API.
However, users can unsafely pin things to the stack by guaranteeing that they
never move the object again after creating the pinned reference.

The pin-utils crate on crates.io contains macros to assist with both stack
pinning and pin projection. The stack pinning macro safely pins objects to the
stack using a trick involving shadowing, whereas a macro for projection exists
which is unsafe, but avoids you having to write the projection boilerplate in
which you could possibly introduce other incorrect unsafe code.

Implementation changes prior to stabilization

  • Export Unpin from the prelude, remove pin::Unpin re-export

As a general rule, we don't re-export things from multiple places in std unless
one is a supermodule of the real definition (e.g. shortening
std::collections::hash_map::HashMap to std::collections::HashMap). For this
reason, the re-export of std::marker::Unpin from std::pin::Unpin is out of
place.

At the same time, other important marker traits like Send and Sync are included
in the prelude. So instead of re-exporting Unpin from the pin module, by
putting in the prelude we make it unnecessary to import std::marker::Unpin,
the same reason it was put into pin.

  • Change associated functions to methods

Currently, a lot of the associated function of Pin do not use method syntax.
In theory, this is to avoid conflicting with derefable inner methods. However,
this rule has not been applied consistently, and in our experience has mostly
just made things more inconvenient. Pinned pointers only implement immutable
deref, not mutable deref or deref by value, limiting the ability to deref
anyway. Moreover, many of these names are fairly unique (e.g. map_unchecked)
and unlikely to conflict.

Instead, we prefer to give the Pin methods their due precedence; users who
need to access an interior method always can using UFCS, just as they would be
required to to access the Pin methods if we did not use method syntax.

  • Rename get_mut_unchecked to get_unchecked_mut

The current ordering is inconsistent with other uses in the standard library.

  • Remove impl<P> Unpin for Pin<P>

This impl is not justified by our standard justification for unpin impls: there is no pointer direction between Pin<P> and P. Its usefulness is covered by the impls for pointers themselves.

This futures impl will need to change to add a P: Unpin bound.

  • Mark Pin as repr(transparent)

Pin should be a transparent wrapper around the pointer inside of it, with the same representation.

Connected features and larger milestones

The pin APIs are important to safely manipulating sections of memory which can
be guaranteed not to be moved out. If the objects in that memory do not
implement Unpin, their address will never change. This is necessary for
creating self-referential generators and asynchronous functions. As a result,
the Pin type appears in the standard library future APIs and will soon
appear in the APIs for generators as well (#55704).

Stabilizing the Pin type and its APIs is a necessary precursor to stabilizing
the Future APIs, which is itself a necessary precursor to stabilizing the
async/await syntax and moving the entire futures 0.3 async IO ecosystem
onto stable Rust.

cc @cramertj @RalfJung

@withoutboats withoutboats added the T-libs label Nov 7, 2018
@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 7, 2018

@rfcbot fcp merge

@rfcbot
Copy link

@rfcbot rfcbot commented Nov 7, 2018

Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Nov 7, 2018

Thanks for the detailed writeup here @withoutboats! I've also sort of been historically confused by the various guarantees of Pin, and I've currently got a few questions about the APIs being stabilized here wrt the safety guarantees. To help sort this out in my own head though I figured I'd try to write these things down.

In trying to start writing this down though I keep running up against a wall of "what is Unpin?" I'm sort of confused by what that is and the various guarantees around it. Can you say again what it means for T to and also to not implement Unpin? Also, if Unpin is a safe trait to implement it naively seems like it could be used to easily undermine the unsafe guarantees of Pin<T>, but I'm surely missing something

@tinaun
Copy link
Contributor

@tinaun tinaun commented Nov 7, 2018

if i have got this correct, every type that is not self referential (ie: not a generator) is Unpin

@Nemo157
Copy link
Member

@Nemo157 Nemo157 commented Nov 7, 2018

It's not just self-referentiality, there are some other use-cases for stable memory addresses that Pin can also support. They are relatively few and far-between though.

How I understand Unpin being safe to implement is that by implementing it you may violate an invariant required of other unsafe code you have written (crucially, only you may write this unsafe code, no external code can be relying on whether you have implemented Unpin). There is nothing you can do with Pin's safe API that will cause unsoundness whether or not you have implemented Unpin. By opting in to using some of Pin's unsafe API you are guaranteeing that you will only implement Unpin when it is safe to do so. That is covered by point 1 of the "Notes on pinning & safety" section above.

@alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Nov 7, 2018

Hm I still don't really understand Unpin. I'm at first just trying to understand what it means to implement or not to impelment Unpin.

First off, it's probably helpful to know what types implement Unpin automatically! It's mentioned above that common pointer types (Arc/Rc/Box/references) implement Unpin, but I think that's it? If this is an auto trait, does that mean that a type MyType automatically implements Unpin if it only contains pointers? Or do no other types automatically implement Unpin?

I keep trying to summarize or state what Unpin guarantees and such, but I'm finding it really difficult to do so. Can someone help out by reiterating again what it means to implement Unpin as well as what it means to not implement Unpin?

I think I understand the guarantees of Pin<P> where you can't move out of any of the inline members of P::Target, but is that right?

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 7, 2018

@alexcrichton Thanks for the questions, I'm sure the pinning APIs can be a bit confusing for people who haven't been part of the group focusing on them.

First off, it's probably helpful to know what types implement Unpin automatically!

Unpin is an auto trait like Send or Sync, so most types implement it automatically. Generators and the types of async functions are !Unpin. Types that could contain a generator or an async function body (in other words, types with type parameters) are also not automatically Unpin unless their type parameters are.

The explicit impls for pointer types is to make the Unpin even if their type parameters are not. The reason for this will hopefully be clearer by the end of this comment.

Can someone help out by reiterating again what it means to implement Unpin as well as what it means to not implement Unpin?

Here is the sort of fundamental idea of the pinning APIs. First, given a pointer type P, Pin<P> acts like P except that unless P::Target implements Unpin, it is unsafe to mutably dereference Pin<P>. Second, there are two basic invariants unsafe code related to Pin has to maintain:

  • If you unsafely get an &mut P::Target from a Pin<P>, you must never move P::Target.
  • If you can construct a Pin<P>, it must be guaranteed that you'll never be able to get an unpinned pointer to the data that pointer points to until the destructor runs.

The implication of all of this is that if you construct a Pin<P>, the value pointed to by P will never move again, which is the guarantee we need for self-referential structs as well as intrusive collections. But you can opt out of this guarantee by just implementing Unpin for your type.

So if you implement Unpin for a type, you're saying that the type opts out of any additional safety guarantees of Pin - its possible to mutably dereference the pointers pointing to it. This means you're saying the type does not need to be immovable to be used safely.

Moving a pointer type like Rc<T> doesn't move T because T is behind a pointer. Similarly, pinning a pointer to an Rc<T> (as in Pin<Box<Rc<T>>) doesn't actually pin T, it only pins that particular pointer. This is why anything which keeps its generics behind a pointer can implement Unpin even when their generics don't.

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 7, 2018

Also, if Unpin is a safe trait to implement it naively seems like it could be used to easily undermine the unsafe guarantees of Pin, but I'm surely missing something

This was one of the trickiest parts of the pinning API, and we got it wrong at first.

Unpin means "even once something has been put into a pin, it is safe to get a mutable reference to it." There is another trait that exists today that gives you the same access: Drop. So what we figured out was that since Drop is safe, Unpin must also be safe. Does this undermine the entire system? Not quite.

To actually implement a self-referential type would require unsafe code - in practice, the only self-referential types anyone cares about are those that the compiler generates for you: generators and async function state machines. These explicitly say they don't implement Unpin and they don't have a Drop implementation, so you know, for these types, once you have a Pin<&mut T>, they will never actually get a mutable reference, because they're an anonymous type that we know doesn't implement Unpin or Drop.

The problem emerges once you have a struct containing one of these anonymous types, like a future combinator. In order go from a Pin<&mut Fuse<Fut>> to a Pin<&mut Fut>, you have to perform a "pin projection." This is where you can run into trouble: if you pin project to the future field of a combinator, but then implement Drop for the combinator, you can move out of a field that is supposed to be pinned.

For this reason, pin projection is unsafe! In order to perform a pin projection without violating the pinning invariants, you have to guarantee that you never do several things, which I listed in the stabilization proposal.

So, the tl;dr: Drop exists, so Unpin must be safe. But this doesn't ruin the whole thing, it just means that pin projection is unsafe and anyone who wants to pin project needs to uphold a set of invariants.

@Matthias247
Copy link
Contributor

@Matthias247 Matthias247 commented Nov 8, 2018

generators and async function state machines. These explicitly say they don't implement Unpin and they don't have a Drop implementation, so you know, for these types, once you have a Pin<&mut T>, they will never actually get a mutable reference, because they're an anonymous type that we know doesn't implement Unpin or Drop.

Shouldn't an async state machine have a Drop implementation? Things that are on the "stack" of the async function (which probably equals fields in the state machine) need to get destructed when the async function completes or gets cancelled. Or does this happen otherwise?

@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Nov 8, 2018

I guess what matters in this context is whether an impl Drop for Foo {…} item exists, which would run code with &mut Foo which could use for example mem::replace to "exfiltrate" and move the Foo value.

This is not the same as "drop glue", which can be called through ptr::drop_in_place. Drop glue for a given Foo type will call Drop::drop if it’s implemented, then recursively call the drop glue for each field. But those recursive calls never involve &mut Foo.

@Nemo157
Copy link
Member

@Nemo157 Nemo157 commented Nov 8, 2018

Also, while a generator (and therefore an async state machine) has custom drop glue, it's just to drop the correct set of fields based on the current state, it promises to never move any of the fields during drop.

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 8, 2018

The terminology I use (though I don't think there's any standard): "drop glue" is the compiler generated recursive walking of fields, calling their destructors; "Drop implementation" is an implementation of the Drop trait, and "destructor" is the combination of the drop glue and the Drop implementation. The drop glue never moves anything around, so we only care about Drop implementations.

@Gankra
Copy link
Contributor

@Gankra Gankra commented Nov 8, 2018

Is someone on the hook to write a nomicon chapter on futures? It seems incredibly necessary given how subtle this is. In particular I think examples, especially examples of buggy/bad implementations and how they're unsound, would be illuminating.

@cramertj
Copy link
Member

@cramertj cramertj commented Nov 8, 2018

@gankro Sure, you can put me down for that.

@Matthias247
Copy link
Contributor

@Matthias247 Matthias247 commented Nov 9, 2018

Thanks for the explanation everyone!

I personally am super new to async Rust and the Pin APIs, but I have played a bit around with it during the last days (rust-lang/futures-rs#1315 - where I tried to exploit pinning for intrusive collections in async code).

During experimentation I got some concerns with these APIs:

  • The concept of pinning seems very hard to fully understand, even if one knows that stable memory addresses are required. Some challenges that I faced:
    • Error messages like:

      within impl core::future::future::Future+futures_core::future::FusedFuture, the trait std::marker::Unpin is not implemented for std::marker::Pinned
      This one sounds pretty recursive and contradicting (why should something that is pinned be not pinned?)

    • How can I access mutable fields in my method that takes a Pin as a parameter. It took a bit of search, learning about new terms (Pin projection), trying to copy the methods from pin-utils, until I came to a solution that uses 2 unsafe methods to do what I needed.
    • How can I actually use my future in an async method and a select case. It turned out I needed to pin_mut!() it to futures stack. Which isn't really a stack, which makes it confusing.
    • Why does my drop() method now again take &mut self, instead of a Pin.
    • Overall it was a bit of a feeling of randomly using unsafe pinning related APIs in the hope of getting things to compile, which definitely shouldn't be the ideal situation. I suspect others might try to do similar things.
  • The concept propagates a lot throughout the codebase, which seems to force lots of people to understand what Pin and Unpin actually are. This seems like a bit a leak of implementation details for some more special use-cases.
  • Pin seems to be somewhat related to lifetimes, but also somewhat orthogonal. Basically a Pin tells us we might see an object again with exactly the same address until drop() is called. Which gives it some kind of virtual lifetime between the current scope and 'static. It seems confusing to have 2 concepts which are related, but still completely different. I wondered whether it can be modelled by something like 'pinned.

While getting things to compile, I often thought about C++, where most of these issues are avoided: Types can be declared unmovable by deleting the move constructor and assignment operator. When a type is not moveable, types which contain that type are also not. Thereby the property and requirements flow natively through the type hierarchy and get checked by the compiler, instead of forwarding the property inside some calls (not all, e.g. not drop()). I think this is a lot easier to understand. But maybe not applicable to Rust, since Futures must currently often be moved before something starts polling them? But on the other hand that might be fixable with a new definition with that trait, or separating movement and polling phase.

Regarding Alex Unpin comment: I'm slowly grasping what Unpin means, but I agree that it's hard to understand. Maybe another name helps, but I can't find a concise one right now. Something along ThingsInsideDontBreakIfObjectGetsMoved, DoesntRequireStableAddress, PinDoesntMatter, etc.

What I haven't yet fully understood is, why getting &mut self from Pin<&mut self> can't be safe for all types. If Pin would just mean the address of the object itself is stable, then this should hold true for most typical Rust types. It seems there is another concern integrated into Pin, which is that also self-referential types inside it don't break. For those types manipulating them after having a Pin is always unsafe. But I think in most cases those will be compiler-generated, and unsafety or even raw pointers there should be fine. For a manually future combinators that need to forward pinned calls to subfields there would obviously need to be unsafe calls to create pins to the fields before polling those. But in that case I see the unsafety more related to the place where it actually matters (is the pin still valid?) than to accessing other fields for which it doesn't matter at all.

Another thought that I had is whether it's possible to get a simpler system if some limitations are applied. E.g. if things that require pinning could only be used inside async methods and not with future combinators. Maybe that could simplify things to one of Pin or Unpin. Considering that for lots of code async/await methods with select/join support might be favorable to combinators, this might not be the hugest loss. But I haven't really put enough thoughts into whether this really helps.

On the positive side: Being able to write async/await code with "stack" borrows is pretty cool and much needed! And ability to use pinning for other use-cases, like intrusive collections, will help performance as well as targets like embedded systems or kernels. So I'm really looking forward to a solution on this.

@glaebhoerl
Copy link
Contributor

@glaebhoerl glaebhoerl commented Nov 9, 2018

Minor nit w.r.t. the stabilization report:

fn get_unchecked_mut(self) -> &'a mut T

I'm guessing this is actually an unsafe fn?

@tikue
Copy link
Contributor

@tikue tikue commented Nov 9, 2018

@Matthias247 you can't safely get &mut T from Pin<P<T>> if T isn't Unpin because then you could mem::swap to move T out of the Pin, which would defeat the purpose of pinning things.

One thing I have trouble explaining to myself is what makes Future fundamentally different from other traits such that Pin needs to be part of its API. I mean, I know intuitively it's because async/await requires pinning, but does that say something specifically about Futures that's different from, say, Iterators?

Could poll take &mut self and only implement Future for Pin<P> types or types that are Unpin?

@Nemo157
Copy link
Member

@Nemo157 Nemo157 commented Nov 9, 2018

How can I access mutable fields in my method that takes a Pin as a parameter.

This actually makes me wonder whether there's a couple of missing methods on Pin,

impl<'a, T: ?Sized> Pin<&'a T> {
    fn map<U: Unpin, F: FnOnce(&T) -> &U>(self, f: F) -> Pin<&'a U>
}

impl<'a, T: ?Sized> Pin<&'a mut T> {
    fn map<U: Unpin, F: FnOnce(&mut T) -> &mut U>(self, f: F) -> Pin<&'a mut U>
}

These could be used in the pin_utils::unsafe_unpinned! macro.

I'm trying to work out why this macro claims to be unsafe. If we are !Unpin and have a Unpin field, why would it be unsafe to project to this field?

The only situation where I can see it being unsound is implementing a custom !Unpin type, taking a raw pointer to an Unpin field of self (and relying on it having a stable address/pointing to the same instance), then acquiring an &mut to the same field and passing it to an external function. This seems to fall under the same reasoning of why implementing Unpin is safe though, by taking a raw pointer to a field of a !Unpin you are opting out of being able to call some of the safe APIs.

To make the previous situation safer there could be a wrapper built on Pinned for marking which normally Unpin fields of a struct are actually !Unpin instead of just adding a Pinned to the struct as a whole:

pub struct MustPin<T: Unpin>(T, Pinned);

impl<T: Unpin> MustPin<T> {
    pub const fn new(t: T) -> Self { ... }
    pub fn get(self: Pin<&Self>) -> *const T { ... }
    pub fn get_mut(self: Pin<&mut Self>) -> *mut T { ... }
}

This all seems backwards compatible with the current API, it could remove some of the unsafety from futures-rs combinators (e.g. it would give safe access to extra fields like this one), but isn't necessary for the current usecases. We could probably experiment with some APIs like this for implementing custom !Unpin types (like intrusive collections) and add them later on.

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 9, 2018

@Nemo157 Those map functions are unsafe because I could mem::swap on the &mut T the function passes me before returning an &mut U. (well the mutable one is, the immutable one might not be unsafe)

EDIT: Also the pin-utils macro is different, unsafe_unpinned doesnt have to do with the target type being Unpin, its just an "unpinned projection" - a projection to a &mut Field. Its safe to use as long as you never pin project to that field, even if the field is !Unpin.

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Nov 9, 2018

One thing I have trouble explaining to myself is what makes Future fundamentally different from other traits such that Pin needs to be part of its API. I mean, I know intuitively it's because async/await requires pinning, but does that say something specifically about Futures that's different from, say, Iterators?

There's no theoretical difference, but there are pragmatic ones. For one, Iterator is stable. But as a result, unless we figure out something very clever, you'll never be able to run a for loop over a self-referential generator without a double indirection, even though that should be completely safe without it (because the for loop will consume and never move the generator).

Another important pragmatic difference is that the code patterns between Iterator and Future are quite different. You can't go 10 minutes without wanting to borrow across an await point in a future, which is why on futures 0.1 you see these Arcs appearing all over the place so you can use the same variable in two different and_then calls. But we've gotten quite far without being able to express self-referential iterators at all, it's just not that important of a use case.

@Nemo157
Copy link
Member

@Nemo157 Nemo157 commented Nov 9, 2018

Those map functions are unsafe because I could mem::swap on the &mut T the function passes me before returning an &mut U

Ah, damn, forgot about that part of it 😦

@RalfJung
Copy link
Member

@RalfJung RalfJung commented Nov 9, 2018

fn as_mut(&mut self) -> Pin<&mut P::Target> fn into_ref(self) -> Pin<&'a T>`

Should these be called as_pinned_mut and into_pinned_ref, to avoid confusing them with methods that actually return references?

Notable implementations of Unpin in std:

We also have impl<P> Unpin for Pin<P> {}. Should this maybe become impl<P: Unpin> Unpin for Pin<P> {}? For the types we use this has no effect and it seems a bit safer? After all, Pin itself does not add a ptr indirection, so the argument about pinning not being structural does not apply.
Never mind, you have that on your list. ;)

Drop guarantee

I think we have to codify the Drop guarantee before stabilization, or else it will be too late:

It is illegal to invalidate the storage of a pinned object (the target of a Pin reference) without calling drop on the object.

"Invalidate" here can mean "deallocation", but also "repurposing": When changing x from Ok(foo) to Err(bar), the storage of foo gets invalidated.

This has the consequence, for example, that it becomes illegal to deallocate a Pin<Box<T>>, Pin<Rc<T>> or Pin<Arc<T>> without first calling drop::<T>.

Repurposing Deref, DerefMut

I still feel slightly uneasy about how this repurposes the Deref trait to mean "this is a smart pointer". We use Deref for other things as well, e.g. for "inheritance". That might be an anti-pattern, but it is still in fairly common use and, frankly, it is useful. :D

I think this is not a soundness issue though.

Understanding Unpin

Shameless plug: I wrote two blog posts on this, that might help or not depending on how much you enjoy reading (semi-)formal syntax. ;) The APIs have changed since then, but the basic idea has not.

Safe pin projections

@Matthias247 I think one problem you are running into is that building abstractions that involve pinning currently almost always requires unsafe code. Using those abstractions is fine, but e.g. defining a future combinator in safe code won't work. The reason for this is that "pin projections" have constraints that we could only safely check with compiler changes. For example, you ask

Why does my drop() method now again take &mut self, instead of a Pin.

Well, drop() is old -- it exists since Rust 1.0 -- so we cannot change it. We'd love to just make it take Pin<&mut Self>, and then Unpin types could get their &mut like they do now, but that's a non-backwards-compatible change.

To make writing future combinators safe, we'd need safe pin projections, and for that we have to change the compiler, this cannot be done in a library. We could almost do it in a derive proc_macro, except that we'd need a way to assert "this type does not have any implementation of Drop or Unpin".

I think it'd be worth the effort to figure out how to get such a safe API for pin projections. However, that does not have to block stabilization: The API we stabilize here should be forward compatible with safe pin projections. (Unpin might have to become a lang item to implement the above assertion, but that doesn't seem too bad.)

Pinning is not !Move

I often thought about C++, where most of these issues are avoided: Types can be declared unmovable by deleting the move constructor and assignment operator. When a type is not moveable, types which contain that type are also not.

There were several attempts at defining unmovable types for Rust. It gets very complicated very quickly.

One important thing to understand is that pinning does not provide unmovable types! All types remain movable in Rust. If you have a T, you can move it anywhere you want. Instead of unmovable types, we make use of Rust's ability to define new encapsulated APIs, as we define a pointer type from which you cannot move out. All the magic is in the pointer type (Pin<&mut T>), not in the pointee (T). There is no way for a type to say "I cannot get moved ever". However, there is a way for a type to say "If I got pinned, don't move me again". So a MyFuture that I own will always be movable, but Pin<&mut MyFuture> is a pointer to an instance of MyFuture that cannot get moved any more.

This is a subtle point, and we probably should spend some time fleshing out the docs here to help avoid this very common misunderstanding.

@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Nov 9, 2018

But we've gotten quite far without being able to express self-referential iterators at all, it's just not that important of a use case.

This is because, so far, all iterators are defined using a type and an impl Iterator for … block. When state needs to be kept between two iterations, there is no choice but to stash it in fields of the iterator type and work with &mut self.

However, even if this is not the primary motivation for including generators in the language, it would be very nice to eventually be able to use generator yield syntax to define something that can be used with a for loop. As soon as that exists, borrowing across yield because just as important for generator-iterators as it is for generator-futures.

@Nemo157
Copy link
Member

@Nemo157 Nemo157 commented Nov 9, 2018

(Unpin might have to become a lang item to implement the above assertion, but that doesn't seem too bad.)

Unpin and Pin already have to be lang items to support safe immovable generators.

@alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Nov 9, 2018

Ok thanks for the explanations all! I agree with @gankro that a nomicon-style chapter about Pin and such would be massively useful here. I suspect that there's a lot of development history for why various safe methods exist and why unsafe methods are unsafe and such.

In an effort to help myself understand this though I wanted to try again to write down why each function is safe or why it's unsafe. With the understanding from above I've got all the following, but there's a few questions here and there. If others can help me fill this in that'd be great! (or help point out where my thinking is wrong)

  • fn new(P) -> Pin<P> where P: Deref, P::Target: Unpin

    • This is a safe method to construct Pin<P>, and while the existence of
      Pin<P> normally means that P::Target will never be moved again in the
      program, P::Target implements Unpin which is read "the guarantees of
      Pin no longer hold". As a result, the safety here is because there's no
      guarantees to uphold, so everything can proceed as usual.
  • unsafe fn new_unchecked(P) -> Pin<P> where P: Deref

    • Unlike the previous, this function is unsafe because P doesn't
      necessarily implement Unpin, so Pin<P> must uphold the guarantee that
      P::Target will never move ever again after Pin<P> is created.

      A trivial way to violate this guarantee, if this function is safe, looks
      like:

      fn foo<T>(mut a: T, b: T) {
          Pin::new_unchecked(&mut a); // should mean `a` can never move again
          let a2 = mem::replace(&mut a, b);
          // the address of `a` changed to `a2`'s stack slot
      }
      

      Consequently it's up to the user to guarantee that Pin<P> indeed means
      P::Target is never moved again after construction, so it's unsafe!

  • fn as_ref(&Pin<P>) -> Pin<&P::Target> where P: Deref

    • Given Pin<P> we have a guarantee that P::Target will never move. That's
      part of the contract of Pin. As a result it trivially means that
      &P::Target, another "smart pointer" to P::Target will provide the same
      guarantee, so &Pin<P> can be safely translate to Pin<&P::Target>.

      This is a generic method to go from Pin<SmartPointer<T>> to Pin<&T>

  • fn as_mut(&mut Pin<P>) -> Pin<&mut P::Target> where P: DerefMut

    • I believe the safety here is all roughly the same as the as_ref above.
      We're not handing out mutable access, just a Pin, so nothing can be easily
      violated yet.

      Question: what about a "malicious" DerefMut impl? This is a safey way
      to call a user-provided DerefMut which crates &mut P::Target natively,
      presumably allowing it to modify it as well. How's this safe?

  • fn set(&mut Pin<P>, P::Target); where P: DerefMut

    • Question: Given Pin<P> (and the fact that we don't know anything about
      Unpin), shouldn't this guarantee that P::Target never moves? If we can
      reinitialize it with another P::Target, though, shouldn't this be unsafe?
      Or is this somehow related to destructors and all that?
  • unsafe fn map_unchecked<U, FnOnce(&T) -> &U>(Pin<&'a T>, f: F) -> Pin<&'a U>

    • This is an unsafe function so the main question here is "why isn't this
      safe"? An example of violating guarantees if this were safe looks like:

      ...

      Question:: what's the counter-example here? If this were safe, what's
      the example that shows violating the guarantees of Pin?

  • fn get_ref(Pin<&'a T>) -> &'a T

    • The guarantee of Pin<&T> means that T will never move. Returning &T
      doesn't allow mutation of T, so it should be safe to do while upholding
      this guarantee.

      One "maybe gotcha" here is interior mutability, what if T were
      RefCell<MyType>? This, however, doesn't violate the guarantees of
      Pin<&T> because the guarantee only applies to T as a whole, not the
      interior field MyType. While interior mutability can move around
      internals, it still fundamentally can't move the entire structure behind a
      & reference.

  • fn into_ref(Pin<&'a mut T>) -> Pin<&'a T>

    • Pin<&mut T> means that T will never move. As a result it means Pin<&T>
      provides the same guarantee. Shouldn't be much issue with this conversion,
      it's mostly switching types.
  • unsafe fn get_unchecked_mut(Pin<&'a mut T>) -> &'a mut T

    • Pin<&mut T> means that T must never move, so this is trivially unsafe
      because you can use mem::replace on the result to move T (safely). The
      unsafe here "although I am giving you &mut T, you're not allowed to ever
      move T".
  • unsafe fn map_unchecked_mut<U, F: FnOnce(&mut T) -> &mut U>(Pin<&'a mut T>, f: F) -> Pin<&'a mut U>

    • I think the unsafe here is basically at least the same as above, we're
      handing out &mut T safely (can't require an unsafe closure) which could
      easily be used with mem::replace. There's probably other unsafety here too
      with the projection, but it seems reasonable that it's at least unsafe
      because of that.
  • fn get_mut(Pin<&'a mut T>) -> &'a mut T where T: Unpin

    • By implementing Unpin a type says "Pin<&mut T> has no guarantees, it's
      just a newtype wrapper of &mut T". As a result, with no guarantees to
      uphold, we can return &mut T safely
  • impl<P: Deref> Deref for Pin<P> { type Target = P::Target }

    • This can be safely implemented with as_ref followed by get_ref, so the
      safety of this impl follows from the above.
  • impl<P: DerefMut> DerefMut for Pin<P> where T::Target: Unpin { }

    • This can be safely implemented with as_mut followed by get_mut, so the
      safety of this impl follows from the above.
  • impl<T: ?Sized> Unpin for Box<T> (and other pointer-related implementations)

    • If no other action were taken, Box<T> would implement the Unpin trait
      only if T implemented Unpin. This implementation here codifies that even
      if T explicitly doesn't implement Unpin, Box<T> implements Unpin.

      Question: What's an example for what this is fundamentally empowering?
      For example, what safe could would otherwise require unsafe if this impl
      did not exist.

@tikue
Copy link
Contributor

@tikue tikue commented Dec 10, 2018

Is it safe to create a Pin<P<T>> with a T: !Unpin and then immediately unpin the value, if it's guaranteed that nothing observed the pinning? Is it only undefined behavior if the pinned value is moved after it has had a poll-like method called on it?

@cramertj
Copy link
Member

@cramertj cramertj commented Dec 10, 2018

@HeroicKatora Unfortunately, it's possible today to implement Drop for a type that could be !Unpin due to generics, e.g.:

struct S<T>(T); // `!Unpin` if `T: !Unpin`
impl<T> Drop for S<T> { ... }
@HeroicKatora
Copy link
Contributor

@HeroicKatora HeroicKatora commented Dec 10, 2018

@cramertj Thank you, just realized that as well.

@HeroicKatora
Copy link
Contributor

@HeroicKatora HeroicKatora commented Dec 10, 2018

@cramertj So we default to T: ?Unpin on generic bounds? Is there another reason behind not defaulting to T: Unpin for existing types like Sized? It would give a few instances where that would be annoyingly strict but can that additional auto bound cause regressions?

@cramertj
Copy link
Member

@cramertj cramertj commented Dec 10, 2018

@HeroicKatora That would require sprinkling ?Unpin bounds on every type in the standard library and in a bunch of code that doesn't care about Unpin at all.

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Dec 10, 2018

It's also safe to add a PhantomPinned field, as another way of creating a Drop + !Unpin type.

Is there another reason behind not defaulting to T: Unpin for existing types?

Unpin is just an auto trait like Send and Sync, this proposal involves no new language features. So it has the same semantics as auto traits are already defined to have, which is that they are not applied to generics by default (unlike Sized, which is a built in trait to the compiler, not an auto trait).

Is it only undefined behavior if the pinned value is moved after it has had a poll-like method called on it?

We should be clear here that undefined behavior in this context is not really the traditional sort of UB. Rather, its that code, observing a type in a Pin, can make assumptions about the ownership semantics of that value (namely, that if it does not implement Unpin it will not be moved again or invalidated until after its destructor has run). It's not UB in the sense that it is assumed by the compiler never to happen (the compiler doesnt know what a Pin is).

So yes, in the sense that you can verify that no code is relying on the guarantee you have provided. But also what you have done is certainly pointless, since no code could observe that the type was pinned.

@HeroicKatora
Copy link
Contributor

@HeroicKatora HeroicKatora commented Dec 10, 2018

@cramertj I see, that would be a bit unfortunate. I'm a bit uncomfortable with Pin interaction with Drop in the specification but not in the language. Quite conflicting between interface correctness, usability and releasing it in the near future. Can I get clarification on Edit 2 above:

Rc allows one to observe Pin<&T> but for T: Unpin still not have drop called on the original memory location. Keep a reference alive outside the pin, then after the pin was dropped you can move out with Rc::try_unwrap. https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=dec6f6c6d2c0903d87a4a9cefe50a0ca

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Dec 10, 2018

@HeroicKatora the problem with your code is that Mem<'a>: Unpin, but the line of unsafe code you use relies on assumptions that only apply to types that don't implement Unpin. I've edited your gist to include proof that Mem<'a>: Unpin: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=201d1ae382f590be8c5cac13af279aff

You rely on the Unpin bound when you call Pin::new, which can only be called on pointers whose target implements Unpin.

The loophole you thought you found has been considered, which is why there is no way to go from an Rc<T: ?Unpin> to a Pin<Rc<T>>. You have to construct the Rc in the pin with Rc::pinned.

@HeroicKatora
Copy link
Contributor

@HeroicKatora HeroicKatora commented Dec 10, 2018

@withoutboats Just wanted to get that confirmed, that T: Unpin indeed also opts-out of the drop call before invalidation. That begs the question, should there not also be

fn into_inner(Pin<P>) -> P where P: Deref, P::Target: Unpin;

since it does not protect any part of the interface, and unwrapping a Pin<Box<T>> into Box<T> is potentially useful (for T: Unpin of course).

@withoutboats
Copy link
Contributor Author

@withoutboats withoutboats commented Dec 10, 2018

@HeroicKatora you're correct that that function would be safe. I don't mind adding it but I'd like to avoid expanding the API we're stabilizing in this moment, since this thread is already hundreds of comments long. We can always add it later as the need arises just as we do with all std APIs.

@HeroicKatora
Copy link
Contributor

@HeroicKatora HeroicKatora commented Dec 10, 2018

I would just say that both naming and functionality of Unpin makes a lot more perceived sense with that converse method. And knowing that the Pin guarantees are indeed restricted to T: !Unpin. That would also be directly demonstrated by the existence of such a method, instead of constructing escape hatches to show the possibility 😄 . And its documentation would be a perfect place to explain (or link once more to) the restrictions. (One might even consider naming it unpin instead of into_inner).

Edit: It would mostly just generalize what's already there.

impl<P> Pin<P> where P: Deref, P::Target: Unpin

  • fn unpin(self) -> P

has the instantiation for P=&'a mut T, that is equivalent to the proposed

impl<'a, T: ?Sized> Pin<&'a mut T> where T: Unpin

  • fn get_mut(self) -> &'a mut T
bors added a commit that referenced this issue Dec 12, 2018
Expand std::pin module docs and rename std::pin::Pinned to PhantomPinned

cc #49150, #55766

r? @withoutboats
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue Dec 21, 2018
…richton

Pin stabilization

This implements the changes suggested in rust-lang#55766 (comment) and stabilizes the `pin` feature. @alexcrichton also listed several "blockers" in that issue, but then in [this comment](rust-lang#55766 (comment)) mentioned that they're more "TODO items":
>  In that vein I think it's fine for a stabilization PR to be posted at any time now with FCP lapsed for a week or so now. The final points about self/pin/pinned can be briefly discussed there (if even necessary, they could be left as the proposal above).

Let's settle these last bits here and get this thing stabilized! :)

r? @alexcrichton
cc @withoutboats
Centril added a commit to Centril/rust that referenced this issue Dec 22, 2018
…richton

Pin stabilization

This implements the changes suggested in rust-lang#55766 (comment) and stabilizes the `pin` feature. @alexcrichton also listed several "blockers" in that issue, but then in [this comment](rust-lang#55766 (comment)) mentioned that they're more "TODO items":
>  In that vein I think it's fine for a stabilization PR to be posted at any time now with FCP lapsed for a week or so now. The final points about self/pin/pinned can be briefly discussed there (if even necessary, they could be left as the proposal above).

Let's settle these last bits here and get this thing stabilized! :)

r? @alexcrichton
cc @withoutboats
Centril added a commit to Centril/rust that referenced this issue Dec 23, 2018
…richton

Pin stabilization

This implements the changes suggested in rust-lang#55766 (comment) and stabilizes the `pin` feature. @alexcrichton also listed several "blockers" in that issue, but then in [this comment](rust-lang#55766 (comment)) mentioned that they're more "TODO items":
>  In that vein I think it's fine for a stabilization PR to be posted at any time now with FCP lapsed for a week or so now. The final points about self/pin/pinned can be briefly discussed there (if even necessary, they could be left as the proposal above).

Let's settle these last bits here and get this thing stabilized! :)

r? @alexcrichton
cc @withoutboats
@Centril
Copy link
Contributor

@Centril Centril commented Jan 29, 2019

Now that the stabilization has gone through, the point of the PFCP seems moot, so therefore:

@rfcbot cancel

@Centril Centril closed this Jan 29, 2019
@Centril Centril reopened this Jan 29, 2019
@RalfJung
Copy link
Member

@RalfJung RalfJung commented Jan 29, 2019

Is there any tracking to make sure the comments developed starting at #55766 (comment) get turned into docs? Seems we are already too late for the release as beta got branched off... even though that was explicitly called out here to happen before stabilization :/

@RalfJung
Copy link
Member

@RalfJung RalfJung commented Apr 14, 2019

Is there any reason that this is still open? @Centril you closed and reopened this, was that deliberate?

@Centril
Copy link
Contributor

@Centril Centril commented Apr 14, 2019

@RalfJung I tried to cancel the FCP but it didn't listen; @alexcrichton can you cancel the FCP?

@alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Apr 15, 2019

This is stable, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet