-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for DerefMove #2439
RFC for DerefMove #2439
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,325 @@ | ||
- Feature Name: `deref_move` | ||
- Start Date: 2018-05-05 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Add a new `DerefMove` trait that allows consuming a smart pointer to move its | ||
contents, as in `let x = *p;`. | ||
|
||
For more background and discussion, see #178 and #997. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Currently, two smart pointer traits provide access to values contained in smart | ||
pointers: `Deref` and `DerefMut`. These traits allow overloading of the | ||
dereferencing (unary `*`) operator, and by extension participate in autoderef in | ||
a number of places, such as method calls and index expressions. | ||
|
||
These two traits, however, only dereference references to produce references to | ||
the contained value. They do not offer the option to dereference by value, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sentence is a mouthful. Perhaps:
|
||
consuming the smart pointer to produce the owned value inside. As a special | ||
case, `Box<T>` has compiler support allowing it to be dereferenced (explicitly | ||
or implicitly) to produce the inner `T`. Because this is special-cased in the | ||
compiler, this functionality is not available for other smart pointer types. | ||
|
||
Other smart pointers may support this consuming operation, but it must be done | ||
explicitly due to the lack of language support. For instance, in the stdlib, the | ||
following methods both consume a smart pointer to produce the inner value: | ||
|
||
* `ManuallyDrop<T>::into_inner()` | ||
* `binary_heap::PeakMut<'a, T>::pop()` | ||
* `Cow<'a, B>::into_owned()` | ||
|
||
I have not made any attempt to survey into the larger ecosystem for more | ||
examples, but there are probably many more. | ||
|
||
Note that removing `FnBox` is explicitly *not* part of the motivation for this | ||
RFC. It has been suggested in a few places, such as in discussion of | ||
rust-lang/rust#28796, that `DerefMove` would allow `Box<FnOnce()>` to be called | ||
without needing the `FnBox` workaround. This is not the case, however: the inner | ||
value is dynamically-sized and thus cannot be moved onto the stack. Indeed, one | ||
can call a `Box<T>` where `T` is a concrete type implementing `FnOnce()`. | ||
`DerefMove` would allow `Box<FnOnce()>` to continue not working, just without a | ||
special case in the compiler to allow it to do so. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These last two sentences are jumbled up, and I'm not actually 100% clear what they're trying to express. Maybe just:
And it may be worth noting that if by-value DSTs ever get implemented (which is theoretically possible and has long been desireable), then the combination of that feature with DerefMove would in fact be sufficient to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By-value DSTs are sufficient and necessary for I'll try to clarify this. |
||
|
||
This will not remove all of `Box`'s special cases from the compiler, but | ||
represents progress towards that goal. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
The following trait is defined in `std::ops`: | ||
|
||
```rust | ||
trait DerefMove : DerefMut { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not obvious to me why this should require There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I agree. Actually, the There is an implication the other way ( trait DerefMove {
type Target;
// ...
}
impl<T> DerefMove for T where T: Deref, <T as Deref>::Target: Copy {
// We can avoid consuming ourselves *and* fork over the deref target.
type Target = (T, <T as Deref>::Target);
// [insert copying implementation using `Deref` on `self`]
} Actually using this implementation would be irritating, but it gets the idea across (namely, how the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rationale is below, but I had not considered this use case. The ergonomic drawbacks may be worth it if this use case is worth supporting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The behaviour of ignoring This would also complicate autoderef and in particular its relationship with overload resolution, probably unbearably so. I think it would be so hard to use a type with different There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I suppose something like That would need that "split a trait into two" RFC, I think, to introduce There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Small nit: this trait requires a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @scottmcm The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pythonesque Good point! I forget that pointers get to cheat 🙂 |
||
fn deref_move(self) -> Self::Target; | ||
} | ||
``` | ||
|
||
This trait allows moving out of the result of a derefence. This can happen | ||
either explicitly, as in `let x = *p;`, or implicitly, as in `p.into_foo()`. | ||
`Box<T>` from the standard library implements `DerefMove`, allowing you to move | ||
out of it. | ||
|
||
```rust | ||
let b = Box::new("s".to_owned()); | ||
let s = *b; | ||
// ERROR: b has already been moved on the previous line. | ||
// let t = *b; | ||
``` | ||
|
||
If the `Target` type is `Copy`, then `DerefMove` will not be used, and instead | ||
`Deref` will be used and the resulting referenced value will be copied: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think understand why this is necessary after staring at it for a while (since otherwise existing code that expects to be able to copy out of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wouldn't be sufficient: fn foo(b: Box<(String, i32)>) {
let r = (*b).1; // copy!
drop(b);
} |
||
|
||
```rust | ||
let b = Box::new(0i32); | ||
// Since i32 is Copy, this is equivalent to i = b.deref(). | ||
let i = *b; | ||
// That makes this legal, since b was not moved. | ||
let j = *b; | ||
``` | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
`DerefMove` is a lang item that, like `Deref` and `DerefMut`, plays a role in | ||
dereferencing, both explicitly with the unary `*` operator and implicitly with | ||
autoderef. | ||
|
||
`deref_move` is only called in contexts where a value is needed to be moved out. | ||
This can occur either because an explicit dereference is used in a context where | ||
a value is to be moved, or because autoderef selects a method taking `self` (or | ||
`Box<self>`) as a parameter. In both cases, when evaluating whether to call | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I forget, is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is sort of getting fixed with arbitrary self types, right? Or is this specifically about the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, once arbitrary receiver types are accepted then |
||
`deref_move`, the compiler will look at whether `Target` implements `Copy`. If | ||
it does not, then `deref_move` is called to move the target out of the smart | ||
pointer. If it does, then instead the compiler will only call `deref` and then | ||
copy out of the returned reference. For generic types, the compiler only checks | ||
if a `Copy` bound is available at the use site; it is not done during | ||
monomorphization. | ||
|
||
For explicit dereference, the changes are straightforward. Any explicit | ||
dereference of a `DerefMove` type becomes a movable expression, just as | ||
dereference of `Box` is today. | ||
|
||
For autodereference: | ||
|
||
* In a function call expression, the callee operand may be dereferenced and | ||
moved if the `Target` implements `FnOnce` and no reference-based call trait. | ||
* In a method call expression, the receiver operand may be dereferenced and | ||
moved if the function selected by overload resolution takes `self` or | ||
`Box<self>` as a parameter (in the latter case, this would require a smart | ||
pointer containing a `Box<Self>`, of course). | ||
* In a field access expression, the left-hand operand may be dereferenced and | ||
moved if the result of the expression is moved. In this case, the move is | ||
always entire; partial move is not supported when a dereference is necessary. | ||
* An index expression can never move its left operand, so it is unaffected. | ||
* A borrow expression can never move its operand, so it is unaffected. | ||
|
||
No change is made to overload resolution of method calls. This means that calls | ||
which are invalid today due to preferring `T` over `&T` may now be legal, | ||
selecting to move out of the smart pointer rather than giving an error. | ||
|
||
There is an ambiguity in a method call expression when `Target = &mut T` and | ||
`&mut T` has a method taking `self` by move. In this case, the compiler could | ||
either call `deref_mut` to get a `&mut &mut T`, then derefence to get the `&mut | ||
T` to pass into the method, or it could call `deref_move` to get a `&mut T` | ||
directly. The former is chosen because it avoids moving the pointer. The same | ||
ambiguity arises with `&T`, but is already accounted for by the `Copy` rule | ||
since shared references are `Copy`. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
## Nobody understands autoderef already and this makes it worse | ||
|
||
Or at least, it would if `Box` wasn't already complicating it in the same way. | ||
|
||
## Copy behaviour is unintuitive in generic cases | ||
|
||
If a type has nontrivial/side-effectful `deref_move`, it may be slightly | ||
confusing why it is not used for `Copy` types in non-generic code, but is used | ||
in generic code. | ||
|
||
## Breaking change to an edge case of `Box<&mut T>` | ||
|
||
Currently, a `Box<&mut T>` variable [can be | ||
dereferenced](https://play.rust-lang.org/?gist=5b3f046ecf29da8f00fde3bd1d2a3df4&version=nightly&mode=debug) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer if you inlined this example, so that these documents are less prone to bitrot. |
||
in at least some cases into a `&mut T` without either requiring that the | ||
variable be mutable (as would be the case for any other `DerefMut` type), or | ||
moving out of the `Box`. This behaviour would not be maintained if its magical | ||
dereferencing were replaced with `DerefMove`, as `DerefMut` requires a mutable | ||
variable. | ||
|
||
This is an edge case, as it requires `Box<&mut T>`, which is an unlikely type, | ||
and in most cases it can be trivially fixed by making the variable mutable. It's | ||
possible there are more complex cases (such as if the `Box` itself is stored in | ||
another smart pointer) where this is not easily fixed, but I would be | ||
unsurprised if there were few, if any, instances of this in extant code. | ||
Moreover, the behaviour is arguably a bug because it is taking a mutable borrow | ||
of an immutable variable (and one can modify the above example to get an error | ||
message implying exactly that), though the semantics are more like the unique | ||
immutable borrows used by closures. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that this feels like a bugfix. |
||
|
||
As a result, I propose that this breaking change be accepted after verifying its | ||
effect on Cargo packages. | ||
|
||
# Rationale and alternatives | ||
[alternatives]: #alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A more comprehensive solution like #1646 should be mentioned as an alternative. |
||
|
||
## Copy handling | ||
|
||
The handling of copies is the real complexity of `DerefMove`. Currently, one can | ||
copy out of a `Box` without consuming it, so this desirable behaviour needs to | ||
be preserved. The proposal is to do so by special casing in the compiler. | ||
However, it could also be done via, for instance, a separate method in the Deref | ||
trait definition: | ||
|
||
```rust | ||
fn deref_copy(&self) -> Self::Target where Self::Target : Copy { | ||
return self.deref(); | ||
} | ||
``` | ||
|
||
This still requires special-casing in the compiler, but it is less magical. One | ||
could even envision a `DerefCopy` trait with similar purpose. | ||
|
||
These approaches would, however, make diverging implementations (that is, | ||
implementations for which the behaviour of the various deref traits is | ||
unintuitive because they don't actually refer to the same underlying value) more | ||
difficult to write, and we want to avoid that. And they would not fundamentally | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean easier to write? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, sorry. See my above comment that if we did this we may actually need to enforce that they match. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on what you mean by "fail"? I assume all that would happen is that the Deref target and implementation would be used on autoderef, which seems no different than what would happen if the targets were the same.
Wouldn't it always be unambiguous which implementation to use in any given context, in the same way that |
||
solve the special casing in the compiler; to do that, we would really need a | ||
more advanced type system allowing us to condition the type of `self` in | ||
`deref_move` on whether or not `Target : Copy`. | ||
|
||
With negative trait bounds, we could also bound `DerefMove` on `Target : !Copy`, | ||
eliminating the problem, at the cost of requiring every implementation to also | ||
bound on `!Copy`. | ||
|
||
## Status quo | ||
|
||
The status quo is disfavoured because it requires special-casing `Box` in the | ||
compiler, privileging it over other smart pointers. While there's no obvious | ||
call in the stdlib for `DerefMove` other than `Box` (see below), it prevents | ||
other library authors from writing similar code. | ||
|
||
## `IndexMove` trait | ||
|
||
One option considered in the early discussion was an `IndexMove` trait, similar | ||
to `DerefMove`, which would consume a collection and return a single element. I | ||
have not included this in this RFC because no compelling use case was presented, | ||
and the behaviour of `let s = v[0];` actually consuming `v` seems likely to add | ||
cognitive overhead to the language. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh i just thought of a use-case; arrays. IndexMove is necessary to get e.g. a Box out of an Array. But in that case into_iter or slice patterns are probably better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, there's definitely a use case for an Index trait which returns things by value. For example, consider a type that wants to expose an API similar to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IndexGet is the intended solution to that problem, but yes it could just be provided through IndexMove. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I initially worked on all versions proposed by #997, but quickly abandoned that when I realize that the |
||
|
||
Compared to other methods of moving out of a collection, `IndexMove` may provide | ||
slight micro-optimization in some cases, but it's not clear without benchmarks | ||
that this is actually meaningfully better than using, say, | ||
`.into_inter().nth(n)`. Given that the `IndexMove` was proposed largely from | ||
parallel construction from `DerefMove` without any specific use cases, it does | ||
not seem worthwhile to move further with it at this time. | ||
|
||
## Weakening `DerefMut` bound | ||
|
||
The `DerefMut` bound on `DerefMove` and, in particular, the `Deref` bound that | ||
it implies, constrain implementors somewhat. It requires that `Deref` be | ||
implementable, so the implementation is not allowed to overload dereferencing in | ||
such a way that it produces a new value each time. This would not be useful for | ||
non-`Copy` types, since they would be consumed, but a `Copy` type could in this | ||
fashion allow itself to be copied and then produce a new value on each | ||
derefence. This would be useful to avoid redefining methods on the outer type. | ||
|
||
This very much goes against the intent of the `Deref` traits to be restricted | ||
smart pointers, however. `Deref` is currently the only way for methods of one | ||
type to be callable on another, but it is deliberately limited in scope. For | ||
instance, one might be tempted to implement `Deref` on a newtype for this | ||
purpose, but this is considered unidiomatic. Additionally, this is a bandaid for | ||
the fact that newtypes would also often like to be able to inherent trait | ||
implementations, and inhereting methods is not sufficient for that. | ||
|
||
The additional requirement is to avoid complication of the following case: | ||
|
||
```rust | ||
struct RefMethod(); | ||
|
||
trait Consume : Sized { | ||
fn consume(self) {} | ||
} | ||
|
||
impl<'a> Consume for &'a mut RefMethod {} | ||
|
||
struct BadDeref<'a> (&'a mut RefMethod); | ||
|
||
impl<'a> Deref for BadDeref<'a> { | ||
type Target = RefMethod; | ||
fn deref(&self) -> &Self::Target { &self.0 } | ||
} | ||
|
||
impl<'a> DerefMove for BadDeref<'a> { | ||
fn deref_move(self) -> Self::Target { self.0 } | ||
} | ||
|
||
fn main() { | ||
let mut r = RefMethod{}; | ||
let b = BadDeref(&mut r); | ||
b.consume(); | ||
b.consume(); | ||
} | ||
``` | ||
|
||
In this case, the first `b.consume()` is legal, but the second `b.consume()` is | ||
not since `b` was already moved out of. But if `BadDeref` also implements | ||
`DerefMut`, then the first `b.consume()` becomes illegal because `deref_mut` | ||
becomes preferred and `b` is immutable. By requiring that `DerefMove : | ||
DerefMut`, this edge case cannot be encountered: this code will always be | ||
illegal unless `b` is made mutable. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I sort of understand why this is a bad edge case, but as was pointed out above there are useful situations where you can't feasibly implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The behaviour of |
||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
The compiler special case for `Box` is the most compelling prior art. This RFC | ||
is written with that behaviour in mind; one of the most important criteria here | ||
is that `Box` should remain completely backwards-compatible. In case of any | ||
conflict between current `Box` behaviour and the behaviour specified in this | ||
RFC that isn't explicitly called out as being different, it should be considered | ||
a bug in the RFC and either this RFC should be corrected or more discussion | ||
should ensue. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rust allows this: struct A<T: ?Sized> {
x: String,
y: T,
}
fn foo(b: Box<A<dyn Send>>) {
let r = (*b).x; // explicit deref for clarity
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also moving from struct D(i32);
impl Drop for D {
fn drop(&mut self) {
println!("drop({})", self.0);
}
}
fn main() {
let mut b = Box::new((D(0), D(1)));
drop(b.0);
println!("Hi!");
b = Box::new((D(2), D(3))); // old value of b is dropped.
println!("Hi again!");
} Output:
|
||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
## Implementors | ||
|
||
It's not clear to me which, if any, of the smart pointers above should implement | ||
`DerefMove` other than `Box`. It requires discussion of whether inadvertently | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point. I will revise the RFC to discuss the interaction with by-value DSTs more, since that seems very fundamental. |
||
moving out of the pointer could cause issues. I think that it's reasonable to | ||
start with only `Box` for now: | ||
|
||
* `Cow` does not implement `DerefMut`, so it is ineligible. In any case, because | ||
`into_owned()` is potentially expensive, it probably should not be a | ||
`DerefMove` implementation. | ||
* `PeekMut`'s `pop` has side effects on the containing collection, and so it | ||
seems best to require it to be moved explicitly. | ||
* Moving out of a `ManualDrop` causes the newly-moved value to regain normal | ||
drop semantics. This seems like the least dangerous, given the mechanics of | ||
`ManualDrop`, but it still probably deserves separate consideration. | ||
|
||
Additional implementations could, of course, be done as separate proposals for | ||
the libs team. | ||
|
||
## Preferring to copy the pointer | ||
|
||
Some earlier discussion proposed that when a type implements both `DerefMove` | ||
and `Copy`, then the special case of a `Copy` target does not apply. In other | ||
words, given `let x = *p;`, if both `p` and `p`'s target type are `Copy`, then | ||
rather than calling `deref` and copying the result, this assignment would call | ||
`deref_mut`, copying `p` to use as the receiver. | ||
|
||
Because of the above decision to require `DerefMut`, `p` must also be able to | ||
get a mutable reference to its target. If it were `Copy`, then, it almost | ||
certainly owns its target, meaning that copying `p` is going to copy `*p` | ||
anyway. So copying only `*p` is a lesser copy, and simplifies the rules by | ||
eliminating a special case. But we may decide that we want to make `DerefMove` a | ||
bit more general, in which case we might want to consider this behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nix the first instance of "smart pointer", I think ("two traits provide...")