Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom self types #2362

Open
wants to merge 4 commits into
base: master
from

Conversation

@withoutboats
Copy link
Contributor

commented Mar 14, 2018

Extends support for new self types, such as Rc<Self> and Arc<Self>.

Rendered

@Centril Centril added the T-lang label Mar 14, 2018

@sfackler

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

How are conflicting method names resolved?

pub struct Ptr<T>(T);

impl<T> Deref<T> {
    type Target = T;

    fn deref(&self) -> &T {
        &self.0
    }
}

impl<T> Ptr<T> {
    pub fn foo(&self) {}
}

pub struct Bar;

impl Bar {
    fn foo(self: &Ptr<Self>) {}
}

However, we also feel that `CoerceUnsized` is not ready to stabilize without
further consideration of the trade offs. For that reason, defining your own
arbitrary method receivers may not be stabilized as quickly.

This comment has been minimized.

Copy link
@cramertj

cramertj Mar 14, 2018

Member

I think this division makes sense. It shouldn't even require an extra feature gate, since CoerceUnsized is already unstable, correct? In other words, the delayed stabilization described here doesn't require delaying stabilization of any part of this RFC-- it just specifies that CoerceUnsized, an existing unstable feature, will remain unstable.

@cramertj

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

How will conflicts between methods on e.g. Arc<T> and self: Arc<MyType> be handled? For example:

// In an upstream crate
impl<T> CleverPtr<T> {
    fn foo(self) { ... }
}

// In a downstream crate:
struct MyType;
impl MyType {
    fn foo(self: CleverPtr<MyType>) { ... }
}

fn main() {
    let x = CleverPtr::new(MyType);
    x.foo(); // ERROR: conflicting methods.
    // What do I write to solve this? UFCs doesn't help:
    <CleverPtr<MyType>>::foo(x) // Which method is this?
}

Edit: whoops, looks like @sfackler beat me to it :)
Edit 2: see below, UFCs does solve the problem

@withoutboats

This comment has been minimized.

Copy link
Contributor Author

commented Mar 14, 2018

@sfackler Good question! Its already implemented, so they must be resolved somehow. Let's check... (playpen)

It's currently a name collision.

(@cramertj I think your question is the same.)

@sfackler

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

Ah cool, and you can disambiguate with Ptr::foo and Bar::foo.

@cramertj

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

Ah, I see-- the type in UFCs is the type of the impl block, not the type of the method receiver.

(In other words, impl Foo { fn bar(self: Arc<Self>) {...} } is called like Foo::bar(x), not <Arc<Foo>>::bar(x).)

@withoutboats

This comment has been minimized.

Copy link
Contributor Author

commented Mar 14, 2018

@sfackler Right! It seems analogous to two traits containing the same method name.

I could see us introducing a hierarchy some day so that impls defined where Self is the type of the variable get precedence (so that Ptr<T>'s foo comes first), similar to how inherent methods get precedence over traits. But the name collision is forward compatible with that change, and I think I'd like to get experience using this and seeing how often its a problem before deciding whether or not to do that (I can add it as an unresolved question).

# Summary
[summary]: #summary

Allow types that implement `Deref` targeting `Self` to be the receiver of a

This comment has been minimized.

Copy link
@Centril

Centril Mar 14, 2018

Member

I think this is clearer:

Allow types that implement Deref<Target = Self> ...

feature.

This feature is increasingly relevant because of the role of special pointer
types to constraint self-referential types, such as generators containing

This comment has been minimized.

Copy link
@Centril

Centril Mar 14, 2018

Member

typo: s/constraint/constrain

[guide-level-explanation]: #guide-level-explanation

When declaring a method, users can declare the type of the `self` receiver to
be any type `T` where `T: Deref<Target = Self>`. Shorthand exists, so that

This comment has been minimized.

Copy link
@Centril

Centril Mar 14, 2018

Member

s/Shorthand exists/Shorthands exist/

fn by_rc(self: Rc<Self>);
fn by_arc(self: Arc<Self>);
}
```

This comment has been minimized.

Copy link
@Centril

Centril Mar 14, 2018

Member

What about HKTs in the future (assuming we get HKTs which we might not)?
Consider:

trait Functor<Self : * -> *> { // strawman syntax for encoding the kind of Self : * -> *
    fn fmap<A, B, F>(self: Self<A>, mapper: F) -> Self<B>
    where F: Fn(A) -> B;
}

This comment has been minimized.

Copy link
@arielb1

arielb1 Mar 14, 2018

Contributor

If we have a bound where Self<A> derefs to Self, I don't see why not.

This comment has been minimized.

Copy link
@Centril

Centril Mar 15, 2018

Member

But Self is not of the kind of values, it has kind * -> *, so you shouldn't be able to construct a value of Selfat all in this case (hence, you can't deref Self<A> to Self).

I guess you could have a separate rule for higher kinded types and say that this rule only applies to *-kinded types.


To support object-safety, we must have a way of obtaining a vtable from the
reference type, and then passing the correct receiver type to the function in
the vtable.

This comment has been minimized.

Copy link
@Centril

Centril Mar 14, 2018

Member

Have you considered interactions with multiple trait bounds in the future?

This comment has been minimized.

Copy link
@withoutboats

withoutboats Mar 14, 2018

Author Contributor

I don't think there's an interaction. This RFC doesn't introduce any constraint that doesn't already exist today.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

Two sortof-related questions:

  • Are trait objects the reason why Deref is necessary, or is there something else? Writing a free fn taking a Foo<T> as argument doesn't (of course) require it to be Deref. Is it because formulating a "the receiver must [somehow] be related to Self" rule would otherwise require HKT, and/or because we want to support non-generic receivers (e.g. String: Deref<Target=str>)?

  • How come DerefMut isn't relevant? (We don't really care about the methods only the Target type?)

@withoutboats

This comment has been minimized.

Copy link
Contributor Author

commented Mar 14, 2018

@glaebhoerl I don't really understand your questions. Its true that object safety requires Deref to find the vtable, and also true that we'd want to keep the constraint anyway. I think the answer to your first question is basically "yes," we don't want something like:

impl String {
    fn this_is_a_string_method(self: i32) { }
}

// These two expressions do the same thing:
0.this_is_a_string_method()
String::this_is_a_string_method(0)

I don't see how DerefMut could come into things at all.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

(I think both of them boil down to trying to clarify for myself why symmetries which typically manifest -- between the requirements of free fns and methods, and between pairs of things which work with Deref resp. DerefMut -- in this case don't.)

@arielb1

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

@glaebhoerl

At least when I wrote the proposal, Deref is required for the same reason that importing traits is - type-based resolution should not depend on the universe - in other words, adding a new sibling crate while not using it in the current scope should not be able to cause ambiguities that need UFCS to be resolved where they did not exist before.

For this reason, DerefMut does not matter - if we have a Deref-chain, we already know that the method is "in our universe".

I'm not sure how useful is that property, because adding a new dependency can worsen type inference, but it felt like a useful thing to keep.

@arielb1

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

@withoutboats

When I last looked on this, I remember that the current checking of CoerceUnsized did not quite have the right invariant that we needed for packing vtables.

However, CoerceUnsized is unstable and ill-documented, so I think the right idea is to:

  1. Figure out the right rules for CoerceUnsized to make autosizing and autounsizing sound, while making sure that there are sane rules that don't break anything important.
  2. Document these rules and change the implementation of CoerceUnsized to use them.
  3. Maybe think about stabilizing it.
@clarfon

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

I assume that self: &Box<T> will always be the appropriate bound, and not &self: Box<T>?

@F001

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2018

Is it possible to allow trait object as the type of self? I mean:

trait Bar {
    fn baz(self: &dyn Deref<Target=Self>);
}
@F001

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2018

And I have another question which might be off topic. Does this RFC solve the issue rust-lang/rust#28796? My understanding is that it is possible to allow Box<FnOnce()> just work. Or it is an orthogonal problem?

@withoutboats

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2018

cc @mikeyhew who will hopefully be spearheading the implementation work on this :)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

@withoutboats

With respect to the trait object rule, there's something a bit implicit in there I want to better understand. The RFC states:

Given a reference type Ptr<dyn Trait>, the compiler must be able to prove that T: Unsize<dyn Trait> implies Ptr<T>: CoerceUnsized<Ptr<dyn Trait>>.

It seems like we would require a "HKT-like" feature here to decide what Ptr is, right? It feels like we should say something about this. Presumably the rule is that, looking at the self type in the trait definition, the Self parameter must appear exactly once, so that we can instantiate it with Self = dyn Trait. Maybe we want to be more restrictive, in fact, and say that it must an ADT with one type parameter (Self)? That would fit e.g. Rc<Self> and friends; we probably want to support Ref<'_, Self> too.

But, for example, I imagine we don't want anyone to write this:

trait Foo {
    fn method(self: Deref1<Self, Self>) { .. }
}

struct Deref1<A: ?Sized, B: ?Sized> { }
impl<A: ?Sized, B: ?Sized> Deref for Deref1<A,B> {
  type Target = A;
}
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

Similarly, the way the RFC is written, it is fine (in inherent impls, anyway) to have a receiver type derefs to Self but doesn't mention Self, right?

struct Foo { }
struct Bar { }
impl Deref for Foo {
   type Target = Bar;
}

impl Bar {
    fn on_foo(self: Foo) { } // `Foo: Deref<Bar>` is ok
}
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

I think it'd be useful to dig a bit more into how method dispatch on a trait object will work. Maybe I'm just slow, but the text in the RFC is a bit terse. =)

So, let me try to spell it out for my own edification. We start out with a Rc<dyn Trait> and we see an invocation of a method foo, defined like so:

trait Trait {
    fn foo(self: Rc<Self>);
}

We consider Trait to be "dyn capable", as we have previously checked that (in Chalk terms):

forall<T> {
  if (T: Unsize<dyn Trait>) {
    Rc<T>: CoerceUnsized<Rc<dyn Trait>>
  }
}

Then, at runtime, we can deref Rc<Self> to attain &self and extract the vtable. Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is):

For custom types, the coercion here works by coercing Foo<T> to Foo<U> provided an impl of CoerceUnsized<Foo<U>> for Foo<T> exists. Such an impl can only be written if Foo<T> has only a single non-phantomdata field involving T. If the type of that field is Bar<T>, an implementation of CoerceUnsized<Bar<U>> for Bar<T> must exist. The coercion will work by coercing the Bar<T> field into Bar<U> and filling in the rest of the fields from Foo<T> to create a Foo<U>. This will effectively drill down to a pointer field and coerce that.

Assuming these docs are accurate (I should check the code), then it seems like we already know that actually we can just strip the vtable from the Rc<dyn Trait> and pass the pointer -- i.e., we don't really need to extract the type info out of the vtable in particular, right? (Am I missing something?)

@arielb1 I'm curious what constraints you think are insufficient.

(If we were to generalize CoerceUnsized -- e.g., to allow user-specified computation as was originally proposed -- we might have to do more.)

@withoutboats

This comment has been minimized.

Copy link
Contributor Author

commented Mar 16, 2018

@nikomatsakis You're not slow, I'm just uncertainly cribbing from the notes that you and @mikeyhew gave me. :)

Everything you've said seems accurate. I think we should restrict the self type in traits to having the Self parameter appear exactly once, but no more restrictions (e.g. having other parameters seems fine to me, as long as it can be implemented).

@mikeyhew

This comment has been minimized.

Copy link

commented Mar 17, 2018

@nikomatsakis

it seems like we would require a "HKT-like" feature here to decide what Ptr is, right?

I agree with this notion. I'm wary of stabilizing the way that we determine if a self type is object-safe until we have more trait-system features.

I don't think that the CoerceUnsized requirement is the ultimate solution here, I think it's more of a hack that we can use now (and keep unstable) until we have a better solution.

Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is)

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

This is why the `CoerceUnsized` bound is necessary for object-safe receiver
types. Using the type ID stored in the vtable, we can downcast the `Self` type
to the correct concrete type to pass to the function pointer for the method
we're calling, effectively reversing the unsizing coercion.

This comment has been minimized.

Copy link
@mikeyhew

mikeyhew Mar 17, 2018

Are you sure that "type ID" and "downcast" is the right terminology here? My understanding was that the compiler looks up the method in the vtable, coerces the Rc<Foo> to Rc<WhateverTypeThatImplementsFoo> without actually knowing what that type is, and calls the method using the coerced Rc as the self argument

This comment has been minimized.

Copy link
@withoutboats

withoutboats Mar 17, 2018

Author Contributor

You're right.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Mar 17, 2018

@mikeyhew

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

However, I believe those fields have to be phantom-data -- or at least zero-sized? That's what the description says, anyway.

@newpavlov

This comment has been minimized.

Copy link

commented Mar 20, 2018

Will this RFC bring us closer to self-consuming object safe trait methods? Ideally I would like to be able to write the following code:

trait Foo {
    fn consume(self: Consumed<Self>) {
        // `Consumed<Self>` implements `DerefMut<Self>`
        // but to create it you'll have to move your data into it
    }
}

// `T` is a concrete type which implements `Foo`
fn bar1(val: T) {
    // this line desugars into `Consumed::new(val).consume()`
    val.consume()
}

fn bar2(val: Box<Foo>) {
    // `val` can not be used after this line
    val.consume()
}
@cramertj

This comment has been minimized.

Copy link
Member

commented Mar 21, 2018

@newpavlov No, this RFC is an orthogonal feature. Self-consuming object-safe methods dependent upon being able to generate a vtable which consumes Self via a pointer, similar to the take_closure function allowed by the unsized rvalues RFC.

@gralpli

This comment has been minimized.

Copy link

commented Mar 21, 2018

Does this work for self: Rc<RefCell<Self>>, too?

@MSleepyPanda

This comment has been minimized.

Copy link

commented Mar 21, 2018

@gralpli Probably not, because RefCell does not implement Deref

Allow types that implement Deref targeting Self to be the receiver of a method.

Edit:
There is an existing pattern, access the object via this: Rc<RefCell<Self>> instead of self, though it isn't as ergonomic as it could be because you have to call it by Foo::bar(that) instead of that.bar()

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2018

@arielb1

However, this code is silly and ICEs, so the better option is probably to fix CoerceUnsized checking to prevent this sort of foolery.

Thanks for the example! We should indeed try to write out the CoerceUnsized rules in some kind of more careful way and ensu. AFAIK the true rules are -- right now -- encoded only in the compiler.

@bill-myers

This comment has been minimized.

Copy link

commented Apr 2, 2018

Why does this need Deref at all?

If we have a Foo<dyn Bar>, doesn't that type already include a vtable for trait Bar?

If the concern is that we don't want to add methods to random types, then we could support not having Deref, but only allow UFCS to call the method in that case.

be implemented. Given a reference type `Ptr<dyn Trait>`, the compiler must be
able to prove that `T: Unsize<dyn Trait>` implies `Ptr<T>:
CoerceUnsized<Ptr<dyn Trait>>`. If the compiler can prove this, methods with
these receivers are object safe (how they object safe conversion is implemented

This comment has been minimized.

Copy link
@parasyte

parasyte Jun 8, 2018

nit: typo they -> the

@Nemo157

This comment has been minimized.

Copy link
Contributor

commented Jul 21, 2018

Something not covered in the RFC and I don't recall seeing discussed is lifetime elision, in the futures codebase there is a function signature:

pub fn get_pin_mut(self: PinMut<'a, Self>) -> PinMut<'a, Si>

If arbitrary self types supported the same elision as normal references this could be:

pub fn get_pin_mut(self: PinMut<Self>) -> PinMut<'_, Si>

But that seems a bit terse, and loses out on the "all types have either & or '_ to denote they contain a lifetime" property of the '_ return lifetime. Maybe this could be allowed to be written

pub fn get_pin_mut(self: PinMut<'_, Self>) -> PinMut<'_, Si>

to remove the need to give a name to this very transient lifetime.

(I found this because Clippy's needless_lifetimes lint has a false positive on this line with the current arbitrary_self_types implementation).

@mikeyhew

This comment has been minimized.

Copy link

commented Jul 21, 2018

@Nemo157 I think going with '_ makes a lot of sense. As far as I'm concerned, the fact that it's not supported right now is a bug.

@Laaas

This comment has been minimized.

Copy link

commented Aug 30, 2018

so.... will this be merged?

@mikeyhew

This comment has been minimized.

Copy link

commented Aug 30, 2018

Before this gets merged, I think we should either remove the part about object safety, or say that the way object-safety is determined will remain unstable for a while, and will probably need a new RFC for a final decision to be made. Right now, in the implementation, we are using a new, unstable trait called CoerceSized that has to be implemented in order for a receiver type to be considered object-safe. While I think it probably doesn't need to be decoupled from CoerceUnsized, doing so is the more conservative choice, and so that's what we're doing for now.

Another change that I'm mostly in favour of, is calling the self argument the method's receiver, to keep from confusing lowercase self and uppercase Self. Under this logic, "custom method receivers" would be a better name for this feature.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

Nominating this for @rust-lang/lang -- I think it just got overlooked? AFAIK we stabilized some subset of this functionality!

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2019

We discussed this briefly in the @rust-lang/lang meeting today and decided to postpone. I was mistaken in saying that we stabilized the functionality in here -- in truth, what we stabilized was extending the set of "recognized self types" to include Pin<&mut Self> and a few other selected types. We didn't really work out the general case (e.g., Rc<Self>) and it seems like it's sufficiently complex that we don't think doing so makes sense at this point (there's nobody with time to concentrate on it). Therefore, I am moving to postpone the RFC until we do have such bandwidth.

(Note that the RFC content would obviously be a useful starting point for future discussion when we do revisit the topic.)

@rfcbot postpone

@rfcbot

This comment has been minimized.

Copy link

commented Aug 29, 2019

Team member @nikomatsakis has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@mikeyhew

This comment has been minimized.

Copy link

commented Aug 31, 2019

For the record, we stabilized Rc and Arc and Pin, but did not stabilize the mechanism to make user-defined smart pointer types usable as method receivers.

@comex

This comment has been minimized.

Copy link

commented Sep 3, 2019

That’s is a really unfortunate state to leave things in indefinitely. std is not suitable for all use cases and we should avoid giving it special privileges if at all possible.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

That’s is a really unfortunate state to leave things in indefinitely.

This is not intended as the end state of things. We should get back to arbitrary_self_types eventually, but the language and compiler teams have limited bandwidths that I believe should be focused on finishing a smaller set of tasks better.

std is not suitable for all use cases and we should avoid giving it special privileges if at all possible.

I believe we agree overall.

@burdges

This comment has been minimized.

Copy link

commented Sep 3, 2019

Are there interesting complexities in the mutable cases like Box, MutexGuard, and RwLockReadGuard? RefCell cannot happen because it does not implement Deref?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.