New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom self types #2362

Open
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
@withoutboats
Copy link
Contributor

withoutboats commented Mar 14, 2018

Extends support for new self types, such as Rc<Self> and Arc<Self>.

Rendered

withoutboats added some commits Feb 20, 2018

@Centril Centril added the T-lang label Mar 14, 2018

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Mar 14, 2018

How are conflicting method names resolved?

pub struct Ptr<T>(T);

impl<T> Deref<T> {
    type Target = T;

    fn deref(&self) -> &T {
        &self.0
    }
}

impl<T> Ptr<T> {
    pub fn foo(&self) {}
}

pub struct Bar;

impl Bar {
    fn foo(self: &Ptr<Self>) {}
}

However, we also feel that `CoerceUnsized` is not ready to stabilize without
further consideration of the trade offs. For that reason, defining your own
arbitrary method receivers may not be stabilized as quickly.

This comment has been minimized.

@cramertj

cramertj Mar 14, 2018

Member

I think this division makes sense. It shouldn't even require an extra feature gate, since CoerceUnsized is already unstable, correct? In other words, the delayed stabilization described here doesn't require delaying stabilization of any part of this RFC-- it just specifies that CoerceUnsized, an existing unstable feature, will remain unstable.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Mar 14, 2018

How will conflicts between methods on e.g. Arc<T> and self: Arc<MyType> be handled? For example:

// In an upstream crate
impl<T> CleverPtr<T> {
    fn foo(self) { ... }
}

// In a downstream crate:
struct MyType;
impl MyType {
    fn foo(self: CleverPtr<MyType>) { ... }
}

fn main() {
    let x = CleverPtr::new(MyType);
    x.foo(); // ERROR: conflicting methods.
    // What do I write to solve this? UFCs doesn't help:
    <CleverPtr<MyType>>::foo(x) // Which method is this?
}

Edit: whoops, looks like @sfackler beat me to it :)
Edit 2: see below, UFCs does solve the problem

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 14, 2018

@sfackler Good question! Its already implemented, so they must be resolved somehow. Let's check... (playpen)

It's currently a name collision.

(@cramertj I think your question is the same.)

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Mar 14, 2018

Ah cool, and you can disambiguate with Ptr::foo and Bar::foo.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Mar 14, 2018

Ah, I see-- the type in UFCs is the type of the impl block, not the type of the method receiver.

(In other words, impl Foo { fn bar(self: Arc<Self>) {...} } is called like Foo::bar(x), not <Arc<Foo>>::bar(x).)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 14, 2018

@sfackler Right! It seems analogous to two traits containing the same method name.

I could see us introducing a hierarchy some day so that impls defined where Self is the type of the variable get precedence (so that Ptr<T>'s foo comes first), similar to how inherent methods get precedence over traits. But the name collision is forward compatible with that change, and I think I'd like to get experience using this and seeing how often its a problem before deciding whether or not to do that (I can add it as an unresolved question).

# Summary
[summary]: #summary

Allow types that implement `Deref` targeting `Self` to be the receiver of a

This comment has been minimized.

@Centril

Centril Mar 14, 2018

Contributor

I think this is clearer:

Allow types that implement Deref<Target = Self> ...

feature.

This feature is increasingly relevant because of the role of special pointer
types to constraint self-referential types, such as generators containing

This comment has been minimized.

@Centril

Centril Mar 14, 2018

Contributor

typo: s/constraint/constrain

[guide-level-explanation]: #guide-level-explanation

When declaring a method, users can declare the type of the `self` receiver to
be any type `T` where `T: Deref<Target = Self>`. Shorthand exists, so that

This comment has been minimized.

@Centril

Centril Mar 14, 2018

Contributor

s/Shorthand exists/Shorthands exist/

fn by_rc(self: Rc<Self>);
fn by_arc(self: Arc<Self>);
}
```

This comment has been minimized.

@Centril

Centril Mar 14, 2018

Contributor

What about HKTs in the future (assuming we get HKTs which we might not)?
Consider:

trait Functor<Self : * -> *> { // strawman syntax for encoding the kind of Self : * -> *
    fn fmap<A, B, F>(self: Self<A>, mapper: F) -> Self<B>
    where F: Fn(A) -> B;
}

This comment has been minimized.

@arielb1

arielb1 Mar 14, 2018

Contributor

If we have a bound where Self<A> derefs to Self, I don't see why not.

This comment has been minimized.

@Centril

Centril Mar 15, 2018

Contributor

But Self is not of the kind of values, it has kind * -> *, so you shouldn't be able to construct a value of Selfat all in this case (hence, you can't deref Self<A> to Self).

I guess you could have a separate rule for higher kinded types and say that this rule only applies to *-kinded types.


To support object-safety, we must have a way of obtaining a vtable from the
reference type, and then passing the correct receiver type to the function in
the vtable.

This comment has been minimized.

@Centril

Centril Mar 14, 2018

Contributor

Have you considered interactions with multiple trait bounds in the future?

This comment has been minimized.

@withoutboats

withoutboats Mar 14, 2018

Contributor

I don't think there's an interaction. This RFC doesn't introduce any constraint that doesn't already exist today.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Mar 14, 2018

Two sortof-related questions:

  • Are trait objects the reason why Deref is necessary, or is there something else? Writing a free fn taking a Foo<T> as argument doesn't (of course) require it to be Deref. Is it because formulating a "the receiver must [somehow] be related to Self" rule would otherwise require HKT, and/or because we want to support non-generic receivers (e.g. String: Deref<Target=str>)?

  • How come DerefMut isn't relevant? (We don't really care about the methods only the Target type?)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 14, 2018

@glaebhoerl I don't really understand your questions. Its true that object safety requires Deref to find the vtable, and also true that we'd want to keep the constraint anyway. I think the answer to your first question is basically "yes," we don't want something like:

impl String {
    fn this_is_a_string_method(self: i32) { }
}

// These two expressions do the same thing:
0.this_is_a_string_method()
String::this_is_a_string_method(0)

I don't see how DerefMut could come into things at all.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Mar 14, 2018

(I think both of them boil down to trying to clarify for myself why symmetries which typically manifest -- between the requirements of free fns and methods, and between pairs of things which work with Deref resp. DerefMut -- in this case don't.)

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 14, 2018

@glaebhoerl

At least when I wrote the proposal, Deref is required for the same reason that importing traits is - type-based resolution should not depend on the universe - in other words, adding a new sibling crate while not using it in the current scope should not be able to cause ambiguities that need UFCS to be resolved where they did not exist before.

For this reason, DerefMut does not matter - if we have a Deref-chain, we already know that the method is "in our universe".

I'm not sure how useful is that property, because adding a new dependency can worsen type inference, but it felt like a useful thing to keep.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 14, 2018

@withoutboats

When I last looked on this, I remember that the current checking of CoerceUnsized did not quite have the right invariant that we needed for packing vtables.

However, CoerceUnsized is unstable and ill-documented, so I think the right idea is to:

  1. Figure out the right rules for CoerceUnsized to make autosizing and autounsizing sound, while making sure that there are sane rules that don't break anything important.
  2. Document these rules and change the implementation of CoerceUnsized to use them.
  3. Maybe think about stabilizing it.
@clarcharr

This comment has been minimized.

Copy link
Contributor

clarcharr commented Mar 14, 2018

I assume that self: &Box<T> will always be the appropriate bound, and not &self: Box<T>?

@F001

This comment has been minimized.

Copy link
Contributor

F001 commented Mar 15, 2018

Is it possible to allow trait object as the type of self? I mean:

trait Bar {
    fn baz(self: &dyn Deref<Target=Self>);
}
@F001

This comment has been minimized.

Copy link
Contributor

F001 commented Mar 15, 2018

And I have another question which might be off topic. Does this RFC solve the issue rust-lang/rust#28796? My understanding is that it is possible to allow Box<FnOnce()> just work. Or it is an orthogonal problem?

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 15, 2018

cc @mikeyhew who will hopefully be spearheading the implementation work on this :)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 16, 2018

@withoutboats

With respect to the trait object rule, there's something a bit implicit in there I want to better understand. The RFC states:

Given a reference type Ptr<dyn Trait>, the compiler must be able to prove that T: Unsize<dyn Trait> implies Ptr<T>: CoerceUnsized<Ptr<dyn Trait>>.

It seems like we would require a "HKT-like" feature here to decide what Ptr is, right? It feels like we should say something about this. Presumably the rule is that, looking at the self type in the trait definition, the Self parameter must appear exactly once, so that we can instantiate it with Self = dyn Trait. Maybe we want to be more restrictive, in fact, and say that it must an ADT with one type parameter (Self)? That would fit e.g. Rc<Self> and friends; we probably want to support Ref<'_, Self> too.

But, for example, I imagine we don't want anyone to write this:

trait Foo {
    fn method(self: Deref1<Self, Self>) { .. }
}

struct Deref1<A: ?Sized, B: ?Sized> { }
impl<A: ?Sized, B: ?Sized> Deref for Deref1<A,B> {
  type Target = A;
}
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 16, 2018

Similarly, the way the RFC is written, it is fine (in inherent impls, anyway) to have a receiver type derefs to Self but doesn't mention Self, right?

struct Foo { }
struct Bar { }
impl Deref for Foo {
   type Target = Bar;
}

impl Bar {
    fn on_foo(self: Foo) { } // `Foo: Deref<Bar>` is ok
}
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 16, 2018

I think it'd be useful to dig a bit more into how method dispatch on a trait object will work. Maybe I'm just slow, but the text in the RFC is a bit terse. =)

So, let me try to spell it out for my own edification. We start out with a Rc<dyn Trait> and we see an invocation of a method foo, defined like so:

trait Trait {
    fn foo(self: Rc<Self>);
}

We consider Trait to be "dyn capable", as we have previously checked that (in Chalk terms):

forall<T> {
  if (T: Unsize<dyn Trait>) {
    Rc<T>: CoerceUnsized<Rc<dyn Trait>>
  }
}

Then, at runtime, we can deref Rc<Self> to attain &self and extract the vtable. Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is):

For custom types, the coercion here works by coercing Foo<T> to Foo<U> provided an impl of CoerceUnsized<Foo<U>> for Foo<T> exists. Such an impl can only be written if Foo<T> has only a single non-phantomdata field involving T. If the type of that field is Bar<T>, an implementation of CoerceUnsized<Bar<U>> for Bar<T> must exist. The coercion will work by coercing the Bar<T> field into Bar<U> and filling in the rest of the fields from Foo<T> to create a Foo<U>. This will effectively drill down to a pointer field and coerce that.

Assuming these docs are accurate (I should check the code), then it seems like we already know that actually we can just strip the vtable from the Rc<dyn Trait> and pass the pointer -- i.e., we don't really need to extract the type info out of the vtable in particular, right? (Am I missing something?)

@arielb1 I'm curious what constraints you think are insufficient.

(If we were to generalize CoerceUnsized -- e.g., to allow user-specified computation as was originally proposed -- we might have to do more.)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 16, 2018

@nikomatsakis You're not slow, I'm just uncertainly cribbing from the notes that you and @mikeyhew gave me. :)

Everything you've said seems accurate. I think we should restrict the self type in traits to having the Self parameter appear exactly once, but no more restrictions (e.g. having other parameters seems fine to me, as long as it can be implemented).

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Mar 17, 2018

@nikomatsakis

it seems like we would require a "HKT-like" feature here to decide what Ptr is, right?

I agree with this notion. I'm wary of stabilizing the way that we determine if a self type is object-safe until we have more trait-system features.

I don't think that the CoerceUnsized requirement is the ultimate solution here, I think it's more of a hack that we can use now (and keep unstable) until we have a better solution.

Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is)

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

This is why the `CoerceUnsized` bound is necessary for object-safe receiver
types. Using the type ID stored in the vtable, we can downcast the `Self` type
to the correct concrete type to pass to the function pointer for the method
we're calling, effectively reversing the unsizing coercion.

This comment has been minimized.

@mikeyhew

mikeyhew Mar 17, 2018

Are you sure that "type ID" and "downcast" is the right terminology here? My understanding was that the compiler looks up the method in the vtable, coerces the Rc<Foo> to Rc<WhateverTypeThatImplementsFoo> without actually knowing what that type is, and calls the method using the coerced Rc as the self argument

This comment has been minimized.

@withoutboats

withoutboats Mar 17, 2018

Contributor

You're right.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 17, 2018

@mikeyhew

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

However, I believe those fields have to be phantom-data -- or at least zero-sized? That's what the description says, anyway.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Mar 17, 2018

@nikomatsakis that description must be out of date, then. Here is an example of a pointer with an additional, non-zero-sized field:

https://play.rust-lang.org/?gist=4df5fa3c12db2d4545edd283ae76e76a&version=nightly

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Mar 17, 2018

@clarcharr

I assume that self: &Box will always be the appropriate bound, and not &self: Box?

Sorry, I don't understand the question. self: &Box<T> isn't a trait bound, and &self: Box<T> isn't valid syntax for a method's self argument.

@F001

Is it possible to allow trait object as the type of self? I mean:

trait Bar {
   fn baz(self: &dyn Deref<Target=Self>);
}

That actually works right now with the arbitrary_self_types feature. I don't think it's very useful though, since generic self types also work and are much better in my opinion. Here's a playpen with both of them. Unfortunately, generic self can't be made object-safe for the same reason that all generic methods aren't object-safe.

EDIT: updated the link, thanks @glaebhoerl!

And I have another question which might be off topic. Does this RFC solve the issue rust-lang/rust#28796? My understanding is that it is possible to allow Box<FnOnce()> just work. Or it is an orthogonal problem?

That is unrelated. The issue is that FnOnce::call_once takes self by value and arbitrary_self_types doesn't solve the problem that taking self by value isn't object-safe. That is theoretically solved by RFC 1909 (unsized rvalues); in fact that was one of its biggest selling points. The RFC hasn't been implemented yet though afaik

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Mar 17, 2018

(@mikeyhew you accidentally linked the same playpen again as in your previous comment)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 17, 2018

I don't think that the CoerceUnsized requirement is the ultimate solution here, I think it's more of a hack that we can use now (and keep unstable) until we have a better solution.

Agreed! My vision is that we stabilize using the object safe method receivers in std immediately as a generalization of self: Box (that is, Rc and Arc, and - in a week or two under the pin feature flag - Pin and PinBox). However, we keep non-object safe method receivers and coerce_unsized unstable. Basically, we'll have expanded the special case for Box to the other "box-like" types in std, and we can keep iterating on coerce_unsized and trait objects in general before allowing users to define their own custom self types.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 19, 2018

@mikeyhew

that description must be out of date, then.

I had a feeling that might be the case.

I don't think that the CoerceUnsized requirement is the ultimate solution here, I think it's more of a hack that we can use now (and keep unstable) until we have a better solution.

I agree, but I think that the right solution feels like it might still be in this general direction: that is, basically we are saying that CoerceUnsized is now defining a "bidirectional" relationship. You can coerce from a Ptr<T> to a Ptr<dyn Trait> and then back again, whereas before we had only the one direction. But being able to go bidirectionally here seems to make sense to me, and it's not clear that we would ever not want that.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 19, 2018

@nikomatsakis

@arielb1 I'm curious what constraints you think are insufficient.

The current definition of CoerceUnsized, together with a constraint like you suggested (forall<T> T: Unsize<dyn Trait> => Rc<T>: CoerceUnsized<Rc<dyn Trait>>) does not imply unsizing coercion always occurs at the "right" vtable type.

That's it, you can write code such as the following (I was not aware of this particular example back then, only suspected that it existed):

#![feature(dyn_trait, coerce_unsized)]

use std::fmt;
use std::ops::CoerceUnsized;

trait Xyz {
    type Output: ?Sized;
}

impl<T> Xyz for T {
    type Output = u32;
}

impl Xyz for dyn fmt::Debug {
    type Output = fmt::Debug;
}

struct Foo<T: Xyz + ?Sized>(Box<<T as Xyz>::Output>);

impl<T> CoerceUnsized<Foo<dyn fmt::Debug>> for Foo<T> {
}

fn main() {
    let x: Foo<()> = Foo(Box::new(2));
    let x: Foo<dyn fmt::Debug> = x;
    println!("{:?}", x.0);
}

This type-checks, and the only way this works is if the vtable in Foo<dyn fmt::Debug> is the vtable for u32, rather than for fmt::Debug. That would obviously pose a problem if we want to "size" it again to Foo<SomethingSpecific>.

However, this code is silly and ICEs, so the better option is probably to fix CoerceUnsized checking to prevent this sort of foolery.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 19, 2018

So on my second look, this looks more like a poor definition of the unstable CoerceUnsized than anything else, and the solution should be to find a better definition for it where this case can no longer occur.

@newpavlov

This comment has been minimized.

Copy link

newpavlov commented Mar 20, 2018

Will this RFC bring us closer to self-consuming object safe trait methods? Ideally I would like to be able to write the following code:

trait Foo {
    fn consume(self: Consumed<Self>) {
        // `Consumed<Self>` implements `DerefMut<Self>`
        // but to create it you'll have to move your data into it
    }
}

// `T` is a concrete type which implements `Foo`
fn bar1(val: T) {
    // this line desugars into `Consumed::new(val).consume()`
    val.consume()
}

fn bar2(val: Box<Foo>) {
    // `val` can not be used after this line
    val.consume()
}
@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Mar 21, 2018

@newpavlov No, this RFC is an orthogonal feature. Self-consuming object-safe methods dependent upon being able to generate a vtable which consumes Self via a pointer, similar to the take_closure function allowed by the unsized rvalues RFC.

@gralpli

This comment has been minimized.

Copy link

gralpli commented Mar 21, 2018

Does this work for self: Rc<RefCell<Self>>, too?

@MSleepyPanda

This comment has been minimized.

Copy link

MSleepyPanda commented Mar 21, 2018

@gralpli Probably not, because RefCell does not implement Deref

Allow types that implement Deref targeting Self to be the receiver of a method.

Edit:
There is an existing pattern, access the object via this: Rc<RefCell<Self>> instead of self, though it isn't as ergonomic as it could be because you have to call it by Foo::bar(that) instead of that.bar()

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 21, 2018

@arielb1

However, this code is silly and ICEs, so the better option is probably to fix CoerceUnsized checking to prevent this sort of foolery.

Thanks for the example! We should indeed try to write out the CoerceUnsized rules in some kind of more careful way and ensu. AFAIK the true rules are -- right now -- encoded only in the compiler.

@bill-myers

This comment has been minimized.

Copy link

bill-myers commented Apr 2, 2018

Why does this need Deref at all?

If we have a Foo<dyn Bar>, doesn't that type already include a vtable for trait Bar?

If the concern is that we don't want to add methods to random types, then we could support not having Deref, but only allow UFCS to call the method in that case.

be implemented. Given a reference type `Ptr<dyn Trait>`, the compiler must be
able to prove that `T: Unsize<dyn Trait>` implies `Ptr<T>:
CoerceUnsized<Ptr<dyn Trait>>`. If the compiler can prove this, methods with
these receivers are object safe (how they object safe conversion is implemented

This comment has been minimized.

@parasyte

parasyte Jun 8, 2018

nit: typo they -> the

@Nemo157

This comment has been minimized.

Copy link
Contributor

Nemo157 commented Jul 21, 2018

Something not covered in the RFC and I don't recall seeing discussed is lifetime elision, in the futures codebase there is a function signature:

pub fn get_pin_mut(self: PinMut<'a, Self>) -> PinMut<'a, Si>

If arbitrary self types supported the same elision as normal references this could be:

pub fn get_pin_mut(self: PinMut<Self>) -> PinMut<'_, Si>

But that seems a bit terse, and loses out on the "all types have either & or '_ to denote they contain a lifetime" property of the '_ return lifetime. Maybe this could be allowed to be written

pub fn get_pin_mut(self: PinMut<'_, Self>) -> PinMut<'_, Si>

to remove the need to give a name to this very transient lifetime.

(I found this because Clippy's needless_lifetimes lint has a false positive on this line with the current arbitrary_self_types implementation).

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Jul 21, 2018

@Nemo157 I think going with '_ makes a lot of sense. As far as I'm concerned, the fact that it's not supported right now is a bug.

@Laaas

This comment has been minimized.

Copy link

Laaas commented Aug 30, 2018

so.... will this be merged?

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Aug 30, 2018

Before this gets merged, I think we should either remove the part about object safety, or say that the way object-safety is determined will remain unstable for a while, and will probably need a new RFC for a final decision to be made. Right now, in the implementation, we are using a new, unstable trait called CoerceSized that has to be implemented in order for a receiver type to be considered object-safe. While I think it probably doesn't need to be decoupled from CoerceUnsized, doing so is the more conservative choice, and so that's what we're doing for now.

Another change that I'm mostly in favour of, is calling the self argument the method's receiver, to keep from confusing lowercase self and uppercase Self. Under this logic, "custom method receivers" would be a better name for this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment