Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for `arbitrary_self_types` #44874

Open
arielb1 opened this Issue Sep 26, 2017 · 86 comments

Comments

Projects
None yet
@arielb1
Copy link
Contributor

arielb1 commented Sep 26, 2017

Tracking issue for #![feature(arbitrary_self_types)].

This needs an RFC before stabilization, and also requires the following issues to be handled:

  • figure out the object safety situation
  • figure out the handling of inference variables behind raw pointers
  • decide whether we want safe virtual raw pointer methods

Object Safety

See #27941 (comment)

Handling of inference variables

Calling a method on *const _ could now pick impls of the form

impl RandomType {
    fn foo(*const Self) {}
}

Because method dispatch wants to be "limited", this won't really work, and as with the existing situation on &_ we should be emitting an "the type of this value must be known in this context" error.

This feels like fairly standard inference breakage, but we need to check the impact of this before proceeding.

Safe virtual raw pointer methods

e.g. this is UB, so we might want to force the call <dyn Foo as Foo>::bar to be unsafe somehow - e.g. by not allowing dyn Foo to be object safe unless bar was an unsafe fn

trait Foo {
    fn bar(self: *const Self);
}

fn main() {
    // creates a raw pointer with a garbage vtable
    let foo: *const dyn Foo = unsafe { mem::transmute([0usize, 0x1000usize]) };
    // and call it
    foo.bar(); // this is UB
}

However, even today you could UB in safe code with mem::size_of_val(foo) on the above code, so this might not be actually a problem.

More information

There's no reason the self syntax has to be restricted to &T, &mut T and Box<T>, we should allow for more types there, e.g.

trait MyStuff {
    fn do_async_task(self: Rc<Self>);
}

impl MyStuff for () {
    fn do_async_task(self: Rc<Self>) {
        // ...
    }
}

Rc::new(()).do_async_stuff();

This doesn't have an RFC, but we want to experiment on this without one.

See #27941.

@arielb1 arielb1 changed the title Allow methods with arbitrary self-types Allow trait methods with arbitrary self-types Sep 26, 2017

@porky11

This comment has been minimized.

Copy link

porky11 commented Sep 28, 2017

Why would you need this?
Why wouldn't you write an impl like this:

impl MyStuff for Rc<()> {
    fn do_async_task(self) {
        // ...
    }
}

I'd rather define the trait different. Maybe like this:

trait MyStuff: Rc {
    fn do_async_task(self);
}

In this case, Rc would be a trait type. If every generic type implemented a specific trait (this could be implemented automatically for generic types) this seems more understandable to me.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 28, 2017

This could only be allowed for trait methods, right?

For inherent methods, I can't impl Rc<MyType>, but if impl MyType can add methods with self: Rc<Self>, it seems like that would enable weird method shadowing.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Oct 1, 2017

@cuviper

This is still pending lang team decisions (I hope there will be at least 1 RFC) but I think it will only be allowed for trait method impls.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Oct 1, 2017

@porky11

You can't implement anything for Rc<YourType> from a crate that does not own the trait.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Oct 2, 2017

So changes needed:

  • remove the current error message for trait methods only, but still have a feature gate.
  • make sure fn(self: Rc<Self>) doesn't accidentally become object-safe
  • make sure method dispatch woks for Rc<Self> methods
  • add tests

@arielb1 arielb1 added E-mentor and removed E-needs-mentor labels Oct 2, 2017

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 2, 2017

I’ll look into this.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Oct 2, 2017

Note that this is only supported to work with trait methods (and trait impl methods), aka

trait Foo {
    fn foo(self: Rc<Self>);
}
impl Foo for () {
    fn foo(self: Rc<Self>) {}
}

and is NOT supposed to work for inherent impl methods:

struct Foo;
impl Foo {
    fn foo(self: Rc<Self>) {}
}
@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 6, 2017

I got caught in some more Stylo work that's gonna take a while, so if someone else wants to work on this in the meantime feel free.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Oct 6, 2017

Is this supposed to allow any type as long as it involves Self? Or must it impl Deref<Target=Self>?

trait MyStuff {
    fn a(self: Option<Self>);
    fn b(self: Result<Self, Self>);
    fn c(self: (Self, Self, Self));
    fn d(self: Box<Box<Self>>);
}

impl MyStuff for i32 {
   ...
}

Some(1).a();  // ok?
Ok(2).b();  // ok?
(3, 4, 5).c(); // ok?
(box box 6).d(); // ok?

kennytm added a commit to kennytm/rust that referenced this issue Oct 10, 2017

Rollup merge of rust-lang#45099 - mikeyhew:fix-astconv-self-type-comm…
…ents, r=nikomatsakis

Update comments referring to old check_method_self_type

I was browsing the code base, trying to figure out how rust-lang#44874 could be implemented, and noticed some comments that were out of date and a bit misleading (`check_method_self_type` has since been renamed to `check_method_receiver`). Thought it would be an easy first contribution to Rust!

kennytm added a commit to kennytm/rust that referenced this issue Oct 10, 2017

Rollup merge of rust-lang#45099 - mikeyhew:fix-astconv-self-type-comm…
…ents, r=nikomatsakis

Update comments referring to old check_method_self_type

I was browsing the code base, trying to figure out how rust-lang#44874 could be implemented, and noticed some comments that were out of date and a bit misleading (`check_method_self_type` has since been renamed to `check_method_receiver`). Thought it would be an easy first contribution to Rust!
@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Oct 10, 2017

I've started working on this issue. You can see my progress on this branch

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Nov 2, 2017

@arielb1 You seem adamant that this should only be allowed for traits and not structs. Aside from method shadowing, are there other concerns?

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Nov 2, 2017

@mikeyhew

inherent impl methods are loaded based on the type. You shouldn't be able to add a method to Rc<YourType> that is usable without any use statement.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Nov 2, 2017

That's it, if you write something like

trait Foo {
    fn bar(self: Rc<Self>);
}

Then it can only be used if the trait Foo is in-scope. Even if you do something like fn baz(self: u32); that only works for modules that use the trait.

If you write an inherent impl, then it can be called without having the trait in-scope, which means we have to be more careful to not allow these sorts of things.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Nov 3, 2017

@arielb1 Can you give an example of what we want to avoid? I'm afraid I don't really see what the issue is. A method you define to take &self will still be callable on Rc<Self>, the same as if you define it to take self: Rc<Self>. And the latter only affectsRc<MyStruct>, not Rc<T> in general.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Nov 4, 2017

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types. Basically we need a way to take a CustomPointer<Trait>, and do two things: (1) extract the vtable, so we can call the method, and (2) turn it into a CustomPointer<T> without knowing T.

(1) is pretty straightforward: call Deref::deref and extract the vtable from that. For (2), we'll effectively need to do the opposite of how unsized coercions are implemented for ADTs. We don't know T, but we can can coerce to CustomPointer<()>, assuming CustomPointer<()> has the same layout as CustomPointer<T> for all T: Sized. (Is that true?)

The tough question is, how do we get the type CustomPointer<()>? It looks simple in this case, but what if CustomPointer had multiple type parameters and we had a CustomPointer<Trait, Trait>? Which type parameter do we switch with ()? In the case of unsized coercions, it's easy, because the type to coerce to is given to us. Here, though, we're on our own.

@arielb1 @nikomatsakis any thoughts?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Nov 4, 2017

@arielb1

and is NOT supposed to work for inherent impl methods:

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Nov 4, 2017

@mikeyhew

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types.

I do want to support that, but I expected it to be out of scope for this first cut. That is, I expected that if a trait uses anything other than self, &self, &mut self, or self: Box<Self> it would be considered no longer object safe.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Nov 4, 2017

@nikomatsakis

I do want to support that, but I expected it to be out of scope for this first cut.

I know, but I couldn't help looking into it, it's all very interesting to me :)

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Nov 5, 2017

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

We need some sort of "orphan rule" to at least prevent people from doing things like this:

struct Foo;
impl Foo {
    fn frobnicate<T>(self: Vec<T>, x: Self) { /* ... */ }
}

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I think that if we have the deref-back requirement, there's no problem with allowing inherent methods - we just need to change inherent method search a bit to also look at defids of derefs. So that's probably a better idea than restricting to trait methods only.

Note that the CoerceSized restriction for object safety is orthogonal if we want allocators:

struct Foo;
impl Tr for Foo {
    fn frobnicate<A: Allocator+?Sized>(self: RcWithAllocator<Self, A>) { /* ... */ }
}

Where an RcWithAllocator<Self, A> can be converted to a doubly-fat RcWithAllocator<Tr, Allocator>.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Nov 5, 2017

@arielb1

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Are saying is that there would be a "conflicting symbols for architechture x86_64..." linker error?

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Nov 5, 2017

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

The deref-back requirement is supposed to be for everything, not only object-safety. It prevents the problem when one person does

struct MyType;
impl MyType {
    fn foo<T>(self: Vec<(MyType, T)>) { /* ... */ }
}   

While another person does

struct OurType;
impl OurType {
    fn foo<T>(self: Vec<(T, OurType)>) {/* ... */ }
}   

And now you have a conflict on Vec<(MyType, OurType)>. If you include the deref-back requirement, there is no problem with allowing inherent impls.

@raphaelcohn

This comment has been minimized.

Copy link

raphaelcohn commented Jan 23, 2019

Just played around with this a bit and found that whilst a receiver of self: *mut Self is allowed, self: NonNull<Self> isn't (in a trait) and produces a compile error.

error[E0307]: invalid method receiver type: std::ptr::NonNull<Self>
  --> src/main.rs:38:29
   |
38 |     unsafe fn initialize(self: NonNull<Self>, arguments: Self::InitializeArguments);
   |                                ^^^^^^^^^^^^^
   |
   = note: type of `self` must be `Self` or a type that dereferences to it
   = help: consider changing to `self`, `&self`, `&mut self`, or `self: Box<Self>`

To my mind, where I know the receiver is a non-null pointer, but not necessarily initialized properly, this is a useful way of writing code as documentation; particularly for a trait, where the implementor can be certain he hasn't been given null.

It might also be worth considering Option<NonNull<Self>>, too, even if only to warn that it would be better expressed as *mut Self.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jan 23, 2019

self: NonNull<Self> isn't

Because NonNull doesn't implement Deref. See also #48277.

considering Option<NonNull<Self>>

Option also doesn't implement Deref. AFAICT it cannot, at least not without panicking when it's None, which seems like a very bad ergonomic idea.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jan 23, 2019

*mut Self doesn’t implement Deref either.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jan 23, 2019

*mut Self doesn’t implement Deref either.

Sure, but that was a specific addition (See also #46664):

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

Raw pointers are definitely special-cased in the compiler, while NonNull is a regular struct and not a language item.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Jan 23, 2019

Raw pointers are definitely special-cased in the compiler

Indeed. NonNull<Self> is something that really should be supported at some point (at least if raw pointer receivers are), but it would require a separate trait from Deref that returns *const Self::Target, and we would need an RFC for that

@raphaelcohn

This comment has been minimized.

Copy link

raphaelcohn commented Jan 24, 2019

@mikeyhew Another RFC? That's disappointing to learn. I haven't got the desire to go down that route. At least the thought's there.

@shepmaster I the code I am now commonly writing when dealing with low-level pointer stuff, Option<NonNull<T>> comes up quite a bit. It'd just be nice to have the compiler either deal with it by assuming *mut Self.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Jan 25, 2019

This is the trait that I was thinking of. Both NonNull<T> and Option<NonNull<T>> could implement it.

trait DerefRaw {
    type Target;

    fn deref_raw(&self) -> *const Self::Target;
}
@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jan 25, 2019

I’m really not sure about doing this for Option<NonNull<T>>. Isn’t the whole point of it that you must check for None/null before accessing the pointer?

@raphaelcohn

This comment has been minimized.

Copy link

raphaelcohn commented Jan 25, 2019

Option<NonNull<T>> is, in many cases, much easier to work with than *mut T and contains far more expressive power. It also more 'naturally' inevitably gets created by code working with NonNull<T>; suddenly moving to *mut T is weird, and, less-Rusty; one can't simply do Some(non_null) as a result of a function, say. It also allows 'balance' in match clauses, which are shorter to read and more ergonomic than if with is_null(); it does not require the casual reader of code to suddenly understand as much of raw pointers - knowledge of Rust's Option type is enough; and lastly, it makes up for the absence of is_not_null(), for which !x.something.is_null() reads appalling and requires more visual processing (and is easier to miss).

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jan 25, 2019

Sorry, I should have clarified. Using Option<NonNull<T>> in general sounds good. What I’m not sure about is implementing DerefRaw as proposed above for Option<NonNull<T>>. Sure, it has an obvious implementation, but converting to a nullable *const T seems to defeat the point of using Option<NonNull<T>> in the first place.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 25, 2019

@kennytm

This comment was marked as outdated.

Copy link
Member

kennytm commented Jan 25, 2019

Unless we have decided a representation of a dynamic-sized enum I'm opposed to Option<NonNull<dyn Stuff>>.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Jan 25, 2019

@withoutboats Ohh, good point, I forgot about that. That means there's potentially a real difference between *mut T and Option<NonNull<T>>.

@kennytm I think you're confusing Option<NonNull<dyn Trait>> with Option<dyn Trait>

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Jan 26, 2019

@mikeyhew Oh right, sorry.

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Jan 26, 2019

@mikeyhew

Are you talking about Option<NonNull<*mut dyn Trait>> (which is a type that makes sense) or about NonNull<dyn Trait> (which I don't think makes much sense)?

@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Jan 26, 2019

BTW, from an API design standpoint, one thing that would work would be to have a special ArbitrarySelfTypePtr trait (bikeshed on name):

trait ArbitrarySelfTypePtr {
    type Next: ?Sized;
}

impl<T: Deref + ?Sized> ArbitrarySelfTypePtr for T {
    type Next = T::Target;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for *mut T {
    type Next = T;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for *const T {
    type Next = T;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for NonNull<T> {
    type Next = T;
}

// Maybe, if we really want that?
impl<T: ?Sized> ArbitrarySelfTypePtr for Option<T> {
    type Next = T;
}
@arielb1

This comment has been minimized.

Copy link
Contributor Author

arielb1 commented Jan 26, 2019

The "ArbitrarySelfTypePtr" exists solely so we can have a "sane" deref chain for inherent methods etc.

@raphaelcohn

This comment has been minimized.

Copy link

raphaelcohn commented Jan 27, 2019

@arielb1 I like your suggested ArbitrarySelfTypePtr trait because it logically documents what is and isn't possible for a self type. It would make it easy to refer to in compiler errors, eg an arbitrary self type must implement the trait ArbitrarySelfTypePtr.

BTW, for the avoidance of doubt, I'm not absolutely wedded to the Option<NonNull<T>> requirement, it's just that it logically seemed it should exist.

@kellerkindt

This comment has been minimized.

Copy link
Contributor

kellerkindt commented Apr 1, 2019

Did anyone consider f(self: Weak<Self>)?
In my point of view this seems reasonable since Arcand Rc are already allowed.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Apr 1, 2019

That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving self to dangle.

@kellerkindt

This comment has been minimized.

Copy link
Contributor

kellerkindt commented Apr 1, 2019

That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving self to dangle.

Hmm? Am I missing something? (probably)
Isn't it just an ergonomic 'shortcut', being able to write f(self: Weak<Self>) instead of i.e. f(me: Weak<Self>)?
Thus it is still required to call self.upgrade() to access Self which isn't unsafe.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Apr 1, 2019

Weak<Self> is in the same boat as *const/mut Self and NonNull<Self>: it's safe to call a method with it, but it doesn't implement Deref. Right now the unstable arbitrary_self_types feature only supports receiver types that implement Deref, plus *const/mut Self as a special case.

@kellerkindt

This comment has been minimized.

Copy link
Contributor

kellerkindt commented Apr 1, 2019

Weak<Self> is in the same boat as *const/mut Self and NonNull<Self>: it's safe to call a method with it, but it doesn't implement Deref. Right now the unstable arbitrary_self_types feature only supports receiver types that implement Deref, plus *const/mut Self as a special case.

Thank you for the clarification!

Any chance Weak<Self> could be added to the additional special cases? (caution rust-compiler-newbie speaking) Looking at the current state, it seems that one would need to handle another speicialised case in receiver_is_valid or more precisely in Autoderef and it would need to be called like include_raw_pointers. From the naming scheme that feels odd(?) to me, because Weak<Self> can fail to "Deref" to Self (well, as you mentioned, like *const/mut Self ).

Maybe another reason for an ArbitrarySelfType-likish trait?

@mikeyhew

This comment has been minimized.

Copy link
Contributor

mikeyhew commented Apr 2, 2019

Yeah, I think using a separate trait from Deref as @arielb1 suggested would be a good idea. There is a running list of types that would benefit from being method receivers, but can't implement Deref:

  • *const/mut Self
  • Weak<Self> (both Rc and Arc versions)
  • NonNull<Self>
  • RefCell<Self>, and by composition, Rc<RefCell<Self>>
  • similarly, Mutex<Self>
  • Option<Self> and Result<E, Self> (these ones are questionable)

Of the above list, all but Option and Result could produce a valid *const Self and therefore could implement a hypothetical UnsafeDeref trait. However, I'm starting to think that it would be better to have a trait that is specifically used for method lookup, whether or not it is decided thatOption<Self> should be usable as a method receiver or not.

Also, here is a potential name for the trait: MethodReceiver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.