Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for `arbitrary_self_types` #44874

Open
arielb1 opened this issue Sep 26, 2017 · 88 comments
Open

Tracking issue for `arbitrary_self_types` #44874

arielb1 opened this issue Sep 26, 2017 · 88 comments

Comments

@arielb1
Copy link
Contributor

@arielb1 arielb1 commented Sep 26, 2017

Tracking issue for #![feature(arbitrary_self_types)].

This needs an RFC before stabilization, and also requires the following issues to be handled:

  • figure out the object safety situation
  • figure out the handling of inference variables behind raw pointers
  • decide whether we want safe virtual raw pointer methods

Object Safety

See #27941 (comment)

Handling of inference variables

Calling a method on *const _ could now pick impls of the form

impl RandomType {
    fn foo(*const Self) {}
}

Because method dispatch wants to be "limited", this won't really work, and as with the existing situation on &_ we should be emitting an "the type of this value must be known in this context" error.

This feels like fairly standard inference breakage, but we need to check the impact of this before proceeding.

Safe virtual raw pointer methods

e.g. this is UB, so we might want to force the call <dyn Foo as Foo>::bar to be unsafe somehow - e.g. by not allowing dyn Foo to be object safe unless bar was an unsafe fn

trait Foo {
    fn bar(self: *const Self);
}

fn main() {
    // creates a raw pointer with a garbage vtable
    let foo: *const dyn Foo = unsafe { mem::transmute([0usize, 0x1000usize]) };
    // and call it
    foo.bar(); // this is UB
}

However, even today you could UB in safe code with mem::size_of_val(foo) on the above code, so this might not be actually a problem.

More information

There's no reason the self syntax has to be restricted to &T, &mut T and Box<T>, we should allow for more types there, e.g.

trait MyStuff {
    fn do_async_task(self: Rc<Self>);
}

impl MyStuff for () {
    fn do_async_task(self: Rc<Self>) {
        // ...
    }
}

Rc::new(()).do_async_stuff();

This doesn't have an RFC, but we want to experiment on this without one.

See #27941.

@arielb1 arielb1 changed the title Allow methods with arbitrary self-types Allow trait methods with arbitrary self-types Sep 26, 2017
@porky11
Copy link

@porky11 porky11 commented Sep 28, 2017

Why would you need this?
Why wouldn't you write an impl like this:

impl MyStuff for Rc<()> {
    fn do_async_task(self) {
        // ...
    }
}

I'd rather define the trait different. Maybe like this:

trait MyStuff: Rc {
    fn do_async_task(self);
}

In this case, Rc would be a trait type. If every generic type implemented a specific trait (this could be implemented automatically for generic types) this seems more understandable to me.

@cuviper
Copy link
Member

@cuviper cuviper commented Sep 28, 2017

This could only be allowed for trait methods, right?

For inherent methods, I can't impl Rc<MyType>, but if impl MyType can add methods with self: Rc<Self>, it seems like that would enable weird method shadowing.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Oct 1, 2017

@cuviper

This is still pending lang team decisions (I hope there will be at least 1 RFC) but I think it will only be allowed for trait method impls.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Oct 1, 2017

@porky11

You can't implement anything for Rc<YourType> from a crate that does not own the trait.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Oct 2, 2017

So changes needed:

  • remove the current error message for trait methods only, but still have a feature gate.
  • make sure fn(self: Rc<Self>) doesn't accidentally become object-safe
  • make sure method dispatch woks for Rc<Self> methods
  • add tests
@arielb1 arielb1 added E-mentor and removed E-needs-mentor labels Oct 2, 2017
@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Oct 2, 2017

I’ll look into this.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Oct 2, 2017

Note that this is only supported to work with trait methods (and trait impl methods), aka

trait Foo {
    fn foo(self: Rc<Self>);
}
impl Foo for () {
    fn foo(self: Rc<Self>) {}
}

and is NOT supposed to work for inherent impl methods:

struct Foo;
impl Foo {
    fn foo(self: Rc<Self>) {}
}
@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Oct 6, 2017

I got caught in some more Stylo work that's gonna take a while, so if someone else wants to work on this in the meantime feel free.

@kennytm
Copy link
Member

@kennytm kennytm commented Oct 6, 2017

Is this supposed to allow any type as long as it involves Self? Or must it impl Deref<Target=Self>?

trait MyStuff {
    fn a(self: Option<Self>);
    fn b(self: Result<Self, Self>);
    fn c(self: (Self, Self, Self));
    fn d(self: Box<Box<Self>>);
}

impl MyStuff for i32 {
   ...
}

Some(1).a();  // ok?
Ok(2).b();  // ok?
(3, 4, 5).c(); // ok?
(box box 6).d(); // ok?
kennytm added a commit to kennytm/rust that referenced this issue Oct 10, 2017
…ents, r=nikomatsakis

Update comments referring to old check_method_self_type

I was browsing the code base, trying to figure out how rust-lang#44874 could be implemented, and noticed some comments that were out of date and a bit misleading (`check_method_self_type` has since been renamed to `check_method_receiver`). Thought it would be an easy first contribution to Rust!
kennytm added a commit to kennytm/rust that referenced this issue Oct 10, 2017
…ents, r=nikomatsakis

Update comments referring to old check_method_self_type

I was browsing the code base, trying to figure out how rust-lang#44874 could be implemented, and noticed some comments that were out of date and a bit misleading (`check_method_self_type` has since been renamed to `check_method_receiver`). Thought it would be an easy first contribution to Rust!
@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Oct 10, 2017

I've started working on this issue. You can see my progress on this branch

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Nov 2, 2017

@arielb1 You seem adamant that this should only be allowed for traits and not structs. Aside from method shadowing, are there other concerns?

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Nov 2, 2017

@mikeyhew

inherent impl methods are loaded based on the type. You shouldn't be able to add a method to Rc<YourType> that is usable without any use statement.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Nov 2, 2017

That's it, if you write something like

trait Foo {
    fn bar(self: Rc<Self>);
}

Then it can only be used if the trait Foo is in-scope. Even if you do something like fn baz(self: u32); that only works for modules that use the trait.

If you write an inherent impl, then it can be called without having the trait in-scope, which means we have to be more careful to not allow these sorts of things.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Nov 3, 2017

@arielb1 Can you give an example of what we want to avoid? I'm afraid I don't really see what the issue is. A method you define to take &self will still be callable on Rc<Self>, the same as if you define it to take self: Rc<Self>. And the latter only affectsRc<MyStruct>, not Rc<T> in general.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Nov 4, 2017

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types. Basically we need a way to take a CustomPointer<Trait>, and do two things: (1) extract the vtable, so we can call the method, and (2) turn it into a CustomPointer<T> without knowing T.

(1) is pretty straightforward: call Deref::deref and extract the vtable from that. For (2), we'll effectively need to do the opposite of how unsized coercions are implemented for ADTs. We don't know T, but we can can coerce to CustomPointer<()>, assuming CustomPointer<()> has the same layout as CustomPointer<T> for all T: Sized. (Is that true?)

The tough question is, how do we get the type CustomPointer<()>? It looks simple in this case, but what if CustomPointer had multiple type parameters and we had a CustomPointer<Trait, Trait>? Which type parameter do we switch with ()? In the case of unsized coercions, it's easy, because the type to coerce to is given to us. Here, though, we're on our own.

@arielb1 @nikomatsakis any thoughts?

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Nov 4, 2017

@arielb1

and is NOT supposed to work for inherent impl methods:

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Nov 4, 2017

@mikeyhew

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types.

I do want to support that, but I expected it to be out of scope for this first cut. That is, I expected that if a trait uses anything other than self, &self, &mut self, or self: Box<Self> it would be considered no longer object safe.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Nov 4, 2017

@nikomatsakis

I do want to support that, but I expected it to be out of scope for this first cut.

I know, but I couldn't help looking into it, it's all very interesting to me :)

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Nov 5, 2017

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

We need some sort of "orphan rule" to at least prevent people from doing things like this:

struct Foo;
impl Foo {
    fn frobnicate<T>(self: Vec<T>, x: Self) { /* ... */ }
}

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I think that if we have the deref-back requirement, there's no problem with allowing inherent methods - we just need to change inherent method search a bit to also look at defids of derefs. So that's probably a better idea than restricting to trait methods only.

Note that the CoerceSized restriction for object safety is orthogonal if we want allocators:

struct Foo;
impl Tr for Foo {
    fn frobnicate<A: Allocator+?Sized>(self: RcWithAllocator<Self, A>) { /* ... */ }
}

Where an RcWithAllocator<Self, A> can be converted to a doubly-fat RcWithAllocator<Tr, Allocator>.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Nov 5, 2017

@arielb1

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Are saying is that there would be a "conflicting symbols for architechture x86_64..." linker error?

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Nov 5, 2017

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

The deref-back requirement is supposed to be for everything, not only object-safety. It prevents the problem when one person does

struct MyType;
impl MyType {
    fn foo<T>(self: Vec<(MyType, T)>) { /* ... */ }
}   

While another person does

struct OurType;
impl OurType {
    fn foo<T>(self: Vec<(T, OurType)>) {/* ... */ }
}   

And now you have a conflict on Vec<(MyType, OurType)>. If you include the deref-back requirement, there is no problem with allowing inherent impls.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Jan 25, 2019

@withoutboats Ohh, good point, I forgot about that. That means there's potentially a real difference between *mut T and Option<NonNull<T>>.

@kennytm I think you're confusing Option<NonNull<dyn Trait>> with Option<dyn Trait>

@kennytm
Copy link
Member

@kennytm kennytm commented Jan 26, 2019

@mikeyhew Oh right, sorry.

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Jan 26, 2019

@mikeyhew

Are you talking about Option<NonNull<*mut dyn Trait>> (which is a type that makes sense) or about NonNull<dyn Trait> (which I don't think makes much sense)?

@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Jan 26, 2019

BTW, from an API design standpoint, one thing that would work would be to have a special ArbitrarySelfTypePtr trait (bikeshed on name):

trait ArbitrarySelfTypePtr {
    type Next: ?Sized;
}

impl<T: Deref + ?Sized> ArbitrarySelfTypePtr for T {
    type Next = T::Target;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for *mut T {
    type Next = T;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for *const T {
    type Next = T;
}

impl<T: ?Sized> ArbitrarySelfTypePtr for NonNull<T> {
    type Next = T;
}

// Maybe, if we really want that?
impl<T: ?Sized> ArbitrarySelfTypePtr for Option<T> {
    type Next = T;
}
@arielb1
Copy link
Contributor Author

@arielb1 arielb1 commented Jan 26, 2019

The "ArbitrarySelfTypePtr" exists solely so we can have a "sane" deref chain for inherent methods etc.

@raphaelcohn
Copy link

@raphaelcohn raphaelcohn commented Jan 27, 2019

@arielb1 I like your suggested ArbitrarySelfTypePtr trait because it logically documents what is and isn't possible for a self type. It would make it easy to refer to in compiler errors, eg an arbitrary self type must implement the trait ArbitrarySelfTypePtr.

BTW, for the avoidance of doubt, I'm not absolutely wedded to the Option<NonNull<T>> requirement, it's just that it logically seemed it should exist.

@kellerkindt
Copy link
Contributor

@kellerkindt kellerkindt commented Apr 1, 2019

Did anyone consider f(self: Weak<Self>)?
In my point of view this seems reasonable since Arcand Rc are already allowed.

@mark-i-m
Copy link
Member

@mark-i-m mark-i-m commented Apr 1, 2019

That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving self to dangle.

@kellerkindt
Copy link
Contributor

@kellerkindt kellerkindt commented Apr 1, 2019

That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving self to dangle.

Hmm? Am I missing something? (probably)
Isn't it just an ergonomic 'shortcut', being able to write f(self: Weak<Self>) instead of i.e. f(me: Weak<Self>)?
Thus it is still required to call self.upgrade() to access Self which isn't unsafe.

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Apr 1, 2019

Weak<Self> is in the same boat as *const/mut Self and NonNull<Self>: it's safe to call a method with it, but it doesn't implement Deref. Right now the unstable arbitrary_self_types feature only supports receiver types that implement Deref, plus *const/mut Self as a special case.

@kellerkindt
Copy link
Contributor

@kellerkindt kellerkindt commented Apr 1, 2019

Weak<Self> is in the same boat as *const/mut Self and NonNull<Self>: it's safe to call a method with it, but it doesn't implement Deref. Right now the unstable arbitrary_self_types feature only supports receiver types that implement Deref, plus *const/mut Self as a special case.

Thank you for the clarification!

Any chance Weak<Self> could be added to the additional special cases? (caution rust-compiler-newbie speaking) Looking at the current state, it seems that one would need to handle another speicialised case in receiver_is_valid or more precisely in Autoderef and it would need to be called like include_raw_pointers. From the naming scheme that feels odd(?) to me, because Weak<Self> can fail to "Deref" to Self (well, as you mentioned, like *const/mut Self ).

Maybe another reason for an ArbitrarySelfType-likish trait?

@mikeyhew
Copy link
Contributor

@mikeyhew mikeyhew commented Apr 2, 2019

Yeah, I think using a separate trait from Deref as @arielb1 suggested would be a good idea. There is a running list of types that would benefit from being method receivers, but can't implement Deref:

  • *const/mut Self
  • Weak<Self> (both Rc and Arc versions)
  • NonNull<Self>
  • RefCell<Self>, and by composition, Rc<RefCell<Self>>
  • similarly, Mutex<Self>
  • Option<Self> and Result<E, Self> (these ones are questionable)

Of the above list, all but Option and Result could produce a valid *const Self and therefore could implement a hypothetical UnsafeDeref trait. However, I'm starting to think that it would be better to have a trait that is specifically used for method lookup, whether or not it is decided thatOption<Self> should be usable as a method receiver or not.

Also, here is a potential name for the trait: MethodReceiver

@eddyb
Copy link
Member

@eddyb eddyb commented Apr 9, 2019

We can do a similar trick to CoerceUnsized and whitelist MethodReceiver in the compiler for pointers/references, and require that any structs that implement MethodReceiver, themselves contain exactly one field that reaches the target type after any number of applications of MethodReceiver.

This differs from CoerceUnsized in that it uses an associated type, like Deref, instead of a parameter, so it's less flexible, but it allows repeated application in the compiler.

Doing it without a method for performing a "deref" operation means x.foo() -> <_>::foo(x) sugar can be decoupled from any kind of safety concerns.
(That still leaves x.foo() -> (***x).foo() sugar but maybe those can be decoupled?)

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue May 21, 2020
…. r=froydnj

This matches how the `Dispatch(already_AddRefed<nsIRunnable>)`
overloads work in C++: `Dispatch` takes ownership of the runnable, and
leaks it if dispatch fails—because the thread manager is shutting down,
for instance. This avoids a race where a runnable can be released on
either the owning or target thread.

Rust doesn't allow arbitrary `Self` types yet (see
rust-lang/rust#44874), so we need to change `dispatch` and
`dispatch_with_options` to be associated methods.

Differential Revision: https://phabricator.services.mozilla.com/D75858
xeonchen pushed a commit to xeonchen/gecko that referenced this issue May 21, 2020
…. r=froydnj

This matches how the `Dispatch(already_AddRefed<nsIRunnable>)`
overloads work in C++: `Dispatch` takes ownership of the runnable, and
leaks it if dispatch fails—because the thread manager is shutting down,
for instance. This avoids a race where a runnable can be released on
either the owning or target thread.

Rust doesn't allow arbitrary `Self` types yet (see
rust-lang/rust#44874), so we need to change `dispatch` and
`dispatch_with_options` to be associated methods.

Differential Revision: https://phabricator.services.mozilla.com/D75858
@Zenithsiz
Copy link

@Zenithsiz Zenithsiz commented Sep 26, 2020

Wanted to comment a small pain point I've encountered using this feature I haven't seen mentioned yet.

If we have a method receiving a *const T, such as fn len(self: *const Self), we can't call it directly with a mutable pointer, *mut T, although we can just cast it and call it manually with (var as *const T).len().
Might be worth adding a special case for this, just as we can call &self methods with a &mut reference.

Example:

#![feature(arbitrary_self_types)]

pub trait Test {
    fn a(self: *const Self);
}

impl Test for u32 {
    fn a(self: *const Self) {}
}

pub fn main() {
    let mut a = 5;
    
    let a_ptr: *const u32 = &mut a;
    // Works fine
    a_ptr.a();
    
    let a_mut_ptr: *mut u32 = &mut a;
    // Currently works
    (a_mut_ptr as *const u32).a();
    // Currently results in error "method not found in `*mut u32`"
    // Would result to calling `(a_mut_ptr as *const u32).len()`
    a_mut_ptr.a();
}

Playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.