Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtually dispatched trait methods #3440

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

blagowtf
Copy link

@blagowtf blagowtf commented May 30, 2023

@Diggsey
Copy link
Contributor

Diggsey commented May 30, 2023

This is probably not what you want to here, but the pattern you are trying to express is fundamentally one of inheritance, not composition.

You are treating DoubleOp2<SystemA> as a poor-mans substitute for inheritance, and so it's not surprising that it's hard to express ergonomically in Rust: you want SystemA to somehow be aware that it's nested inside DoubleOp2, but this really goes against the design. It would frankly be surprising that the behaviour of SystemA would change just because it's stored inside a DoubleOp2.

Instead, dependencies should be specified explicitly, ie. it should be clear that the implementation of op2 can depend on op1, and vice versa.

However, this is another red flag when it comes to the design: generally you don't want cycles in your dependencies. It implies that your supposedly separate components are in fact tightly coupled. It could also lead to bugs: in your example, accidental misuse of the &virt self reference could easily lead to infinite recursion between op1 and op2.

It's difficult to say what the correct design should be when the example is so abstract, but one option is to provide a default implementation of op1:

pub trait System {
    fn op1(&self) -> SystemOp1 {
        self.op2() * 5
    }
    fn op2(&self) -> SystemOp2;
    // etc.
}

This can be adapted further if you want multiple different "op1 behaviours" depending on the implementation by making the result of op2 an explicit input to the op1 implementation:

pub trait System {
    fn op1(&self) -> SystemOp1 {
        self.op1(self.op2())
    }
    fn op2(&self) -> SystemOp2;
    fn op1_impl(&self, op2: SystemOp2);
    // etc.
}

There are plenty more solutions like this that don't require sweeping language changes, are much more concise, whilst being more explicit about dependencies.

@VitWW
Copy link

VitWW commented May 30, 2023

Maybe you could find this helpful: Rfc:what the process is

@blagowtf
Copy link
Author

@VitWW thanks for the comment. I thought the numbering was unrelated to the PR ids, my bad. I have now updated the PR. Or is your comment referencing something else? I wasn't sure.

@blagowtf
Copy link
Author

blagowtf commented May 30, 2023

@Diggsey thanks for the comment.

Firstly, let me say that I wasn't claiming there aren't workarounds. Rust is pretty powerful as it is. But I cannot agree with you that 1) what i'm proposing is "sweeping language" change - it's an opt-in feature that requires a relatively simple grammar change that can express a pattern in what in my eyes is much more consise and flexible of an approach.

In particular, your suggestions of modifying System not only increases the trait surface, but also presupposes the dependency as being of one op1 on op2. But that may not be the case (e.g. it could be the other way or both could share a common dependency on a third op3, etc.). The dependency between the trait methods is not a trait-thing, it's a impl-of-trait thing. And my belief is that &virt self (or similar) is explicit enough to inform users of the api that there is, in fact, a general dynamic dependence on all of self.

I take your point about unintended recursion being a potential issue. However, at least in the real-life case I have in mind there's a contract that must be respected that guarantees recursion never happens. Sadly, I cannot share specifics here, but I'll try to think of a good alternative example where what I'm talking about makes concrete sense.

By the way, I am not sure if I necessarily agree that virtual methods should necessarily be equated to inheritance (dyn as it exists has nothing to do with inheritance in my eyes but exists for a good reason). But be that as it may, if Rust has no classes / OOP and encourages composition instead, then why not support the feature I suggest? I am by no means a fan of OOP, but for some domains it really can be useful.

@burdges
Copy link

burdges commented May 30, 2023

I've wanted inheritance once or twice but always found better ways, at least once by using Deref polymorphism. I think Deref polymorphism is much less of an anti-pattern than full inheritance btw.

There exist various places where folks use manual vtables in rust, either safe or unsafe, so maybe glance at https://internals.rust-lang.org/t/associated-const-fns-can-clean-up-abstraction-layers/18562

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 30, 2023
@blagowtf
Copy link
Author

blagowtf commented May 30, 2023

Here is a slightly more concrete example.

Imagine we have different models of attractiveness of a being.

pub enum Being {
  Pet,
  Person,
}

pub trait AttrModel {
   // we define all the differenet qualities we may use to model the attractiveness 
   // e.g. these are all the things we have concrete data for or models based on concrete data
   fn intelligence(&self, being: &Being) -> f64;
   fn looks(&self, being: &Being) -> f64;
   fn compatibility(&self, being: &Being) -> f64;
   fn attractive(&self, being: &Being) -> f64;
}

Now, say we have different ideas of what attractiveness could be, e.g.

pub struct Population1;
impl AttrModel for Population1 {
   fn looks(&self, being: &Being) -> f64 {
       // subjective, but according to this model with e.g. data fetched from some db
       fetch_looks_pop1(being)
   }
      fn intelligence(&self, being: &Being) -> f64 {
       // we could measure e.g. by some standard test
      test1(being)
   }
   
    fn compat(&self, being: &Being) -> f64 {
      // some measure of compatibility
       std::cmp::max(self.intelligence(), self.looks())
    }
    
   fn attractive(&self, being: &Being) -> f64 {
      // in this model we really only care about intelligence
      self.intelligence()
   }
}

pub struct Population2;
impl AttrModel for Population2 {
   fn looks(&self, being: &Being) -> f64 {
       // subjective, but according to this model perceived looks is correlated to intelligence as well
       fetch_looks_pop2(being) + 0.2 * self.intelligence()
   }
      fn intelligence(&self, being: &Being) -> f64 {
       // we could measure e.g. by some other standard test
      test2(being)
   }
   
    fn compat(&self, being: &Being) -> f64 {
      // some measure of compatibility, e.g. the average
       0.5 * self.intelligence() + 0.5 * self.looks()
    }
    
   fn attractive(&self, being: &Being) -> f64 {
      // in this model we really only care about a max of all three
      std::cmp::max(self.intelligence(), std::cmp::max(self.looks(), self.compat()));
   }
}

Now, we can have many similar and different models. And we may want to have wrappers that implement standard functionality such as how much more attractive does a given being become if their looks is increased by 1.0

pub struct ExtraIntelligence<A: AttrModel>(A);
impl AttrModel {
    fn looks(&self, being: &Being) -> f64 { self.0.looks(being) }
    fn intelligence(&self, being: &Being) -> f64 { self.0.intelligence(being) + 1.0 }
    fn compat(&self, being: &Being) -> f64 { self.0.compat(being) }
    fn attractive(&self, being: &Being) -> f64 { self.0.attractive(being) }
}

but that will obviously fail because the statically dispatched self.0.attractive will simply not respect the newly increased intelligence so our measure will fail. And because in general, we cannot know how a model measures any of looks, intelligence, or compatibility we cannot simply add explicit arguments.

The whole thing becomes even more complicated if we start returning different things based on the type of being (which is where the potential of what I'm talking about is really obvious).

We could consider pets to be compatible with us if we like how they look (so dispatch to looks) whereas for people we may want to consider some average of looks and intelligence (say). Then attractiveness could simply be expressed as a function of compatibility and now depending on the being, we have different attractiveness which we could then use different wrappers to change slightly more or less.

I hope this example is a bit elucidating and I hope it's not offensive to anyone - I'm just trying to work with everyday concepts that everyone can kind of understand.

@Diggsey
Copy link
Contributor

Diggsey commented May 30, 2023

But I cannot agree with you that 1) what i'm proposing is "sweeping language" change - it's an opt-in feature that requires a relatively simple grammar change that can express a pattern in what in my eyes is much more consise and flexible of an approach.

Rust is not just a bag of features though, it's a cohesive whole. Sticking another feature on the side that doesn't fit with existing conventions is by definition a sweeping change, because it reduces the consistency of the language considerably.

In particular, your suggestions of modifying System not only increases the trait surface

Traits are there to be used, if a problem is complex enough then it warrants being broken down more, and that might mean more functions. It doesn't have to be public surface area of your API necessarily.

but also presupposes the dependency as being of one op1 on op2. But that may not be the case (e.g. it could be the other way or both could share a common dependency on a third op3, etc.)

That's kind of the point though. We're only discussing this because you intend to design a modular system (otherwise you could just implement the logic directly) but without explicitly specifying what depends on what, you cannot make a modular system.

If you have one "system adapter" that reverses the dependency order, then you will get infinite recursion if you try to combine that adapter with one that uses the original dependency order, because it's intrinsically broken to have mutually dependent attributes.

There are a couple of obvious ways to avoid this mutual dependency situation:

  1. Be explicit about the direction of the dependency.
  2. Split the values into a set of "base values" and "dependent values". This way any "dependent value" can depend on any "base value" without causing infinite recursion. Things like ExtraIntelligence can work by modifying the base values, so you can observe changes to the "dependent values".

In both cases a concise implementation naturally falls out that doesn't require new language features.

@blagowtf
Copy link
Author

blagowtf commented May 30, 2023 via email

@VitWW
Copy link

VitWW commented May 31, 2023

@blagowtf I meant

  1. file naming (already fixed)
  2. "Rendered link" in first comment of pull request to full text of RFC
  3. Synopsis (extremely short version in "2 words ") of your proposal at first comment of pull request

@Diggsey
Copy link
Contributor

Diggsey commented Jun 1, 2023

In that sense, I think we may just have to agree to disagree if what I’m proposing fits consistently with the rest of the language.

There is certainly a subjective element to it, but there are also some objective ones:

  • The triumvirate of "owned/exclusive reference/shared reference" crops up all over the place in Rust. In types (T, &T, &mut T) in traits (FnOnce, FnMut, Fn) in methods (, borrow_mut(), borrow()) in states (see RefCell, RwLock, etc.) Basically, it's everywhere. The proposed syntax introduces a fourth kind of reference &virt T which breaks this symmetry, and means we possibly need something like &mut virt T as well.

  • The use-case is very niche. Syntax that is rarely used harms consistency. Consistency is helped by having niche features be rarely used functions rather than new syntax, because it help to avoid the division of the language into "commonly used syntax" and "very rarely used syntax" where many people are only familiar with stuff from the former.

  • virt(x) syntax is unprecedented - Rust doesn't tend to have function-call-like keywords AFAIK.

I could list some more but I think that's enough for now...

I thought self captures
exactly that - a dependence on the object and in the particular case of
traits - the object implementing a trait.

I'm not sure exactly what you mean. A "self capture" (by which I'm assuming you mean "self parameter") is just fancy syntax for passing the first argument to a function. It's not even required - you can always call functions using UFCS, ie. Type::method(receiver).

What are the dependencies that
allow you to drain a hashmap? From a consumer pov, it’s just a mutable
reference to self. From a library designer’s point of view it might be a
whole lot of inner fields and data structures with their invariants.

A HashMap depends on it's Key and Value types. Importantly the behaviour of Keys and Values in a HashMap are the same as if you operated on those Keys and Values directly. That is to say - a Key/Value doesn't "know" it's in a HashMap. Similarly, if you put that HashMap inside another type, say Option<HashMap<...>>, the HashMap doesn't know it's inside an Option and can't behave any differently as a result.

This means that when you create a complex type, like Option<Arc<Mutex<T>>, it is still easy to reason about.

Requiring inner implementation details to leak out through the common API
(the trait) is a big red flag for me.

The problem is that in your scenario they are not implementation details though. The order in which the attributes depend on each other matters to the user of the type. They cannot be sure to use it correctly without understanding that order:

Regarding dependencies and ending up in infinite loops - I think that’s
something that is up to the developer to prevent.

ie. How can the developer prevent this if you've hidden that fact as an "implementation detail"?

By having "base attributes" and "derived attributes" you actually solve this problem:

First, only other combinators (like "ExtraIntelligence") or implementations of the trait (like "Population1") need to care that "base attributes" exist at all. To the downstream developer using these types, they don't need to know about this concept at all.

Secondly, it means the combinators (like "ExtraIntelligence") are actually fully interchangeable. You can combine an "ExtraIntelligence" with an "ExtraBlah" without it causing an infinite loop. And this is why it's OK for the downstream developer to be unaware of these methods.

Copy link
Contributor

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it adds a lot of syntactic and semantic edge-cases that risk altering much of the rest of the language for the sake of a convenience.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

The case of `mut self`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't state what the unresolved question actually is.

Comment on lines +148 to +150
Implementing the above for immutable references is easy. However, mut references are inherently unsafe as we will be aliasing self initially.

Solution is to either outright ban the above for `&mut self` or only allow it in unsafe contexts (e.g. treat it as `mut ptr`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unacceptable for &mut self to have differing semantic guarantees in this context, because this makes analysis of Rust code much harder. This must make clear this would only be allowed with *mut T. Not even unsafe code is allowed to break the aliasing rules concerning &mut T.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that. I guess my point was that just like with refs to statics, the mutable case is more complicated and thus may necessitate unsafe {}. But point well taken!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static is unique and special and static mut should be regarded as a mistake, or at least, not something one should build any precedents off of. There are many subtleties regarding statics, like how referencing them is semantically equivalent to borrowing them, last I checked, which are non-obvious and don't really pattern elsewhere in the language. This is to be expected: the rest of the language doesn't really "do" global state, nevermind global mutable state.


If the called method is also declared virt, using `self->op2()` will retain the `vself` the same as originally passed to `op1`. Otherwise, it will replace it by `self as &dyn System`.

Outside of traits, `a->b()` wouldn't compile.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax that is sensitive to the trait context but is not part of the trait's declaration syntax, but rather is part of the expression syntax, is extremely unusual. This would mean that macros that work inside a trait's functions do not work inside other functions. This would be very bad for the ecosystem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But macros that use this syntax would be meant for the trait case, otherwise normal syntax will still work as it does today. So what is really lost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, in syntax generation through macros, an expression is an expression and you don't have to think about it beyond that. This would create a "subclass" of $trait_expr that could not be used where all $expr could be used. This is a severe problem for macros concealing their implementation details:

Now macros that pass in an expression using this syntax to the inside of a trait and use that to do some work break if they instead pass the expression through to inside free functions or closures to do that work. In order to create slightly more opacity for traits, it blows a giant hole through opacity for macros.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( maybe that's a bit strong, but "expressions are expressions" is kiiinda important in a general sense. )

# Drawbacks
[drawbacks]: #drawbacks

I fail to see any major drawbacks besides the override of `->`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Addition of context-specific syntax is a significant drawback.
  • Incompatibility with &mut self is a significant drawback and a sign this proposal is insufficiently general.
  • There are many other possible uses we might prefer to put -> to that would conflict.

Please think about and try to enumerate more of the drawbacks beyond what I can think of off the top of my head.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to reply. I find your critique insightful.

The reason why I don't mind the context sensitivity (or see it as an issue) is because of e.g. await in async context. But I see why someone may not like more of that. But mind you, from outside the trait what I'm suggesting simply looks like any old trait call.

Regarding mut self, I just meant it's not something that can obviously live in safe Rust and that's ok. To me it's just like yet another form of ref to self (and as such carries its own pros/cons just like &self, &mut self, self, Box, etc.)

Specific syntax is not that important. I am sure there are other proposals about ->.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const context is another example

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const for the most part doesn't change what raw syntax is permissible. A function in const is still a function and is still called as a function. You are right about async, but it is distinct in that async is a distinct addition to the language, rather than pervading all traits. By being "boxed up" into well-defined spaces, rather than "all traits", they are better-contained.

Also, the "keyword generics" initiative has been launched to try to find a solution to the problem I mention.


Of course, this behaviour may be what is desired. But sometimes, virtualizing at depth is more appropriate. Certainly when adopting OOP patterns.

When I first stumbled upon this, I wondered if I could simply take `self` as `&dyn System` (or `Arc` or `Box` equivalent, etc.). Nope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this allowed by arbitrary self types?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitely help for quite a few use cases, yes!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. To what degree does #![feature(arbitrary_self_types)] obsolete or outmode this RFC? If what you want (essentially, "more control over vtables") can be built on that more ergonomically, then I think it would be most appropriate to redraft this RFC with that feature in mind.


- What other designs have been considered and what is the rationale for not choosing them?

Explicit argument passing considered above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not give the rationale.


I believe this feature will make Rust more user-friendly to people more inclined to think in OOP terms or who (like me) simply found themselves writing code in a domain that is very amenable to an OOP approach.

The syntax changes are relatively minimal and there is no extra cost for code that does not use this feature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will become possible for this recursively virtualized pattern to be invoked via reference to a static, even inside a fully concrete function. My intuition suggests this could make optimization significantly harder in some cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Care to elaborate here? Rust already knows when you're calling into a trait (forcing you to import it in scope) so any such call should simply use the ref to static as the vself? Or did i miss your point?

Copy link
Contributor

@workingjubilee workingjubilee Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My statement is that by just adding this feature, now functions that internally reference global state can have dependencies that are already hard to resolve at compile time become almost impossible to resolve at compile time, so they may not be statically resolved and thus devirtualized as easily as they could before. This could impact even programs that do not use this feature. It could be that this doesn't actually make anything worse for optimizers, but I suspect it does, so it's probably the case that it is not true that it imposes no costs anywhere else. Many languages, by adding some additional undecidability somewhere, defeat optimization much harder than they may have before.

@workingjubilee
Copy link
Contributor

I think that is something that is up to the developer to prevent

The proposal to add inline_const, which makes it easier to add "post monomorphization errors" to code, has been fairly contentious, despite those being errors caught during code build time. You are proposing to block detection of unconditional recursion by the compiler because developers supposedly can handle it. I believe that if that was true, neither of us would know the name of the 395 page book from Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, that spends a notable amount of time admonishing people to do what Rust has made mandatory.

@blagowtf
Copy link
Author

blagowtf commented Jun 1, 2023

I think that is something that is up to the developer to prevent

The proposal to add inline_const, which makes it easier to add "post monomorphization errors" to code, has been fairly contentious, despite those being errors caught during code build time. You are proposing to block detection of unconditional recursion by the compiler because developers supposedly can handle it. I believe that if that was true, neither of us would know the name of the 395 page book from Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, that spends a notable amount of time admonishing people to do what Rust has made mandatory.

Just quickly about the detection of recursion. I am empathically not suggesting the blocking of it because 1) my proposal would make it at best conditional recursion 2) existing semantics will still apply for all traits as written currently - virt is in the signature after all

@workingjubilee
Copy link
Contributor

A virtual_fn_a that resolves to a body of virtual_fn_b() which itself resolves to virtual_fn_a() yields unconditional recursion. My understanding, by my reading of the previous discussion, is that this RFC does not prevent that from happening.

@nielsle
Copy link

nielsle commented Jun 2, 2023

Here is an attempt to use existing language features to implement the slightly more concrete example.

pub trait Intelligence {
    fn intelligence(&self) -> i64;
}

pub enum Being {
    Pet,
    Person,
}

impl Intelligence for Being {
    fn intelligence(&self) -> i64 {
        match self {
            Being::Pet => 1,
            Being::Person => 10,
        }
    }
}

pub struct IntelligentBeing<A>(A);

impl<A: Intelligence> Intelligence for IntelligentBeing<A> {
    fn intelligence(&self) -> i64 {
        self.0.intelligence() + 1
    }
}

@kpreid
Copy link

kpreid commented Jun 9, 2023

In OOP theory the unconditional recursion hazard — as well as less blatant misbehaviors such as accidentally bypassing or duplicating some checks or side effects — is known specifically as the fragile base class problem.

(I am not expressing an opinion on this RFC; only hoping the discussion will benefit from additional standard terminology.)

@workingjubilee
Copy link
Contributor

Also, aside from the RFC's contents, for anything along these lines we would probably want to use one of the existing reserved keywords.

@e-dant
Copy link

e-dant commented Jul 29, 2023

I think several short side-by-side examples of where virtual inheritance is the only clean or viable solution to common problems would go a long way. (If these examples exist.)

I think some discussion about what kinds of inheritance are allowed and disallowed is important. I expect some edge cases where the order of initialization, and the behavior of drop, is unsafe or undefined.

I think some discussion of the existing friction in languages that have two or more camps -- one more functional, another more OO -- is important to get anywhere meaningful. Javascript, Scala, C++, Erlang and Lisp (w/ CLOS) stand out as important in that space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants