Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trait objects for multiple traits #2035

Open
sgrif opened this issue Jun 17, 2017 · 67 comments
Open

Trait objects for multiple traits #2035

sgrif opened this issue Jun 17, 2017 · 67 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@sgrif
Copy link
Contributor

sgrif commented Jun 17, 2017

Given arbitrary traits Foo and Bar, it'd be great to have Box<Foo + Bar> compile, but there are unresolved questions that need to be laid out explicitly. (Moved from rust-lang/rust#32220)

@parkovski
Copy link

parkovski commented Sep 18, 2017

Just dealt with a really frustrating workaround for the lack of upcasting, I'd be interested in working on this. A few questions/comments (retrospectively, maybe more than a few):

It looks like the best way to start out with this is super trait coercion, and then move on to arbitrary combined traits. The first looks more or less straight forward, and the second still has some unanswered tradeoff questions, but I do think it's good to keep the second in mind so that casting is implementing in a way that can eventually segue into multiple-trait pointers.

For super traits, it seems that we could for the most part just concatenate vtables together. However, consider this example:

trait A: Debug + Display { /* pretend there are methods here */ }
trait B: Debug + Display { /* same here */ }
trait C: A + B {}

What does the vtable for Box<C> look like? You can only have one implementation of Debug and Display but if you want to allow coercion to Box<A> or Box<B> and then from either of those to Debug or Display, you basically have to list duplicate implementations of those in A's vtable. So how big does this realistically get and will it be a problem? It seems feasible to me that you might have cases where you implement several traits, and some of them require other custom traits, in addition to the occasional PartialEq, etc, which will lead to lots of repetitions. This gets worse when you allow Box<X + Y + Z> to convert to arbitrary combinations of bounds, potentially, although realistically there may not be enough uses of this. I also have a feeling that trait aliases will encourage people not to overdo it, since you'll just keep passing Box<Alias> rather than different combinations of Box<FirstThingINeed + SecondThingINeed>.

As an aside, there is a possible optimization around the case of Box<A + B> - since we know which combinations are used, we can change the order that supertraits are listed for implementations of those combinations to minimize the number of new vtables created. This pretty much rules out dynamic linking, but I'm not sure that's really feasible anyway for this feature, since you'd basically need a runtime function that scans external vtables and writes new ones for any possible combinations you may use.

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>? Even if you're only storing offset tables, if those three traits appear together often, like say Debug + PartialEq + PartialOrd, that can end up being a lot of offset tables.

The only really feasible option I can think of for this is to have the compiler trace which types have a potential path to any of those combinations, and only generate extra tables for those types. Not familiar at all with internals, so I don't know how much work that would be.

For reference, here's how C++ does virtual inheritance. The problem is that virtual inheritance in C++ is not nearly as common as multiple trait implementation in Rust, so the overhead of writing all the extra tables and offsets isn't as much. On the other hand, most of the common std traits are just one or two methods, so I don't know if it would even help to do it the C++ way with offset tables, and C++ doesn't have a way to pass a pointer to two classes, you just have to use dynamic_cast, so I think the Rust solution is going to be fairly different. I do want to spend some more time looking at this though and seeing what I can come up with for Rust.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 6, 2017
@ejmahler
Copy link

ejmahler commented Apr 8, 2018

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>? Even if you're only storing offset tables, if those three traits appear together often, like say Debug + PartialEq + PartialOrd, that can end up being a lot of offset tables.

Is there any reason you can't have one vtable for A + B + C and an entirely different vtable for a + c? The benefits of reusing the same vtable if possible are clear - but in cases where that's not possible, what's stopping the compiler from generating a new one for just A + C?

@ejmahler
Copy link

ejmahler commented Apr 9, 2018

I just ran my head into this, and it's left me pretty disappointed. I'm developing the rust_dct crate.

Each of the Discrete Cosine transforms type 1 through for has its own trait (DCT1, DCT2, DCT3, DCT4), and same with discrete sine transforms (DST1, DST2, DST3, DST4). Previously the structs that implemented these traits were completely disjoint: There's a struct that converts DCT3 problems into FFT problems, and an entirely separate struct that converts DCT2 problems into FFT problems. They're completely separated.

But recently, I discovered that DCT2, DST2, DCT3, and DST3 problems pretty much always require the same precomputed data and pre-allocated buffers, and so I've started creating single structs that can compute all four. So a single "convert to FFT" structs all four traits: DCT2, DST2, DCT3, and DST3. All D{C,S}{2,3} structs are now implemented this way.

Sometimes, I have a problem that needs both a DCT2 and a DCT3. Currently, I have to write this:

fn my_algorithm(input: &[f32], output: &mut [f32], dct2: Arc<DCT2>, dct3: Arc<DCT3>) {
    
}

If I had a problem that needed DCT2, DST2, DCT3, and DST3, I'd have to be even more verbose:

fn my_algorithm2(input: &[f32], output: &mut [f32], dct2: Arc<DCT2>, dct3: Arc<DCT3>, dst2: Arc<DCT2>, dst3: Arc<DCT3>) {
    
}

I absolutely despise this API though, because it requires an unreasonable amount of repetition by the user, and in the end all four Arcs will be pointing to the same thing. It would be much, much more ergonomic if I could write this:

fn my_algorithm3(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3 + DST2 + DST3>) {
    
}

And then I can use the same dct object to compute all four transforms. If, internally, my_algorithm3 delegates to the following method:

fn my_algorithm4(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3>) {
    
}

wouldn't it be ridiculously convenient if I could just pass the dct object along and let the compiler figure it out?

fn my_algorithm3(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3 + DST2 + DST3>) {
    my_algorithm4(input, output, dct);
}

@comex
Copy link

comex commented Apr 9, 2018

Is there any reason you can't have one vtable for A + B + C and an entirely different vtable for a + c? The benefits of reusing the same vtable if possible are clear - but in cases where that's not possible, what's stopping the compiler from generating a new one for just A + C?

Mainly the fact that that requires a number of vtables which is exponential in the number of +s.

wouldn't it be ridiculously convenient if I could just pass the dct object along and let the compiler figure it out?

I agree this should 'just work' one way or another. But as a workaround, for the record, rather than taking separate Arc parameters, I'd make a wrapper trait or traits, like trait Foo : DCT2 + DCT3 + DST2 + DST3.

Alternately… do you actually need to be using trait objects in the first place? Just from a glance at your use case, my guess is that having my_algorithm* use a generic parameter instead would work fine – i.e. it's unlikely that one program would need to choose at runtime between a large number of different algorithms for the same calculation.

@ejmahler
Copy link

ejmahler commented Apr 9, 2018

Alternately… do you actually need to be using trait objects in the first place? Just from a glance at your use case, my guess is that having my_algorithm* use a generic parameter instead would work fine – i.e. it's unlikely that one program would need to choose at runtime between a large number of different algorithms for the same calculation.

It's definitely occurred to me that my situation works as expected if this is all done at compile time instead of with trait objects. To explain why trait object are more or less necessary here, let me provide two more bits of context:

  1. A crucial piece of my library is the "planner" which takes a given problem size, and returns an assembled set of algorithms that compute the relevant problem size. The intention is that a user doesn't need to know all the different ways to compute a DCT Type 2, they just give the library a size, and the library returns a thing that can compute a DCT Type 2 of that size.
  2. Some DCT algorithms are implemented in terms of other DCT algorithms, or are even implemented recursively. See this DCT3 algorithm which computes the DCT3 by dividing it into one DCT3 of half-size, and another DCT3 of quarter-size. If the half and quarter algorithms are generic parameters instead of trait object, then you'd have to write out both types anywhere you used it, in addition to the parent struct. And what if the half_dct and quarter_dct structs are also split radix? They need their own generic parameters. Suddenly you have an exploding tree of generic types.

@alexreg
Copy link

alexreg commented Jun 11, 2018

This is a good idea and I think a bunch of people would like to see this implemented. Does anyone want to have a crack at an RFC? (I don't feel too expert about it myself.)

@shepmaster
Copy link
Member

I was reading accepted RFC 1733 — trait aliases and came to the erroneous conclusion that it would supersede this RFC:

  1. Aliases can be defined from multiple traits: trait DebugDefault = Debug + Default;
  2. Aliases can be used as trait objects: Box<MyTraitAlias>

Only careful reading of the RFC's examples showed me this was not the case:

trait PrintableIterator = Iterator<Item=i32> + Display;
fn bar3(x: Box<PrintableIterator>) { ... } // ERROR: too many traits (*)

There's even an example immediately after that that makes it look like this would work, but only because one of the magic auto traits was renamed:

trait Sink = Sync;
trait IntIterator = Iterator<Item=i32>;
fn bar4(x: Box<IntIterator + Sink + 'static>) { ... } // ok (*)

Anyway, my point is that I think that RFC 1733 is going to exacerbate the occurrences of this issue.

@dhardy
Copy link
Contributor

dhardy commented Sep 17, 2018

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>?

This is a complication which doesn't have to be solved immediately — the compiler can simply state that up-casting to multi-trait objects is not (currently) supported. As a workaround users can use trait AC: A + C {} and cast from Box<AC + B>, although this doesn't cover all cases. (Alternatively the compiler could implement vtables and casts only when used; this likely requires a unique vtable for each source/target combination.)

What should be supported is:

  • Box<A + B + ...>Box<A> for any A, B, ...
  • Box<D>Box<A> where trait D: A { ... }
  • x.foo where x: Box<A + B> and A::foo exists
  • x.foo where x: Box<D>, D: A and A::foo exists

Issue: name conflict where A::foo and B::foo both exist.
Solution: calling x.foo() for x: Box<A + B> should be illegal and UFCS required, same as for static dispatch.

Issue: if A: C and B: C, then Box<A + B>Box<C> has conflicting implementations.
Poor solution: disable direct up-cast on multi-trait-objects; i.e. require x as Box<A> as Box<C> for x: Box<A + B>.
Better solution: only disable direct up-cast on multi-trait-objects where there are conflicts. This is more convenient but means that Box<A + C>Box<C> where A: C would silently ignore the indirect upcast option even though it would otherwise be an option.
Another solution: something akin to UFCS syntax but for upcasts.

Is this enough of an RFC? It doesn't detail what vtables should look like, but this is probably best left unspecified (I don't see any further complications).

sgrif added a commit to sgrif/crates.io that referenced this issue Sep 24, 2018
This behavior was split into two functions so that one of the 5 places
we're calling it from could intercept a 404 response code. Since the
happy path otherwise is just calling `.json`, this felt really awkward
to me. Instead I've opted to return `NotFound`, and downcast to it in
the error path for that one place.

I expected to just be able to call `Any::is`, but it turns out this
method is an inherent method not a trait method (which makes sense,
otherwise `Any` wouldn't be object safe). However, a side effect of that
is that even though `Any` is a supertrait of `CargoError`, we can't call
`Any::is` since `dyn CargoError` can't be cast to `dyn Any`. This may
change at some point in the future (see
rust-lang/rfcs#2035), but we have to duplicate
the body of `Any::is` for now.

Except we can't even just duplicate the body of `Any::is`, because the
only trait method for `Any` is unstable so we have to duplicate that
method, too..........

Other notable changes:

- We're no longer sending `Proxy-Connection: Keep-Alive`. According to
  Wikipedia, this is "Implemented as a misunderstanding of the HTTP
  specifications. Common because of mistakes in implementations of early
  HTTP versions". Firefox recently removed it, as have recent versions
  of curl.
- We will accept gzipped responses now. This is good.
- We send an actual user agent, instead of "hello!"
@dhardy
Copy link
Contributor

dhardy commented Oct 8, 2018

trait A: Debug + Display { /* pretend there are methods here */ }
trait B: Debug + Display { /* same here */ }
trait C: A + B {}

Going back to @parkovski's example: can we not use a reduced vtable for C (basically just A plus unique methods from B), then use a lookup table to replace one vtable with another to support &C → &B?

This implies that trait-object-casting functions may need a small dataset (of pointers to vtables) embedded in the executable and that trait-object-cast may be slow, but I don't think those are real problems?

@hanna-kruppe
Copy link

The hard question is what happens if you want to cast Box<A+B+C+D+...> to a trait object of a subset of {A, B, C, D, ...} rathern than a single one. If it's allowed, there are a large number of ways that could be be implemented, but most ways scale badly as you add more traits to the multi-trait object (e.g., pointer sizes increase linearly in the number of traits, or the number of vtables that are emitted increases exponentially in the number of traits). The "lookup table holding the vtables you need for upcast" approach falls under the latter if it's extended up "upcast to another multi-trait object" in the way I expect.

For more discussion of those trade-offs see https://internals.rust-lang.org/t/wheres-the-catch-with-box-read-write/6617 -- there is one approach in there (vorner's) that side-steps the aforementioned problems but instead sacrifices efficency of some virtual calls.

@burdges
Copy link

burdges commented Apr 19, 2020

I opened a discussion at https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 that should address the issues with layering compilation units.

@jgarvin
Copy link

jgarvin commented Aug 4, 2020

I figured I'd mention since I haven't seen any discussion of it here, that there is a minimal version of this that is probably much more trivial to add which is adding support just for user-defined marker traits. Because marker traits purely exist at the type level and shouldn't (AFAIK) require any vtables. I ran into wanting this due to trying to make an object safe wrapper trait for an existing non-object safe trait. Anywhere the original trait used Self or an associated type I tried substituting Box<dyn (Any + MarkerForAssociatedType)>. Of course Box<dyn Any> can hold anything, but I wanted the marker trait in order to preserve a little type safety -- it made so I couldn't accidentally pass a Box<dyn (Any + MarkerForAssociatedTypeA)> into a method expecting Box<dyn (Any + MarkerForAssociatedTypeB)>, making it a bit more likely my runtime downcasts would succeed. It also made it less likely that somebody would unintentionally pass in a Box<dyn Any> from some other library.

@scottmcm
Copy link
Member

scottmcm commented Aug 4, 2020

@jgarvin Good comment! Makes me think that, rather than Send and Sync being special, perhaps they should just be ordinary #[marker] traits, and all #[marker] traits should get the "can be added to other traits in dyn" behaviour.

EDIT: Come to think of it, the "trait addition ok" is probably an OIBITauto trait behaviour, not special to Send/Sync.

@jgarvin
Copy link

jgarvin commented Aug 5, 2020

@scottmcm I believe it is currently an auto trait behavior because the error message you get when you try Box<dyn Foo + Bar> specifically complains about having more than one non-auto trait. But I think expanding the behavior to include anything #[marker] makes sense.

@Timmmm
Copy link

Timmmm commented Aug 6, 2020

It feels weird to me that this doesn't compile:

fn write_something(w: &mut (dyn Write + Seek));

But this does:

trait WriteSeek: Write + Seek {}
fn write_something(w: &mut dyn WriteSeek);

Will this fix that?

@SOF3
Copy link

SOF3 commented Aug 6, 2020

@scottmcm @jgarvin why #[marker] in particular, but not empty traits (without any virtual methods) in general?

@ssokolow
Copy link

ssokolow commented Aug 6, 2020

@SOF3 I'd assume for the same reason only closures allow type inference in function signatures. It'd be too easy to conflate interface and implementation.

@bjorn3
Copy link
Member

bjorn3 commented Aug 6, 2020

A trait could add new methods with default implementations without breaking semver compatibility.

@jgarvin
Copy link

jgarvin commented Aug 8, 2020

@SOF3 @ssokolow I just assumed #[marker] was some way to identify intentionally empty traits. I'm not actually sure what the advantage of labeling them this way is, or what it has to do with interface vs implementation. Their defining feature seems to be the absence of an interface :)

@piegamesde
Copy link

I scrolled quite a bit through the backlog, and still don't understand one point: Why can't we automatically desugar Box<dyn Read + Seek> to trait ReadSeek: Read + Seek {}; Box<dyn ReadSeek>? I don't need any of this type fancy casting, all I want is to store a trait object that implements multiple traits.

@afetisov
Copy link

@piegamesde If that is all you need, you can just declare that trait, provide a blanket impl and use it. No need to introduce any extra language complexity. You can even write a proc macro to automate that pattern, and I would expect that someone already published a crate for that.

@burdges
Copy link

burdges commented Sep 20, 2022

I wrote up a detailed version of roughly that approach at https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 and yes it requires little if any rustc support. Yet as @unbrice and I discussed above one should not expect single solutions that fit everything optimally. In this case, you wind up with different ReedSeek traits in different compilation units, which must be unified by the linker, but the linker does not understand traits.

@bjorn3
Copy link
Member

bjorn3 commented Sep 21, 2022

Vtables are already duplicated between codegen units anyway.

@ProofOfKeags
Copy link

You can even write a proc macro to automate that pattern, and I would expect that someone already published a crate for that.

Relying on proc macros for something that is a rather basic expectation for how the language should behave is an admission that the language has failed. I think the request for the desugaring makes a lot of sense. Sometimes all you need to know about an argument is that it implements a set of behaviors (traits). Limiting to a single trait is needlessly burdensome on users of the language and I don't think rust should openly dismiss the desires of users for something that seems rather reasonable.

@afetisov
Copy link

@ProofOfKeags It's not my basic expectation. In my view, the thing proposed above is a very niche desire. Neither is it a goal of the language to pull in every feature under the sun. One-off cases should be implemented in crates, via macros or whatever other way, and the core language must have only powerful composable features which work in every supported use case.

@ProofOfKeags
Copy link

Given that trait objects are natively supported by the language, and we have syntax for being able to combine trait bounds (via +), it is a natural expectation for a significant portion of rust users that this would work in this way. The fact that it doesn't is a language failure. Whether you care or not is a separate issue.

and the core language must have only powerful composable features

This is a composability failure, I'm not sure why you aren't registering it that way. Languages are intuitive when users can take the concepts they know and apply them elsewhere.

Concept 1: I can make trait objects by doing Box<dyn SomeTrait>

Concept 2: I can intersect trait requirements by doing T: Trait1 + Trait2

Expected Composition: Box<dyn Trait1 + Trait2> should guarantee that what is inside the box implements both traits.

Reality: This doesn't work.

Given that the solution seems possible by just desugaring, it seems like the approach to fixing this could be rather non-invasive. Why isn't this something that should be considered?

@afetisov
Copy link

@ProofOfKeags It's quite easy to do the thing you want:

trait Composite: Trait1 + Trai2 {}
impl<T> Composite for T where T: Trait1 + Trait2 {}

// use Box<dyn Composite>

So what you're arguing for is not some language feature which opens real new possibilities, but rather syntactic sugar for writing the two lines above. Now, syntactic sugar certainly has its place, but it still has a high bar to clear in terms of language additions vs ergonomic benefits. As it stands, the benefits are pretty low: there are two simple lines to write, and there are not that many traits which you would want to compose in the first place.

In terms of semantics and composability, the cost of the feature would be much higher (at least with the proposal "just declare automatically a new trait"). Where is that composite trait declared? Trait1 and Trait2 may come from different crates, so you can't place Trait1 + Trait2 in either of them. So, you must declare it at use site. But that would mean that each downstream crate has its own instance of Trait1 + Trait2, so you have code bloat, compile time bloat, and Trait1 + Trait2 coming from different crates will be different. That would in turn mean that if you get Box<dyn Trait1 + Trait2> from crate A, you won't be able to pass it into crate B. So now you need a way to create transparent coercions between traits from different crates, but that feature has no precedent in current Rust, and may run afoul of different internal layouts of those composite traits. And again, who should be responsible for defining those coercions?

Ok, let's say we figure it out. Let's say we impose some fixed layout of composite traits, so that the coercion can be a no-op. But now you have declared that Trait1 + Trait2 is a valid trait name, so T: Trait1 + Trait2 shouldn't mean "implement both traits", it should mean "implement the composite trait". If it's a separate trait, I should be able to write impl Trait1 + Trait2 for T, but that doesn't make sense, since it's the same things as impl Trait1 for T and impl Trait2 for T (and in fact is harder to implement since the traits may have methods with the same name). So we should forbid that composite impl, which means that we have a trait which is not really a trait, which means the proposed feature makes Rust less intuitive and composable.

And what about Trait1 + Trait2 + Trait3? Nowadays this means just 3 separate impls. If sum traits are a thing, then we could also read it as (Trait1 + Trait2) + Trait3 or Trait1 + (Trait2 + Trait3), or even as (Trait1 + Trait3) + Trait2. So now a sum of N traits is no longer O(N) impls and memory, instead we have an exponential number of potential traits and impls. There should likely also be conversions between sums with different brackets, so now we also need some way to normalize sums of types and to define all those conversions (and remember the issues above about the crate which should define all of those). Now either you have an exponential code blowup, or you once again break the symmetry between sum traits and real traits in some arbitrary way.

At this point, what did you get in clarity, composability and language regularity over just forbidding sum traits in trait objects?

All of these issues at the very least mean that "just declare an anonymous composite trait" isn't a workable suggestion. Now, there may be some other approach, like making dyn Trait1 + Trait2 include separate vtables of dyn Trait1 and dyn Trait2, but at this point the design space is much larger, and there is no longer a simple single solution. There are also other competing design issues, like trait object size, or up/dowcasting of trait objects, which mean that likely there is no "one size fits all" solution. So either we need a way to support all possible semantics, or someone must decide that some use cases won't be supported. In either case "this seems like it should work so just implement it already" is a useless sentiment.

@bjorn3
Copy link
Member

bjorn3 commented Nov 15, 2022

But that would mean that each downstream crate has its own instance of Trait1 + Trait2, so you have code bloat, compile time bloat, and Trait1 + Trait2 coming from different crates will be different.

No, it wouldn't. For the same reason that &Foo in one crate is the same type as &Foo in another crate, dyn TraitA + TraitB in one crate is the same as in another. We already support dyn Trait + Send and the same for other auto traits. Removing the restriction that the extra trait(s) is/are (an) auto trait(s) would only make it less special. As for code bloat we are duplicating vtables anyway and no function would need to be duplicated.

@afetisov
Copy link

@bjorn3 I have literally addressed this in my comment.

As for code bloat we are duplicating vtables anyway

That's nothing to be proud of, it's a problem to fix. Doubling down on it to get a syntax-sugar feature is unacceptable.

@piegamesde
Copy link

In about 90% of the cases where I'd need multiple trait trait objects I have no need for any up- or downcasting and the basic "anonymous union trait" solution would work for me. I can only speak for my own usage patterns here, but calling these use cases "niche" or "one-off" does not feel appropriate.

The problem with the proposed two-line workaround is how it interfaces with different libraries. Say I have a library that requires trait objects of two traits to work. So it creates the said union type, but now it must also expose it as (for example) mycrate::AsyncReadSeek. This is already weird on its own because that trait is not inherent to the crate but only an implementation necessity; it feels not in place and is in the wrong name space. But things get even weirder if multiple crates do this, because you now have equivalent-but-not-identical types in different name spaces, and I'm not even sure how one would convert between them.

Go has common union types declared in the standard library, but I'd still prefer having these types implicit by using the already common syntax of Trait1 + Trait2 instead of having to hope that somebody already defined Trait1Trait2 (or even having to do it manually).

@bjorn3
Copy link
Member

bjorn3 commented Nov 15, 2022

As for code bloat we are duplicating vtables anyway

That's nothing to be proud of, it's a problem to fix. Doubling down on it to get a syntax-sugar feature is unacceptable.

How are you going to fix duplication of vtables for types like (Foo, Bar)? Whatever fix you have almost defintively applies to dyn TraitA + TraitB too.

@dhardy
Copy link
Contributor

dhardy commented Nov 15, 2022

@afetisov already mentioned that:

Now, there may be some other approach, like making dyn Trait1 + Trait2 include separate vtables of dyn Trait1 and dyn Trait2 ...

I.e. fatter trait objects. I think this approach would be worth further investigation (especially because up-cast to super/component trait objects is easy).

@afetisov
Copy link

But things get even weirder if multiple crates do this, because you now have equivalent-but-not-identical types in different name spaces, and I'm not even sure how one would convert between them.

True, but you get the same problem if you try to do that trick at the compiler level, as I discussed above.

How are you going to fix duplication of vtables for types like (Foo, Bar)?

I don't know what duplicates you are talking about, (Foo, Bar) does not automatically implement any of the traits on Foo or Bar. If you mean "the impls basically just delegate to impls on individual types", then I consider it a non-issue, since it's explicitly controllable and no worse than any other impl for a composite type. You may do some deduplication as a codegen optimization, but there is no reason to make it a user-facing guarantee.

You must have a specific vtable at hand while compiling a crate, so that you can put the pointers to that vtable in trait object. So basically the only way to deduplicate those composite trait objects would be to punt the issue of pointer patching on the linker. This may mean baking something like lto, or at least some uncomfortable coupling to the linker, in language semantics; otherwise you can't guarantee ABI compatibility between the different sum trait objects.

My personal opinion is that multiple vtable pointers is the way to go with the implementation of sum trait objects. This gives simple really zero-cost implementation of casts to subsets of traits, is simple to implement and understand, and the only downside is increased fat pointer size. On its own, I don't consider it an issue. If a pointer gets too fat, you can always use double-indirection to deal with it, or use the usual subtrait trick to get a single trait object. However, a potential hazard is that some code may rely on the current non-guarantee that dynamically sized types are two pointers long.

Perhaps it's time to write a proper RFC for that design.

@truppelito
Copy link

I’m also in favor of fatter pointer types for sum trait objects. It seems like a very natural extension of the current trait design. Summarizing, the way I see it:

Advantages: minimal compile-time cost (code size, code duplication, compilation time, compiler complexity, etc), trivial subset casting.

Disadvantages: non minimal runtime cost (fatter pointers), but there are reasonable workarounds for the niche cases where too many traits are combined.

Are there any other disadvantages?

@bjorn3
Copy link
Member

bjorn3 commented Nov 15, 2022

Are there any other disadvantages?

It needs a lot of changes to allow non-pointer sized pointer metadata across rustc. Especially in the codegen backends that assume every fat pointer is a ScalarPair. In addition it makes the pointers larger. Potentially a lot so. If neither was the case we would have used fatter pointers for upcasting trait Foo: Bar + Baz too I think.

@afetisov
Copy link

With upcasting, the benefits of sumo pointers are less clear, because there is a fixed graph of supertraits, and we can optimize the layout of pointers and vtables based on that information. For example, in single-inheritance case the optimal solution is "supertraits are initial segments of subtraits" (optimizing for speed) or "supertraits contain pointers to subtraits" (optimizing for size). With sum traits, the biggest issue is that the set of summands is potentially "all traits in the artifact", and that itself is not defined until the root crate is compiled.

It needs a lot of changes to allow non-pointer sized pointer metadata across rustc.

I assumed that the same flexibility would be required for arbitrary DST's, which are expected to be supported, even if currently super unstable. It is unfortunate if the "two pointers" assumption is still baked in.

@bjorn3
Copy link
Member

bjorn3 commented Nov 15, 2022

Yeah, custom DST's would also require it and as such rustc is far from ready for custom DST's if they were to be proposed and accepted. cg_clif is better in this regard than cg_llvm as cg_llvm matches on OperandValue (which has the Ref, Immediate and Pair variants) a lot to determine how to handle values, while cg_clif matches on the actual type and is completely fine with referencing a fat pointer by reference (as necessary for fat pointers not fitting in two pointer sized values) rather than forcing it to be put in the equivalent of OperandValue::Pair. Even then cg_clif does use a lot of .load_scalar_pair() calls when dealing with fat pointers to split the thin pointer and the metadata.

@truppelito
Copy link

Are there any other disadvantages?

It needs a lot of changes to allow non-pointer sized pointer metadata across rustc.

That's fair, and certainly something to consider. But actually, I meant in usability, i.e. would sumo pointers lack some capability that people want (or not do something well enough)?

@thaven
Copy link

thaven commented Feb 11, 2023

I also hit this issue while using fastcgi-client package, any trying to decide at runtime to connect to the server over TCP or by unix domain socket. Both TcpStream and UnixStream from tokio implement the AsyncRead and AsyncWrite traits. But this fails to give them a common super type that implements both traits. So I end up with duplicated calls to construct the Client, and duplication of usage of the client (because the clients are of different type....) etc. Very frustrating, so I started googling around and thus ended up here. I read all of the discussion, and now just wonder why this has to be so difficult. I think the only potential problem is a little ABI change/incompatibility regarding trait inheritance.

I am not familiar with rust (compiler) internals, but from the discussion above it appears to me that the implementation of trait inheritance is to blame here. Apparently, the sub-trait vtables includes the super-trait vtable(s). If instead, the sub-trait vtable would just contain a pointer to the vtable of each of its super-traits, all troubles vanish. This is because, given trait TraitAB : TraitA + TraitB { }, a &dyn TraitAB can now be converted to a &dyn TraitA or &dyn TraitB simply by getting the relevant pointer from the vtable! Such conversion then has to happen implicitly when calling a super-trait method on a sub-trait reference.

The same works for &(dyn TraitA + TraitB) if such is simply handled as referencing an implicit/anonymous version of TraitAB.

@bjorn3
Copy link
Member

bjorn3 commented Feb 11, 2023

If instead, the sub-trait vtable would just contain a pointer to the vtable of each of its super-traits, all troubles vanish.

This is pretty much what is done with #![feature(trait_upcasting)].

@Pr0methean
Copy link

Pr0methean commented Jun 14, 2023

I suspect that in most cases, the number of individual traits used for all of a program's dynamic dispatches is small enough that we could afford one big, flat offset-table struct with nullable entries for all of them. Then casting Box<dyn A+B+C> up to Box<dyn A+C> not only wouldn't require an extra offset table, but it'd be a no-op at runtime. The total size of all offset tables would grow in O(nTraits * nImpls) rather than exponentially in nTraits. Plus this would support runtime instance-of-trait checks at no extra cost (just check whether the trait's entry is null).

@bjorn3
Copy link
Member

bjorn3 commented Jun 14, 2023

Vtables are codegened before all traits are known. There may even be new traits added at runtime using dlopen.

@Pr0methean
Copy link

Pr0methean commented Jun 14, 2023

Then the trait-vtable-offset tables (not the individual trait vtables, we don't have to change those) can be stored as a Vec-like type and extended when a new trait is encountered/loaded.

@bjorn3
Copy link
Member

bjorn3 commented Jun 14, 2023

That would require a global constructor in every dylib to extend those tables and to somehow locate all tables. Not every platform supported by rustc has global constructors in the first place and on those that do it slows down startup and has this proposal has issues with unloading dylibs again. Also it would likely require the registration to happen in the crate that defines the object safe vtable (of which there are many) as opposed to the one that actually turns it into a trait object (which is less common) to ensure a single fixed offset is used across the entire process. With dlopen(RTLD_LOCAL) it may not even be possible to locate all tables.

@Pr0methean
Copy link

Maybe in cases like that, some central code is needed to manage trait loading/unloading, like on the JVM.

@bjorn3
Copy link
Member

bjorn3 commented Jun 14, 2023

Rust doesn't have a runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests