Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generalized arity tuples #2702

Open
wants to merge 1 commit into
base: master
from

Conversation

@Woyten
Copy link

commented May 22, 2019

This is another proposal for a generalized tuples solution. Compared to similar RFCs, this RFC

  • only needs one key idea implemented in the compiler
  • does not add any new syntax
  • does not add any special traits
  • does not change the way Rust reasons about types

Rendered

@Centril
Copy link
Member

left a comment

Here's some food for thought. Once they settle, please eventually extend the outcome of these discussions into the text itself.


Unfortunately, it is not possible to express the generalization strategy in Rust's type system. Instead, a common practice is to generalize code using the macro system. This has two major drawbacks:

- The code is not really general since it can only support a limited number of arities. This is the same restriction as if it had been written down by hand. To make things worse, each library has its own understanding about what is cosidered a good limit.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

For trait implementations, I think the typical number is around 10-12; do you really need more? -- please expand on this. :)

Functions like zip seems to be a different matter however.

- Focussed on variadic generics
- Introduces new syntax
- Includes new traits
- Uses special handling for references

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

It would be good to survey / see a discussion of variadics, type level lists, etc. in other languages, including:

  • Haskell
  • Idris
  • C++
  • Other?

This comment has been minimized.

Copy link
@tomwhoiscontrary
- Introduces new syntax
- Includes new traits
- Uses special handling for references

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

Since you are taking the frunk approach to this, it would be a good idea to have a discussion of the library and the various traits and transformations in there. In particular, frunk should provide us with a good investigation of how this approach actually pans out in terms of what transformations can be written and not.

cc @lloydmeta @ExpHP

where `Tuple` is a new struct located in `std::ops` with the following definition:

```rust
struct Tuple<ELEM, TAIL> {

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

So this is essentially a HList: https://docs.rs/frunk/0.3.0/frunk/hlist/index.html.

  • The good aspect of this is that you are essentially removing ty::TyKind::Tuple. The rules about unsized types in the last element should just fall out from structs. Overall, this is a substantial reduction in the amount of structural typing in the type system, which is a good thing. Instead, Tuple<H, T> is nominally typed and (T0, ..., Tn) is just sugar. You may want to extend the rationale with a note about the benefits of this simplification.

    • Also please make a note of Tuple becoming a #[lang = "tuple"] item by attaching the attribute here.
  • On the other hand, this also means that the compiler is no longer free to apply layout optimizations where fields are reordered. E.g. today, the compiler is free to lay (A, B, C, D) out as laid out as A D C B. After introducing struct Tuple<H, T>, the compiler can no longer do that because it is now possible to take a reference to tup.tail.

    • See rust-lang/unsafe-code-guidelines#105 for a discussion.

    • As a consequence, this may inflate the memory footprint of using tuples and other perf characteristics as well.

    • This problem could however be solved by not exposing pub tail like this and not exposing a method to take a reference to tail either. Moreover, we might want a sort of attribute #[non_referenceable] on fields which would prevent you from taking a reference &tup.tail. This would allow the compiler to reorder as it sees fit. This would be beneficial in other circumstances where refactoring a data type into pieces is currently not zero cost, an unfortunate situation.

    • However, by not making the tail referenceable, you give up on certain transformations that frunk provides.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

In a discussion with @oli-obk, they noted that #[non_referenceable] pub tail: Tail poses a problem with respect to implementing PartialEq and similar traits when you want to do it recursively and generally for all tuples.

This comment has been minimized.

Copy link
@oli-obk

oli-obk May 23, 2019

Contributor

More details:

impl PartialEq for () {
    fn eq(&self, other: &Self) -> bool {
        true
    }
}
impl<ELEM: PartialEq, TAIL: PartialEq> PartialEq for Tuple<ELEM, TAIL> {
    fn eq(&self, other: &Self) -> bool {
        self.elem == other.elem && self.tail == other.tail
    }
}

The self.tail == other.tail is essentially PartialEq::eq(&self.tail, &other.tail), which would violate #[non_referenceable].

This comment has been minimized.

Copy link
@Woyten

Woyten May 23, 2019

Author

If I get this right, the compiler still has the flexibility to lay out Tuple<ELEM, TAIL> as either ELEM TAIL or TAIL ELEM. So (A, B, C, D) could become A B C D or B C D A or A C D B but not B A C D.

But, indeed, this is a hard restriction which might increase the memory footprint of every tuple.

This problem could be mitigated if the tuple representation was changed to a tree structure, e.g. Tuple<ELEM, LEFT, RIGHT>. In this way, the compiler could regain some control about the memory layout. In return, the compiler would need to match Tuple<Elem, Tail, ()> with Tuple<Elem, (), Tail> or wouldn't it? My first feeling is that this solution is bad just because it is not simple enough.

This comment has been minimized.

Copy link
@Ixrec

Ixrec May 23, 2019

Contributor

Maybe I misunderstood something in previous discussions, but it seemed like we already knew there's a fundamental choice we have to make here between:

  • the type lists used by variadic generics are identical to tuple types, and can compile away to nothing because tuples "don't exist at runtime" in a sense that rules out sub-tuples having addresses and being efficiently borrowable and so on
  • the type lists used by variadic generics are distinct from tuple types, so there is a certain amount of unfortunate duplication going on, but we get to make guarantees about tuple layout/addressability/etc

And any future variadic generics / generic tuples proposal would simply have to pick one of these and argue convincingly for it, but making up our minds on this was the main thing blocking progress on variadics. Is that no longer the case?

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

Well there's a third option:

  • Go ahead with this proposal either with #[non_referenceable] or without, and then never add "variadic generics".

This comment has been minimized.

Copy link
@Woyten

Woyten May 23, 2019

Author

I in my understanding, solving the variadic tuple problem is equivalent to solving the variadic generic problem. I would even go so far as to say that you do not need variadic generics if you have variadic tuples.

@Ixrec am not aware that the variadic generic problem has a final solution yet, has it?

This comment has been minimized.

Copy link
@Ixrec

Ixrec May 23, 2019

Contributor

Yeah, it's definitely not solved yet. The whole point of my last comment was that any proposal for variadic tuples effectively is a proposal to solve the variadic generics problem. In other words, even if the proposed solution is simply that variadic tuples are enough, this needs to be made explicit, and then it has to be argued that "full" variadic generics are unnecessary or not worth it (afaik no one's made that argument before; maybe I could be convinced).

This comment has been minimized.

Copy link
@Woyten

Woyten May 24, 2019

Author

I believe that variadic generics might not be strictly necessary since you could model every type Generic<A, ...> with variadic generics simply as Generic<T> where T: TraitThatIsValidForAllTuples. The transition to a real generic notation could be done using a syntactic sugar approach.


- The compiler needs to treat any type `(ELEM, TAIL.0, ..., TAIL.n-1)` to be equivalent to `Tuple<ELEM, (TAIL.0, ..., TAIL.n-1)>`. This could work in the same way as `std::io::Result<T>` is considered equivalent to `core::result::Result<T, std::io::Error>`.
- Equivalently, every tuple value `(elem, tail.0, ..., tail.n-1)` must be considered structurally equal to `Tuple { elem: elem, tail: (tail.0, ..., tail.n-1) }`.
- Every tuple index access `tuple.n` must evaluate to `tuple{{.tail}^n}.elem`. In other words, `.tail` must be called `n` times before calling `.elem`.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

So essentially, you are moving tuples to the HIR lowering phases of the compiler and out of later phases. That turns tuples into essentially syntactic sugar. This is a nice simplification. On the other hand, this may also inflate compile times by giving the type checker later phases larger HIR trees to work with.

My overall sense is that it is hard to answer both the run-time and compile-time perf questions without implementing this in a PR and testing it out. Thus, if we are going to accept this, it would be best to run some experiments and get data to inform these questions.


- The compiler needs to treat any type `(ELEM, TAIL.0, ..., TAIL.n-1)` to be equivalent to `Tuple<ELEM, (TAIL.0, ..., TAIL.n-1)>`. This could work in the same way as `std::io::Result<T>` is considered equivalent to `core::result::Result<T, std::io::Error>`.
- Equivalently, every tuple value `(elem, tail.0, ..., tail.n-1)` must be considered structurally equal to `Tuple { elem: elem, tail: (tail.0, ..., tail.n-1) }`.
- Every tuple index access `tuple.n` must evaluate to `tuple{{.tail}^n}.elem`. In other words, `.tail` must be called `n` times before calling `.elem`.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

Can you also please discuss pattern matching? Please consider at least these cases:

  • let (a, b) = tup;
  • let (ref mut? a, b) = tup;
  • let (a, ..) = tup;
  • let (a, b, ..) = tup;
  • let (a, b, c @ ..) = tup; -- this is not allowed today, but could potentially be. This ties into questions about &tup.tail
  • let (a, b, ref c @ ..) = tup; -- same here re. &tup.tail; also not allowed today.
# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

The selling point of the proposed solution is that it is completely based on existing concepts. The syntax and type system remain unaffected. Hence, the implementation effort should be predictable and the risk of compromising the overall quality of the language should be low. A second benefit is the possibility to define more advanced type mappings, e.g. `(A, B, C, ..., Z)` &rarr; `(B, A, C, ..., Z)`.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

The type system is absolutely affected, but as I noted before it is simplified... ;)

Those points are mainly decisions to be made before implementing the RFC:

- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

I agree with not doing anything about "nonsensical" tuples; seems like banning them just brings unjustified complication to the type system and undoes the nice simplification benefits your proposal brings.

This comment has been minimized.

Copy link
@eaglgenes101

eaglgenes101 May 23, 2019

Rust's current type system is plenty enough to be able to specify that a tuple cons-element can only have () or another tuple cons-element as its tail associated type, so a separate mechanism for checking the well-formedness of a tuple type list is not needed. (Custom diagnostics might be of help in that area, though.)

This comment has been minimized.

Copy link
@varkor

varkor May 27, 2019

Member

Can't we just have a closed trait IsTuple that is only implemented by valid tuples? That would be a simple modification, but would avoid any extra complexities by not making sure Tuple is well-formed.


- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.
- How should the `Tuple` struct look like precisely? Should it export globally visible symbols like `tuple.elem` or `tuple.elem()` or should they be hidden behind a namespace, e.g. `Tuple::elem(tuple)`?

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

I think the larger question here is whether &tup.tail should be possible or not. Please add that to the list.


- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.
- How should the `Tuple` struct look like precisely? Should it export globally visible symbols like `tuple.elem` or `tuple.elem()` or should they be hidden behind a namespace, e.g. `Tuple::elem(tuple)`?

This comment has been minimized.

Copy link
@Centril

Centril May 23, 2019

Member

It seems to me that a big question here is how much of frunk we want to add to the standard library.

@Woyten

This comment has been minimized.

Copy link
Author

commented May 23, 2019

@Centril Thank you very much for your feedback. I will address your comments and update this RFC once I find the time for it. 😊

@eaglgenes101

This comment has been minimized.

Copy link

commented May 23, 2019

This was essentially what I was thinking about proposing for variadic tuples, plus or minus some names and implementation details and such. I'll put my support behind it.

That said, my incubated proposal avoids specifying anything about tuple layout (mostly to avoid edge cases where alignment concerns cause tuples to be up to an order of magnitude larger than a similarly defined struct), and instead specifies a heterogeneous iteration mechanism, then augments tuples so that this heterogeneous iteration mechanism can be used to iterate over and destructure the tuple's fields by reference, by mut reference, or by move.

```rust
type () = (); // Not really an alias. Written down for completeness.
type (A,) = Tuple<A, ()>;
type (A, B) = Tuple<A, Tuple<B, ()>>;

This comment has been minimized.

Copy link
@kennytm

kennytm May 24, 2019

Member

Unfortunately this desugaring is incompatible with the current unsizing rule.

  1. We need allow ([u8],).
  2. We also need to allow (u8, [u8]).

If (A,) is desugared as Tuple<A, ()>, this means ELEM must be relaxed as ELEM: ?Sized.

If (A, B) is desugared as Tuple<A, (B,)>, this means TAIL must be relaxed as TAIL: ?Sized.

But we cannot have two unsized fields in a structure (struct Tuple<E: ?Sized, T: ?Sized> { head: E, tail: T }.

Therefore, the tuple desugaring must terminate at (A,) and cannot be further desugared to Tuple<A, ()>.

Alternatively, you could reverse the expansion direction, so that only the final field needs to be unsized.

struct Tuple<Init, Last: ?Sized> {
    init: Init,
    last: Last,
}

type () = ();
type (A,) = Tuple<(), A>;
type (A, B) = Tuple<Tuple<(), A>, B>;
type (A, B, C) = Tuple<Tuple<Tuple<(), A>, B>, C>;

This comment has been minimized.

Copy link
@Woyten

Woyten May 24, 2019

Author

Good point! Easy solution 👍

@camlorn

This comment has been minimized.

Copy link

commented May 24, 2019

Some thoughts:

When I implemented the initial version of size layout optimizations, we discovered that this can let some very small tuples and structs fit into registers when making function calls, i.e. (u8, u16, u8) becomes (u16, u8, u8). No one did a detailed analysis of how much this mattered because it was an emergent behavior; the reaction was "huh, that's cool and it might help." Given Rust's increased adoption in the embedded space, this may have practical implications on someone at this point.

Dropping layout optimizations also means that sequences of tuples take a real hit, if the program likes to build containers containing tuples. In practice (I don't have the reference handy) I think the largest gain we saw from field reordering was 72 bytes. I would expect the gains for reordering tuples to be much less since you usually use a struct for a large number of fields, but it can be a surprisingly significant drop.

I don't think these optimizations can be swept under the rug at this point. It's one thing for Rust to deoptimize memory usage on a PC, but quite another for it to deoptimize memory usage on a small microcontroller, where the line between "runs" and "doesn't run" can be small enough that a few extra bytes matters.

With variadics, you can for example pass 10 4-to-8 byte arguments in registers on some platforms. With tuples you can't, short of complicated optimizations that are hard to reason about as an end user of the language. For instance the compiler could elide tuples constructed at call sites, but probably couldn't if they're stored in variables. You can, however, arguably get forward compatibility with variadics using current tuples by making the compiler understand that a call myVariadic(my_tuple) should desugar to myVariadic(my_tuple.0, my_tuple.1, ...). In practice this behavior would probably need to be behind a macro or special syntax because there's no way to distinguish between calling a variadic function with one tuple argument, or wanting to expand a tuple, but it allows for much of the same effect.

Either way, I don't think that tuples can replace variadics in practice because variadics have a lot more potential around being faster. Perhaps there's a clever way to implement this in the compiler so that the gap closes, but if there's not I don't expect this will put that discussion to rest.

@Woyten

This comment has been minimized.

Copy link
Author

commented May 24, 2019

@camlorn @Centril What if the compiler would not use Tuple<ELEM, TAIL> or Tuple<INIT, ELEM> as a representation but Tuple<LEFT, RIGHT, ELEM> which would be equivalent to (LEFT.0, ..., LEFT.n-1, ELEM, RIGHT.0, ..., RIGHT.m-1)?

The compiler could reorder an arbitrary tuple (A, D, B, C, G, E, F) to the desired memory layout (A, B, C, D, E, F, G) using the following strategy (basically quick sort 😄):

(A, D, B, C, G, E, F)
  ELEM = G
  LEFT = (A, D, B, C)
    ELEM = D
    LEFT = (A,)
      ELEM = A
      LEFT = ()
      RIGHT = ()
    RIGHT = (B, C)
      ELEM = C
      LEFT = (B,)
        ELEM = B
        LEFT = ()
        RIGHT = ()
      RIGHT = ()
  RIGHT = E F
    ELEM = F
    LEFT = (E,)
      ELEM = E
      LEFT = ()
      RIGHT = ()
    RIGHT = ()

As ELEM is in the last position, unsized types should work as well. However, this approach would require at least one new language features that enables matching TUPLE<(), SUB, ELEM> with TUPLE<SUB, (), ELEM> and so on.

This means, in particular, when writing a generalized mapping function from, let's say, (u32, u16, u8) to (u8, u8, u8), we have to deal with the fact that the Tuple trees differ from each other and they need to me mapped at the value level. After all, I strongly believe that this structural mapping needs to be implemented anyway at some point in the future, no matter what the solution for the variadic tuples/generics problem will be.

@ExpHP

This comment has been minimized.

Copy link

commented May 24, 2019

What if the compiler would not use Tuple<ELEM, TAIL> or Tuple<INIT, ELEM> as a representation but Tuple<LEFT, RIGHT, ELEM> which would be equivalent to (LEFT.0, ..., LEFT.n-1, ELEM, RIGHT.0, ..., RIGHT.m-1)?

That sounds like it has nasty implications for parametricity.

fn func<A, B, C>(tup: (A, B, C)) {
    // Does this have type A, B, or C?
    let x = get_elem(tup);
}

fn get_elem<L, R, E>(tup: Tuple<L, R, E>) -> E {
    tup.elem
}
@Woyten

This comment has been minimized.

Copy link
Author

commented May 24, 2019

@ExpHP The compiler would not accept tup.2 since tup being Tuple<L, R, E> is not a well-formed tuple.

On the other hand, if tup was Tuple<Tuple<(), (), usize>, Tuple<(), (), char>, String>, then tup.2 would evaluate to char.

@ExpHP

This comment has been minimized.

Copy link

commented May 24, 2019

I meant .elem, I edited the post.

@comex

This comment has been minimized.

Copy link

commented May 25, 2019

You can accomplish most of the same tasks using a tuple trait. Using a trait (which the compiler could automatically implement for tuple types) would avoid the need to define tuples as having a particular structure, and also opens the door to more flexibility in the future, especially regarding references.

Here is a playground link where I've implemented the three examples in the RFC in today's Rust; I had to use a macro to define two base traits (HeadTail and Prepend) for different sizes of tuples, but Join, Last, and Halve themselves are implemented without any macros. However, Last does not actually compile, because rustc thinks the two impls could conflict even though they don't. That seems like it should be possible to improve.

With GATs, instead of having a HeadTail trait and a Prepend trait, you could pack everything into one trait:

trait Tuple {
    type Head;
    type Tail;
    fn head(self) -> Self::Head;
    fn tail(self) -> Self::Tail;

    type Prepend<N>;
    fn prepend<N>(self, n: N) -> Self::Prepended<N>;

    // more operations...
}

Edit: Actually, that's not great because it would imply defining a head and tail for (). What we really need is some way to express disjunction: a tuple either has a head and a tail, or is (). Ideally it would be possible to write a separate impl of a trait for each case, and then be able to convince the compiler that you've impled the trait for all T: Tuple.

Anyway, if you read my implementation, the obvious downside is the stuttering:

impl<T> Join for T where
    T: Copy,
    T: HeadTail,
    <T as HeadTail>::Head: Join,
    <T as HeadTail>::Tail: Join,
    <<T as HeadTail>::Tail as Join>::Joined: Prepend<<<T as HeadTail>::Head as Join>::Joined> {
    type Joined = <<<T as HeadTail>::Tail as Join>::Joined as Prepend<<<T as HeadTail>::Head as Join>::Joined>>::Prepended;

However, this could potentially be mitigated in a few different ways:

  • If the (already accepted) implied bounds RFC is ever implemented, especially if it's extended to type aliases, you could get most of those bounds introduced implicitly.
  • Even without that, use of type aliases could make the above snippet much shorter.
  • rustc can get smarter about inferring the appropriate trait for type projections. Currently, you can write T::Foo and have rustc guess which trait Foo belongs to based on impls in scope, but if you have a nested projection likeT::Foo::Bar, rustc requires explicitly specifying the trait for Bar.
  • edit: With GATs, it would be enough to have a T: Tuple bound (which could be put on the associated type definition) and then be able to use operations like Prepend without needing to add a separate bound.

Edit: And regardless of what happens with tuples, it would be nice to make that sort of computation using associated types more ergonomic.

@KrishnaSannasi

This comment has been minimized.

Copy link

commented May 25, 2019

You can accomplish most of the same tasks using a tuple trait. Using a trait (which the compiler could automatically implement for tuple types) would avoid the need to define tuples as having a particular structure, and also opens the door to more flexibility in the future, especially regarding references.

This seems like a great idea. One issue that people may have with is it, is that it allows other types to masquerade as tuples, but I think that is fine.

Edit: Actually, that's not great because it would imply defining a head and tail for (). What we really need is some way to express disjunction: a tuple either has a head and a tail, or is (). Ideally it would be possible to write a separate impl of a trait for each case, and then be able to convince the compiler that you've impled the trait for all T: Tuple.

This can be solved with more traits (yay)

trait Tuple {}

trait NonEmpty: Tuple {
    type Head;
    type Tail: Tuple;

    fn head(self) -> Self::Head;
    fn tail(self) -> Self::Tail;
    fn split_off(self) -> (Self::Head, Self::Tail);
}

edit: With GATs, it would be enough to have a T: Tuple bound (which could be put on the associated type definition) and then be able to use operations like Prepend without needing to add a separate bound.

If we don't want to wait for GATs we can have

trait Prepend<T>: Tuple {
    type Joined: Tuple;
    
    fn prepend(self, value: T) -> Self::Joined;
}

Edit: And regardless of what happens with tuples, it would be nice to make that sort of computation using associated types more ergonomic.

Yes, I think chalk will bring in most of those improvements. I would especially love to have some sort of delayed bounds.


The only problem is, how do we implement algorithms with just traits?

@Aaron1011

This comment has been minimized.

Copy link

commented May 25, 2019

One issue that people may have with is it, is that it allows other types to masquerade as tuples, but I think that is fine.

The compiler could simply prevent any manual implementations of the Tuple trait. Such a restriction could always be relaxed in the future, without breaking backwards compatibility.

@KrishnaSannasi

This comment has been minimized.

Copy link

commented May 25, 2019

Thinking about this some more, with @Aaron1011's idea of limiting who can implement the trait, and with the NonEmpty trait we can implement algorithms recursively.

trait Algo {
    fn do_work(self);
}

impl Algo for () {
    fn do_work(self) {}
}

impl<T: NonEmpty> Algo for T {
    fn do_work(self) {
        do_work_with_head(self.head);
        self.tail.do_work() // no need to have a bound `T::Tail: Algo` because we have `NonEmpty` and `()` implemented
    }
}
@camlorn

This comment has been minimized.

Copy link

commented May 26, 2019

@Woyten and everyone else:

Unless something has significantly changed since I did my work on the compiler (it has been a very long time as these things go at this point) making a tuple a tree is going to be just as hard to optimize as making a tuple list-like (there are proper terms for these, but I haven't done serious functional programming in a long time). The compiler is written such that assuming that a type-of-types-of... will in some way combine the layouts of the lower levels. There's been some movement away from that, for example the niche optimizations, but something this fundamental would be a pretty big change. I believe tuples are actually the same variant of ty::layout as structs, and that the rest of the pipeline after that gets to treat them as effectively the same--that is to say either we can optimize nested structs and this representation of tuples, or do neither, and since we can't do this optimization on nested structs because of pretty bad performance implications and borrowing, we can't do it at all (specifically if a.b is itself a struct, &a.b can't borrow if we optimize layout using a global thing that looks at all fields of all nested types).

Someone more up to date on the compiler's internal workings should perhaps chime in here.

There's a lot of paths forward without these downsides:

  1. Many traits can be implemented by implementing the trait for one item of the tuple, then effectively saying "for all items do...". @comex's proposal can be made to do this, and it could be given a convenient syntax (i.e. a macro in std).
  2. The type system can be extended (@comex again). How is an interesting discussion that I don't have the background for, but most of the proposals around that bring other interesting benefits in addition to "we fixed tuples" and are worth it for that reason in my opinion.
  3. One of the most common uses for this that I've seen is being able to write generic future combinators, which are better served by variadics anyway (because then you don't have to wrap everything in an extra set of parens).

In general I favor approaches that don't deoptimize memory to make this work. having to do some extra copies makes a program slower. Having to use more memory makes a program run out of memory. In hindsight someone should probably have raised concerns around never being able to undo field reordering once it was stabilized for this reason, but we are nonetheless in a state where I would be hesitant to do that; CPU is a more bountiful resource than memory, in the sense that if you don't have enough you run slower vs if you don't have enough you can't run at all. Also reordering tuple fields to fix your memory issue is not nearly so easy as reordering struct fields, since the order of tuple fields is their names--that is to say if we disabled this for structs we could fix programs that had this issue by changing the struct definition, but for anything else that might suddenly break or balloon in memory usage, someone has to find all the places it's used too.

No one will pay for the extra copying, should it be necessary, on day one either. It'll only be in new APIs that use the new feature.

For borrowing the tail of tuples, I think that we should disallow it unless it's the last item. This may be technically infeasible. But if we don't, it's a weird asymmetry in a lot of these proposals, since in effect you can probably only borrow the tail in some places (one unique advantage of this specific RF C is that I think it would work everywhere).

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented May 26, 2019

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

To avoid the need to borrow the tail of a tuple (and the memory layout constraints that implies) we could:

  • Only provide decomposition into head and tail that move / take ownership of the input tuple rather than borrow it. This requires GATs with lifetime parameters.
  • Have projections from references to a tuple to a tuple of references, similar to Option::as_ref and Option::as_mut.

API sketch:

pub trait Tuple {
    type AsRef<'a>;
    type AsMut<'a>;
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a>;
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a>;

    type First;
    type Rest;
    fn split_first(self) -> Option<(Self::First, Self::Rest)>;
}

// Compiler-generated for all tuple sizes:

impl Tuple for () {
    type AsRef<'a> = ();
    type AsMut<'a> = ();
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { () }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { () }

    type First = !;
    type Rest = !;
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { None }
}
...
impl<A> Tuple for (A,) {
    type AsRef<'a> = (&'a A);
    type AsMut<'a> = (&'a mut A);
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { (&self.0) }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { (&mut self.0) }

    type First = A;
    type Rest = ();
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { Some((self.0, ()) }
}

impl<A, B> Tuple for (A, B) {
    type AsRef<'a> = (&'a A, &'a B);
    type AsMut<'a> = (&'a mut A, &'a mut B);
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { (&self.0, &self.1) }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { (&mut self.0, &mut self.1) }

    type First = A;
    type Rest = (B,);
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { Some((self.0, (self.1,)) }
}

// etc.

Alternative where () doesn’t implement the trait:

pub trait NonEmptyTuple {
    // …
    fn split_first(self) -> (Self::First, Self::Rest)
}
@KrishnaSannasi

This comment has been minimized.

Copy link

commented May 26, 2019

I think that having 1 trait is better, using an Option is good enough since this trait will be useless as a trait object, split_first will be inlined and the Option will be optimized away.

@SimonSapin you put &mut 'a instead of &'a mut

@kennytm

This comment has been minimized.

Copy link
Member

commented May 26, 2019

Could this RFC (whether struct or trait approach) be applicable to function pointers?

  • fn(A, B, C, D) -> R
  • unsafe fn(A, B, C, D) -> R
  • extern fn(A, B, C, D) -> R
  • unsafe extern fn(A, B, C, D) -> R
  • extern fn(A, B, C, D, ...) -> R
  • unsafe extern fn(A, B, C, D, ...) -> R

What about the Fn/FnMut/FnOnce traits/objects?

@comex

This comment has been minimized.

Copy link

commented May 26, 2019

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

Good point. Something like this old proposal for "sealed traits" could work – a way to mark a trait as not being implementable outside of the crate it's defined in, which would then allow more relaxed coherence rules.

@comex

This comment has been minimized.

Copy link

commented May 26, 2019

What about the Fn/FnMut/FnOnce traits/objects?

We could change the Fn* traits to add Tuple bounds, e.g.

trait Fn<Args> where Args: Tuple

This would not be a breaking change, because you can't write Fn<Args> on stable; you have to use the Fn(Arg, Arg) sugar, which guarantees that the generic parameter is a tuple.

However, to be able to recurse on the Args parameter in a generic context would require some way to express disjunction, as I mentioned before. You need to be able to write separate impls of your trait for non-empty tuples and for (), and then somehow convince the compiler that your trait is necessarily impl'ed by Args given that Args: Tuple.

One potential approach could be based on associated constants. Something like:

enum TupleKind {
    Empty,
    NonEmpty
}
trait Tuple {
    const KIND: TupleKind;
}
impl<T> Foo for T where T: Tuple<KIND=TupleKind::Empty> { ... }
impl<T> Foo for T where T: Tuple<KIND=TupleKind::NonEmpty> { ... }

The compiler would have to add support for referencing associated constants with Trait<Foo=Bar> syntax, not just associated types, and additionally be able to tell that the impls together cover all possible variants of TupleKind.

Alternately, there have been some proposals for "mutually exclusive traits".

A third approach could be to add type inequality constraints, but I don't think those would be as easy to implement as they seem: they run into the same coherence issues as specialization.

@eaglgenes101

This comment has been minimized.

Copy link

commented May 27, 2019

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

Good point. Something like this old proposal for "sealed traits" could work – a way to mark a trait as not being implementable outside of the crate it's defined in, which would then allow more relaxed coherence rules.

Trait specialization is coming around as part of the 2019 roadmap; perhaps it might be the tool at hand that we use to make this work when it comes time to implementation?

(Chalk should be able to reason that either the sentinel unit type does implement the type list trait, and thus its implementation unambiguously specializes the type list trait's, or that it doesn't, and thus there is no overlap. In either case, the types as declared are okayed by the type checker, and the trait implementation written specifically for the sentinel unit type applies to that type.)

@comex

This comment has been minimized.

Copy link

commented May 27, 2019

Specialization could work for some things, but as it stands today it has limitations. If you write <Foo as Trait>::Bar, the compiler will usually 'normalize' the projection by replacing it with whatever Bar is defined as in whichever Trait impl it finds for Foo. But if that impl only defines a default type, the compiler intentionally leaves it un-normalized. This is meant to preserve forwards compatibility with hypothetical specialized impls being added in the future; unfortunately, it also makes it impossible to do type-level computation with it, which severely limits what you can accomplish. In other words, if you ask "does a specialized impl apply", the compiler can only answer "yes" and "maybe", not "no". Perhaps this could be improved in the future, but I'm not exactly sure how.

...But in any case, the post you quoted wasn't about conflicts between nonempty tuples and (), but about conflicts in downstream crates between impls of the same trait for tuples and for unrelated types, because the compiler doesn't want to rule out that you could impl Tuple for UnrelatedType. Specialization could serve as a workaround there too, but it's annoying to force on downstream crates, and for some use cases you'd need the (proposed but not implemented) "lattice rule".

pub elem: ELEM,
pub tail: TAIL,
}
```

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 29, 2019

Contributor

Is this compatible with the layout specified in the unsafe code guidelines?

an anonymous tuple type (T1..Tn) of arity N is laid out "as if" there were a corresponding tuple struct declared in libcore:

#[repr(Rust)]
struct TupleN<P1..Pn:?Sized>(P1..Pn);

Note that this specifies that the layout of a tuple does not match that of the struct this RFC proposes, but of a tuple-struct.

This allows the compiler to perform some optimizations, like field re-ordering, but this RFC does not mention anything about this particular trade-off. I'd recommend scanning the UCGs repo for trade-offs and extending the RFC with how would this change affect those.

@frehberg

This comment has been minimized.

Copy link

commented May 29, 2019

Nice! I must admit, I implemented a "sequence" crate https://crates.io/crates/seq, but never thought about it as a generalization of "tuples".

As this proposed idea of "tuple" is very close to the head/tail concept known form functional languages, would the proposed tuple-concept in rust also permit to extend an N-tuple, prepending another element to it and forming an (N+1)-tuple, formed by head and a tail, where the tail would reference the N-tuple?

@matthieu-m

This comment has been minimized.

Copy link

commented May 29, 2019

Back in the days, prior to C++11, Boost.Tuple was developed using a similar ConsList strategy.

It unlocked a number of usecases, however it also had a number of drawbacks:

  • Template instantiation recursion limit would pop up relatively frequently, as all algorithms had to be expressed as recursions with accumulator.
  • Said template instantiation recursion limit was a hint that compile-time were suffering, and they did.
  • Said recursive algorithms with accumulator were not really straightforward to write, debug or read.

The addition of variadic templates to C++11 was a very welcome change. It did not clean up all the cruft, but it did sped up compilation and lift quite a few limitations.


This modest proposal is interesting, from a hacker/minimalist POV, however I cannot help but think back to my C++ experience and wonder:

  • Is this proposal not going to suffer from the same drawbacks that Boost.Tuple did?
  • Is this proposal going to be sufficient, or will it have to be replaced by a better handling of variadic generics?
  • If this proposal will have to be replaced anyway, is the intermediate step necessary or should we just go with the next step immediately?
@eaglgenes101

This comment has been minimized.

Copy link

commented May 29, 2019

One could layer an inductive destructuring mechanism over such a base, but then with the inductive destructuring mechanism, the whole part of the inductive concrete tuple definition made redundant, and can be replaced by methods that inductively destructure to references, mutable references, or moved values.

@spunit262

This comment has been minimized.

Copy link

commented May 30, 2019

I don't have any experience with C++11's variadic templates, but @matthieu-m's comment reminded me of D's variadic templates and static foreach.
So what do people think of this quick mock up?

impl<T @ (..: Debug, _: ?Sized + Debug,)> Debug for T {
    fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
        let mut builder = f.debug_tuple("");
        static for v in self {
            builder.field(&v);
        }
        builder.finish()
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.