New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft RFC: variadic generics #376

Open
rust-highfive opened this Issue Oct 8, 2014 · 54 comments

Comments

Projects
None yet
@rust-highfive

Issue by eddyb
Monday Oct 28, 2013 at 17:39 GMT

For earlier discussion, see rust-lang/rust#10124

This issue was labelled with: B-RFC in the Rust repository


The Problem

  • fn types need a reform, and being able to define a trait with a variable number of type parameters would help
  • working with functions which have a variable number of arguments is impossible right now (e.g. a generic bind method, f.bind(a, b)(c) == f(a, b, c)) and defining such functions may only be done (in a limited fashion) with macros
  • tuples also have similar restrictions right now, there is no way to define a function which takes any tuple and returns the first element, for example

The Solution: Part One

C++11 sets a decent precedent, with its variadic templates, which can be used to define type-safe variadic functions, among other things.
I propose a similar syntax, a trailing ..T in generic formal type parameters:

type Tuple<..T> = (..T); // the parens here are the same as the ones around a tuple.
// use => expansion
Tuple<> => ()
Tuple<int> => (int,)
Tuple<A, B, C> => (A, B, C)

The simple example above only uses ..T, but not T itself.
The question which arises is this: what is T? C++11 has a special case for variadic parameter packs, but we can do better.
We have tuples. We can use them to store the actual variadic generic type parameters:

// Completely equivalent definition:
type Tuple<..T> = T;
// Detailed expansion of (..T) from above:
Tuple<> => (..()) => ()
Tuple<int> => (..(int,)) => (int,)
Tuple<A, B, C> => (..(A, B, C)) => (A, B, C)

The Solution: Part Two

Now that we know the answer is "tuples", everything else is about extending them.
The prefix .. operator would expand a tuple type:

// in another tuple type:
type AppendTuple<T, U> = (..T, U);
type PrependTuple<T, U> = (U, ..T);
AppendTuple<PrependTuple<Tuple<B>, A>, C> => (A, B, C)
// in a fn type's parameter list:
type VoidFn<..Args> = fn(..Args);
VoidFn<A, B, C> => fn(A, B, C);
// in a variadic generic parameter list:
Tuple<..(A, B, C), ..(D, E, F)> => (A, B, C, D, E, F)

Then we can do the same thing with values:

// in another tuple:
(..(true, false), ..(1, "foo"), "bar") => (true, false, 1, "foo", "bar")
// in a function call:
f(..(a, b), ..(c, d, e), f) => f(a, b, c, d, e, f)

There's only one piece missing: we're still not able to define a function which takes a variable number of arguments.
For this, I propose ..x: T (where T is a tuple type) in a pattern, which can be used to "capture" multiple arguments when used in fn formal arguments:

// Destructure a tuple.
let (head, ..tail) = foo;
// This function has the type fn(int, A, B, C), and can be called as such:
fn bar(_: int, ..x: (A, B, C)) -> R {...}

A type bound for ..T (i.e. impl<..T: Trait>) could mean that the tuple T has to satisfy the bound (or each type in T, but that's generally less useful).

Examples:

// The highly anticipated Fn trait.
trait Fn<Ret, ..Args> {
    fn call(self, .._: Args) -> Ret;
}
impl Fn<Ret, ..Args> for fn(..Args) -> Ret {
    fn call(self, ..args: Args) -> Ret {
        self(..args)
    }
}
impl Fn<Ret, ..Args> for |..Args| -> Ret {
    fn call(self, ..args: Args) -> Ret {
        self(..args)
    }
}

// Cloning tuples (all cons-list-like algorithms can be implemented this way).
impl<Head, ..Tail: Clone> Clone for (Head, ..Tail) {
    fn clone(&self @ &(ref head, ..ref tail)) -> (Head, ..Tail) {
        (head.clone(), ..tail.clone())
    }
}
impl Clone for () {
    fn clone(&self) -> () {
        ()
    }
}

// Restricting all types in a variadic tuple to one type.
trait ArrayLikeTuple<T> {
    fn len() -> uint;
}
impl<T> ArrayLikeTuple<T> for () {
    fn len() -> uint {0}
}
impl<T, ..Tail: ArrayLikeTuple<T>> ArrayLikeTuple<T> for (T, ..Tail) {
    fn len() -> uint {
        1 + Tail::len()
    }
}

// And using that trait to write variadic container constructors.
impl<T> Vec<T> {
    fn new<..Args: ArrayLikeTuple<T>>(..args: Args) -> Vec<T> {
        let v: Vec<T> = Vec::with_capacity(Args::len());
        // Use an unsafe move_iter-like pattern to move all the args
        // directly into the newly allocated vector. 
        // Alternatively, implement &move [T] and move_iter on that.
    }
}

// Zipping tuples, using a `Tuple` kind and associated items.
trait TupleZip<Other>: Tuple {
    type Result: Tuple;
    fn zip(self, other: Other) -> Result;
}
impl TupleZip<()> for () {
    type Result = ();
    fn zip(self, _: ()) -> () {}
}
impl<A, ATail: TupleZip<BTail>, B, BTail: Tuple> TupleZip<(B, ..BTail)> for (A, ..ATail) {
    type Result = ((A, B), ..ATail::Result);
    fn zip(self @ (a, ..a_tail), (b, ..b_tail): (B, ..BTail)) -> Result {
        ((a, b), ..a_tail.zip(b_tail))
    }
}

Todo

  • better formal descriptions
  • figure out the details of pattern matching
  • more examples

@aturon aturon changed the title from [CLOSED] RFC: variadic generics to Draft RFC: variadic generics Oct 8, 2014

@aturon aturon referenced this issue Oct 8, 2014

Open

Array splicing #374

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Oct 9, 2014

Member

There is one issue with my initial approach: a reference to a "subtuple" of a larger tuple could have different padding than the tuple type made out of those fields.

One solution I may have is a &(ref head, ..ref tail) pattern turning &(A, B, C, D) into &A and (&B, &C, &D), i.e. a subtuple of references instead of a reference of a subtuple.
But I have no idea how inference or generics would work with it.

Member

eddyb commented Oct 9, 2014

There is one issue with my initial approach: a reference to a "subtuple" of a larger tuple could have different padding than the tuple type made out of those fields.

One solution I may have is a &(ref head, ..ref tail) pattern turning &(A, B, C, D) into &A and (&B, &C, &D), i.e. a subtuple of references instead of a reference of a subtuple.
But I have no idea how inference or generics would work with it.

@alexchandel

This comment has been minimized.

Show comment
Hide comment
@alexchandel

alexchandel Jan 23, 2015

This is probably a prerequisite for taking the Cartesian product of iterators without a macro, like in rust-itertools/iproduct! and bluss/rust-itertools#10.

This is probably a prerequisite for taking the Cartesian product of iterators without a macro, like in rust-itertools/iproduct! and bluss/rust-itertools#10.

@Diggsey

This comment has been minimized.

Show comment
Hide comment
@Diggsey

Diggsey Jan 24, 2015

Contributor

This should be expanded to also support using the ".." operator on fixed size arrays (behaves equivalent to a tuple where the elements are all the same type and the arity matches the length of the array).

In combination with #174 this would make it much easier to deal with variadic arguments of the same type. Something which even C++ is still missing.

Contributor

Diggsey commented Jan 24, 2015

This should be expanded to also support using the ".." operator on fixed size arrays (behaves equivalent to a tuple where the elements are all the same type and the arity matches the length of the array).

In combination with #174 this would make it much easier to deal with variadic arguments of the same type. Something which even C++ is still missing.

@alexchandel

This comment has been minimized.

Show comment
Hide comment
@alexchandel

alexchandel Feb 4, 2015

This would need to be compatible with range syntax.

This would need to be compatible with range syntax.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Feb 27, 2015

Contributor

How would one implement the cartesian product of two type tuples with this?

Contributor

gnzlbg commented Feb 27, 2015

How would one implement the cartesian product of two type tuples with this?

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Feb 27, 2015

Member

@gnzlbg what exactly do you have in mind, Product<(A, B), (X, Y)> = ((A, X), (A, Y), (B, X), (B, Y))?

Member

eddyb commented Feb 27, 2015

@gnzlbg what exactly do you have in mind, Product<(A, B), (X, Y)> = ((A, X), (A, Y), (B, X), (B, Y))?

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Feb 27, 2015

Contributor

@eddyb yes, but with variadics.

Contributor

gnzlbg commented Feb 27, 2015

@eddyb yes, but with variadics.

@goertzenator

This comment has been minimized.

Show comment
Hide comment
@goertzenator

goertzenator Feb 27, 2015

Having wrestled with c++11 templates and Rust macros, I really like this RFC. I hope it comes to fruition sooner rather than later.

Having wrestled with c++11 templates and Rust macros, I really like this RFC. I hope it comes to fruition sooner rather than later.

@oblitum

This comment has been minimized.

Show comment
Hide comment
@oblitum

oblitum Feb 27, 2015

Me too, I really wished this to come before 1.0 and solved tuple impl etc.

oblitum commented Feb 27, 2015

Me too, I really wished this to come before 1.0 and solved tuple impl etc.

@dgrunwald

This comment has been minimized.

Show comment
Hide comment
@dgrunwald

dgrunwald Feb 27, 2015

Contributor

Unfortunately this syntax doesn't work, because the .. operator is already taken by range syntax. We could avoid this problem by using triple dots like C++. Which are also already taken (by inclusive range syntax), but only as infix operator and only in pattern syntax (though there were some requests to allow triple dots in expressions, too).

For the general idea: 👍, I was imagining variadic generics using pretty much the same approach. However, the devil is in the details that aren't yet worked out. C++ resolves expressions involving parameter pack expansion only after monomorphization; but I assume Rust would want to type-check the generic code? Wouldn't that require restrictions on where the unpacking operator is used? Also, consuming variadics in C++ usually involves a recursive function using template specialization. With this proposal, it seems like you'd need to create a trait for every variadic function, so that you can "specialize" using trait impls as in the Clone example?

Also, the ... operator in C++ allows "unpacking" complex expressions, not just directly unpacking the parameter pack. For example, in C++11 I can do:

template <typename... T> void consumer(T... t) { ... }

template <typename... T> void adapter(T... t)
{
   consumer(t.adapt()...);
   // expands to:
   // consumer(t1.adapt(), t2.adapt(), t3.adapt(), ...);
}

What's the easiest way to write the 3-line adapter function in Rust with this RFC? Do I have to write recursive Adapt implementations like the Clone impls in the example?

As for Vec::new: I think in cases where we don't actually need a type parameter pack, but just a variadic function with a single parameter type, it might be better to support something like this:
fn new(..elements: [T]) -> Vec<T> { ... }
This requires being able to pass unsized types as parameters - but that seems doable (and might already be planned?).

Contributor

dgrunwald commented Feb 27, 2015

Unfortunately this syntax doesn't work, because the .. operator is already taken by range syntax. We could avoid this problem by using triple dots like C++. Which are also already taken (by inclusive range syntax), but only as infix operator and only in pattern syntax (though there were some requests to allow triple dots in expressions, too).

For the general idea: 👍, I was imagining variadic generics using pretty much the same approach. However, the devil is in the details that aren't yet worked out. C++ resolves expressions involving parameter pack expansion only after monomorphization; but I assume Rust would want to type-check the generic code? Wouldn't that require restrictions on where the unpacking operator is used? Also, consuming variadics in C++ usually involves a recursive function using template specialization. With this proposal, it seems like you'd need to create a trait for every variadic function, so that you can "specialize" using trait impls as in the Clone example?

Also, the ... operator in C++ allows "unpacking" complex expressions, not just directly unpacking the parameter pack. For example, in C++11 I can do:

template <typename... T> void consumer(T... t) { ... }

template <typename... T> void adapter(T... t)
{
   consumer(t.adapt()...);
   // expands to:
   // consumer(t1.adapt(), t2.adapt(), t3.adapt(), ...);
}

What's the easiest way to write the 3-line adapter function in Rust with this RFC? Do I have to write recursive Adapt implementations like the Clone impls in the example?

As for Vec::new: I think in cases where we don't actually need a type parameter pack, but just a variadic function with a single parameter type, it might be better to support something like this:
fn new(..elements: [T]) -> Vec<T> { ... }
This requires being able to pass unsized types as parameters - but that seems doable (and might already be planned?).

@alexchandel

This comment has been minimized.

Show comment
Hide comment
@alexchandel

alexchandel Mar 3, 2015

Why not the dereference operator?

type AppendTuple<T, U> = (*T, U);
f(*(a, b), *(c, d, e), f) => f(a, b, c, d, e, f);

Why not the dereference operator?

type AppendTuple<T, U> = (*T, U);
f(*(a, b), *(c, d, e), f) => f(a, b, c, d, e, f);
@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Mar 3, 2015

Contributor

@dgrunwald :

Also, consuming variadics in C++ usually involves a recursive function using template specialization.

In C++1y there are fold-expressions that allow you to do a lot of things without recursion:

template<typename... Args>
bool all(Args... args) { return (... && args); }

bool b = all(true, true, true, false);

but I assume Rust would want to type-check the generic code? Wouldn't that require restrictions on where the unpacking operator is used?

You can easily type check that all parameters in the parameter pack implement a given trait. If one needs more complicated type-checking a variadic function is likely not the best solution.

Contributor

gnzlbg commented Mar 3, 2015

@dgrunwald :

Also, consuming variadics in C++ usually involves a recursive function using template specialization.

In C++1y there are fold-expressions that allow you to do a lot of things without recursion:

template<typename... Args>
bool all(Args... args) { return (... && args); }

bool b = all(true, true, true, false);

but I assume Rust would want to type-check the generic code? Wouldn't that require restrictions on where the unpacking operator is used?

You can easily type check that all parameters in the parameter pack implement a given trait. If one needs more complicated type-checking a variadic function is likely not the best solution.

@oblitum

This comment has been minimized.

Show comment
Hide comment
@oblitum

oblitum Mar 4, 2015

@gnzlbg that's C++1z actually.

oblitum commented Mar 4, 2015

@gnzlbg that's C++1z actually.

@alexchandel

This comment has been minimized.

Show comment
Hide comment
@alexchandel

alexchandel Mar 5, 2015

The syntax between the value-level and type-level languages should be kept as close as possible. Already the + operator means the intersection of traits at the type level, even though * (or &) more accurately conveys a lattice meet.

@gnzlbg I don't think that form of duck-typing is compatible with Rust's trait-based generics.

Rather than having special fold syntax, it's cleaner to design some system compatible with HKTs for constraining the tuple's types, and then iterate over a tuple for fold any other operation:

fn fold_variadic<*T> (x: T) -> u32
    where *T: BitAnd<u32>
{
     x.iter().fold(!0u32, |b, a| b & a)
}

fold_variadic(0xFFF0, 0x0FFF) == 0x0FF0

In the above example, T is a tuple of variadic length, and * is a currying operator. This makes intuitive sense once you realize that type arguments are effectively pattern-matched in variadic generics. Consider the partial signature fold_variadic::<*T> and the invocation fold_variadic::<u32, u32, u32>, and notice how T is bound:

<*T> = <u32, u32, u32>
*T = u32, u32, u32

The curried/destructured tuple is matched to u32, u32, u32, so the tuple is (u32, u32, u32).

The interaction of a curried tuple with +, =, and :, as below, should also be specified.

where *T: BitAnd<u32>
// or
where *T = u32

A sensible semantic is distribution over the curried form. *T: BitAnd<u32> means every type in the tuple implements BitAnd<u32>, *T = u32 means every type in the (variable-length) tuple is u32.

The syntax between the value-level and type-level languages should be kept as close as possible. Already the + operator means the intersection of traits at the type level, even though * (or &) more accurately conveys a lattice meet.

@gnzlbg I don't think that form of duck-typing is compatible with Rust's trait-based generics.

Rather than having special fold syntax, it's cleaner to design some system compatible with HKTs for constraining the tuple's types, and then iterate over a tuple for fold any other operation:

fn fold_variadic<*T> (x: T) -> u32
    where *T: BitAnd<u32>
{
     x.iter().fold(!0u32, |b, a| b & a)
}

fold_variadic(0xFFF0, 0x0FFF) == 0x0FF0

In the above example, T is a tuple of variadic length, and * is a currying operator. This makes intuitive sense once you realize that type arguments are effectively pattern-matched in variadic generics. Consider the partial signature fold_variadic::<*T> and the invocation fold_variadic::<u32, u32, u32>, and notice how T is bound:

<*T> = <u32, u32, u32>
*T = u32, u32, u32

The curried/destructured tuple is matched to u32, u32, u32, so the tuple is (u32, u32, u32).

The interaction of a curried tuple with +, =, and :, as below, should also be specified.

where *T: BitAnd<u32>
// or
where *T = u32

A sensible semantic is distribution over the curried form. *T: BitAnd<u32> means every type in the tuple implements BitAnd<u32>, *T = u32 means every type in the (variable-length) tuple is u32.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Mar 5, 2015

Contributor

@oblitum you are correct.
@alexchandel my previous post was just to show that in future C++ you won't need template specialization to solve this problems in c++.

A recurring task in C++ meta-programming is zipping a tuple of values with a tuple of types (tags, or std::integral_constants). I wonder how this could be done in Rust.

Contributor

gnzlbg commented Mar 5, 2015

@oblitum you are correct.
@alexchandel my previous post was just to show that in future C++ you won't need template specialization to solve this problems in c++.

A recurring task in C++ meta-programming is zipping a tuple of values with a tuple of types (tags, or std::integral_constants). I wonder how this could be done in Rust.

@Vikaton

This comment has been minimized.

Show comment
Hide comment
@Vikaton

Vikaton Dec 22, 2015

Any update on this?

Vikaton commented Dec 22, 2015

Any update on this?

@semirix

This comment has been minimized.

Show comment
Hide comment
@semirix

semirix Dec 24, 2015

@Vikaton +1 I would be really interested to know as well. This is the only feature that I miss from C++.

semirix commented Dec 24, 2015

@Vikaton +1 I would be really interested to know as well. This is the only feature that I miss from C++.

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Jan 9, 2016

Contributor

How will this interact with type level numerals (in particular, polymorphism over arrays of size N)?

Contributor

ticki commented Jan 9, 2016

How will this interact with type level numerals (in particular, polymorphism over arrays of size N)?

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew Apr 23, 2016

Contributor

There's been lots more discussion on this over on #1582 but I'll try to summarise here.

The main problem with all of these RFCs is the problem of memory layout for tuples. In general, in current Rust, a tuple of the form (A, B, C) does not contain a tuple of the form (B, C) anywhere in its representation. This means if you have an (A, B, C) you can't take a reference to the "sub-tuple" (B, C) because it doesn't exist. For example, if (A, B, C) is the tuple (u16, u16, u32) it may be represented as:

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|---------|---------|-------------------|
| Data | A (u16) | B (u16) |      C (u32)      |
------------------------------------------------

By comparison, a tuple (B, C) == (u16, u32) may be represented as.

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|---------|---------|-------------------|
| Data | B (u16) | padding |      C (u32)      |
------------------------------------------------

Note the different position of B.

There are two proposed solutions to this:

  • Don't allow references to "sub-tuples" of tuples. Instead, when binding part of a tuple by reference, bind it as a tuple of references. This solves the problem but has the potential to be confusing. Under this scheme let (head, ...tail) = (0, 1, 2) will give you head == 0, tail == (1, 2) but let (ref head, ...ref tail) = (0, 1, 2) will give you head == &0, tail == (&1, &2) rather than tail == &(1, 2).
  • Only allow expanding tuples into the tail (or alternatively the prefix) of a larger tuple. This means that (a, b, c, ...d) would be valid but using ... anywhere else would not. Eg. these would not be valid: (a, b, ...c, d), (...a, b, c, d), (a, b, ...c, ...d) etc. This still allows you to define anything you could without this restriction, it just becomes a lot more cumbersome for some usecases. Eg. instead of being able to write (...a, b) you'd have to write this. Additionally, this approach requires that one of two changes are made to tuple layouts. The first option is to pack tuples less space-efficiently so that (A, B, C) always contains a (B, C). For example (u16, u16, u32) would have to be packed as
--------------------------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 |
|------|---------|---------|---------|---------|--------------------
| Data | A (u16) | padding | B (u16) | padding |      C (u32)      |
--------------------------------------------------------------------

The second option is to accept RFC #1397 and reverse the layout order of tuples so that (u16, u16, u32) would look like

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|-------------------|---------|---------|
| Data |      C (u32)      | B (u16) | A (u16) |
------------------------------------------------

And its tail (u16, u32) would look like

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|-------------------|---------|---------|
| Data |      C (u32)      | B (u16) | padding |
------------------------------------------------

This comes at the cost of disallowing the compiler from generating code which overwrites padding at the end of tuples as the "padding" may actually contain the next element of a larger tuple.

Contributor

canndrew commented Apr 23, 2016

There's been lots more discussion on this over on #1582 but I'll try to summarise here.

The main problem with all of these RFCs is the problem of memory layout for tuples. In general, in current Rust, a tuple of the form (A, B, C) does not contain a tuple of the form (B, C) anywhere in its representation. This means if you have an (A, B, C) you can't take a reference to the "sub-tuple" (B, C) because it doesn't exist. For example, if (A, B, C) is the tuple (u16, u16, u32) it may be represented as:

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|---------|---------|-------------------|
| Data | A (u16) | B (u16) |      C (u32)      |
------------------------------------------------

By comparison, a tuple (B, C) == (u16, u32) may be represented as.

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|---------|---------|-------------------|
| Data | B (u16) | padding |      C (u32)      |
------------------------------------------------

Note the different position of B.

There are two proposed solutions to this:

  • Don't allow references to "sub-tuples" of tuples. Instead, when binding part of a tuple by reference, bind it as a tuple of references. This solves the problem but has the potential to be confusing. Under this scheme let (head, ...tail) = (0, 1, 2) will give you head == 0, tail == (1, 2) but let (ref head, ...ref tail) = (0, 1, 2) will give you head == &0, tail == (&1, &2) rather than tail == &(1, 2).
  • Only allow expanding tuples into the tail (or alternatively the prefix) of a larger tuple. This means that (a, b, c, ...d) would be valid but using ... anywhere else would not. Eg. these would not be valid: (a, b, ...c, d), (...a, b, c, d), (a, b, ...c, ...d) etc. This still allows you to define anything you could without this restriction, it just becomes a lot more cumbersome for some usecases. Eg. instead of being able to write (...a, b) you'd have to write this. Additionally, this approach requires that one of two changes are made to tuple layouts. The first option is to pack tuples less space-efficiently so that (A, B, C) always contains a (B, C). For example (u16, u16, u32) would have to be packed as
--------------------------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 |
|------|---------|---------|---------|---------|--------------------
| Data | A (u16) | padding | B (u16) | padding |      C (u32)      |
--------------------------------------------------------------------

The second option is to accept RFC #1397 and reverse the layout order of tuples so that (u16, u16, u32) would look like

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|-------------------|---------|---------|
| Data |      C (u32)      | B (u16) | A (u16) |
------------------------------------------------

And its tail (u16, u32) would look like

------------------------------------------------
| Byte | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
|------|-------------------|---------|---------|
| Data |      C (u32)      | B (u16) | padding |
------------------------------------------------

This comes at the cost of disallowing the compiler from generating code which overwrites padding at the end of tuples as the "padding" may actually contain the next element of a larger tuple.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 25, 2016

Contributor

IIUC what you want is to be able to slice (A,B,C) into (B,C). I have some questions:

  • Why/when isn't (&B, &C) enough? Is there a real use case for this? We cannot do this for structs, so why should we be able to do this for tuples?
  • Why does the language have to define the layout of tuples? I would rather have it be "implementation defined" so that the implementation can reorder the elements of a tuple to minimize padding. Some C++ implementations of tuple actually do this [*]. We can, like for structs, use #[repr(C)] to get a specific layout.

I think it would be great to be able to slice (A,B,C) into (B,C) and still have (A,B,C) with a perfectly packed memory layout but I don't see how we could achieve both things.

[*]: https://rmf.io/cxx11/optimal-tuple-i/, https://rmf.io/cxx11/optimal-tuple-ii/, https://rmf.io/cxx11/optimal-tuple-iii/ , https://rmf.io/cxx11/optimal-tuple-iv/

Contributor

gnzlbg commented Apr 25, 2016

IIUC what you want is to be able to slice (A,B,C) into (B,C). I have some questions:

  • Why/when isn't (&B, &C) enough? Is there a real use case for this? We cannot do this for structs, so why should we be able to do this for tuples?
  • Why does the language have to define the layout of tuples? I would rather have it be "implementation defined" so that the implementation can reorder the elements of a tuple to minimize padding. Some C++ implementations of tuple actually do this [*]. We can, like for structs, use #[repr(C)] to get a specific layout.

I think it would be great to be able to slice (A,B,C) into (B,C) and still have (A,B,C) with a perfectly packed memory layout but I don't see how we could achieve both things.

[*]: https://rmf.io/cxx11/optimal-tuple-i/, https://rmf.io/cxx11/optimal-tuple-ii/, https://rmf.io/cxx11/optimal-tuple-iii/ , https://rmf.io/cxx11/optimal-tuple-iv/

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 25, 2016

Contributor

Meaning if the trade-off is between sugar for a particular use case (slicing (A,B,C) into (B,C) instead of just (&B,&C)) or a zero-cost abstraction (a perfectly packed tuple) I would chose the zero-cost abstraction any day.

Contributor

gnzlbg commented Apr 25, 2016

Meaning if the trade-off is between sugar for a particular use case (slicing (A,B,C) into (B,C) instead of just (&B,&C)) or a zero-cost abstraction (a perfectly packed tuple) I would chose the zero-cost abstraction any day.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Apr 25, 2016

Contributor

FWIW the last option mentioned by @canndrew, involving #1397, seems like clearly the cleanest and nicest design to me. The only big question mark hanging over it in my mind is the backwards (in)compatibility impact of #1397 on already-existing unsafe code that uses size_of where only stride would now be correct.

Contributor

glaebhoerl commented Apr 25, 2016

FWIW the last option mentioned by @canndrew, involving #1397, seems like clearly the cleanest and nicest design to me. The only big question mark hanging over it in my mind is the backwards (in)compatibility impact of #1397 on already-existing unsafe code that uses size_of where only stride would now be correct.

@bluss

This comment has been minimized.

Show comment
Hide comment
@bluss

bluss Apr 27, 2016

I want to crosslink you to the fresh RFC clarification issue rust-lang/rfcs/pull/1592 because the commitment to having a specific position in the tuple possible to be unsized is relevant to the tuple representation discussion here.

bluss commented Apr 27, 2016

I want to crosslink you to the fresh RFC clarification issue rust-lang/rfcs/pull/1592 because the commitment to having a specific position in the tuple possible to be unsized is relevant to the tuple representation discussion here.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Apr 27, 2016

Contributor

I don't think variadic generics should care about layouts much based on some observations from C++ :

  1. A lot of operations with variadics happen at the type level, where layout doesn't matter. E.g. tuple types (A, B, C) can be sliced, combined and transformed regardless of their layouts or representability. In this sense #1592 seems unfortunate - it means you can't represent an arbitrary type list with a tuple type.

  2. The main use case for variadics - function arguments - don't need any layout guarantees. So function arguments can be represented with tuples at the type level, but they don't need to be kept as a tuple in memory. In this sense #1592 seems unfortunate again, because with expected ability to pass unsized types to functions by value a function fn([u8], [u8]) will be well formed, but its argument list will not be representable as a tuple type ([u8], [u8]).

  3. In C++ parameter packs (aka type lists, aka Rust tuple types) are type level constructs, but they can be materialized into values with two main instruments - tuples std::tuple<Types...> and closures with explicit captures [values...]{}. Both std::tuples and anonymous closure structs have unspecified layouts, but for some reason it seems to not bother anyone, because taking sub-tuples by references doesn't seem to be especially useful. As for tuples of references suggested above, there's even a standard facility std::forward_as_tuple materializing a tuple value (&A, &B, &C) from any value list, not necessarily being a tuple itself.

Sorry for random thoughts, this really needs a more thorough investigation with examples from something like https://github.com/boostorg/hana

Contributor

petrochenkov commented Apr 27, 2016

I don't think variadic generics should care about layouts much based on some observations from C++ :

  1. A lot of operations with variadics happen at the type level, where layout doesn't matter. E.g. tuple types (A, B, C) can be sliced, combined and transformed regardless of their layouts or representability. In this sense #1592 seems unfortunate - it means you can't represent an arbitrary type list with a tuple type.

  2. The main use case for variadics - function arguments - don't need any layout guarantees. So function arguments can be represented with tuples at the type level, but they don't need to be kept as a tuple in memory. In this sense #1592 seems unfortunate again, because with expected ability to pass unsized types to functions by value a function fn([u8], [u8]) will be well formed, but its argument list will not be representable as a tuple type ([u8], [u8]).

  3. In C++ parameter packs (aka type lists, aka Rust tuple types) are type level constructs, but they can be materialized into values with two main instruments - tuples std::tuple<Types...> and closures with explicit captures [values...]{}. Both std::tuples and anonymous closure structs have unspecified layouts, but for some reason it seems to not bother anyone, because taking sub-tuples by references doesn't seem to be especially useful. As for tuples of references suggested above, there's even a standard facility std::forward_as_tuple materializing a tuple value (&A, &B, &C) from any value list, not necessarily being a tuple itself.

Sorry for random thoughts, this really needs a more thorough investigation with examples from something like https://github.com/boostorg/hana

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 27, 2016

Member

std::forward_as_tuple is pretty much the original solution I had in mind for the reference problem.
I agree that not being able to use tuples as type lists because of unsized types is a problem.
cc @arielb1 @nikomatsakis

Member

eddyb commented Apr 27, 2016

std::forward_as_tuple is pretty much the original solution I had in mind for the reference problem.
I agree that not being able to use tuples as type lists because of unsized types is a problem.
cc @arielb1 @nikomatsakis

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Apr 28, 2016

Contributor

On the other hand, variadics should work with lifetime lists too and tuples can't represent them unless some transformation, like ('a, 'b, 'c) <=> (&'a (), &'b (), &'c ()), is built into the language.

So maybe it still may make sense to make variadic lists separate tuple-like entities and not tuples, and leave tuples alone with their layout and representability problems. It's probably one of the most important choices to make here.

Contributor

petrochenkov commented Apr 28, 2016

On the other hand, variadics should work with lifetime lists too and tuples can't represent them unless some transformation, like ('a, 'b, 'c) <=> (&'a (), &'b (), &'c ()), is built into the language.

So maybe it still may make sense to make variadic lists separate tuple-like entities and not tuples, and leave tuples alone with their layout and representability problems. It's probably one of the most important choices to make here.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Apr 28, 2016

Contributor

@petrochenkov I agree with all of those premises (re: the earlier comment), but differ about the conclusion.

C++ variadic generics (as is C++'s wont) are both very powerful and... very ugly. Lifting the high-level design for this kind of advanced feature from C++ simply because it's the only prior art we've familiarized ourselves with would make me kind of sad. Figuring out the right variadics generics design for Rust, so that it's as powerful as we need it to be but also integrates better with our type system, needs research and careful design work.

On the other hand, #1582 seems like a quite simple and fairly "obvious" solution for the narrower use case of just working with tuples of different arity, modulo the questions it raises about tuples' representation, and would solve many of the most common use cases for variadic generics as well. I can't really think of any other way it could sensibly be designed, nor any major reason we would want to. Even if or when we later implement "full" variadic generics, I don't think we'd really want it to be any different.

So while my feeling about variadic generics is "we need to approach this very carefully", my feeling about a design for tuple-destructuring like #1582 is "cool, let's do it, why not!".

(Again, as long as we can settle on a satisfactory solution for the representation issue...)

Contributor

glaebhoerl commented Apr 28, 2016

@petrochenkov I agree with all of those premises (re: the earlier comment), but differ about the conclusion.

C++ variadic generics (as is C++'s wont) are both very powerful and... very ugly. Lifting the high-level design for this kind of advanced feature from C++ simply because it's the only prior art we've familiarized ourselves with would make me kind of sad. Figuring out the right variadics generics design for Rust, so that it's as powerful as we need it to be but also integrates better with our type system, needs research and careful design work.

On the other hand, #1582 seems like a quite simple and fairly "obvious" solution for the narrower use case of just working with tuples of different arity, modulo the questions it raises about tuples' representation, and would solve many of the most common use cases for variadic generics as well. I can't really think of any other way it could sensibly be designed, nor any major reason we would want to. Even if or when we later implement "full" variadic generics, I don't think we'd really want it to be any different.

So while my feeling about variadic generics is "we need to approach this very carefully", my feeling about a design for tuple-destructuring like #1582 is "cool, let's do it, why not!".

(Again, as long as we can settle on a satisfactory solution for the representation issue...)

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 28, 2016

Member

@glaebhoerl As long as we punt on taking a reference to a sub-tuple, the representation issue is a non-issue IMO. Also, if you've seen my prototype you'll note that the compiler support needed is pretty small.

I expect splitting operations to be done with associated types, at least initially, but spreading a tuple type in a type list (or a tuple value in a value list) can be there from the start.
I'd like to add the necessary infrastructure even if only to desugar the "rust-call" madness into something that MIR can actually optimize.

Member

eddyb commented Apr 28, 2016

@glaebhoerl As long as we punt on taking a reference to a sub-tuple, the representation issue is a non-issue IMO. Also, if you've seen my prototype you'll note that the compiler support needed is pretty small.

I expect splitting operations to be done with associated types, at least initially, but spreading a tuple type in a type list (or a tuple value in a value list) can be there from the start.
I'd like to add the necessary infrastructure even if only to desugar the "rust-call" madness into something that MIR can actually optimize.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Apr 28, 2016

Contributor

@glaebhoerl

C++ variadic generics (as is C++'s wont) are both very powerful and... very ugly. Lifting the high-level design for this kind of advanced feature from C++ simply because it's the only prior art we've familiarized ourselves with would make me kind of sad.

I certainly wouldn't like to copy the design from C++ as is, it has its own problems - parameter packs are not first class enough, they cannot be indexed, sliced or aliased, everything has to be done through recursion which is ugly/inconvenient and causes long compile times (Clang even had to introduce special intrinsics to reduce compile times for std::tuple and friends), all these problems can be solved by making packs similar to Rust's built-in tuples, but C++ doesn't have built-in tuples.

Contributor

petrochenkov commented Apr 28, 2016

@glaebhoerl

C++ variadic generics (as is C++'s wont) are both very powerful and... very ugly. Lifting the high-level design for this kind of advanced feature from C++ simply because it's the only prior art we've familiarized ourselves with would make me kind of sad.

I certainly wouldn't like to copy the design from C++ as is, it has its own problems - parameter packs are not first class enough, they cannot be indexed, sliced or aliased, everything has to be done through recursion which is ugly/inconvenient and causes long compile times (Clang even had to introduce special intrinsics to reduce compile times for std::tuple and friends), all these problems can be solved by making packs similar to Rust's built-in tuples, but C++ doesn't have built-in tuples.

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew Apr 28, 2016

Contributor

We're conflating two different concepts here when talking about using tuples to represent lists of types. There's "tuple types", and then there's "a tuple of types". It may be that (str, str) is not a valid tuple type, but it's a valid tuple of types.

To make this clearer, imagine we're in a dependently typed language where types can be used as values and have a type themselves (the type type). Here you'd have two different things that could be ambiguously written as (u32, u32):

  • The tuple type (u32, u32) where (0u32, 1u32) : (u32, u32) : type
  • The tuple of types (u32, u32) where (u32, u32) : (type, type) : type

It's an abuse of tuple types to try and also use them as tuples of types.

Edit: And this makes me think we might need HKT in order to have some kind of kind-level-tuples in order to do variadic generics properly. I dunno what they would look like though.

Contributor

canndrew commented Apr 28, 2016

We're conflating two different concepts here when talking about using tuples to represent lists of types. There's "tuple types", and then there's "a tuple of types". It may be that (str, str) is not a valid tuple type, but it's a valid tuple of types.

To make this clearer, imagine we're in a dependently typed language where types can be used as values and have a type themselves (the type type). Here you'd have two different things that could be ambiguously written as (u32, u32):

  • The tuple type (u32, u32) where (0u32, 1u32) : (u32, u32) : type
  • The tuple of types (u32, u32) where (u32, u32) : (type, type) : type

It's an abuse of tuple types to try and also use them as tuples of types.

Edit: And this makes me think we might need HKT in order to have some kind of kind-level-tuples in order to do variadic generics properly. I dunno what they would look like though.

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew Apr 28, 2016

Contributor

As long as we punt on taking a reference to a sub-tuple, the representation issue is a non-issue IMO

I think we should only go the subtuple-of-references route if we decide we're definitely not willing to ever change the representation for the other options. The subtuple-of-references idea works, but its not the most natural way to split a tuple and I can only see it getting really messy in practice.

Actually, even if we do go the subtuple-of-references route I'm not sure we should use the let (ref head, ...ref tail) = tuple syntax to destructure a tuple. It'll be very confusing that adding the ref in front of tail changes the type of tail from (A, B) to (&A, &B) and not to &(A, B). We might want to have an intrinsic function that performs this conversion, something like let (ref head, tail) = std::tuple::split((0, 1, 2)) which gives head = &0 and tail = (&1, &2).

Contributor

canndrew commented Apr 28, 2016

As long as we punt on taking a reference to a sub-tuple, the representation issue is a non-issue IMO

I think we should only go the subtuple-of-references route if we decide we're definitely not willing to ever change the representation for the other options. The subtuple-of-references idea works, but its not the most natural way to split a tuple and I can only see it getting really messy in practice.

Actually, even if we do go the subtuple-of-references route I'm not sure we should use the let (ref head, ...ref tail) = tuple syntax to destructure a tuple. It'll be very confusing that adding the ref in front of tail changes the type of tail from (A, B) to (&A, &B) and not to &(A, B). We might want to have an intrinsic function that performs this conversion, something like let (ref head, tail) = std::tuple::split((0, 1, 2)) which gives head = &0 and tail = (&1, &2).

@Diggsey

This comment has been minimized.

Show comment
Hide comment
@Diggsey

Diggsey Apr 28, 2016

Contributor

@canndrew The name tuple refers to both those things: while the name is ambiguous I don't think anyone's been conflating the two: we already have (0u32, 0u32) : (u32, u32) and we don't need HKT to express that.

The expansion syntax would presumably work for both tuple types and tuple values, although the layout argument is only relevent for tuple values.

Also lifetimes are not an issue since lifetime arguments are always separated from type arguments, there's no need for a construct which supports both. It's possible to start with just our existing tuples, and then potentially later add a notion of "tuple lifetimes" or similar, which might look like ('a, 'b, 'c) and which would be unusable on their own (only as part of an expansion)

Contributor

Diggsey commented Apr 28, 2016

@canndrew The name tuple refers to both those things: while the name is ambiguous I don't think anyone's been conflating the two: we already have (0u32, 0u32) : (u32, u32) and we don't need HKT to express that.

The expansion syntax would presumably work for both tuple types and tuple values, although the layout argument is only relevent for tuple values.

Also lifetimes are not an issue since lifetime arguments are always separated from type arguments, there's no need for a construct which supports both. It's possible to start with just our existing tuples, and then potentially later add a notion of "tuple lifetimes" or similar, which might look like ('a, 'b, 'c) and which would be unusable on their own (only as part of an expansion)

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 28, 2016

Member

@canndrew Actually, in my demo I've used a function which turns &(A, B) into (&A, &B), not unlike the C++ std::forward_as_tuple function @petrochenkov mentioned, and I agree that's better, if only because modelling the type is not too hard in a library, but strange in the compiler itself (the best thing it can do is just use a lang item which does the type transformation).

Member

eddyb commented Apr 28, 2016

@canndrew Actually, in my demo I've used a function which turns &(A, B) into (&A, &B), not unlike the C++ std::forward_as_tuple function @petrochenkov mentioned, and I agree that's better, if only because modelling the type is not too hard in a library, but strange in the compiler itself (the best thing it can do is just use a lang item which does the type transformation).

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew Apr 28, 2016

Contributor

@Diggsey

I don't think anyone's been conflating the two: we already have (0u32, 0u32) : (u32, u32) and we don't need HKT to express that.

No, but we'd need it to express something equivalent to (u32, u32) : (type, type) because at the moment we only have (u32, u32) : type. And @petrochenkov was conflating them here but then realised what he was doing in his next comment. It's an important distinction because (str, str) isn't a valid type, should never be read as a type and should never appear in a type position. And ('a, 'b) certainly isn't a valid type. But there might be things that are written like that when we have HKT and can talk about concepts like type -> type, (type, type) or (lifetime, lifetime).

@eddyb
When I mentioned a compiler intrinsic I meant that without syntax to split a tuple by reference it wouldn't be possible to implement the std::tuple::split function I mentioned above.

Contributor

canndrew commented Apr 28, 2016

@Diggsey

I don't think anyone's been conflating the two: we already have (0u32, 0u32) : (u32, u32) and we don't need HKT to express that.

No, but we'd need it to express something equivalent to (u32, u32) : (type, type) because at the moment we only have (u32, u32) : type. And @petrochenkov was conflating them here but then realised what he was doing in his next comment. It's an important distinction because (str, str) isn't a valid type, should never be read as a type and should never appear in a type position. And ('a, 'b) certainly isn't a valid type. But there might be things that are written like that when we have HKT and can talk about concepts like type -> type, (type, type) or (lifetime, lifetime).

@eddyb
When I mentioned a compiler intrinsic I meant that without syntax to split a tuple by reference it wouldn't be possible to implement the std::tuple::split function I mentioned above.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 28, 2016

Member

@canndrew Ah, I agree then, we can use an intrinsic relying on a trait to produce the right return type in generic contexts, while the actual monomorphized implementation of the intrinsic would only care about concrete tuple types.

Member

eddyb commented Apr 28, 2016

@canndrew Ah, I agree then, we can use an intrinsic relying on a trait to produce the right return type in generic contexts, while the actual monomorphized implementation of the intrinsic would only care about concrete tuple types.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 28, 2016

Contributor

Currently (u16, u32, u16) requires 12 bytes, while (u32, u16, u16) requires 8 bytes. I would expect them to be the same size since the compiler should be able to reorder the elements of the tuple to minimize padding automatically.

While I now see the advantage of &(B, C) vs (&B, &C) (one pointer vs two pointers) that prevents the optimization above.

Is there a way to get both optimizations?

Contributor

gnzlbg commented Apr 28, 2016

Currently (u16, u32, u16) requires 12 bytes, while (u32, u16, u16) requires 8 bytes. I would expect them to be the same size since the compiler should be able to reorder the elements of the tuple to minimize padding automatically.

While I now see the advantage of &(B, C) vs (&B, &C) (one pointer vs two pointers) that prevents the optimization above.

Is there a way to get both optimizations?

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Apr 28, 2016

Contributor

@canndrew

And @petrochenkov was conflating them here but then realised what he was doing in his next comment.

Well, the "conflation" was certainly intentional in my first comment, I do want to use tuple types for expressing type lists.
To conflate or not to conflate is a design choice, I don't think the alternative from my second comment is unambiguously better.

Contributor

petrochenkov commented Apr 28, 2016

@canndrew

And @petrochenkov was conflating them here but then realised what he was doing in his next comment.

Well, the "conflation" was certainly intentional in my first comment, I do want to use tuple types for expressing type lists.
To conflate or not to conflate is a design choice, I don't think the alternative from my second comment is unambiguously better.

@vi

This comment has been minimized.

Show comment
Hide comment
@vi

vi May 5, 2016

Shouldn't ... be reserved for for i in 1...10 inclusive range syntax?
More dots, less problems: ....

vi commented May 5, 2016

Shouldn't ... be reserved for for i in 1...10 inclusive range syntax?
More dots, less problems: ....

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 5, 2016

Member

@vi start...end and ...end are taken but start... isn't used.

Member

eddyb commented May 5, 2016

@vi start...end and ...end are taken but start... isn't used.

@ubsan

This comment has been minimized.

Show comment
Hide comment
@ubsan

ubsan May 5, 2016

Contributor

This would also make our syntax line up with C++'s, which is... I don't know, a nice symmetry at least.

Contributor

ubsan commented May 5, 2016

This would also make our syntax line up with C++'s, which is... I don't know, a nice symmetry at least.

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew May 7, 2016

Contributor

start...end and ...end are taken but start... isn't used.

Having a... mean something completely different to a.., a...b, a..b, ...b and ..b seems potentially very confusing. I don't think the (a, b, c; d) syntax looks bad. I read the semicolon as a hard comma that says "everything after here is rolled into one thing". It's also similar to the [a; n] syntax.

Contributor

canndrew commented May 7, 2016

start...end and ...end are taken but start... isn't used.

Having a... mean something completely different to a.., a...b, a..b, ...b and ..b seems potentially very confusing. I don't think the (a, b, c; d) syntax looks bad. I read the semicolon as a hard comma that says "everything after here is rolled into one thing". It's also similar to the [a; n] syntax.

@canndrew

This comment has been minimized.

Show comment
Hide comment
@canndrew

canndrew May 7, 2016

Contributor

I decided to reopen #1582 largely because of this comment. I'll close it again if @eddyb or @nikomatsakis disagree with the decision.

Contributor

canndrew commented May 7, 2016

I decided to reopen #1582 largely because of this comment. I'll close it again if @eddyb or @nikomatsakis disagree with the decision.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 7, 2016

Member

@canndrew It seems non-trivial to see at a glance what is being spread, and it's non-symmetric.
I agree the range syntax overlap is unfortunate, but (a, b...) and (a..., b) are obvious IMO.
The semicolon is also easier to miss and would look really strange for variadic functions:

fn foo<T: Tuple>(; args: Tuple) {...}
Member

eddyb commented May 7, 2016

@canndrew It seems non-trivial to see at a glance what is being spread, and it's non-symmetric.
I agree the range syntax overlap is unfortunate, but (a, b...) and (a..., b) are obvious IMO.
The semicolon is also easier to miss and would look really strange for variadic functions:

fn foo<T: Tuple>(; args: Tuple) {...}
@leonardo-m

This comment has been minimized.

Show comment
Hide comment
@leonardo-m

leonardo-m Jun 22, 2017

Regarding this instead of looking at C++ design I suggest to look instead at the D language and its type lists (once named type tuples,
https://dlang.org/phobos/std_meta.html ), that support an array-like indexing syntax, a array-like slicing syntax, a array-like concatenation syntax, this is much better than what C++17 offers (one problem of the D is that they auto-flatten).

leonardo-m commented Jun 22, 2017

Regarding this instead of looking at C++ design I suggest to look instead at the D language and its type lists (once named type tuples,
https://dlang.org/phobos/std_meta.html ), that support an array-like indexing syntax, a array-like slicing syntax, a array-like concatenation syntax, this is much better than what C++17 offers (one problem of the D is that they auto-flatten).

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jun 22, 2017

Member

What D does strongly requires either duck typing or value refinement / dependent types.
E.g. pair[0] having a different type than pair[1] - AFAIK you can't even do this in standard Haskell.

Member

eddyb commented Jun 22, 2017

What D does strongly requires either duck typing or value refinement / dependent types.
E.g. pair[0] having a different type than pair[1] - AFAIK you can't even do this in standard Haskell.

@leonardo-m

This comment has been minimized.

Show comment
Hide comment
@leonardo-m

leonardo-m Jun 22, 2017

Right, D type lists are dynamically typed at compile-time, just like D templates are half-dynamically typed... (half because in D you often use template constraints to perform half type checking) Unlike Rust. Thank you eddyb.

Right, D type lists are dynamically typed at compile-time, just like D templates are half-dynamically typed... (half because in D you often use template constraints to perform half type checking) Unlike Rust. Thank you eddyb.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Jun 22, 2017

Contributor

@eddyb

What D does strongly requires either duck typing or value refinement / dependent types.
E.g. pair[0] having a different type than pair[1] - AFAIK you can't even do this in standard Haskell.

Type level integers would be enough. This is a minimal form of dependent types[*], but you could then use pair[Usize<0>{}] and pair[Usize<1>{}] to call two different trait methods, so that pair[] can return different types.

One could probably think of multiple combinations of more or less complex rules to make pair[0] work. Not that I like any, but one could be to make all integer literals dependently typed, of type std::Usize<VALUE>, which implements a (potentially magic) conversion to usize. Or also the opposite, give the std::Usize<VALUE> type a "magic" constructor from integer literals so that when calling pair[0] the VALUE parameter is deduced. And probably many more alternatives.

[*] EDIT: maybe we could already do something like this without dependent types at all by making all integer literals be of their corresponding type in the typenum crate, e.g. 3usize of type Usize3, and implement some magic on those.

EDIT 2: The main reason I don't like this "style" of dependent types is that it might generate an explosion in code size and compile times. Each literal integer value has a different type, each function taking one of those is a different function, etc. But I think something like this should be explored as an ergonomic improvement to type-level integers, and in a way that doesn't impact code-size or compile-times, after the minimal RFC goes through because I hate having to manually construct std::integral_constant<int, 3> in C++. Pinging @withoutboats since this is a bit offtopic here.

Contributor

gnzlbg commented Jun 22, 2017

@eddyb

What D does strongly requires either duck typing or value refinement / dependent types.
E.g. pair[0] having a different type than pair[1] - AFAIK you can't even do this in standard Haskell.

Type level integers would be enough. This is a minimal form of dependent types[*], but you could then use pair[Usize<0>{}] and pair[Usize<1>{}] to call two different trait methods, so that pair[] can return different types.

One could probably think of multiple combinations of more or less complex rules to make pair[0] work. Not that I like any, but one could be to make all integer literals dependently typed, of type std::Usize<VALUE>, which implements a (potentially magic) conversion to usize. Or also the opposite, give the std::Usize<VALUE> type a "magic" constructor from integer literals so that when calling pair[0] the VALUE parameter is deduced. And probably many more alternatives.

[*] EDIT: maybe we could already do something like this without dependent types at all by making all integer literals be of their corresponding type in the typenum crate, e.g. 3usize of type Usize3, and implement some magic on those.

EDIT 2: The main reason I don't like this "style" of dependent types is that it might generate an explosion in code size and compile times. Each literal integer value has a different type, each function taking one of those is a different function, etc. But I think something like this should be explored as an ergonomic improvement to type-level integers, and in a way that doesn't impact code-size or compile-times, after the minimal RFC goes through because I hate having to manually construct std::integral_constant<int, 3> in C++. Pinging @withoutboats since this is a bit offtopic here.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jun 22, 2017

Member

What you are describing is what I call "value refinement" e.g. Index<0usize> - where that 0usize is a subtype of usize with only one value, 0. This is more powerful than just const generics and can generalize to runtime values just like dependent types, and also to ranges, variant types, etc.

Member

eddyb commented Jun 22, 2017

What you are describing is what I call "value refinement" e.g. Index<0usize> - where that 0usize is a subtype of usize with only one value, 0. This is more powerful than just const generics and can generalize to runtime values just like dependent types, and also to ranges, variant types, etc.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Jun 22, 2017

Contributor

@leonardo-m

I think the current design of Rust tuple indexing with dot-number was a big design mistake

Nothing prevents adding a new syntax for this as well (especially together with type/value lists), e.g. x.[y], that can't be overloaded, requires y being a constant expression (index or maybe range), x being a variadic list, or tuple, or array, or struct (or even closure!) and returning the y-th element of x at compile time.

Contributor

petrochenkov commented Jun 22, 2017

@leonardo-m

I think the current design of Rust tuple indexing with dot-number was a big design mistake

Nothing prevents adding a new syntax for this as well (especially together with type/value lists), e.g. x.[y], that can't be overloaded, requires y being a constant expression (index or maybe range), x being a variadic list, or tuple, or array, or struct (or even closure!) and returning the y-th element of x at compile time.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jun 22, 2017

Member

s[Struct::field] is another related idea but that is more reflection oriented

Member

eddyb commented Jun 22, 2017

s[Struct::field] is another related idea but that is more reflection oriented

@qraynaud

This comment has been minimized.

Show comment
Hide comment
@qraynaud

qraynaud Dec 7, 2017

I was planning to transcribe some C++ code I have in Rust. It is some playground project of mine that is somehow like yacc. It currently can generate correct parsing tables and parsers for any LALR(1) grammar for C++ and Javascript.

Somehow, I wanted to move the grammar analysis codebase to Rust and also the reference parser implementation. My concern is I'd like to be able to define my parsers like I did in C++ / JS: for each pattern in each rule in the grammar, you have to provide a method to construct the value it produces consuming the subrules / tokens values. I had an interface like this:

template </**/>
class parser: base_parser</**/> {
  void register_rules() {
    this->register_code("rule_name", 0, *this, /* pattern index in the rule */, &parser::rule_name_0);
  }

  float rule_name_0(int i, float f) {
    return i + f;
  }
}

The form with 4 args is providing a method, the 3rd being the object on which the method should be invoked on while with 3 args, you could pass on any function instead. What is important to me is that register_code is able to accept any function / method whatever the arguments if the arguments are all comprised in the types included in a variant type the parser is templated with and if they have less than 50 arguments (the limitation is caused by the fact that I used macros to implement this because when I wrote this code, c++11 was not really around). If you want to take a look at what I did, the code is here: https://bitbucket.org/qraynaud/jungle/src/default/src/libs/jungle/caller.hpp.

I have in the internal parser code a class that can invoke any of those methods using a provided array of values of the variant type I talked of before. Since everything relies on proper templating, the type checking is ensured at compilation for the most part. The variant still has to check it has the correct type value in it at execution time but it will fail properly if not.

I find this important because it makes the rules codebase easy to understand. You put an argument for every token / subrule your pattern contains (eg rule: expr "+" expr + "int" would result in a signature like (left: f64, plus: (), right: f64, i: u32)). I could give to the method the array of arguments directly (like an array of enums) but it feels wrong to me.

Since the basic parser code is not generated, I have the feeling that it is impossible to achieve anything even remotely like this in Rust as of today. It would be great if this spec could take into account those kind of usages to make them possible.

qraynaud commented Dec 7, 2017

I was planning to transcribe some C++ code I have in Rust. It is some playground project of mine that is somehow like yacc. It currently can generate correct parsing tables and parsers for any LALR(1) grammar for C++ and Javascript.

Somehow, I wanted to move the grammar analysis codebase to Rust and also the reference parser implementation. My concern is I'd like to be able to define my parsers like I did in C++ / JS: for each pattern in each rule in the grammar, you have to provide a method to construct the value it produces consuming the subrules / tokens values. I had an interface like this:

template </**/>
class parser: base_parser</**/> {
  void register_rules() {
    this->register_code("rule_name", 0, *this, /* pattern index in the rule */, &parser::rule_name_0);
  }

  float rule_name_0(int i, float f) {
    return i + f;
  }
}

The form with 4 args is providing a method, the 3rd being the object on which the method should be invoked on while with 3 args, you could pass on any function instead. What is important to me is that register_code is able to accept any function / method whatever the arguments if the arguments are all comprised in the types included in a variant type the parser is templated with and if they have less than 50 arguments (the limitation is caused by the fact that I used macros to implement this because when I wrote this code, c++11 was not really around). If you want to take a look at what I did, the code is here: https://bitbucket.org/qraynaud/jungle/src/default/src/libs/jungle/caller.hpp.

I have in the internal parser code a class that can invoke any of those methods using a provided array of values of the variant type I talked of before. Since everything relies on proper templating, the type checking is ensured at compilation for the most part. The variant still has to check it has the correct type value in it at execution time but it will fail properly if not.

I find this important because it makes the rules codebase easy to understand. You put an argument for every token / subrule your pattern contains (eg rule: expr "+" expr + "int" would result in a signature like (left: f64, plus: (), right: f64, i: u32)). I could give to the method the array of arguments directly (like an array of enums) but it feels wrong to me.

Since the basic parser code is not generated, I have the feeling that it is impossible to achieve anything even remotely like this in Rust as of today. It would be great if this spec could take into account those kind of usages to make them possible.

@tinyplasticgreyknight

This comment has been minimized.

Show comment
Hide comment
@tinyplasticgreyknight

tinyplasticgreyknight Dec 7, 2017

@qraynaud I'm not sure if I understand what you're looking for exactly, but would it be a useful workaround if your function always accepts only a single argument, which might be a tuple containing the "real" arguments?

Maybe I should read your original repository and get a feel for how this is all used :-)

tinyplasticgreyknight commented Dec 7, 2017

@qraynaud I'm not sure if I understand what you're looking for exactly, but would it be a useful workaround if your function always accepts only a single argument, which might be a tuple containing the "real" arguments?

Maybe I should read your original repository and get a feel for how this is all used :-)

@qraynaud

This comment has been minimized.

Show comment
Hide comment
@qraynaud

qraynaud Dec 7, 2017

@tinyplasticgreyknight : even so, every function would have a different tuple type and that would not work either for me right?

Each method for each parser rule's pattern has a different signature. I think that was not clear enough. What I implemented in C++ is similar to the apply method in JS (obj.method.apply(obj, [arg1, arg2, …])).

I don't really see how I can build the correct tuple, its signature being something like (..T) with each T being one type between a list of known types. With this idea I got my problem moved. Before it was calling a method using a vector of arguments, now it is constructing a tuple with a vector of arguments. Not knowing beforehand the types & the numbers of arguments…

qraynaud commented Dec 7, 2017

@tinyplasticgreyknight : even so, every function would have a different tuple type and that would not work either for me right?

Each method for each parser rule's pattern has a different signature. I think that was not clear enough. What I implemented in C++ is similar to the apply method in JS (obj.method.apply(obj, [arg1, arg2, …])).

I don't really see how I can build the correct tuple, its signature being something like (..T) with each T being one type between a list of known types. With this idea I got my problem moved. Before it was calling a method using a vector of arguments, now it is constructing a tuple with a vector of arguments. Not knowing beforehand the types & the numbers of arguments…

@Centril Centril added the T-lang label Dec 10, 2017

@NateLing

This comment has been minimized.

Show comment
Hide comment
@NateLing

NateLing Feb 11, 2018

Will it implement in 2018?

Will it implement in 2018?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment