Skip to content
GitHub no longer supports this web browser. Learn more about the browsers we support.
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Structural Records #2584

Open
wants to merge 15 commits into
base: master
from
Open

Conversation

@Centril
Copy link
Member

Centril commented Nov 2, 2018

🖼️ Rendered

📝 Summary

Introduce structural records of the form { foo: 1u8, bar: true } of type { foo: u8, bar: bool } into the language. Another way to understand these sorts of objects is to think of them as "tuples with named fields", "unnamed structs", or "anonymous structs".

💖 Thanks

To @kennytm, @alexreg, @Nemo157, and @tinaun for reviewing the draft version of this RFC.
To @varkor and @pnkfelix for good and helpful discussions.

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Nov 2, 2018

This looks like a super well thought out and comprehensive RFC.

It might be worth clarifying the behaviour of #[derive(...)] with respect to types like:

struct RectangleTidy {
    dimensions: {
        width: u64,
        height: u64,
    },
    color: {
        red: u8,
        green: u8,
        blue: u8,
    },
}

Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves.

Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples. With anonymous structs this will not be possible. Especially with the likely arrival of const generics in the not too distant future, negating the need for macros entirely, anonymous structs will become second class citizens. Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs?

other_stuff(color.1);
...
yet_more_stuff(color.2);
}

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor
fn do_stuff_with((red, green, blue): (u8, u8, u8)) {
    some_stuff(red);
    other_stuff(green);
    yet_more_stuff(blue);
}

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

Even better, with #2522:

fn do_stuff_with((red: u8, green: u8, blue: u8)) {
    some_stuff(red);
    other_stuff(green);
    yet_more_stuff(blue);
}

However, while if you write it in this way it is clear what each tuple component means, it is not clear at the call site...

let color = (255, 0, 0);
// I can guess that this is (red, green, blue) because that's
// the usual for "color" but it isn't clear in the general case.
blue: u8,
},
}
```

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

This syntax looks very similar to another already accepted RFC, but semantics seems different - #2102.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

True; I have to think about if some unification can be done in some way here.

@Centril

This comment has been minimized.

Copy link
Member Author

Centril commented Nov 2, 2018

@Diggsey

This looks like a super well thought out and comprehensive RFC.

Thanks!

It might be worth clarifying the behaviour of #[derive(...)] with respect to types like:

struct RectangleTidy {
    dimensions: {
        width: u64,
        height: u64,
    },
    color: {
        red: u8,
        green: u8,
        blue: u8,
    },
}

Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves.

This bit is not currently well specified, but it should be so I will fix that (EDIT: fixed)... I see two different ways to do it:

  1. Use the structural records in there as types; this means that #[derive(Default)] would produce:

    impl Default for RectangleTidy {
        fn default() -> Self {
            RectangleTidy {
                dimensions: Default::default(),
                color: Default::default()
            }
        }
    }

    The main benefit of this approach is that it is quite simple to implement. It will also work for all derivable standard library traits because implementations are auto-provided for structural records.

  2. Use the structural records in there as syntax; this means that #[derive(Default)] would produce:

    impl Default for RectangleTidy {
        fn default() -> Self {
            RectangleTidy {
                dimensions: {
                   width: Default::default(),
                   height: Default::default(),
                },
                color: {
                    red: Default::default(),
                    green: Default::default(),
                    blue: Default::default(),
                }
            }
        }
    }

    This would need to be done recursively in the macro but it shouldn't be very hard to implement.

    The main benefit of this is flexibility. More types would be derivable.

    However, due to the automatically provided implementations as noted in 1. there is no benefit to this approach for the standard library so I think the sane behaviour is 1.

    That said, a crate like serde can benefit from using the "magical" approach in 2. and we certainly cannot mandate what a procedural macro must do in the language so it will be up to each derive macro to decide what to do.

Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples.

Yup. Usually up to 12, emulating the way the standard library does this.

Especially with the likely arrival of const generics in the not too distant future,
negating the need for macros entirely, anonymous structs will become second
class citizens.

Nit: to not have to use macros for implementing traits for tuples you need variadic generics, not const generics. :)

Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs?

I think it might be possible technically; I've written down some thoughts about it in the RFC. However, the changes needed to make it possible might not be what folks want.

However, not solving the issue might also be good; by not solving the issue you add a certain pressure to gradually move towards nominal typing once there is enough operations and structure that you want on the type.

standard traits that are implemented for [tuples]. These traits are: `Clone`,
`Copy`, `PartialEq`, `Eq`, `PartialOrd`, `Ord`, `Debug`, `Default`, and `Hash`.
Each of these traits will only be implemented if all the field types of a struct
implements the trait.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

That's... quite a bit of magic.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

I agree; this is noted in the drawbacks.

This comment has been minimized.

Copy link
@Centril

Centril Nov 3, 2018

Author Member

Ideas for possible magic reduction discussed below in #2584 (comment).

+ For `PartialEq`, each field is compared with same field in `other: Self`.

+ For `ParialOrd` and `Ord`, lexicographic ordering is used based on
the name of the fields and not the order given because structural records

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

With macros fields can have names with same textual representation, but different hygienic contexts.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

Does that change anything wrt. the implementations provided tho? I don't see any problems with hygiene intuitively, but maybe I can given elaboration?

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

This is probably relevant to other places where the order of fields needs to be "normalized" as well.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

Sure. :) But I'm not sure if you are pointing out a problem or just noting...
Any sorting / "normalization" is done post expansion.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

It's not clear to me how to sort hygienic contexts in stable order, especially in cross-crate scenarios. That probably can be figured out somehow though.

```rust
ty ::= ... | ty_srec ;
ty_srec ::= "{" (ty_field ",")+ (ty_field ","?)? "}" ;

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

I don't think types can start with { syntactically.
There are multiple already existing ambiguities where parser assumes that types cannot start with {, and upcoming const generics introduce one more big ambiguity ({} is used to disambiguate in favor of const generic arguments).
#2102 uses struct { field: Type, ... } to solve this (and also to discern between struct { ... } and union { ... }).

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

I'm not sure about expressions/patterns.
Unlimited lookahead may also be required, need to check more carefully.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

There are multiple already existing ambiguities where parser assumes that types cannot start with {

Such as? (examples please...)

Const generics allow literals and variables as expressions but anything else needs to be in { ... }.
This is not ambiguous with structural records (because the one-field-record requires a comma...) but requires lookahead.

Unlimited lookahead may also be required, need to check more carefully.

Scroll down ;) It's discussed in the sub-section "Backtracking".

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

It's discussed in the sub-section "Backtracking".

It's a huge drawback.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

Such as? (examples please...)

Nothing that can't be solved by infinite lookahead, but here's "how to determine where where clause ends":

fn f() where PREDICATE1, PREDICATE2, { a: ...

Does { a: ... belong to the function body or to the third predicate?

This comment has been minimized.

Copy link
@Centril

Centril Nov 7, 2018

Author Member

@Laaas Because it is inconsistent with Tuples/Tuple-structs and also less ergonomic.

I don't think it's confusing for readers, the one-field case is clearly disambiguated by an extra comma and that is consistent with how tuples are treated. In other words, if { x, } is confusing, then so is (x,).

This comment has been minimized.

Copy link
@GrayJack

GrayJack Jan 10, 2020

If it's going to be drawbacks of this pattern for the structural records, either for slowing parser or cause types can't start with { syntactically, couldn't we use .{ instead?

Inspiration by Zig programming language

This comment has been minimized.

Copy link
@Centril

Centril Jan 10, 2020

Author Member

Actually, I've been thinking about this recently and I've come to the conclusion that if we lean into the ambiguity, then this is an non-issue.

The need for backtracking here stems from ambiguity with type ascription. This can be fixed by simply assuming that { ident: is always going to be a structural record, either in expression or type position (also solves the const generics issue). That is, if given { foo: Vec<u8> } in an expression context, we will attempt parsing this as a structural record, and then fail at < with a parse error (backtracking can be done inside diagnostics only if we want to). This way, we have retained LL(k) (standing at {, the : is 2 tokens away and so therefore k = 2 in this instance).

If a user wants to disambiguate in favor of ascription, they can write { (ident: Type) }. Disambiguating in this way should not be a problem as { ident: Type } is not a very useful thing to write. Specifically, in the case of PREDICATE2, { a: ..., this should not be a problem because the return type itself is a coercion site and informs type inference.

This comment has been minimized.

Copy link
@comex

comex Jan 11, 2020

Why would it fail at <? If you're trying to parse Vec<u8> as an expression, wouldn't the < be parsed as less-than, and you only fail when you see > (and in general arbitrarily far away)?

It seems like the same issue as when we tried to get rid of turbofish...

This comment has been minimized.

Copy link
@Centril

Centril Jan 11, 2020

Author Member

@comex Oh; yeah, you're right. It should fail at >. Still, we can take a snapshot before trying to parse the expression after ident: and backtrack if it turned out to be a type and provide good MachineApplicable diagnostics here.

It seems like the same issue as when we tried to get rid of turbofish...

Well in the sense that turbofish would also be unnecessary if we are OK with backtracking, but it seems like some people object to backtracking, so I've come up with a solution using LL(k) at the cost of not being able to recognize { binding: Type } as a block with a tail-expression ascribing binding. Because such code is, in my view, pathological, that should be fine. It's much less of an issue than turbofish from a learnability POV imo.

field_init ::= ident | field ":" expr ;
```

Note that this grammar permits `{ 0: x, 1: y }`.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 2, 2018

Contributor

#1595 was closed, so I'd expect this to be rejected.
This saves us from stuff like "checking that there are no gaps in the positional fields" as well.

This comment has been minimized.

Copy link
@Centril

Centril Nov 2, 2018

Author Member

Yeah probably; I've left an unresolved question about it.

This comment has been minimized.

Copy link
@varkor

varkor Nov 3, 2018

Member
struct A(u8);
let _ = A { 0: 0u32 };

is accepted, so this is currently inconsistent as it stands anyway.

This comment has been minimized.

Copy link
@Centril

Centril Nov 4, 2018

Author Member

@varkor yeah that was accepted to make macro writing easier (and it totally makes it easier...!)

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Nov 2, 2018

I think the language should be frozen for large additions like this in general, for a couple of years at least.
This is not a crucial feature and not a completion of another crucial feature, motivation seems really insufficient for a change of this scale.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Nov 3, 2018

The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things. Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine serde really wanting to be able to support these, so I wouldn't want to accept them without some sort of plan -- especially as we're getting closer to finally having good stories for our two other structural type families.

I'd also be tempted to block this on changing struct literal syntax to using = -- I really don't like the lookahead requirement here.

Overall, I think I feel that this is cool, but not necessary.

@clarfon

This comment has been minimized.

Copy link
Contributor

clarfon commented Nov 3, 2018

I'm very concerned with properties of the struct explicitly relying on lexicographic ordering. I don't think that Ord or PartialOrd should be implemented for structural records-- rather, they should be treated as having unordered fields. If someone wishes to enforce Ord or PartialOrd, I feel that they should be required to either name the struct or use a tuple.

I think that defining lexicographic behaviour for Hash and Debug, however, is completely fine. Hash really only needs to uphold that x == y => hash(x) == hash(y), and any ordering (as long as it's consistent) will satisfy this. Additionally, lexicographic ordering makes the most sense for debugging.

But as a user, I would be surprised to find the below code fail:

let first = { foo: 1, bar: 2 };
let second = { foo: 2, bar: 1};
assert!(first < second);
@Centril

This comment has been minimized.

Copy link
Member Author

Centril commented Nov 3, 2018

@clarcharr

I've added an unresolved question to be resolved prior to merging (if we do that...) about whether (Partial)Ord should be provided or not. I'm certainly open to your suggestion.

@scottmcm

I'd also be tempted to block this on changing struct literal syntax to using = -- I really don't like the lookahead requirement here.

I would have loved if we had used = instead... but blocking on that seems like blocking on a thing that no one will agree to changing because it would cause churn in more or less every single Rust program/library ever written... ;)

The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things.

Sure; I entirely agree with that sentiment; it is by far the biggest drawback.

Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine serde really wanting to be able to support these, so I wouldn't want to accept them without some sort of plan -- especially as we're getting closer to finally having good stories for our two other structural type families.

I noted somewhere in the RFC that with the combination of const generics and variadic generics, you may be able to extend this to structural records as well. For example (please note that this is 100% a sketch), using a temporary syntax due to Yato:

impl
    <*(T: serde::Serialize, const F: meta::Field)>
    // quantify a list of pairs with:
    // - a) type variables T all bound by `serde::Serialize`,
    // - b) a const generic variables F standing in for the field of type meta::Field
    //       where `meta` is the crate we reserved and `Field` is a compile time
    //       reflection / polymorphism mechanism for fields.
    serde::Serialize
for
   { *(F: T) }
   // The structural record type; (Yes, you need to extend the type grammar *somehow*..)
{
    // logic...
}

This idea is inspired by Glasgow Haskell's Symbol.

@burdges

This comment has been minimized.

Copy link

burdges commented Feb 26, 2019

We noticed a handy trick for replacing structural records:

We permit types declared pub inside fn bodies to be referenced from outside the fn, which naturally prevents them from using type parameters of the fn declaration, so

impl<V> Storage for FooStore {
    pub fn insert<'o,K>(&'o mut self, i: insert::Input<K>) -> insert::Output<'o> {
        #[derive(...)]
        pub Input<K> { k: K, v: V }
        impl From<(K,V)> for Input { ... }
        ...
        pub struct Output<'oo> { old_value: Option<&'oo V> }
        Ouput { old_value }
    }
}

In essence, insert::Input/Output now serve as our "structural records" but we address all concerns like traits directly.

We might support more polymorphism this way than with structural records, but obviously structural records simplify everything sometimes. We've some complex type parameters in insert::<V>::Input<K> or whatever in this case. We'll run into worse concerns if we ever permit defining types inside impls though because at least here insert acts like a submodule. And consistency with ATCs should make this fairly canonical.

@Stargateur

This comment has been minimized.

Copy link

Stargateur commented Feb 26, 2019

@Stargateur where is Color defined? If I use libA::Color it's not compatible with libB::Color so if libA gives me a Color struct I have to convert it to libB::Color before calling libB's function

@iopq And what if one library use { r: u8, g: u8, b: u8 } and the other { red: u8, green: u8, blue: u8 } ?

@ssokolow

This comment has been minimized.

Copy link

ssokolow commented Feb 26, 2019

@iopq And what if one library use { r: u8, g: u8, b: u8 } and the other { red: u8, green: u8, blue: u8 } ?

From that "optimize for interoperation without conversion" perspective, the best solution is a (u8, u8, u8) tuple, which Rust already supports, since there are no names to disagree and, in the most common case (RGB component ordering), they'll Just Work™. (At the cost of risking Perl/PHP-like "do what I mean" behaviour which compiles but breaks in confusing ways when said assumptions are violated.)

Of course, it's also always possible that you'll find libA using Color { red: u8, green: u8, blue: u8 }, libB using { r: u8, g: u8, b: u8 } and libC using (u8, u8, u8) if all three are supported.

There's no magic bullet. The best you can get is to have a standard crate which just defines types like Color (similar to the http crate) and encourage everyone to depend on it rather than rolling their own types.

@jswrenn

This comment has been minimized.

Copy link

jswrenn commented Feb 26, 2019

@Stargateur totally valid concerns, and for those reasons, I don't think we'll see ad hoc record types appear in the public API of many crates. Ad hoc records aren't really a replacement for named records; they're more closely a substitute for Rust's other ad hoc type: tuples.

We generally use tuples for the one-off packings of data that you need when you're doing transformations over that data; e.g., iterator chains, or internal helper functions. Tuples are the right tool to use when giving a name to a type is actually more cognitive load than just bunching the data all together (and there is some cognitive load to naming a type: you need to pick a name, pick a place to define it, manage imports, and look back at the definition). The big problem with tuples is they aren't really self-documenting: you need to always remember what order the components are in. If the components have different types, you'll be saved from mistakes by a compiler error. If they don't, you're hosed. I think of named records as basically just being tuples where the fields aren't named 0, 1, 2, etc.

And that's the role I need them for. In my work I do a lot of transformations over deep, tree-structured data for analysis purposes. The input types are all named structures (thanks serde!) and the output types are all named structures too. What happens in the middle is a complete mess of tuples, though. I do dozens of one-off transformations of data. The cognitive load (and code bloat) of turning all of those tuples into name structures would be totally unmanagable. But: trying to decipher iterator chains that use tuples also sucks! Ad hoc records are going to really improve my quality of life in programming Rust.

@burdges

This comment has been minimized.

Copy link

burdges commented Feb 26, 2019

I think transformations over deeply structured data sounds like lenses, but structural records would often be simpler.

@iopq

This comment has been minimized.

Copy link
Contributor

iopq commented Feb 27, 2019

But Color has a canonical ordering

A 2d rectangle doesn't, but { top-left: (10, 10), width: 190, height: 50 } seems good

Not having to declare it somewhere seems like a win

@ssokolow

This comment has been minimized.

Copy link

ssokolow commented Feb 28, 2019

Actually, color does not have a canonical ordering. For historical reasons, you run into BGR as a legacy detail that has persisted in various Windows file formats and APIs and it also crops up when the abstraction leaks on a little-endian system.

For example, I got tripped up when using PyOpenCV because its use of 32-bit integers as an internal representation results in BGR(A) content on a little-endian processor and (A)RGB on a big-endian processor. (I wound up very confused about why my images were coming out wrong until I finally managed to determine that it was my assumptions about the mapping between color components and indices... the exact kind of bug named struct fields prevent.)

Likewise, Google's design document for Skia has a section named "The epic battle of RGBA and BGRA" which talks about having to do conversion between RGB(A) file formats and BGRA memory representations. (Another situation where bugs are likely to creep in when using numeric indices.)

As for 2D rectangles, I've always seen them in one of two representations:

  • (x, y, width, height) (the most common and what you see in the constructor for QRect and the struct definitions for GdkRectangle and cairo_rectangle_int_t)
  • (x1, y1, x2, y2) (as used in things like the SetRect function in the Win32 API.)

In both cases, taking after the (x, y) ordering people learn when graphing points in high school.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Feb 28, 2019

We generally use tuples for the one-off packings of data that you need when you're doing transformations over that data; e.g., iterator chains, or internal helper functions.

Note that this is the same thing that C# hit, which I mentioned in #2584 (comment) -- it doesn't necessarily need structural types, just anonymous ones often work great.

@ssokolow ssokolow mentioned this pull request Mar 7, 2019
@jhpratt

This comment has been minimized.

Copy link

jhpratt commented Apr 29, 2019

I don't believe anyone else has mentioned this, but going back to July 2012, Rust 0.3 had support for what is described to be "structural records", and the syntax is similar. (docs). I'm not sure if that syntax just morphed into the current struct, or if it was actually implemented differently. If it's the latter, that could potentially assist with implementation.

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Apr 29, 2019

The old structural records were intentionally removed. See https://internals.rust-lang.org/t/pre-rfc-unnamed-struct-types/3872/31?u=ixrec IIUC, the current proposal and its context are different enough (if only because we already have nominal structs) that the reasoning in that post does not directly apply here.

@LukeSkyw

This comment has been minimized.

Copy link

LukeSkyw commented Jul 24, 2019

I am not an expert in Rust and don't pretend to understand everything being said in this thread but I am very interested by this proposal and would like to see it adopted. So I am trying to sum up the outstanding issues that has popped up:

  • Syntax similar to #2102
  • Traits auto-implementation magic
  • Grammar issues (unlimited lookahead / Backtracking), abandon LL(k) parser
  • This change would be a significant change but language should be frozen for a couple of years
    -> It would just be an extension to struct for what is being done with tuples
  • Lexicographic ordering of unsized fields
    -> It would be treated as tuple

Is this RFC stalled because there is no interest to pursue it further by rust language team? Then shouldn't it be closed?

@burdges

This comment has been minimized.

Copy link

burdges commented Jul 24, 2019

There are much higher proprieties than this syntactic sugar, and all syntax incurs some cognitive cost so assessing the cost vs benefits remain hard, even if you can solve the various outstanding issues.

I proposed in #2584 (comment) that we merely allow pub declarations from inside fn bodies, which suffices for most use cases without complicating anything, except perhaps doc comments, and plays well with visibility rules like pub(crate) too. If doing pub struct inside fn bodies became popular, then one could survey usage for evidence for structural records value.

There are even more limited options like attaching items doc comments to other items, but maybe too limited for serious adoption.

@iopq

This comment has been minimized.

Copy link
Contributor

iopq commented Jul 24, 2019

It's not syntactic sugar, it actually creates a new kind of types. It lets you easily group things when passing stuff between functions of different crates. Right now you either need a struct in std, or to import a common crate.

@jswrenn

This comment has been minimized.

Copy link

jswrenn commented Jul 24, 2019

@burdges, I'm impressed with how many use-cases are covered by your proposal (in large part because of @scottmcm's closely related comment #2584 (comment)). I believe this is a viable alternative to structural records for the purposes of code generation (which is one of my motivations for wanting structural records #2584 (comment)). I would really love to see this feature explored in an RFC. However:

  • The uses of structural records are not entirely covered by your proposal. As @iopq notes, structural records are an actual extension to the type system that let you express has-field requirements on types.
  • Surface syntax matters a lot for usability...

When I need an ad-hoc structure type in an iterator chain, I often choose to use a tuple, because a full-blown structure definition would decrease the readability of my code. In these situations, I would like to use a structural record, but must use a tuple instead.

This decision always leaves me deeply unsatisfied, because tuples are almost always bad for readability. A tuple is actually just a special-case of a structural record, in which the field names are consecutive integers. (This correspondence is already reflected in Rust.)

Let's imagine an alternate reality where Rust had ad-hoc records, but no parenthetical tuple initialization syntax for it. How often would you choose to use numbers as field names?

For me, it would be about as often as I use variable names like var3.

@LukeSkyw

This comment has been minimized.

Copy link

LukeSkyw commented Jul 24, 2019

For what it's worth, this pattern is actually familiar to Javascript/Typescript developers (which numbers are growing I believe):

function foo(): { bar: Number, baz: Number } {
  const bar = // ... compute some value
  const baz = // ... compute some value
  return { bar, baz }
}

const { bar, baz } = foo();

When used with de-structuring, it is a very convenient and expressive way of reducing cognitive load when returning data that do not necessarily fit together into a predefined struct.

EDIT: By the way, the object literal shorthand syntax could be nice to have too, but let's not push our luck ;)

@burdges

This comment has been minimized.

Copy link

burdges commented Jul 24, 2019

Interestingly, you could likey define a struct!{} procedural macro for some form of structural record using pub inside fn, although it requires type annotations. You cannot customize traits for structural records.

You also cannot define anonymous types, ala closures, using pub inside fn. I donno if structural record should be anonymous types though, i.e. you could still destructure, so let struct { foo, bar } = struct { foo, bar }; works, but struct {a,b} == struct {a,b} gives a type error.

@Stargateur

This comment has been minimized.

Copy link

Stargateur commented Jul 24, 2019

It lets you easily group things when passing stuff between functions of different crates. Right now you either need a struct in std, or to import a common crate.

I already prove this is wrong. The same will happen with Strutural records. The only case is when you are lucky to use the same name if you not you will also need a breaking change like today.

@jswrenn

This comment has been minimized.

Copy link

jswrenn commented Jul 30, 2019

I think this RFC's take on structural records is interesting in its own right, but I'm also excited about the future design possibilities it opens up.

Row Polymorphism

At the bottom of #2584 (comment), I wrote that the potential future addition of row polymorphism to structural records had exciting ergonomics implications. E.g.:

// Transform a stream of things with euclidian coordinates into a stream of
// things polar coordinates.
//
// `Ρ` denotes "some collection of additional fields and types"
//
// this type signature conveys that the _only_ thing altered between the
// input and output types is the `pos` field.
fn convert<Ρ>(input: impl IntoIterator<Item={pos: EuclidianCoordinate, ..Ρ}>)
                      -> impl Iterator<Item={pos: PolarCoordinate,     ..Ρ}>
{
    input
        .into_iter()
        .map(| {pos, ..ρ} : {pos: EuclidianCoordinate, ..Ρ} |
            {pos: pos.into_polar_coordinate(), ..ρ})
}

Given this...

Generic Trait Impls over Ad-Hoc Types

...I think there's a natural path towards generic trait impls over ad-hoc types like tuples.
As I pointed out in #2584 (comment) that tuples are just a special case of ad-hoc record types where the fields are named with consecutive natural numbers.

To borrow a bit of syntax from advanced_slice_patterns, I can imagine writing something like this to express PartialEq for arbitrary ad-hoc record types (tuples included!):

// `Ρ` denotes "some collection of additional fields and types"
// `_ : T` denotes a field of any name
impl<T, Ρ> PartialEq for {_ : T, Ρ..}
where
    T: PartialEq,
    P: PartialEq,
{
    // `rhs_first@_ : T` binds the value of `T` to `rhs_first`
    fn eq(&self, other: &{rhs_first@_ : T, rhs_rest@Ρ..}) -> bool {
        let {lhs_first@_ : T, lhs_rest@Ρ..} = self;

        (lhs_first == rhs_first) && (lhs_rest == rhs_rest)
    }
}

This would cover some use-cases of #2702 and avoid the layout implications.

@zzyyxxww

This comment has been minimized.

Copy link

zzyyxxww commented Oct 12, 2019

I can understand the opposition to this rfc based on implementation/technical difficulties.
But I can't understand deeming this as "not necessary" or just "cool". I consider this feature so important that, if implemented, it would elevate rust way, way above other static langs (if it wasn't high enough already).

Excessive types and type conversion code have been a big problem for a long time for me when using statically typed langs.

For example, let's say you have a User struct that has Name, LastName, Address, City, Country, all of type of String.

Usually what happens is that some functions access some of these fields, but not all of them.
So, it makes sense to only pass those fields.
This results in types with all sorts of horrible names that depend on what the dev of the moment feels like, such as UserWithNameAndLastName, UserWithoutCountry, or sometimes omitting the struct name like CountryCity or CountryAndCity (boilerplate).

It basically turns into a wild west of types and names.

You could say "just pass the entire struct", but sometimes you have similar structs like AdminUser. So now you have to write User from/to AdminUser conversion code (more boilerplate).

And you can expect this boilerplate to happen when trying to combine funcs from different crates.

You could say "just pass a tuple or separate arguments". But tuples or separate arguments are just awful for our example because of having the same type, so if you try to "change" the tuple (as in, you expect different values or different orders) but keep the same length, you won't get any compile error yet every call is now passing the incorrect value.

So, the best choice is to stick to structs, but it results in annoying and needless amounts of boilerplate code.

The concept of ad-hoc types is very valuable. That's why there's this relief when going from a statically typed language to for example javascript, because of the realization that you don't need to write superfluous types or superfluous type conversion code anymore (with the expense of total uncertainty, ofc). But I hope rust acquires this feature at some point.

Regarding comments from other users:

@ssokolow says "use common types from the same crate". That is too ideal and doesn't happen in practice. People will declare whatever types they want to and I think that's fine.

@Stargateur says that if two structs have equivalent but different names, then you will get a breaking change. Yes, but the same can happen with tuples if they are expected in different orders.
And the tuples way is much worse for values of same types: passing values in the wrong order won't cause a compile error, but passing the wrong field names will cause a compile error.

@burdges

This comment has been minimized.

Copy link

burdges commented Oct 12, 2019

I do like structural records, but they interact closely with almost every major type system extension being considered. I've posted 5-10 comments exploring such interactions inside different threads here. I'd suggest that delegation and trait fields be sorted out first in particular, which sounds like another couple years. And varadic tuples should wait a couple years for similar reasons

All your examples sound like dynamic issues not really resolvable at compile time.

As a rule, anything like UserWithoutCountry should be handled with Option or Result. In particular, if records represent people then it's unethical to require much data, and it'll increasingly become illegal, ala GDPR. Also, if records represent places then staggering variations already exists, so again Options everywhere. etc.

AdminUser might employ deref polymorphism, but all the details depend on whether AdminUser and User contain the same, orthogonal, etc. data. I think structural records cannot simplify this case.

In more constrained cases, you should often provided a statically typed exterior that enforces any important constraints, but use appropriate "dynamism" internally, both for developer sanity and to represent the full protocol flow.

If your code has extremely hairy types then you could always go fully dynamic by using HashMap<&'static str,dyn Any> along with getters and setter that called downcast_ref or downcast_mut.

I've never done that, but I once used a Rs: BTreeMap<PublicKey,CoR> where CoR was an state machine enum. These Rs lived inside structs who represented a tightly related state machine at the type level. All those state machines felt like an annoying dynamic aspect, but they represent protocol interactions, not really avoidable.

Just fyi, there are also formal verification techniques like refinement types that push more dynamic information into the type level. At least refinement types do provide remarkable improvements in syntax, like the whole if !let .. business should be done with refinement types, so more improvements likely exist here. Yet, ideally one should do refinement types, etc. with some compiler extension, not rustc itself, partially for compiler performance reasons.

@Centril Centril mentioned this pull request Oct 25, 2019
1 of 1 task complete
@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Dec 3, 2019

First, thank you so much for this RFC. I didn’t know it existed until today. It’s always a pleasure to read such well-written RFCs. Congrats!

I consider this feature so important that, if implemented, it would elevate rust way, way above other static langs (if it wasn't high enough already).

I disagree with this. As stated above, I think that feature is providing a feature that is not necessary, but about ergonomics. It doesn’t bring anything new (you could be using a tuple, or a one-shot struct). It’s just comfort, it won’t make your program more correct.

I don’t quite know how to feel about it, to be honest. Currently, my mind is just picturing them as “unordered tuples with records”. I personally think I have no use for them — types are everything to me: the stronger the better. I sometimes need ordered unnamed types (i.e. tuples), but unordered tuples… I don’t see any useful use case for them.

About positional arguments in functions, I sometimes wonder whether I would be for them or not. But I agree this RFC shouldn’t be considered regarding positional arguments.

@phlopsi

This comment has been minimized.

Copy link

phlopsi commented Dec 18, 2019

This is awesome. I can't wait to try this out. For someone like me, who values code readability highly, this is the perfect feature.

It comes in handy whenever I have to temporarily store partial structs in some collection and I don't want to create a type that is only used in 1 to 2 functions and e.g. call it PartialMyStruct, which doesn't really tell you anything about which part of MyStruct it contains. The alternative is to use some tuple, e.g. (usize, usize), but that isn't one bit more expressive than the throw-away struct from before. If I can express the type as e.g. Vec<{ source_id: usize, target_id: usize }>, it'd drastically impove code readability!

@matthew-mcallister

This comment has been minimized.

Copy link

matthew-mcallister commented Jan 23, 2020

In addition to containers, such a feature would combat a common breed of boilerplate caused by mutexes:

struct ObjectPool {
    memory_limit: usize,
    objects: Mutex<{
        total_objects: usize,
        free: Vec<Object>,
    }>,
}

Typically you either mutex the whole object or define a poorly named struct like ObjectPoolInner that you can put inside the mutex. This kind of nesting often combines with containers, and it's not hard to get 3 levels deep, e.g. RwLock<{ ..., HashMap<_, { ..., Mutex<{ ... }> }> }>.

@burdges

This comment has been minimized.

Copy link

burdges commented Jan 24, 2020

In #2584 (comment) @jswrenn writes:

The uses of structural records are not entirely covered by your proposal. As iopq notes, structural records are an actual extension to the type system that let you express has-field requirements on types.

When I need an ad-hoc structure type in an iterator chain, I often choose to use a tuple, because a full-blown structure definition would decrease the readability of my code. In these situations, I would like to use a structural record, but must use a tuple instead.

I think structural records do not address the has-field requirements on types anyways. If I write, &'a I: Iterator<Item = struct {a: &'a AA, b: &'a BB}> then I must still build this structural record with map or whatever.

We should address ad-hoc return types with types declared pub inside fn bodies, like I suggested in #2584 (comment)

We could separately address has-field requirements with a crate that provides a trait like

pub trait HasField<T,const &'const str> {
    fn field_ref(&self) -> &T;
    fn field_mut(&mut self) -> &mut T;
}

or more likely

pub trait Field {
    type Target: ?Sized;
}
pub trait HasField<F> where F: Field {
    fn field_ref(&self) -> &<F as Field>::Target;
    fn field_mut(&mut self) -> &mut <F as Field>::Target;
}

either of which get implemented by a procmacro like

use some_crate::FieldBar;

pub struct Foo {
    #[has_field]
    pub bar: Bar;
}

I think this HasField trait provides a much cleaner solution since you can now write Iterator<Item = (impl Borrow<impl HasField<X>>,impl BorrowMut<impl HasField<Y>+HasField<Z>>)> which abstracts over the various ownership properties of different iterators coming from a HashMap and BTreeMap and over the various existing structs with these fields.

@rodrimati1992

This comment has been minimized.

Copy link

rodrimati1992 commented Jan 24, 2020

@burdges

In structural I decided to go with a design similar to:

pub trait GetField<const NAME:&'static str> {
    type Ty:?Sized;

    fn get_field_(&self) -> &Self::Ty;
    
    fn get_field_mut_(&mut self) -> &mut Self::Ty;
    
    fn into_field_(self) -> Self::Ty
    where Self:Sized,Self::Ty:Sized;
}

Edit:I hadn't noticed that the RFC also proposes a very similar trait as a future possibility.

But with a trait hierarchy for shared/mutable/by value access,a Sized associated type,and a type parameter(which emulates a const generics parameter).

With this design it's possible to bound the type of a field with impl Trait.

pub fn hello(iter:impl IntoIterator<Item=impl GetField<"name",Ty=impl Debug>>){
    for item in iter {
        println!("{:?}",GetField::<"name">::get_field_(&item));
    }
}

(this code ICEs)

@dhardy

This comment has been minimized.

Copy link
Contributor

dhardy commented Jan 24, 2020

Anonymous structs with derived or manual implementations are (already) possible, albeit with less-concise syntax. For example, one could have the following via proc-macro:

let x = make_struct! {
    #[derive(Clone, Copy, Debug)]
    struct {
        x: u32 = 52,
        y = calc_y(),
        _ = anon_field_value(),
    }
    impl MyTrait { // implicitly for the above struct
        ...
    }
};

Not very concise, but it still beats naming all types and fields and using separate type definition and value instantiation. (And it can be done with a proc-macro today via generics, although it would work better with Permit _ in type aliases. My make_widget! macro is basically a glorified version of this.)

Similarly, if no implementations are wanted, a simpler macro could be used:

let y = make_record! { a = 30.0f64, b = true };

This does already meet some of the motivation for this RFC, although of course a language feature has certain advantages.

@burdges

This comment has been minimized.

Copy link

burdges commented Jan 24, 2020

Oh nice @rodrimati1992 I think your trait makes more sense than either of mine. I think the refraction crate has some similar ideas coming form lenses. And the generic-field-projection crate.

I think even if const generics worked then you'd currently run into issues with the lack of lazy normalization like in #2289 but after it gets fixed then more should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.