Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Structural Records #2584
Conversation
Centril
and others
added some commits
Oct 28, 2018
Centril
added
the
T-lang
label
Nov 2, 2018
This was referenced Nov 2, 2018
This comment has been minimized.
This comment has been minimized.
|
This looks like a super well thought out and comprehensive RFC. It might be worth clarifying the behaviour of struct RectangleTidy {
dimensions: {
width: u64,
height: u64,
},
color: {
red: u8,
green: u8,
blue: u8,
},
}Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves. Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples. With anonymous structs this will not be possible. Especially with the likely arrival of const generics in the not too distant future, negating the need for macros entirely, anonymous structs will become second class citizens. Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs? |
petrochenkov
reviewed
Nov 2, 2018
| other_stuff(color.1); | ||
| ... | ||
| yet_more_stuff(color.2); | ||
| } |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
fn do_stuff_with((red, green, blue): (u8, u8, u8)) {
some_stuff(red);
other_stuff(green);
yet_more_stuff(blue);
}
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
•
Author
Contributor
Even better, with #2522:
fn do_stuff_with((red: u8, green: u8, blue: u8)) {
some_stuff(red);
other_stuff(green);
yet_more_stuff(blue);
}However, while if you write it in this way it is clear what each tuple component means, it is not clear at the call site...
let color = (255, 0, 0);
// I can guess that this is (red, green, blue) because that's
// the usual for "color" but it isn't clear in the general case.
petrochenkov
reviewed
Nov 2, 2018
| blue: u8, | ||
| }, | ||
| } | ||
| ``` |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
This syntax looks very similar to another already accepted RFC, but semantics seems different - #2102.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Author
Contributor
True; I have to think about if some unification can be done in some way here.
This comment has been minimized.
This comment has been minimized.
Thanks!
This bit is not currently well specified, but it should be so I will fix that (EDIT: fixed)... I see two different ways to do it:
Yup. Usually up to 12, emulating the way the standard library does this.
Nit: to not have to use macros for implementing traits for tuples you need variadic generics, not const generics. :)
I think it might be possible technically; I've written down some thoughts about it in the RFC. However, the changes needed to make it possible might not be what folks want. However, not solving the issue might also be good; by not solving the issue you add a certain pressure to gradually move towards nominal typing once there is enough operations and structure that you want on the type. |
petrochenkov
reviewed
Nov 2, 2018
| standard traits that are implemented for [tuples]. These traits are: `Clone`, | ||
| `Copy`, `PartialEq`, `Eq`, `PartialOrd`, `Ord`, `Debug`, `Default`, and `Hash`. | ||
| Each of these traits will only be implemented if all the field types of a struct | ||
| implements the trait. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 3, 2018
Author
Contributor
Ideas for possible magic reduction discussed below in #2584 (comment).
petrochenkov
reviewed
Nov 2, 2018
| + For `PartialEq`, each field is compared with same field in `other: Self`. | ||
|
|
||
| + For `ParialOrd` and `Ord`, lexicographic ordering is used based on | ||
| the name of the fields and not the order given because structural records |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
With macros fields can have names with same textual representation, but different hygienic contexts.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Author
Contributor
Does that change anything wrt. the implementations provided tho? I don't see any problems with hygiene intuitively, but maybe I can given elaboration?
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
•
Contributor
This is probably relevant to other places where the order of fields needs to be "normalized" as well.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Author
Contributor
Sure. :) But I'm not sure if you are pointing out a problem or just noting...
Any sorting / "normalization" is done post expansion.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
It's not clear to me how to sort hygienic contexts in stable order, especially in cross-crate scenarios. That probably can be figured out somehow though.
petrochenkov
reviewed
Nov 2, 2018
| ```rust | ||
| ty ::= ... | ty_srec ; | ||
| ty_srec ::= "{" (ty_field ",")+ (ty_field ","?)? "}" ; |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
•
Contributor
I don't think types can start with { syntactically.
There are multiple already existing ambiguities where parser assumes that types cannot start with {, and upcoming const generics introduce one more big ambiguity ({} is used to disambiguate in favor of const generic arguments).
#2102 uses struct { field: Type, ... } to solve this (and also to discern between struct { ... } and union { ... }).
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
I'm not sure about expressions/patterns.
Unlimited lookahead may also be required, need to check more carefully.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Author
Contributor
There are multiple already existing ambiguities where parser assumes that types cannot start with
{
Such as? (examples please...)
Const generics allow literals and variables as expressions but anything else needs to be in { ... }.
This is not ambiguous with structural records (because the one-field-record requires a comma...) but requires lookahead.
Unlimited lookahead may also be required, need to check more carefully.
Scroll down ;) It's discussed in the sub-section "Backtracking".
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
It's discussed in the sub-section "Backtracking".
It's a huge drawback.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
Such as? (examples please...)
Nothing that can't be solved by infinite lookahead, but here's "how to determine where where clause ends":
fn f() where PREDICATE1, PREDICATE2, { a: ...
Does { a: ... belong to the function body or to the third predicate?
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
Author
Contributor
@H2CO3 oh sure; the parser must handle it, but the perils of backtracking is that it would considerably slow down parsing or produce bad error messages. What I'm saying is that the code paths in the parser that would handle this are pathological, so the slow downs are unlikely.
There's also no ambiguity here at all, but lookahead / backtracking / GLL may be required.
If we ever use the GLL crate in the compiler then we don't even need backtracking... it provides a correct algorithm to parse context-free-grammars and can produce good error messages for these sort of things because it produces a parse forest that you can catch certain patterns on.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 4, 2018
Contributor
The fact that advanced formalisms exist doesn't necessarily mean that we should use them.
I many respects the simpler means the better.
Officially abandoning LL(n) is a separate decision that shouldn't be done in comments to a mostly unrelated RFC.
(That said, for this RFC disambiguation can also be done using cover grammar TYPE_OR_EXPR.
We parse { IDENT: TYPE_OR_EXPR and look at the next token, if it's , then we have a struct record type.)
(Nit: GLL is a backtracking of sorts, just with fancy memoization making is O(n^3) rather than exponential).
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
Author
Contributor
The fact that advanced formalisms exist doesn't necessarily mean that we should use them.
I many respects the simpler means the better.
Well there are advantages to both, and we should decide what trade-off is best for Rust.
Officially abandoning
LL(n)is a separate decision that shouldn't be done in comments to a mostly unrelated RFC.
Abandoning LL(k) must be made for a reason, and that reason would be some feature or change we would like to make that forces us to do it. I don't think (correct me if that is inaccurate) we ever stated explicitly that LL(k) is a design goal so if so it doesn't need explicit and official abandonment either.
So I think either we abandon LL(k) on this RFC or in the #2544 RFC, or we accept neither and keep LL(k).
(That said, for this RFC disambiguation can also be done using cover grammar
TYPE_OR_EXPR.
We parse{ IDENT: TYPE_OR_EXPRand look at the next token, if it's,then we have a struct record type.)
Yeah, that's the idea I had also.
(Nit: GLL is a backtracking of sorts, just with fancy memoization making is
O(n^3)rather than exponential).
Fair enough ;)
This comment has been minimized.
This comment has been minimized.
Laaas
Nov 7, 2018
Even if you disregard the fact that this syntax would necessitate backtracking, it is confusing for readers IMO. Why not require a struct keyword or something?
This comment has been minimized.
This comment has been minimized.
Centril
Nov 7, 2018
Author
Contributor
@Laaas Because it is inconsistent with Tuples/Tuple-structs and also less ergonomic.
I don't think it's confusing for readers, the one-field case is clearly disambiguated by an extra comma and that is consistent with how tuples are treated. In other words, if { x, } is confusing, then so is (x,).
petrochenkov
reviewed
Nov 2, 2018
| field_init ::= ident | field ":" expr ; | ||
| ``` | ||
|
|
||
| Note that this grammar permits `{ 0: x, 1: y }`. |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
#1595 was closed, so I'd expect this to be rejected.
This saves us from stuff like "checking that there are no gaps in the positional fields" as well.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
varkor
Nov 3, 2018
Member
struct A(u8);
let _ = A { 0: 0u32 };is accepted, so this is currently inconsistent as it stands anyway.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
•
Author
Contributor
@varkor yeah that was accepted to make macro writing easier (and it totally makes it easier...!)
petrochenkov
reviewed
Nov 2, 2018
text/0000-structural-records.md Outdated
This comment has been minimized.
This comment has been minimized.
|
I think the language should be frozen for large additions like this in general, for a couple of years at least. |
This comment has been minimized.
This comment has been minimized.
|
The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things. Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine I'd also be tempted to block this on changing struct literal syntax to using Overall, I think I feel that this is cool, but not necessary. |
Centril
added some commits
Nov 3, 2018
This comment has been minimized.
This comment has been minimized.
|
I'm very concerned with properties of the struct explicitly relying on lexicographic ordering. I don't think that I think that defining lexicographic behaviour for But as a user, I would be surprised to find the below code fail: let first = { foo: 1, bar: 2 };
let second = { foo: 2, bar: 1};
assert!(first < second); |
This comment has been minimized.
This comment has been minimized.
|
@clarcharr I've added an unresolved question to be resolved prior to merging (if we do that...) about whether
I would have loved if we had used
Sure; I entirely agree with that sentiment; it is by far the biggest drawback.
I noted somewhere in the RFC that with the combination of const generics and variadic generics, you may be able to extend this to structural records as well. For example (please note that this is 100% a sketch), using a temporary syntax due to Yato: impl
<*(T: serde::Serialize, const F: meta::Field)>
// quantify a list of pairs with:
// - a) type variables T all bound by `serde::Serialize`,
// - b) a const generic variables F standing in for the field of type meta::Field
// where `meta` is the crate we reserved and `Field` is a compile time
// reflection / polymorphism mechanism for fields.
serde::Serialize
for
{ *(F: T) }
// The structural record type; (Yes, you need to extend the type grammar *somehow*..)
{
// logic...
}This idea is inspired by Glasgow Haskell's |
This comment has been minimized.
This comment has been minimized.
bbatha
commented
Nov 9, 2018
This also plays nicely with defaults for optional and default parameters. For instance |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Nov 9, 2018
•
|
@scottmcm I never intended to say that they are similar on a technical level. My intent was to more effectively communicate the value and purpose of having structural records in addition to structs by drawing an analogy to when and why people use (and don't use) closures with only explicit arguments when we already have functions. (ie. As an alternative or complement to throwing a list of example use cases at people.) |
Centril
added
the
A-data-types
label
Nov 10, 2018
Centril
added
A-syntax
A-typesystem
A-structural-typing
A-product-types
A-patterns
A-expressions
labels
Nov 22, 2018
Centril
assigned
cramertj
Jan 3, 2019
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
|
Strongly opposed. The fact that there's an un-filled box in a particular table enumerating aspects of types does not at all warrant filling every such box in. Consistency should not be considered an design criterion that exceeds thrift. Worse: we had this feature before in the past and explicitly removed it to cut things down to a more-manageable size. It should not be added back. |
This comment has been minimized.
This comment has been minimized.
This is not the only motivation of this proposal; indeed, the RFC explicitly states that it is a minor motivation. Perhaps consider the other motivations in the proposal?
Yes I'm aware; it is addressed here; perhaps address the text there? EDIT: I won't be able to devote too much time to this RFC since other accepted proposals and work on those take precedence. |
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
|
It was not addressed. Those 3 reasons patrick listed are the reasons we added nominal structs. The reason we removed structural records was because they were too redundant with nominal structs. They are still too redundant with nominal structs. |
This comment has been minimized.
This comment has been minimized.
Okay, this was not at all clear from what Patrick said.
Well, let's agree to disagree on that. I think structural records add a large degree of convenience. |
This comment has been minimized.
This comment has been minimized.
burdges
commented
Jan 12, 2019
•
|
I do think structural records give many fun tricks for things like named function arguments. In fact, I believe structural records should be clarified as "the one true way" to do things like named function arguments. Yet, we should delay by years any serious experimentation with such tricks, or structural records, because they all incur significant strangeness cost, and far more important concerns exist. |
This comment has been minimized.
This comment has been minimized.
Stargateur
commented
Feb 26, 2019
|
You really think it's a "Major: Improving ergonomics, readability, and maintainability" ?!?!? |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Feb 26, 2019
•
If anything, it illustrates how easily a newcomer might mistake such a syntax for an equivalent to Python's dict literals and JavaScript's object literals. (ie. a I'd certainly expect a language to develop map literals before it develops something in the vein of anonymous structs. |
This comment has been minimized.
This comment has been minimized.
|
@Stargateur where is |
This comment has been minimized.
This comment has been minimized.
burdges
commented
Feb 26, 2019
|
We noticed a handy trick for replacing structural records: We permit types declared
In essence, We might support more polymorphism this way than with structural records, but obviously structural records simplify everything sometimes. We've some complex type parameters in |
This comment has been minimized.
This comment has been minimized.
Stargateur
commented
Feb 26, 2019
•
@iopq And what if one library use |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Feb 26, 2019
•
From that "optimize for interoperation without conversion" perspective, the best solution is a Of course, it's also always possible that you'll find There's no magic bullet. The best you can get is to have a standard crate which just defines types like |
This comment has been minimized.
This comment has been minimized.
jswrenn
commented
Feb 26, 2019
|
@Stargateur totally valid concerns, and for those reasons, I don't think we'll see ad hoc record types appear in the public API of many crates. Ad hoc records aren't really a replacement for named records; they're more closely a substitute for Rust's other ad hoc type: tuples. We generally use tuples for the one-off packings of data that you need when you're doing transformations over that data; e.g., iterator chains, or internal helper functions. Tuples are the right tool to use when giving a name to a type is actually more cognitive load than just bunching the data all together (and there is some cognitive load to naming a type: you need to pick a name, pick a place to define it, manage imports, and look back at the definition). The big problem with tuples is they aren't really self-documenting: you need to always remember what order the components are in. If the components have different types, you'll be saved from mistakes by a compiler error. If they don't, you're hosed. I think of named records as basically just being tuples where the fields aren't named And that's the role I need them for. In my work I do a lot of transformations over deep, tree-structured data for analysis purposes. The input types are all named structures (thanks serde!) and the output types are all named structures too. What happens in the middle is a complete mess of tuples, though. I do dozens of one-off transformations of data. The cognitive load (and code bloat) of turning all of those tuples into name structures would be totally unmanagable. But: trying to decipher iterator chains that use tuples also sucks! Ad hoc records are going to really improve my quality of life in programming Rust. |
This comment has been minimized.
This comment has been minimized.
burdges
commented
Feb 26, 2019
|
I think transformations over deeply structured data sounds like lenses, but structural records would often be simpler. |
This comment has been minimized.
This comment has been minimized.
|
But Color has a canonical ordering A 2d rectangle doesn't, but Not having to declare it somewhere seems like a win |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Feb 28, 2019
•
|
Actually, color does not have a canonical ordering. For historical reasons, you run into BGR as a legacy detail that has persisted in various Windows file formats and APIs and it also crops up when the abstraction leaks on a little-endian system. For example, I got tripped up when using PyOpenCV because its use of 32-bit integers as an internal representation results in BGR(A) content on a little-endian processor and (A)RGB on a big-endian processor. (I wound up very confused about why my images were coming out wrong until I finally managed to determine that it was my assumptions about the mapping between color components and indices... the exact kind of bug named struct fields prevent.) Likewise, Google's design document for Skia has a section named "The epic battle of RGBA and BGRA" which talks about having to do conversion between RGB(A) file formats and BGRA memory representations. (Another situation where bugs are likely to creep in when using numeric indices.) As for 2D rectangles, I've always seen them in one of two representations:
In both cases, taking after the |
This comment has been minimized.
This comment has been minimized.
Note that this is the same thing that C# hit, which I mentioned in #2584 (comment) -- it doesn't necessarily need structural types, just anonymous ones often work great. |
Centril commentedNov 2, 2018
Introduce structural records of the form
{ foo: 1u8, bar: true }of type{ foo: u8, bar: bool }into the language. Another way to understand these sorts of objects is to think of them as "tuples with named fields", "unnamed structs", or "anonymous structs".To @kennytm, @alexreg, @Nemo157, and @tinaun for reviewing the draft version of this RFC.
To @varkor and @pnkfelix for good and helpful discussions.