Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Structural Records #2584
Conversation
Centril
and others
added some commits
Oct 28, 2018
Centril
added
the
T-lang
label
Nov 2, 2018
This was referenced Nov 2, 2018
This comment has been minimized.
This comment has been minimized.
|
This looks like a super well thought out and comprehensive RFC. It might be worth clarifying the behaviour of struct RectangleTidy {
dimensions: {
width: u64,
height: u64,
},
color: {
red: u8,
green: u8,
blue: u8,
},
}Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves. Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples. With anonymous structs this will not be possible. Especially with the likely arrival of const generics in the not too distant future, negating the need for macros entirely, anonymous structs will become second class citizens. Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs? |
petrochenkov
reviewed
Nov 2, 2018
| other_stuff(color.1); | ||
| ... | ||
| yet_more_stuff(color.2); | ||
| } |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
fn do_stuff_with((red, green, blue): (u8, u8, u8)) {
some_stuff(red);
other_stuff(green);
yet_more_stuff(blue);
}
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
•
Contributor
Even better, with #2522:
fn do_stuff_with((red: u8, green: u8, blue: u8)) {
some_stuff(red);
other_stuff(green);
yet_more_stuff(blue);
}However, while if you write it in this way it is clear what each tuple component means, it is not clear at the call site...
let color = (255, 0, 0);
// I can guess that this is (red, green, blue) because that's
// the usual for "color" but it isn't clear in the general case.
petrochenkov
reviewed
Nov 2, 2018
| blue: u8, | ||
| }, | ||
| } | ||
| ``` |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
This syntax looks very similar to another already accepted RFC, but semantics seems different - #2102.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Contributor
True; I have to think about if some unification can be done in some way here.
This comment has been minimized.
This comment has been minimized.
Thanks!
This bit is not currently well specified, but it should be so I will fix that (EDIT: fixed)... I see two different ways to do it:
Yup. Usually up to 12, emulating the way the standard library does this.
Nit: to not have to use macros for implementing traits for tuples you need variadic generics, not const generics. :)
I think it might be possible technically; I've written down some thoughts about it in the RFC. However, the changes needed to make it possible might not be what folks want. However, not solving the issue might also be good; by not solving the issue you add a certain pressure to gradually move towards nominal typing once there is enough operations and structure that you want on the type. |
petrochenkov
reviewed
Nov 2, 2018
| standard traits that are implemented for [tuples]. These traits are: `Clone`, | ||
| `Copy`, `PartialEq`, `Eq`, `PartialOrd`, `Ord`, `Debug`, `Default`, and `Hash`. | ||
| Each of these traits will only be implemented if all the field types of a struct | ||
| implements the trait. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 3, 2018
Contributor
Ideas for possible magic reduction discussed below in #2584 (comment).
petrochenkov
reviewed
Nov 2, 2018
| + For `PartialEq`, each field is compared with same field in `other: Self`. | ||
|
|
||
| + For `ParialOrd` and `Ord`, lexicographic ordering is used based on | ||
| the name of the fields and not the order given because structural records |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
With macros fields can have names with same textual representation, but different hygienic contexts.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Contributor
Does that change anything wrt. the implementations provided tho? I don't see any problems with hygiene intuitively, but maybe I can given elaboration?
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
•
Contributor
This is probably relevant to other places where the order of fields needs to be "normalized" as well.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Contributor
Sure. :) But I'm not sure if you are pointing out a problem or just noting...
Any sorting / "normalization" is done post expansion.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
It's not clear to me how to sort hygienic contexts in stable order, especially in cross-crate scenarios. That probably can be figured out somehow though.
petrochenkov
reviewed
Nov 2, 2018
| ```rust | ||
| ty ::= ... | ty_srec ; | ||
| ty_srec ::= "{" (ty_field ",")+ (ty_field ","?)? "}" ; |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
•
Contributor
I don't think types can start with { syntactically.
There are multiple already existing ambiguities where parser assumes that types cannot start with {, and upcoming const generics introduce one more big ambiguity ({} is used to disambiguate in favor of const generic arguments).
#2102 uses struct { field: Type, ... } to solve this (and also to discern between struct { ... } and union { ... }).
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
I'm not sure about expressions/patterns.
Unlimited lookahead may also be required, need to check more carefully.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 2, 2018
Contributor
There are multiple already existing ambiguities where parser assumes that types cannot start with
{
Such as? (examples please...)
Const generics allow literals and variables as expressions but anything else needs to be in { ... }.
This is not ambiguous with structural records (because the one-field-record requires a comma...) but requires lookahead.
Unlimited lookahead may also be required, need to check more carefully.
Scroll down ;) It's discussed in the sub-section "Backtracking".
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
It's discussed in the sub-section "Backtracking".
It's a huge drawback.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
Such as? (examples please...)
Nothing that can't be solved by infinite lookahead, but here's "how to determine where where clause ends":
fn f() where PREDICATE1, PREDICATE2, { a: ...
Does { a: ... belong to the function body or to the third predicate?
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
Contributor
@H2CO3 oh sure; the parser must handle it, but the perils of backtracking is that it would considerably slow down parsing or produce bad error messages. What I'm saying is that the code paths in the parser that would handle this are pathological, so the slow downs are unlikely.
There's also no ambiguity here at all, but lookahead / backtracking / GLL may be required.
If we ever use the GLL crate in the compiler then we don't even need backtracking... it provides a correct algorithm to parse context-free-grammars and can produce good error messages for these sort of things because it produces a parse forest that you can catch certain patterns on.
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 4, 2018
Contributor
The fact that advanced formalisms exist doesn't necessarily mean that we should use them.
I many respects the simpler means the better.
Officially abandoning LL(n) is a separate decision that shouldn't be done in comments to a mostly unrelated RFC.
(That said, for this RFC disambiguation can also be done using cover grammar TYPE_OR_EXPR.
We parse { IDENT: TYPE_OR_EXPR and look at the next token, if it's , then we have a struct record type.)
(Nit: GLL is a backtracking of sorts, just with fancy memoization making is O(n^3) rather than exponential).
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
Contributor
The fact that advanced formalisms exist doesn't necessarily mean that we should use them.
I many respects the simpler means the better.
Well there are advantages to both, and we should decide what trade-off is best for Rust.
Officially abandoning
LL(n)is a separate decision that shouldn't be done in comments to a mostly unrelated RFC.
Abandoning LL(k) must be made for a reason, and that reason would be some feature or change we would like to make that forces us to do it. I don't think (correct me if that is inaccurate) we ever stated explicitly that LL(k) is a design goal so if so it doesn't need explicit and official abandonment either.
So I think either we abandon LL(k) on this RFC or in the #2544 RFC, or we accept neither and keep LL(k).
(That said, for this RFC disambiguation can also be done using cover grammar
TYPE_OR_EXPR.
We parse{ IDENT: TYPE_OR_EXPRand look at the next token, if it's,then we have a struct record type.)
Yeah, that's the idea I had also.
(Nit: GLL is a backtracking of sorts, just with fancy memoization making is
O(n^3)rather than exponential).
Fair enough ;)
This comment has been minimized.
This comment has been minimized.
Laaas
Nov 7, 2018
Even if you disregard the fact that this syntax would necessitate backtracking, it is confusing for readers IMO. Why not require a struct keyword or something?
This comment has been minimized.
This comment has been minimized.
Centril
Nov 7, 2018
Contributor
@Laaas Because it is inconsistent with Tuples/Tuple-structs and also less ergonomic.
I don't think it's confusing for readers, the one-field case is clearly disambiguated by an extra comma and that is consistent with how tuples are treated. In other words, if { x, } is confusing, then so is (x,).
petrochenkov
reviewed
Nov 2, 2018
| field_init ::= ident | field ":" expr ; | ||
| ``` | ||
|
|
||
| Note that this grammar permits `{ 0: x, 1: y }`. |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Nov 2, 2018
Contributor
#1595 was closed, so I'd expect this to be rejected.
This saves us from stuff like "checking that there are no gaps in the positional fields" as well.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
varkor
Nov 3, 2018
Member
struct A(u8);
let _ = A { 0: 0u32 };is accepted, so this is currently inconsistent as it stands anyway.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 4, 2018
•
Contributor
@varkor yeah that was accepted to make macro writing easier (and it totally makes it easier...!)
petrochenkov
reviewed
Nov 2, 2018
text/0000-structural-records.md Outdated
This comment has been minimized.
This comment has been minimized.
|
I think the language should be frozen for large additions like this in general, for a couple of years at least. |
This comment has been minimized.
This comment has been minimized.
|
The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things. Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine I'd also be tempted to block this on changing struct literal syntax to using Overall, I think I feel that this is cool, but not necessary. |
Centril
added some commits
Nov 3, 2018
This comment has been minimized.
This comment has been minimized.
|
I'm very concerned with properties of the struct explicitly relying on lexicographic ordering. I don't think that I think that defining lexicographic behaviour for But as a user, I would be surprised to find the below code fail: let first = { foo: 1, bar: 2 };
let second = { foo: 2, bar: 1};
assert!(first < second); |
This comment has been minimized.
This comment has been minimized.
|
I've added an unresolved question to be resolved prior to merging (if we do that...) about whether
I would have loved if we had used
Sure; I entirely agree with that sentiment; it is by far the biggest drawback.
I noted somewhere in the RFC that with the combination of const generics and variadic generics, you may be able to extend this to structural records as well. For example (please note that this is 100% a sketch), using a temporary syntax due to Yato: impl
<*(T: serde::Serialize, const F: meta::Field)>
// quantify a list of pairs with:
// - a) type variables T all bound by `serde::Serialize`,
// - b) a const generic variables F standing in for the field of type meta::Field
// where `meta` is the crate we reserved and `Field` is a compile time
// reflection / polymorphism mechanism for fields.
serde::Serialize
for
{ *(F: T) }
// The structural record type; (Yes, you need to extend the type grammar *somehow*..)
{
// logic...
}This idea is inspired by Glasgow Haskell's |
This comment has been minimized.
This comment has been minimized.
|
I love records, but I'm not sure they work well in a language without subtyping that allows combining Probably the best example of what I think would work great, is TinyBang & friends's "onions". |
This comment has been minimized.
This comment has been minimized.
|
One great aspect of this is it's way easier to create substructures (the |
This comment has been minimized.
This comment has been minimized.
|
I wouldn't expect Just because something is hard to do right, doesn't mean it's reasonable to do it wrong |
fbenkstein
reviewed
Nov 7, 2018
| For example, if `A` and `B` are `Send`, then so is `{ x: A, y: B }`. | ||
|
|
||
| A structural record is `Sized` if all field types are. | ||
| If the lexicographically last field of a structural record is `!Sized`, |
This comment has been minimized.
This comment has been minimized.
fbenkstein
Nov 7, 2018
I feel like forcing a particular name by in unnecessarily restrictive. Couldn't the unsized field just be reordered to be last? What's the imagined use case even for allowing unsized fields or is it just for completeness? Wouldn't it be more parsimonious to disallow them initially?
This comment has been minimized.
This comment has been minimized.
Laaas
Nov 7, 2018
I think disallowing unsized fields would be a shame, since they are allowed in tuples, although even in tuples they are awkward to use, since you must place them last, and if they are not, you will get an odd error about the field not being sized, when you try to use it.
This comment has been minimized.
This comment has been minimized.
eddyb
Nov 7, 2018
Member
We kind of want to allow any tuple field to be unsized, FWIW.
The current limitation is almost entirely an implementation limitation.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 7, 2018
Contributor
I feel like forcing a particular name by in unnecessarily restrictive. Couldn't the unsized field just be reordered to be last?
I don't think so no; The type system needs to see these in lexicographic ordering in all respects so that type identity can be determined by sorting fields lexicographically. Otherwise I think the well-formedness requirements would become difficult to reason about. This is also the way Hash, Debug, and behavior wrt. coherence is meant to work.
What's the imagined use case even for allowing unsized fields or is it just for completeness? Wouldn't it be more parsimonious to disallow them initially?
Purely for consistency with tuples. I want to achieve parity with them.
We kind of want to allow any tuple field to be unsized, FWIW.
The current limitation is almost entirely an implementation limitation.
If and when we do that then we can also lift the same restrictions for structural records but I think it would be premature to do it only for structural records.
This comment has been minimized.
This comment has been minimized.
Nemo157
Nov 7, 2018
Contributor
The type system needs to see these in lexicographic ordering in all respects so that type identity can be determined by sorting fields lexicographically.
This doesn't seem to require lexicographic ordering specifically. As long as there is a canonical order that can be determined by the type definition, ignoring the syntactic order of field declarations, then that would work. At its simplest change from what's specified it could be something like lexicographic ordering, except a singular unsized field is moved to the end.
This comment has been minimized.
This comment has been minimized.
eddyb
Nov 7, 2018
Member
@Nemo157 is correct here, the problem already exists with trait objects, that are e.g. dyn X + Y + Z, and they're equal no matter the order they were written in.
I'd say disallow unsized records until we can support all fields being unsized.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 7, 2018
Contributor
At its simplest change from what's specified it could be something like lexicographic ordering, except a singular unsized field is moved to the end.
I suppose; just to make sure... how would this handle polymorphic cases where we have a T: ?Sized...? It might be Sized at instantiation but we don't know that yet...
fbenkstein
reviewed
Nov 7, 2018
|
|
||
| + For `PartialEq`, each field is compared with same field in `other: Self`. | ||
|
|
||
| + For `ParialOrd` and `Ord`, lexicographic ordering is used based on |
This comment has been minimized.
This comment has been minimized.
fbenkstein
Nov 7, 2018
Is it necessary to commit to lexicographic ordering? Maybe it would be better to say there's some ordering based on the (name, type) tuples, telling the users they shouldn't rely on any particular ordering? That way rustc could decide, e.g. to compare integer fields before str fields as an optimization.
This comment has been minimized.
This comment has been minimized.
eddyb
Nov 7, 2018
Member
I don't think records should implement PartialOrd and Ord, because of this.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 7, 2018
Contributor
That way
rustccould decide, e.g. to compare integer fields beforestrfields as an optimization.
That would be semantically surprising imo since it affects observable behavior and thus it isn't really an optimization; if we implement PartialOrd and Ord then it should be done lexicographically.
This comment has been minimized.
This comment has been minimized.
tikue
commented
Nov 8, 2018
•
|
Would Rust ever consider structural enums? If so, there'd be consistency in reusing the let a: enum { A(i32), B(String) } = A(1);
let a: struct { a: i32, b: String } = { a: 1, b: "".into() };Edit: |
sighoya
reviewed
Nov 8, 2018
| type Bigger = { foo: u8, bar: bool, baz: char }; | ||
| let b: Bigger = { foo: 1, bar: true, baz: 'a' }; | ||
| let s: Smaller = b; // OK! `b` has all the fields `Smaller` needs. |
This comment has been minimized.
This comment has been minimized.
sighoya
Nov 8, 2018
•
This causes similar problems as with subtyping or implicit conversion in general. A more explicit solution is recommended, for instance with the as operator:
let s: b as Smaller
The same with nominal<->structural conversions.
This comment has been minimized.
This comment has been minimized.
Centril
Nov 8, 2018
Contributor
Not exactly; implicit conversions have problems (mainly in terms of giving type inference too many options to pick from sometimes as well as not being as clear about semantics), but they are nothing like subtyping which can affect and constrain memory layout / representation.
The main difference between subtyping and implicit coercion is that the latter only affects top level constructions while the former (when covariance occurs) makes it such that:
Vec<{ alpha: T, }> <: Vec<{ alpha: T, beta: U }>For a systems language this would be quite problematic.
That said, because implicit conversions do have problems, I have left it out of the main proposal at the moment.
This comment has been minimized.
This comment has been minimized.
jswrenn
commented
Nov 8, 2018
•
|
This RFC is really well put together @Centril! I am very much in favor of this proposal! I prototyped Pyret's initial language-level support for "data science" and came away from the experienced with an enormous appreciation for what can be achieved with a combination of just anonymous record types, FRU, and the three basic higher-order functions, In the two years since then, I've done virtually all of the data analysis work for my research in Rust. Serde makes it a breeze to parse enormous JSON datafiles into data structures with meaningful types, and Rust's rich iterator support (along with Itertools) let's me confidently and performantly implement transformations of that data. However, the inability to ergonomically create ad-hoc types at intermediate stages of those those transformations significantly increases the complexity of my codebase, reduces readability, and hinders Rust's potential as a language for safe, performant and ergonomic data processing. I am very invested in this proposal, and I basically support everything about it. However, I'd really like to see row polymorphism discussed as a Future Possibility. I don't think it's just a nicety; I think it's actually pretty fundamental for using anonymous record types because without it, you can't write abstractions. I like that there's nothing about this RFC that precludes adding it later. A few previous comments have mentioned row polymorphism, but nobody's given a concrete example of what it lets you express and I'm worried passing readers of this issue might assume it's weird type theory nonsense. Row polymorphism let's you express that a record has fields // Transform a stream of things with euclidian coordinates into a stream of
// things polar coordinates.
//
// `Ξ‘` denotes "some collection of additional fields and types"
//
// this type signature conveys that the _only_ thing altered between the
// input and output types is the `pos` field.
fn convert<Ξ‘>(input: impl IntoIterator<Item={pos: EuclidianCoordinate, ..Ξ‘}>)
-> impl Iterator<Item={pos: PolarCoordinate, ..Ξ‘}>
{
input
.into_iter()
.map(| {pos, ..Ο} : {pos: EuclidianCoordinate, ..Ξ‘} |
{pos: pos.into_polar_coordinate(), ..Ο})
} |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Nov 8, 2018
|
Might it be a good idea to more explicitly position this as being to structs as closures are to functions? I was concerned about this at first, because of the potential for abuse, but realizing that closures present analogous risks and benefits really helped me to get on the same page. |
This comment has been minimized.
This comment has been minimized.
|
@ssokolow I see the resemblance, but what exact parallels can you draw? It's actually easy to name the type of a record, as opposed to a closure. |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Nov 9, 2018
|
I was actually referring more to how they represent a similar trade-off: Functions serve as a boundary to type inference, to aid in maintaining a stable API and ABI while closures provide a means of declaring things inline and allow a controlled exception to the type inference requirement, making constructs like Traditional structs have a clear definition with its own identity, which serves to aid in using the type system to communicate a stable interface to others working on the codebase and enforce it while structural records provide a means of declaring things inline, and allow a controlled exception to the notion of identity beyond mere structure, making the various constructs listed in the RFC more useful when full-blown structs would just be a lot of extra boierplate at best. |
This comment has been minimized.
This comment has been minimized.
|
If anything, the example by @jswrenn is the most convincing. Looks like something that's not convenient to do right now. I can also read and understand exactly what that code is doing, it seems like a natural extension of other Rust features |
This comment has been minimized.
This comment has been minimized.
|
@ssokolow I think there's a few differences between closures and what's proposed here, though. For one, closures are anonymous, not structural -- your The last few comments here make me think more about the fields-in-traits proposal -- with closures you have |
This comment has been minimized.
This comment has been minimized.
bbatha
commented
Nov 9, 2018
This also plays nicely with defaults for optional and default parameters. For instance |
This comment has been minimized.
This comment has been minimized.
ssokolow
commented
Nov 9, 2018
•
|
@scottmcm I never intended to say that they are similar on a technical level. My intent was to more effectively communicate the value and purpose of having structural records in addition to structs by drawing an analogy to when and why people use (and don't use) closures with only explicit arguments when we already have functions. (ie. As an alternative or complement to throwing a list of example use cases at people.) |
Centril
added
the
A-data-types
label
Nov 10, 2018
Centril
added
A-syntax
A-typesystem
A-structural-typing
A-product-types
A-patterns
A-expressions
labels
Nov 22, 2018
Centril
assigned
cramertj
Jan 3, 2019
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
|
Strongly opposed. The fact that there's an un-filled box in a particular table enumerating aspects of types does not at all warrant filling every such box in. Consistency should not be considered an design criterion that exceeds thrift. Worse: we had this feature before in the past and explicitly removed it to cut things down to a more-manageable size. It should not be added back. |
This comment has been minimized.
This comment has been minimized.
This is not the only motivation of this proposal; indeed, the RFC explicitly states that it is a minor motivation. Perhaps consider the other motivations in the proposal?
Yes I'm aware; it is addressed here; perhaps address the text there? EDIT: I won't be able to devote too much time to this RFC since other accepted proposals and work on those take precedence. |
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
|
It was not addressed. Those 3 reasons patrick listed are the reasons we added nominal structs. The reason we removed structural records was because they were too redundant with nominal structs. They are still too redundant with nominal structs. |
This comment has been minimized.
This comment has been minimized.
Okay, this was not at all clear from what Patrick said.
Well, let's agree to disagree on that. I think structural records add a large degree of convenience. |
This comment has been minimized.
This comment has been minimized.
burdges
commented
Jan 12, 2019
•
|
I do think structural records give many fun tricks for things like named function arguments. In fact, I believe structural records should be clarified as "the one true way" to do things like named function arguments. Yet, we should delay by years any serious experimentation with such tricks, or structural records, because they all incur significant strangeness cost, and far more important concerns exist. |
Centril commentedNov 2, 2018
Introduce structural records of the form
{ foo: 1u8, bar: true }of type{ foo: u8, bar: bool }into the language. Another way to understand these sorts of objects is to think of them as "tuples with named fields", "unnamed structs", or "anonymous structs".To @kennytm, @alexreg, @Nemo157, and @tinaun for reviewing the draft version of this RFC.
To @varkor and @pnkfelix for good and helpful discussions.