Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field init shorthand #1682

Merged
merged 16 commits into from
Oct 22, 2016
Merged

Field init shorthand #1682

merged 16 commits into from
Oct 22, 2016

Conversation

ranweiler
Copy link
Contributor

@ranweiler ranweiler commented Jul 19, 2016

This RFC proposes a shorthand syntax for constructing struct-like values with named fields.

Rendered

We propose treating expressions like S { x } to be equivalent to S { x: x }, where S is a struct, struct-like enum variant, or untagged union.

Note: this RFC was coauthored by @joshtriplett and myself.

@jonas-schievink
Copy link
Contributor

Rendered

let field2 = {
// More initialization code
};
SomeStruct { complexField, anotherField }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this line be SomeStruct { field1, field2 }?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sure should, thanks!

@petrochenkov
Copy link
Contributor

petrochenkov commented Jul 19, 2016

Closes #306
Thanks for writing this, since the implementation is trivial, a formal RFC + lang team decision is the only things necessary to add this to the language.

@Havvy
Copy link
Contributor

Havvy commented Jul 19, 2016

As a point of experience, I've used a macro for this in Elixir, and the ergonomics just aren't as good as having it baked into the language, like with ECMAScript 2015.

@jirutka
Copy link

jirutka commented Jul 19, 2016

This is clearly inspired from ECMAScript and from my experience with JS, it’s really very convenience syntactic sugar. 👍

However, I have one minor problem with the syntax, it’s IMO not very clear what’s going on at the first sight. It looks exactly like construction of tuple-like structs with positional fields, only except parenthesis.

I think that using a colon before a variable name may communicate the intention a bit better:
S { x: x }S { :x }.

What do you think?

(This syntax with colon is used in Moonscript).

@sophiajt
Copy link
Contributor

@jirutka - Keeping it aligned with ES6 makes it easier to understand for a growing number of devs. Generally a good rule of thumb to reduce confusion that new syntax might cause if there's already a precedent that people will be familiar with...

@alilleybrinker
Copy link

@jirutka besides what @jonathandturner mentioned, there's also the issue of potential confusion for people coming from Ruby, where a : prefix is used for Symbols. Although this is admittedly a minor problem.

@jirutka
Copy link

jirutka commented Jul 19, 2016

Well, this argument is kinda double-edged. For example, many languages uses null, so should we add it too to be more familiar? I would rather think about it from perspectives of Rust, how it fits to the current syntax and if it may cause some confusion, now what other languages uses.

That said, I consider it a minor issue, I just want to open discussion about the syntax, find more relevant arguments, not just “ECMAScript has it.”

I personally find this syntax a little confusing, from my experience with JS/ECMAScript, and maybe we can find a better one? How others see it?

@joshtriplett
Copy link
Member

joshtriplett commented Jul 19, 2016

@jirutka I think the syntax with the ':' is worth listing in the alternatives, at the very least.

I personally don't favor that syntax, but not because of the additional character; it's because when compared to the field: value syntax, it makes the field name look like it takes the place of value while leaving out field, which seems odd. Using a trailing colon seems more evocative, but field1:, field2: looks strange.

I don't think it's just the braces that'll make it look visually distinct; it's also the conventional spacing. Consider these two typical initializers:

Struct1(a, b, c)
Struct2 { a, b, c }

If they differed only by punctuation, I'd agree that that seems slightly subtle. But a tuple-struct-style initializer typically has no space around the parentheses, while a named field initializer (with or without shorthand) typically has space around the braces.

(Coming from C-like languages in general, braces directly next to other tokens without whitespace just looks wrong. That tends to, instead, evoke bash-style expansion: foo{1,2,3} expanding to foo1 foo2 foo3.)

@strega-nil
Copy link

I was told to write this down, so I am.

I don't like this proposal because I think it's unnecessary and confusing. As a C programmer, I would expect that:

struct Foo {
  x: i32,
  y: i32,
}

let x = 0; let y = 1;
let foo = Foo { x: y, y: x };

to be equivalent to

struct Foo {
  int32_t x;
  int32_t y;
}

int32_t x = 0; int32_t y = 1;
Foo foo = { .x = y, .y = x };

and

let x = 0; let y = 1;
let foo = Foo { y, x };

to be equivalent to

int32_t x = 0; int32_t y = 1;
Foo foo = { y, x };

which would be the reverse of what it is now.

@sophiajt
Copy link
Contributor

@ubsan - Is the gist here that positional values no longer holds?

@strega-nil
Copy link

@jonathandturner Right. If I'm writing Foo { x, y } then I'd expect it to be positional, not based on names which are easily subject to change.

@eddyb
Copy link
Member

eddyb commented Jul 19, 2016

Rust never had any way of observing defined positions of named fields, other than drop order, so this doesn't feel like a particularly strong argument.
On the contrary, C's syntax for this is more confusing and mistake-prone by default than it needs to be, as you can easily get the ordering wrong with the definition of the structure being somewhere else entirely.

@joshtriplett
Copy link
Member

joshtriplett commented Jul 19, 2016

@ubsan I can understand that. That's part of what we were trying to capture with the note in the RFC that this could look a bit like a positional initializer.

@jonathandturner Structs with field names have never accepted positional initializers, and this doesn't change that. It's that, coming from other languages, this looks like a positional initializer, but interpreting it as such would produce incorrect results.

For that matter, it would mean code that Rust would normally reject as invalid will instead compile; if someone learning Rust tries to write a positional initializer for a struct with field names, and happens to use the same names as the fields, they'll get weird results rather than a helpful compiler error. I can actually imagine someone doing this. "Create a struct Point. Create x and y variables. Initialize the Point with them. (Using positional syntax by mistake.) Print it. OK, that works. (But not for the reason they think it did.) Now try reversing the two variables. Wait, what the..."

This is actually a good argument for adding some kind of sigil to indicate this shorthand, as @jirutka suggested. That would make it clear that something more is going on. I'm finding that argument rather convincing now.

What might that sigil look like? What are other alternatives to a leading or trailing colon on the field name?

@strega-nil
Copy link

@eddyb Foo(x, y) should feel equivalent to Foo { x, y }, and it'd be weird (to me) if they didn't. It'd also confuse many C/C++ programmers coming to Rust for the first time, imho.

@solson
Copy link
Member

solson commented Jul 19, 2016

C-f "pattern" (of the comments) found nothing, so I'd like to note that we already have half of this feature: Foo { x, y } is a pattern short for Foo { x: x, y: y }. And these are already non-positional: Foo { y, x } is an equivalent pattern.

This RFC brings the existing pattern shorthand into the expression syntax.

@jirutka
Copy link

jirutka commented Jul 19, 2016

@solson I’m confused, what’s “C-f pattern”?

@ranweiler
Copy link
Contributor Author

(@jirutka, I think they mean that they searched the text of this discussion)

@joshtriplett
Copy link
Member

@solson That was exactly what we were going for.

Interesting point about the non-positional patterns. On the other hand, most other languages with this kind of syntax don't have pattern-matching, so the knowledge doesn't carry over as much. But it would be odd to introduce an extra sigil for the initializer shorthand that patterns don't use. Hmmm.

@alilleybrinker
Copy link

Yeah, given @solson's comment, it seems the current syntax is the most Rust-consistent syntax, even if it's not the most over-language-consistent syntax.

@solson
Copy link
Member

solson commented Jul 19, 2016

@jirutka As @ranweiler said, I just meant I did a Ctrl+F search.

@joshtriplett Yeah, I didn't say much that's not in the RFC, but I thought it was good to stress this point. Good point also about consistency between the two syntaxes.

@SimonSapin
Copy link
Contributor

This mirrors the existing syntactic sugar of structs patterns. +1

@nrc
Copy link
Member

nrc commented Oct 4, 2016

I'm checking off on this because I don't object strongly enough to try and block the feature (I don't think anything actually bad will happen because of it, just that I think the cons outweigh the pros). Let it be known that I still do not like it.

@nikomatsakis
Copy link
Contributor

@eddyb

Except that's not the case for pattern shorthands.

I'm not quite sure what you are saying here. Are you saying that pattern shorthands today interact with hygiene in some complex way? Or just that the lowering doesn't happen when building HIR in particular? (I agree with @nrc that this would be a logical time to do it, regardless of what we do now.)

@eddyb
Copy link
Member

eddyb commented Oct 4, 2016

@nikomatsakis The latter, as you can see in the comment I linked. Of course that with HIR we may just
opt to move the lowering there, but struct pattern shorthands don't have a hygiene problem so doing exactly what they do for struct literal shorthands (i.e. expand in the parser) would work fine with hygiene.

@jimmycuadra
Copy link

Not sure there's any point in commenting on this now since all the subteam members have voted to merge, but I agree totally with @nrc's viewpoint. This is a totally unnecessary thing to have language-level support for.

@ranweiler
Copy link
Contributor Author

@aturon, @nikomatsakis, I have a meta question. We're in the FCP period, but given the discussion concluding here, it looks like there's agreement to rename the feature to field-init-shorthand. Will pushing a commit to do so (at least in the RFC document, if not the branch name) disrupt the RFC procedure or supporting tooling?

@aturon
Copy link
Member

aturon commented Oct 4, 2016

@nrc

Thanks much for pushing to articulate your concerns. I think I can understand your general queasiness, but for me, thinking of as pure shorthand at the parsing level (as @nikomatsakis suggests) basically feels right, and largely settles this question. It's also noteworthy that Standard ML -- a very conservative and mathematically modeled and designed language -- has this feature.

I'll just add as a final note that the places where I've wanted this feature tend to involve relatively large structs that are purely for internal use, where I don't want to write a constructor.

Often the field name is something like token, of type Token, where I pretty consistently am using variables called token to talk about the current instance. IOW, I find that I already tend to match names up to aid readability, even though this shorthand doesn't yet exist. While it's possible that adding the shorthand could incentivize poor naming choices, I feel like in general people are (or should be) more careful about the name of struct fields than the name of particular variables.

Refactoring to use a constructor is indeed an interesting potential pitfall, though I think it's worst when you were punning all the names. It doesn't feel like enough of an issue to block the feature, though.

@nikomatsakis
Copy link
Contributor

@ranweiler you can change the name of the feature, I don't think that disturbs the FCP period. =)

@aturon
Copy link
Member

aturon commented Oct 4, 2016

@jimmycuadra

Not sure there's any point in commenting on this now since all the subteam members have voted to merge

To be clear, we've recently moved to a system, with @rfcbot, where everyone on the subteam signs off on going into FCP with a particular disposition. Previously, it was too easy for some team members to only evaluate carefully during FCP, which was not ideal.

So, at this point the RFC is ready to go into FCP, but that still leaves another week for final community discussion, which can raise new issues.

@ranweiler ranweiler changed the title Named field puns Field init shorthand Oct 4, 2016
@aturon aturon added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Oct 4, 2016
@aturon
Copy link
Member

aturon commented Oct 4, 2016

🔔 This RFC is entering its final comment period, with disposition to merge. 🔔

The most recent full summary is available here, with main arguments against here.

@rfcbot
Copy link
Collaborator

rfcbot commented Oct 11, 2016

It has been one week since all blocks to the FCP were resolved.

@target-san
Copy link

Just my 5 cents.
I see that this particular feature will be possibly considered a "code smell" and then lint will be added to mark it. Thus, is there any sense in adding something which will be considered a code smell?

@solson
Copy link
Member

solson commented Oct 19, 2016

@target-san I consider the foo: foo pattern a code smell, I see it quite a lot in Rust code, and this feature solves it cleanly. The majority seem to think that this feature will make code better, not worse.

@target-san
Copy link

@solson I just really don't like possible gotchas with things like Vector { z, y, x } - which happen in C++ as an example.

@solson
Copy link
Member

solson commented Oct 19, 2016

@target-san { z, y, x } is quite different between C++ and Rust.

I can see why coming from C++ you may be wary of this, because in C++ the curly brace expressions are always positional, so things like int foo[] { 1, 2, 3 }; are placed by order of appearance.

But in Rust, curly-braces-with-commas always means non-positional, name-based associations. For example, you can have a struct struct Foo { x: i32, y: i32, z: i32 } and initialize in a different order with Foo { z: 10, y: 5, x: 0 } or pattern match with Foo { y, x, z } even today. What we pay attention to is the names, and the order has no bearing.

When you look at it this way, there really is no gotcha. It's consistent with the rest of Rust.

@ranweiler
Copy link
Contributor Author

I think @target-san's example is interesting, as it is a case where using a struct expression (vs. a ctor) excludes certain errors (if you are already using a certain pattern).

Suppose we have something like:

struct Vector {
    x: f64,
    y: f64,
    z: f64,
}

impl Vector {
    fn new(x: f64, y: f64, z: f64) {
        Vector { x: x, y: y, z: z }
    }
}

Then lets say I'm using Vector in this context:

// The function calls here are meant to indicate "doing a bunch of complicated work"
let x = f();
let y = g();
let z = h();

let v = _  // we want build a `Vector` here, where the field names should match the local var names

What are the ways I can construct v from x, y, and z?

If I use the ctor, I can make a typesafe error like:

// Permuting arguments
let v = Vector::new(x, z, y);

or:

// Repetition of an argument
let v = Vector::new(x, y, y);

Similarly, if I use a struct literal without shorthand notation, I can make (essentially) the same errors:

let v = Vector { x: x, y: z, z: y };

or maybe a more likely typo:

let v = Vector { x: x, y: y, z: y };

But, if I was already doing some work earlier to come up with x, y, and z vars that ought to correspond to the fields of the same name, I can't make the permutation error with shorthand notation, since the following are equivalent:

Vector { x, y, z };
Vector { y, z, x };
Vector { z, x, y };

(and so on for all permutations of the field labels).

For the repetition error, with this proposal, the expression

Vector { x, y, y }

would expand to:

Vector { x: x, y: y, y: y }

which would fail to compile with the existing error:

error[E0062]: field `y` specified more than once
  --> src/lib.rs:12:34
   |
12 |     let v = Vector { x: x, y: y, y: y };
   |                            ----  ^ used more than once
   |                            |
   |                            first use of `y`

error: aborting due to previous error

Is this a colossal win? Not really, and in many (most?) cases, we want to hide struct fields and mediate all access to them with a ctor/methods, which means the syntax doesn't help end-users of Vector (just us, when writing its impl).

Also, this is only relevant in the setting where we are already (as humans) building up values named x, y, and z that are implicitly meant to match the field names of Vector. As @nrc very sharply pointed out, switching to a ctor (where order matters, not name) isn't really any worse, since we are switching from "one implicit scheme to another implicit scheme". For me, the local var name implicit scheme is common enough (even when I have a ctor!) to favor adding sugar to support it.

If this discussion surfaced gotchas that were worse than what we gain, I'd oppose this sugar (even though I co-proposed it!). But, I think it is conservative enough to provide a small net benefit here and there, including extra static safety in certain contexts.

@dns2utf8
Copy link

dns2utf8 commented Oct 19, 2016

What happens when I use a contextual keyword?

let union = ...;
struct with { some: String, locals: String } // knowing this name would riese a warning
Foo {
    union, with { some, locals }
}

Will this error or guess what is right?

@durka
Copy link
Contributor

durka commented Oct 19, 2016

@dns2utf8 I don't see any conflicts there. You can't declare a union inside
a struct initializer.

On Wed, Oct 19, 2016 at 6:41 PM, dns2utf8 notifications@github.com wrote:

What happens when I use a contextual keyword?

let union = ...;
struct with { some: String, locals: String } // knowing this name would riese a warning
Foo {
union, with { some, locals }
}

Will this error or guess what is right


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1682 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAC3n8mHo8tsoFN9LWxukaGRtcWtosPeks5q1pyigaJpZM4JP7RN
.

@aturon aturon merged commit b425e05 into rust-lang:master Oct 22, 2016
@aturon
Copy link
Member

aturon commented Oct 22, 2016

RFC merged. Further discussion should happen on the tracking issue.

@rust-lang rust-lang locked and limited conversation to collaborators Oct 22, 2016
@Centril Centril added A-syntax Syntax related proposals & ideas A-expressions Term language related proposals & ideas labels Nov 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-expressions Term language related proposals & ideas A-syntax Syntax related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet