New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Accept semicolons as item-like #2479

Open
wants to merge 12 commits into
base: master
from

Conversation

Projects
None yet
@Centril
Contributor

Centril commented Jun 16, 2018

πŸ–ΌοΈ Rendered

πŸ“ Summary

  • The semicolon (;) is now legal, but not recommended, in areas where items are allowed, permitting a user to write struct Foo {};, among other things.
  • To retain a recommended and uniform style, the tool rustfmt will remove any extraneous ;.
  • Furthermore, the compiler will fire a warn-by-default lint when extraneous ; are encountered, whether they be inside or outside an fn body.

πŸ’– Thanks

To @ExpHP who co-authored this RFC with me.

@Centril Centril added the T-lang label Jun 16, 2018

@clarcharr

This comment has been minimized.

Show comment
Hide comment
@clarcharr

clarcharr Jun 17, 2018

Contributor

Another alternative is to allow the parser to accept ; as an item, while still making having one be a hard error. This happens with a number of different syntax errors so that the compiler can report all errors at once, rather than stopping immediately.

Contributor

clarcharr commented Jun 17, 2018

Another alternative is to allow the parser to accept ; as an item, while still making having one be a hard error. This happens with a number of different syntax errors so that the compiler can report all errors at once, rather than stopping immediately.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 17, 2018

Contributor

@clarcharr I think that the proposal in the RFC is better, but I've made a note of that alternative in the RFC. Thank you for mentioning it.

Contributor

Centril commented Jun 17, 2018

@clarcharr I think that the proposal in the RFC is better, but I've made a note of that alternative in the RFC. Thank you for mentioning it.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Jun 17, 2018

Contributor

IIRC, the situation with ; in the parser was kinda mess, especially when macros are involved, we still have some compatibility warnings in this area.

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.
Perhaps we need to treat it as a statement consistently (e.g. accept ; as a stmt in macros), but I'm not sure right now what complications may arise with macros.

Regarding semicolon freely floating among items in modules, I think usual error recovery (report an error, skip the semicolon, continue parsing) would be enough.
The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

Contributor

petrochenkov commented Jun 17, 2018

IIRC, the situation with ; in the parser was kinda mess, especially when macros are involved, we still have some compatibility warnings in this area.

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.
Perhaps we need to treat it as a statement consistently (e.g. accept ; as a stmt in macros), but I'm not sure right now what complications may arise with macros.

Regarding semicolon freely floating among items in modules, I think usual error recovery (report an error, skip the semicolon, continue parsing) would be enough.
The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 17, 2018

Contributor

@petrochenkov

The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

Why? To me, a left-over semicolon does not usually signify a logic bug that needs to be caught immediately, but rather a trivial style mistake that can easily be stripped with rustfmt.

A third alternative is to allow it in the language but fire a warn-by-default lint; that satisfies your "needs to be caught immediately" condition.

Also, I think this RFC is in line with the general policy on language grammar as outlined by @withoutboats.

Contributor

Centril commented Jun 17, 2018

@petrochenkov

The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

Why? To me, a left-over semicolon does not usually signify a logic bug that needs to be caught immediately, but rather a trivial style mistake that can easily be stripped with rustfmt.

A third alternative is to allow it in the language but fire a warn-by-default lint; that satisfies your "needs to be caught immediately" condition.

Also, I think this RFC is in line with the general policy on language grammar as outlined by @withoutboats.

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor

I think the RFC is well written, but I tend to favor "Make struct F {}; a non-fatal hard error". I simply don't come across this problem that often, and I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

My feelings are not very strong though...

Contributor

mark-i-m commented Jun 17, 2018

I think the RFC is well written, but I tend to favor "Make struct F {}; a non-fatal hard error". I simply don't come across this problem that often, and I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

My feelings are not very strong though...

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 17, 2018

Contributor

@mark-i-m

I think the RFC is well written

Thanks :)

I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

Contributor

Centril commented Jun 17, 2018

@mark-i-m

I think the RFC is well written

Thanks :)

I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

I think so.

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

Contributor

mark-i-m commented Jun 17, 2018

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

I think so.

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

@Ixrec

This comment has been minimized.

Show comment
Hide comment
@Ixrec

Ixrec Jun 17, 2018

Contributor

@mark-i-m I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint? A hard error that the compiler "understands" well enough that it can still produce warnings/errors/etc for all the code after the superfluous semicolon?


Personally, the RFC as-written is almost what I'd want, but I'd also add a rustc warn-by-default lint against superfluous semicolons. Today, Rust technically has pretty consistent semicolon rules, which only appear inconsistent in practice because there's nothing loudly discouraging superfluous semicolons. I can see the argument that rustfmt is sufficient, but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

Contributor

Ixrec commented Jun 17, 2018

@mark-i-m I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint? A hard error that the compiler "understands" well enough that it can still produce warnings/errors/etc for all the code after the superfluous semicolon?


Personally, the RFC as-written is almost what I'd want, but I'd also add a rustc warn-by-default lint against superfluous semicolons. Today, Rust technically has pretty consistent semicolon rules, which only appear inconsistent in practice because there's nothing loudly discouraging superfluous semicolons. I can see the argument that rustfmt is sufficient, but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 17, 2018

Contributor

@mark-i-m

I think so.

Great; I'll add it to the list of alternatives for now in a bit. I wouldn't mind changing to it as the main proposal. Let's see what @ExpHP says.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

Fair enough, it does seem like a reasonable inference. I have two points regarding this:

  1. Would it be so bad? I mean, struct Foo { .. }; is a bit ugly in my subjective taste, but I can't say it is less legible than struct Foo { .. } without the semicolon at the end. I certainly wouldn't lose any sleep over such syntax being committed into VCS or be irked at all, even tho I wouldn't write like that personally.
  2. Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

@Ixrec

but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

That's a nice rationale :)

Contributor

Centril commented Jun 17, 2018

@mark-i-m

I think so.

Great; I'll add it to the list of alternatives for now in a bit. I wouldn't mind changing to it as the main proposal. Let's see what @ExpHP says.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

Fair enough, it does seem like a reasonable inference. I have two points regarding this:

  1. Would it be so bad? I mean, struct Foo { .. }; is a bit ugly in my subjective taste, but I can't say it is less legible than struct Foo { .. } without the semicolon at the end. I certainly wouldn't lose any sleep over such syntax being committed into VCS or be irked at all, even tho I wouldn't write like that personally.
  2. Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

@Ixrec

but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

That's a nice rationale :)

@Ixrec

This comment has been minimized.

Show comment
Hide comment
@Ixrec

Ixrec Jun 17, 2018

Contributor

Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

Personally, I constantly misremember the semicolon rules in both C++ and Javascript, despite working on one or the other every single day. So to me, "other languages are different" is a good argument for making it not a hard error (because it does reduce speedbumps), but not an argument for making it rustfmt-only (because I make these mistakes out of ignorance or indifference; not because I have an aesthetic preference for any one language's semicolon rules). Though I have no idea how true that is of everyone else.

Contributor

Ixrec commented Jun 17, 2018

Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

Personally, I constantly misremember the semicolon rules in both C++ and Javascript, despite working on one or the other every single day. So to me, "other languages are different" is a good argument for making it not a hard error (because it does reduce speedbumps), but not an argument for making it rustfmt-only (because I make these mistakes out of ignorance or indifference; not because I have an aesthetic preference for any one language's semicolon rules). Though I have no idea how true that is of everyone else.

@ExpHP

This comment has been minimized.

Show comment
Hide comment
@ExpHP

ExpHP Jun 17, 2018

Right now I'm bothered by what @petrochenkov said: (emphasis mine)

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.

Yikes! And indeed, this is visibly reflected in the macro_rules matchers. I assumed that stmt matches ;, but it does not:

   Compiling playground v0.0.1 (file:///playground)
error: expected a statement
  --> src/main.rs:14:7
   |
14 | i3! { ; }
   |       ^

Perhaps we should not formulate this as "making ; an item"...?

ExpHP commented Jun 17, 2018

Right now I'm bothered by what @petrochenkov said: (emphasis mine)

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.

Yikes! And indeed, this is visibly reflected in the macro_rules matchers. I assumed that stmt matches ;, but it does not:

   Compiling playground v0.0.1 (file:///playground)
error: expected a statement
  --> src/main.rs:14:7
   |
14 | i3! { ; }
   |       ^

Perhaps we should not formulate this as "making ; an item"...?

@ExpHP

This comment has been minimized.

Show comment
Hide comment
@ExpHP

ExpHP Jun 17, 2018

@Centril @mark-i-m I feel that a warn-by-default lint does adequately address the motivations, and would not mind making it the main proposal.

ExpHP commented Jun 17, 2018

@Centril @mark-i-m I feel that a warn-by-default lint does adequately address the motivations, and would not mind making it the main proposal.

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor

I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint?

As I understand it, there are two ways that part of the compiler can report an error. A fatal error immediately ends compilation, whereas a non-fatal error allows the compilation to proceed a bit further so we can gather more errors.

I think the the warn by default lint is a better solution, though. It seems more in line with things like warning for superfluous parens on if conditions.

Would it be so bad?

It is a personal taste thing for me too. I think the warn by default lint removes the papercuts for c/c++ programmers while encouraging good style.

Contributor

mark-i-m commented Jun 17, 2018

I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint?

As I understand it, there are two ways that part of the compiler can report an error. A fatal error immediately ends compilation, whereas a non-fatal error allows the compilation to proceed a bit further so we can gather more errors.

I think the the warn by default lint is a better solution, though. It seems more in line with things like warning for superfluous parens on if conditions.

Would it be so bad?

It is a personal taste thing for me too. I think the warn by default lint removes the papercuts for c/c++ programmers while encouraging good style.

@withoutboats

This comment has been minimized.

Show comment
Hide comment
@withoutboats

withoutboats Jun 17, 2018

Contributor

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

There's precedent for allowing optional ; after items both in how we allow ; optionally after control flow blocks & in how (I believe) we decided to allow , optionally after block match arms.

Contributor

withoutboats commented Jun 17, 2018

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

There's precedent for allowing optional ; after items both in how we allow ; optionally after control flow blocks & in how (I believe) we decided to allow , optionally after block match arms.

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Jun 17, 2018

Member

I'm very good at accidentally putting extra semicolons after braced structs, so I'm sympathetic here, but it's hard for me to judge what, if anything, should be done. I like the discussion so far.

Personally, I've been pretty happy since the error changed from "unexpected token" to

error: expected item, found `;`
 --> src/lib.rs:3:22
  |
3 | struct Foo { x: u32 };
  |                      ^ help: consider removing this semicolon
Member

scottmcm commented Jun 17, 2018

I'm very good at accidentally putting extra semicolons after braced structs, so I'm sympathetic here, but it's hard for me to judge what, if anything, should be done. I like the discussion so far.

Personally, I've been pretty happy since the error changed from "unexpected token" to

error: expected item, found `;`
 --> src/lib.rs:3:22
  |
3 | struct Foo { x: u32 };
  |                      ^ help: consider removing this semicolon
@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

I was actually thinking about this too. I think @withoutboats has a good point. In addition too their point, though, I think making ; a whole item would make life harder for macro authors who would now need to check for and handle another type of item.

By making ; an optional suffix onto any of the existing item types, I think we can avoid both problems.

Contributor

mark-i-m commented Jun 17, 2018

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

I was actually thinking about this too. I think @withoutboats has a good point. In addition too their point, though, I think making ; a whole item would make life harder for macro authors who would now need to check for and handle another type of item.

By making ; an optional suffix onto any of the existing item types, I think we can avoid both problems.

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor
;;;;;;;;;;;;;
fn main() { }

Oh man! This foils my plans to propose ;;;;;;;;;;;;; as shorthand for fn main 😈

Contributor

mark-i-m commented Jun 17, 2018

;;;;;;;;;;;;;
fn main() { }

Oh man! This foils my plans to propose ;;;;;;;;;;;;; as shorthand for fn main 😈

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m Jun 17, 2018

Contributor

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }
Contributor

mark-i-m commented Jun 17, 2018

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }
@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Jun 18, 2018

Contributor

In if clauses, we do lint on extraneous () in the condition:

if (true) { //~ WARNING unnecessary parentheses around `if` condition
}

The situation between the two is very similar. Cpp requires ; after struct declaration as well as () around the if condition.

In fact I as a programmer coming from Cpp got so annoyed by that warning that I allowed it. So even though I'm mildly against this RFC now, I guess I'd have liked back when I was fresh from Cpp.

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.
As for superfluous commata, I think you should still be allowed to type them, for consistency. E.g. here they seem useful:

match v {
    Enum::Foo => panic!(),
    Enum::Baz => {
        let k = boo();
        k.foo();
    }, // ~ USEFUL this comma is good 😎
    Enum::Bar(b) => bar(b),
}
Contributor

est31 commented Jun 18, 2018

In if clauses, we do lint on extraneous () in the condition:

if (true) { //~ WARNING unnecessary parentheses around `if` condition
}

The situation between the two is very similar. Cpp requires ; after struct declaration as well as () around the if condition.

In fact I as a programmer coming from Cpp got so annoyed by that warning that I allowed it. So even though I'm mildly against this RFC now, I guess I'd have liked back when I was fresh from Cpp.

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.
As for superfluous commata, I think you should still be allowed to type them, for consistency. E.g. here they seem useful:

match v {
    Enum::Foo => panic!(),
    Enum::Baz => {
        let k = boo();
        k.foo();
    }, // ~ USEFUL this comma is good 😎
    Enum::Bar(b) => bar(b),
}
@comex

This comment has been minimized.

Show comment
Hide comment
@comex

comex Jun 18, 2018

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Not that extreme, comparatively speaking. This is perfectly valid in C and C++:

;;;;;;;;;;;;;
int main() { }

(Since I was curious: Extraneous semicolons are also valid in Java, JS (of course), Haskell, and Ruby, but not in C#, Swift, or Python. In Go they are allowed only within functions.)

comex commented Jun 18, 2018

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Not that extreme, comparatively speaking. This is perfectly valid in C and C++:

;;;;;;;;;;;;;
int main() { }

(Since I was curious: Extraneous semicolons are also valid in Java, JS (of course), Haskell, and Ruby, but not in C#, Swift, or Python. In Go they are allowed only within functions.)

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 18, 2018

Contributor

@withoutboats

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Yep, it does. But it also seems extremely πŸ˜† unlikely for someone to actually write that ;)
Let's not forget that something like the snippet above is permitted if placed inside a function (because ; is interpreted as () then..)

fn main() {
    ;;;;;;;;;;;;; // <-- interestingly, there is no lint here... but rustfmt will strip it
    fn foo() { }
}

So... still that extreme?

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

It seems like the simpler alternative to make ; its own item. But if we at the conclusion of this RFC decide to allow ; after all items (except ...) I will be equally happy.

@mark-i-m

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }

Yes good point; I think it would be caught by a warn-by-default lint well tho :)

@est31

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.

Yes, I think so. Let's switch over to a warn by default lint as the main proposal (in addition to rustfmt formatting which solves the lint for the user instantly).

So... what should the name of the lint be? I see two categories of options here:

  1. A new lint name, like: redundant_semicolon -- the main benefit here is that you can allow or deny it separately without allowing something else.
  2. Add it to some existing lint -- not sure which one... ideas?
Contributor

Centril commented Jun 18, 2018

@withoutboats

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Yep, it does. But it also seems extremely πŸ˜† unlikely for someone to actually write that ;)
Let's not forget that something like the snippet above is permitted if placed inside a function (because ; is interpreted as () then..)

fn main() {
    ;;;;;;;;;;;;; // <-- interestingly, there is no lint here... but rustfmt will strip it
    fn foo() { }
}

So... still that extreme?

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

It seems like the simpler alternative to make ; its own item. But if we at the conclusion of this RFC decide to allow ; after all items (except ...) I will be equally happy.

@mark-i-m

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }

Yes good point; I think it would be caught by a warn-by-default lint well tho :)

@est31

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.

Yes, I think so. Let's switch over to a warn by default lint as the main proposal (in addition to rustfmt formatting which solves the lint for the user instantly).

So... what should the name of the lint be? I see two categories of options here:

  1. A new lint name, like: redundant_semicolon -- the main benefit here is that you can allow or deny it separately without allowing something else.
  2. Add it to some existing lint -- not sure which one... ideas?
@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Jun 18, 2018

Contributor

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".
Trailing separators are "positively" useful - they support uniform code generation so the last element is not treated specially, optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.
Trailing separators minimize diffs when new elements are added, these semicolons have no such effect.

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?
Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?
I don't think so, recovery is enough. And we shouldn't move these semicolons from recovery into proper language either, they are not special.

Contributor

petrochenkov commented Jun 18, 2018

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".
Trailing separators are "positively" useful - they support uniform code generation so the last element is not treated specially, optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.
Trailing separators minimize diffs when new elements are added, these semicolons have no such effect.

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?
Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?
I don't think so, recovery is enough. And we shouldn't move these semicolons from recovery into proper language either, they are not special.

@Ixrec

This comment has been minimized.

Show comment
Hide comment
@Ixrec

Ixrec Jun 18, 2018

Contributor

I am primarily interested in getting a warn-by-default lint on the redundant semicolons that we already allow today. Allowing them in more places seemed like a minor win for consistency and arguably removing a speedbump when refactoring, but I'm fine if macro compatibility concerns override that.

Contributor

Ixrec commented Jun 18, 2018

I am primarily interested in getting a warn-by-default lint on the redundant semicolons that we already allow today. Allowing them in more places seemed like a minor win for consistency and arguably removing a speedbump when refactoring, but I'm fine if macro compatibility concerns override that.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jun 18, 2018

Contributor

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".

The motivation is clearly not to accept more syntax just because we can. The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes. You might not agree with it it, but it is a positive motivation.

optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.

What specific cases are you worried about wrt. making ; an item?

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?

Are they frequently made trivial mistakes with precedent from other places?

Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?

I'll let whoever wants to do that make the case for it, but I won't. Redundant ; are already allowed inside fn bodies and tuple / unit structs already allow them. The precedent from C/C++ wrt. structs is also much more common than ~. I think there's much more added precedent in this case.

I don't think so, recovery is enough.

Not to me. By emitting a warning instead of a hard error, the code can be compiled and tests can be run, so it helps retain flow a lot more in my opinion.

Contributor

Centril commented Jun 18, 2018

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".

The motivation is clearly not to accept more syntax just because we can. The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes. You might not agree with it it, but it is a positive motivation.

optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.

What specific cases are you worried about wrt. making ; an item?

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?

Are they frequently made trivial mistakes with precedent from other places?

Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?

I'll let whoever wants to do that make the case for it, but I won't. Redundant ; are already allowed inside fn bodies and tuple / unit structs already allow them. The precedent from C/C++ wrt. structs is also much more common than ~. I think there's much more added precedent in this case.

I don't think so, recovery is enough.

Not to me. By emitting a warning instead of a hard error, the code can be compiled and tests can be run, so it helps retain flow a lot more in my opinion.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Jun 18, 2018

Contributor

@Centril

The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes.

If this is a problem that needs solution, then we can have a compiler mode similar to gcc's -fpermissive that treats *all* "sufficiently recoverable" errors as warnings and continues compilation until final lib/exe is produced, rather than target a single special case of mistakes.

What specific cases are you worried about wrt. making ; an item?

I don't remember right away, need to investigate a bit.
The primary source of complications is that both macro invocation (m!(...);) and macro output may have semicolons, but ; is not technically a part of macro invocation syntax, and it doesn't belong to the macro output either, so some rules need to be established with regards to what inner or outer semicolons are attached to what and when they are ignored or not ignored.

Contributor

petrochenkov commented Jun 18, 2018

@Centril

The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes.

If this is a problem that needs solution, then we can have a compiler mode similar to gcc's -fpermissive that treats *all* "sufficiently recoverable" errors as warnings and continues compilation until final lib/exe is produced, rather than target a single special case of mistakes.

What specific cases are you worried about wrt. making ; an item?

I don't remember right away, need to investigate a bit.
The primary source of complications is that both macro invocation (m!(...);) and macro output may have semicolons, but ; is not technically a part of macro invocation syntax, and it doesn't belong to the macro output either, so some rules need to be established with regards to what inner or outer semicolons are attached to what and when they are ignored or not ignored.

@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Jun 18, 2018

Contributor

Regarding the redundant semicolons lint, I think we should avoid linting in this case:

fn foo() -> Foo {
    return Foo;
}

The ; can technically be removed, but many (most?) people still write it.

Contributor

est31 commented Jun 18, 2018

Regarding the redundant semicolons lint, I think we should avoid linting in this case:

fn foo() -> Foo {
    return Foo;
}

The ; can technically be removed, but many (most?) people still write it.

@Manishearth

This comment has been minimized.

Show comment
Hide comment
@Manishearth

Manishearth Jun 25, 2018

Member

The motivation does seem to be entirely motivated by the way people are using the tools, rather than with what the language should look like.

I did feel something was different about this RFC but couldn't put my finger on it; thanks for articulating!

I think that recovering from these errors in the parser is a great idea (I'd love for the parser to do much more error recovery, in fact).

From a compiler side, we already do this in a bunch of cases, yeah? For example, in this code the type error still shows up even though we haven't even parsed the file correctly.

I'm not sure how this works with abort_if_errors() (ideally, this should be ignored by that call by default?), but it would be nice to have a generic framework to do stuff like this. (Essentially, this is equivalent to making such errors into hardwired Deny-only lints)

Member

Manishearth commented Jun 25, 2018

The motivation does seem to be entirely motivated by the way people are using the tools, rather than with what the language should look like.

I did feel something was different about this RFC but couldn't put my finger on it; thanks for articulating!

I think that recovering from these errors in the parser is a great idea (I'd love for the parser to do much more error recovery, in fact).

From a compiler side, we already do this in a bunch of cases, yeah? For example, in this code the type error still shows up even though we haven't even parsed the file correctly.

I'm not sure how this works with abort_if_errors() (ideally, this should be ignored by that call by default?), but it would be nice to have a generic framework to do stuff like this. (Essentially, this is equivalent to making such errors into hardwired Deny-only lints)

@Centril Centril self-assigned this Jul 5, 2018

@alercah

This comment has been minimized.

Show comment
Hide comment
@alercah

alercah Jul 30, 2018

Contributor

I don't think it impacts the RFC directly, but I do not agree with @joshtriplett that semicolon handling is consistent in statement context (I don't see reason to disagree about item context). Imagine if you see a function whose body is one of the following. Can you determine immediately how they are parsed/what they mean, assuming that foo is a function of type fn()?

  1. {foo}()
  2. (foo)()
  3. ({foo})()
  4. {foo};()
  5. (foo);()
  6. ({foo});()

Got your answers down on paper?

1 is an error. {foo}() is parsed as two statements, with {foo} being the expression in the first one, and () being the expression in the second. {foo} has type fn() -> i32, though, so it's an error because an expression statement of non-() type cannot be followed by another expression unless a semicolon intervenes.

2 and 3 call foo() and return the result (which is not important because it is (), but it would be just as valid for another return type). For 2, even though we could parse it as two statements again, we do not, and instead it's a single function call expression. The same applies to 3 but now we have to wonder why {foo} isn't an error here. The answer is because it's a subexpression, the semicolon rule doesn't apply, and then the function call is made because (...)(...) is parsed as a function call.

4, 5, and 6 all evaluate foo, discard it without calling it, and return (). In 5 and 6, the semicolon's role is to clarify the parse by ensuring that there are two statements rather than just one. In 4, its role is different; it does not affect the parse but it does affect the semantics by allowing a non-() type.

I think it's hard to argue that this behaviour is particularly consistent.

Contributor

alercah commented Jul 30, 2018

I don't think it impacts the RFC directly, but I do not agree with @joshtriplett that semicolon handling is consistent in statement context (I don't see reason to disagree about item context). Imagine if you see a function whose body is one of the following. Can you determine immediately how they are parsed/what they mean, assuming that foo is a function of type fn()?

  1. {foo}()
  2. (foo)()
  3. ({foo})()
  4. {foo};()
  5. (foo);()
  6. ({foo});()

Got your answers down on paper?

1 is an error. {foo}() is parsed as two statements, with {foo} being the expression in the first one, and () being the expression in the second. {foo} has type fn() -> i32, though, so it's an error because an expression statement of non-() type cannot be followed by another expression unless a semicolon intervenes.

2 and 3 call foo() and return the result (which is not important because it is (), but it would be just as valid for another return type). For 2, even though we could parse it as two statements again, we do not, and instead it's a single function call expression. The same applies to 3 but now we have to wonder why {foo} isn't an error here. The answer is because it's a subexpression, the semicolon rule doesn't apply, and then the function call is made because (...)(...) is parsed as a function call.

4, 5, and 6 all evaluate foo, discard it without calling it, and return (). In 5 and 6, the semicolon's role is to clarify the parse by ensuring that there are two statements rather than just one. In 4, its role is different; it does not affect the parse but it does affect the semantics by allowing a non-() type.

I think it's hard to argue that this behaviour is particularly consistent.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Jul 31, 2018

Contributor

@ExpHP hey there; could you please take a look at @Manishearth's PR at Centril#6 and OK it? :)

Contributor

Centril commented Jul 31, 2018

@ExpHP hey there; could you please take a look at @Manishearth's PR at Centril#6 and OK it? :)

Manishearth and others added some commits Jun 21, 2018

Merge pull request #6 from Manishearth/semi-item
Make ; not an item but permitted in item contexts
@ExpHP

This comment has been minimized.

Show comment
Hide comment
@ExpHP

ExpHP Aug 15, 2018

To copy over a concern I wrote elsewhere:

rustfmt doesn't currently remove extraneous ; in statement context, which conflicts with our goal to have it remove ; in item context. (especially once it hits 1.0, at which point the behavior for ; in statement context will be stable until at least a 2.0 release)

ExpHP commented Aug 15, 2018

To copy over a concern I wrote elsewhere:

rustfmt doesn't currently remove extraneous ; in statement context, which conflicts with our goal to have it remove ; in item context. (especially once it hits 1.0, at which point the behavior for ; in statement context will be stable until at least a 2.0 release)

@Centril Centril changed the title from RFC: Accept semicolons as items to RFC: Accept semicolons as item-like Aug 15, 2018

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Aug 15, 2018

Contributor

Update:

  • @Manishearth's PR has now been merged into this RFC, which now does not make ; an actual item, but rather item-like.

@ExpHP

rustfmt doesn't currently remove extraneous ; in statement context, which conflicts with our goal to have it remove ; in item context.

Feels natural to me that rustfmt would remove extraneous semis in statement contexts as well.

Contributor

Centril commented Aug 15, 2018

Update:

  • @Manishearth's PR has now been merged into this RFC, which now does not make ; an actual item, but rather item-like.

@ExpHP

rustfmt doesn't currently remove extraneous ; in statement context, which conflicts with our goal to have it remove ; in item context.

Feels natural to me that rustfmt would remove extraneous semis in statement contexts as well.

@Havvy

This comment has been minimized.

Show comment
Hide comment
@Havvy

Havvy Sep 2, 2018

Contributor

For some stronger motivation, procedural macros can either be invoked as macro!{} or macro!(); or macro![];, but not macro!{};. I'm willing to bet a lot of people are going to try the latter. Though, of course, we could also just allow it without following through this RFC. But this RFC gives a more consistent view, so I'm more in favor of that.

Contributor

Havvy commented Sep 2, 2018

For some stronger motivation, procedural macros can either be invoked as macro!{} or macro!(); or macro![];, but not macro!{};. I'm willing to bet a lot of people are going to try the latter. Though, of course, we could also just allow it without following through this RFC. But this RFC gives a more consistent view, so I'm more in favor of that.

@Centril Centril added the I-nominated label Sep 20, 2018

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 20, 2018

Member

There's some more general thoughts that we should write down about future changes that relax the grammar, and what principles we should consider in doing so. For instance, we should avoid accepting anything that seems likely to lead to potential harm, while downgrading to a warning when there's a clear meaning that just isn't the preferred Rust style. @Centril had several more thoughts on possible guidelines for future changes.

Meanwhile, however, I think this change is ready:
@rfcbot fcp merge

Member

joshtriplett commented Sep 20, 2018

There's some more general thoughts that we should write down about future changes that relax the grammar, and what principles we should consider in doing so. For instance, we should avoid accepting anything that seems likely to lead to potential harm, while downgrading to a warning when there's a clear meaning that just isn't the preferred Rust style. @Centril had several more thoughts on possible guidelines for future changes.

Meanwhile, however, I think this change is ready:
@rfcbot fcp merge

@rfcbot

This comment has been minimized.

Show comment
Hide comment
@rfcbot

rfcbot Sep 20, 2018

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot commented Sep 20, 2018

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Sep 20, 2018

Contributor

Here are some of the additional thoughts I had around guidelines and criteria which @joshtriplett referenced.

When evaluating similar changes in the future, we should consider:

  • how frequently Rust programmers run into the error; i.e. if it is a naturally occurring mistake such as the one in this RFC or not.
  • how this impacts our ability to design new language features which may want to use the syntax we are considering to allow for some other semantics.
  • implementation complexity to allow (and warn) rather than error.
  • the theoretical grammar complexity (e.g. backtracking? context sensitivity? still LL(k)?
Contributor

Centril commented Sep 20, 2018

Here are some of the additional thoughts I had around guidelines and criteria which @joshtriplett referenced.

When evaluating similar changes in the future, we should consider:

  • how frequently Rust programmers run into the error; i.e. if it is a naturally occurring mistake such as the one in this RFC or not.
  • how this impacts our ability to design new language features which may want to use the syntax we are considering to allow for some other semantics.
  • implementation complexity to allow (and warn) rather than error.
  • the theoretical grammar complexity (e.g. backtracking? context sensitivity? still LL(k)?
@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Sep 24, 2018

Member

@rfcbot concern parser-recovery

I would still prefer to keep this an error but allow the parser to recover from it. I don't see that making the grammar itself more flexible buys us anything here.

Member

nrc commented Sep 24, 2018

@rfcbot concern parser-recovery

I would still prefer to keep this an error but allow the parser to recover from it. I don't see that making the grammar itself more flexible buys us anything here.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Sep 24, 2018

Contributor

@nrc

I would still prefer to keep this an error but allow the parser to recover from it. I don't see that making the grammar itself more flexible buys us anything here.

@Centril argues that errors (as opposed to warnings) "disturb the flow" when writing code.
I personally never could understand this argument, perhaps because I never have "flow"?
Even if I make some mistake like this accidentally (but not the extra semicolon specifically), the time required to fix it is completely negligible compared to everything else + I fix warnings immediately like errors, except for warnings from the unused group perhaps.

Contributor

petrochenkov commented Sep 24, 2018

@nrc

I would still prefer to keep this an error but allow the parser to recover from it. I don't see that making the grammar itself more flexible buys us anything here.

@Centril argues that errors (as opposed to warnings) "disturb the flow" when writing code.
I personally never could understand this argument, perhaps because I never have "flow"?
Even if I make some mistake like this accidentally (but not the extra semicolon specifically), the time required to fix it is completely negligible compared to everything else + I fix warnings immediately like errors, except for warnings from the unused group perhaps.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Sep 24, 2018

Contributor

@nrc

I don't see that making the grammar itself more flexible buys us anything here.

If for no other reason (e.g. forgetting about flow, refactoring, and habits), I believe it buys us increased consistency and uniformity because by allowing:

fn foo() {
    struct Bar {}; // <-- notice the semicolon
}

one could reasonably expect that struct Bar {}; is allowed in modules also.

I do believe that uniformity is a key property we should be seeking in general in language design as a mechanism reduce complexity costs and I've written about it like so in the Or patterns RFC:

A principal way in which programming languages accumulate complexity is by adding more and more rules that a programmer needs to keep in their head to write or understand what a program does. A consequence of this is that often times, caveats and corner cases make for a language that is harder to learn, understand, and write in. To avoid such caveats, it is thus imperative that we should try to keep the language more uniform rather than less. This is an important means through which it becomes possible to give users more expressiveness but at the same time limit the cost each feature takes from our complexity budget.

With this RFC, we try to reduce the complexity of the language by extending a feature which already exists, and which many users already know about, to another place. In a sense, giving the user more capabilities results in a negative increase in complexity.

I believe this line of reasoning applies equally in this case.

Contributor

Centril commented Sep 24, 2018

@nrc

I don't see that making the grammar itself more flexible buys us anything here.

If for no other reason (e.g. forgetting about flow, refactoring, and habits), I believe it buys us increased consistency and uniformity because by allowing:

fn foo() {
    struct Bar {}; // <-- notice the semicolon
}

one could reasonably expect that struct Bar {}; is allowed in modules also.

I do believe that uniformity is a key property we should be seeking in general in language design as a mechanism reduce complexity costs and I've written about it like so in the Or patterns RFC:

A principal way in which programming languages accumulate complexity is by adding more and more rules that a programmer needs to keep in their head to write or understand what a program does. A consequence of this is that often times, caveats and corner cases make for a language that is harder to learn, understand, and write in. To avoid such caveats, it is thus imperative that we should try to keep the language more uniform rather than less. This is an important means through which it becomes possible to give users more expressiveness but at the same time limit the cost each feature takes from our complexity budget.

With this RFC, we try to reduce the complexity of the language by extending a feature which already exists, and which many users already know about, to another place. In a sense, giving the user more capabilities results in a negative increase in complexity.

I believe this line of reasoning applies equally in this case.

@Manishearth

This comment has been minimized.

Show comment
Hide comment
@Manishearth

Manishearth Sep 24, 2018

Member

@petrochenkov @nrc FWIW we have a whole class of "error that the comiler knows how to fix" errors. They're kinda annoying becase sometimes your reaction is "then just fix it for me and move on please!". Making them recoverable helps. However, ideally, instead of case by case trying to remove them (they can't always be removed), we may eventually try to get rustfix to handle these, and either have an interactive rustfix mode for compilation errors, or build it into the compiler so that if asked the compiler will fix it, assume it was fixed, and move on (i.e. recover from it, but also actually edit the source)

Member

Manishearth commented Sep 24, 2018

@petrochenkov @nrc FWIW we have a whole class of "error that the comiler knows how to fix" errors. They're kinda annoying becase sometimes your reaction is "then just fix it for me and move on please!". Making them recoverable helps. However, ideally, instead of case by case trying to remove them (they can't always be removed), we may eventually try to get rustfix to handle these, and either have an interactive rustfix mode for compilation errors, or build it into the compiler so that if asked the compiler will fix it, assume it was fixed, and move on (i.e. recover from it, but also actually edit the source)

@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Sep 24, 2018

Member

I do believe that uniformity is a key property we should be seeking in general in language design as a mechanism reduce complexity costs

I agree in principle, however, I think in reality this becomes a somewhat vacuous statement. The problem is that while uniformity is a good goal, it can only sensibly be a goal, not an axiom. It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones. The hard part of language design (in this area) is thus deciding how to be uniform, and with what. It is possible to be uniform with everything, but you end up with a language which is elegant and not very user friendly (at the limit, you get lambda calculus or something).

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument. I want to see the trade-offs, what becomes less uniform, what becomes worse in some way, and given that, is the increase in uniformity worth the cost?

In this case we do win some uniformity between items and statements. On the downside, we allow random semicolons which there is no reason to allow and we lose some predictability in the syntax of items (currently, an item is either a block or it is terminated by a single semicolon in which case it has no block). We also open the question of why the ; in, e.g., type Foo = Bar; is necessary, if the semicolon in struct Foo {}; is optional (and complicates slightly the distinction between struct Foo; and struct Foo {}).

Finally, (and in contrast to most other languages) the semicolon in Rust is significant, it is not just a delimiter, it changes the type of an expression to void, therefore in some sense it makes sense to allow it for statements, but not for items, since it does not make sense to change the type of an item to void.

Anyway, I don't think any of those arguments are particularly strong, but I also don't think the argument for change is that strong either. I'm having a hard time weighing up the pros and cons.

Member

nrc commented Sep 24, 2018

I do believe that uniformity is a key property we should be seeking in general in language design as a mechanism reduce complexity costs

I agree in principle, however, I think in reality this becomes a somewhat vacuous statement. The problem is that while uniformity is a good goal, it can only sensibly be a goal, not an axiom. It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones. The hard part of language design (in this area) is thus deciding how to be uniform, and with what. It is possible to be uniform with everything, but you end up with a language which is elegant and not very user friendly (at the limit, you get lambda calculus or something).

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument. I want to see the trade-offs, what becomes less uniform, what becomes worse in some way, and given that, is the increase in uniformity worth the cost?

In this case we do win some uniformity between items and statements. On the downside, we allow random semicolons which there is no reason to allow and we lose some predictability in the syntax of items (currently, an item is either a block or it is terminated by a single semicolon in which case it has no block). We also open the question of why the ; in, e.g., type Foo = Bar; is necessary, if the semicolon in struct Foo {}; is optional (and complicates slightly the distinction between struct Foo; and struct Foo {}).

Finally, (and in contrast to most other languages) the semicolon in Rust is significant, it is not just a delimiter, it changes the type of an expression to void, therefore in some sense it makes sense to allow it for statements, but not for items, since it does not make sense to change the type of an item to void.

Anyway, I don't think any of those arguments are particularly strong, but I also don't think the argument for change is that strong either. I'm having a hard time weighing up the pros and cons.

@Manishearth

This comment has been minimized.

Show comment
Hide comment
@Manishearth

Manishearth Sep 24, 2018

Member

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument

To expand on this a bit, I feel like uniformity in and of itself is not a useful motivator, however there are often better arguments that have their root in the lack of uniformity, for example if people were finding this hard to work with because they expect things to work differently that would make sense.

There's a pretty uniform model for reasoning about this feature: things in items and blocks that don't end with braces must end with semis. It's true that in statements Rust lets you do both (but not in items), but that's pretty rare in practice so it's questionable whether this inconsistency is problematic.

Member

Manishearth commented Sep 24, 2018

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument

To expand on this a bit, I feel like uniformity in and of itself is not a useful motivator, however there are often better arguments that have their root in the lack of uniformity, for example if people were finding this hard to work with because they expect things to work differently that would make sense.

There's a pretty uniform model for reasoning about this feature: things in items and blocks that don't end with braces must end with semis. It's true that in statements Rust lets you do both (but not in items), but that's pretty rare in practice so it's questionable whether this inconsistency is problematic.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Sep 24, 2018

Contributor

@nrc

I agree in principle, however, I think in reality this becomes a somewhat vacuous statement. The problem is that while uniformity is a good goal, it can only sensibly be a goal, not an axiom. It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones.

It might be vacuous in the sense that it is perhaps an obvious statement to make. I agree that uniformity is not an axiom (if it was then it would clearly show...). It is also not an overriding principle trumping everything else. Other aspects such as ergonomics, readability, and implementation complexity must be taken into account; However I do believe that in practical application and in reality, uniformity is important nonetheless as a guiding principle to always keep in mind.

It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones.

Yet I think Rust and languages like Idris are adept at making features consistent with each other to a remarkable extent. For example, we do have expressions, patterns, and types. These are related by a) expression construction being the same syntactically (there are some caveats here unfortunately), b) expressions and patterns having types which are the same.

The hard part of language design (in this area) is thus deciding how to be uniform, and with what. It is possible to be uniform with everything, but you end up with a language which is elegant and not very user friendly (at the limit, you get lambda calculus or something).

While not everything needs to be uniform (consistent) with everything else (because they are mostly unrelated things with few interactions), I think we should at least strive for internal consistency in different "subsystems" in a language.

For example, when considering "patterns" in isolation, I think it is important to ensure that they are uniformly applied and that we for example don't have or-patterns in arbitrary places but that they can be used in all the places where you can perform pattern matching. I feel the same about type ascription in both patterns and expressions: we should consistently allow type ascription because it makes the language easier to understand and because it is also user friendly.

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument. I want to see the trade-offs, what becomes less uniform, what becomes worse in some way, and given that, is the increase in uniformity worth the cost?

While we are still considering things in the abstract...
...Sometimes nothing becomes less uniform when something else becomes more uniform;
...Sometimes uniformity has no cost (for user understanding -- tho it might have implementation costs), in fact it has negative cost. I believe this is the case with features such as or-patterns.

In this case we do win some uniformity between items and statements. On the downside, we allow random semicolons which there is no reason to allow and we lose some predictability in the syntax of items (currently, an item is either a block or it is terminated by a single semicolon in which case it has no block).

By no means was uniformity used as the only reason to make such a change. I think there are reasons beyond uniformity to do this such as making the syntax more user-friendly particularly since struct Foo { ... }; is legal in C and thus is a habit among many systems programmers. Given that people also need to switch between languages, such an easily made mistake becomes a stumbling block in learning Rust, and so eliminating the hurdle can be helpful here.

We also open the question of why the ; in, e.g., type Foo = Bar; is necessary, if the semicolon in struct Foo {}; is optional (and complicates slightly the distinction between struct Foo; and struct Foo {}).

That there is a distinction between struct Foo; and struct Foo {} in the first place is an unfortunate necessity to retain backwards compatibility. In the best of worlds, I would have preferred for struct Foo;, struct Foo {}, and struct Foo {}; to all be equivalent in every respect.

As for uniformity in this case, would it be more consistent to permit type Foo = Bar without a semicolon and is this a frequently made mistake? Are we also sure that it wouldn't cause us impediments for designing new language features moving forward?

Finally, (and in contrast to most other languages) the semicolon in Rust is significant, it is not just a delimiter, it changes the type of an expression to void, therefore in some sense it makes sense to allow it for statements, but not for items, since it does not make sense to change the type of an item to void.

While this is indeed the meaning of ; inside fn bodies, I don't think that this is a fact that many users know about or have reflected over. When you write fn ... { struct Foo {}; } the ; is associated to Foo even tho the parser does see it that way.

I'm having a hard time weighing up the pros and cons.

❀️

@Manishearth

To expand on this a bit, I feel like uniformity in and of itself is not a useful motivator, however there are often better arguments that have their root in the lack of uniformity, for example if people were finding this hard to work with because they expect things to work differently that would make sense.

I would agree in the sense that uniformity is not an end in itself. However, uniformity does matter for both how expressive a language is (i.e. how features can be combined) and in particular (in this case) how predictable and thus learnable/teachable a language is.

As for people expecting things to work differently, I do think that struct Foo {}; being legal inside fn bodies does set up expectations (as I've said before). As I've noted above, C & C++ use syntax which is quite close to Rust for defining data types; The bits that differ here are the separators we use inside { .. }, the order in which we define types / field names, and the requirement to include/exclude the trailing semicolon. While the two former differences would be significantly too costly to remedy, I think the trailing semicolon is an inexpensive change to make that also is not particularly inconsistent with Rust internally.

There's a pretty uniform model for reasoning about this feature: things in items and blocks that don't end with braces must end with semis. It's true that in statements Rust lets you do both (but not in items), but that's pretty rare in practice so it's questionable whether this inconsistency is problematic.

It does happen when refactoring and I think people do hit struct Foo { .. }; being an error in sufficient frequency for it to be problematic.

Contributor

Centril commented Sep 24, 2018

@nrc

I agree in principle, however, I think in reality this becomes a somewhat vacuous statement. The problem is that while uniformity is a good goal, it can only sensibly be a goal, not an axiom. It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones.

It might be vacuous in the sense that it is perhaps an obvious statement to make. I agree that uniformity is not an axiom (if it was then it would clearly show...). It is also not an overriding principle trumping everything else. Other aspects such as ergonomics, readability, and implementation complexity must be taken into account; However I do believe that in practical application and in reality, uniformity is important nonetheless as a guiding principle to always keep in mind.

It is often the case that a feature is related to two other features which are each fine by themselves, but the new feature cannot be uniform with both of the previous ones.

Yet I think Rust and languages like Idris are adept at making features consistent with each other to a remarkable extent. For example, we do have expressions, patterns, and types. These are related by a) expression construction being the same syntactically (there are some caveats here unfortunately), b) expressions and patterns having types which are the same.

The hard part of language design (in this area) is thus deciding how to be uniform, and with what. It is possible to be uniform with everything, but you end up with a language which is elegant and not very user friendly (at the limit, you get lambda calculus or something).

While not everything needs to be uniform (consistent) with everything else (because they are mostly unrelated things with few interactions), I think we should at least strive for internal consistency in different "subsystems" in a language.

For example, when considering "patterns" in isolation, I think it is important to ensure that they are uniformly applied and that we for example don't have or-patterns in arbitrary places but that they can be used in all the places where you can perform pattern matching. I feel the same about type ascription in both patterns and expressions: we should consistently allow type ascription because it makes the language easier to understand and because it is also user friendly.

To get more practical, I think 'we should do this because it makes the language more uniform' is not a great argument. I want to see the trade-offs, what becomes less uniform, what becomes worse in some way, and given that, is the increase in uniformity worth the cost?

While we are still considering things in the abstract...
...Sometimes nothing becomes less uniform when something else becomes more uniform;
...Sometimes uniformity has no cost (for user understanding -- tho it might have implementation costs), in fact it has negative cost. I believe this is the case with features such as or-patterns.

In this case we do win some uniformity between items and statements. On the downside, we allow random semicolons which there is no reason to allow and we lose some predictability in the syntax of items (currently, an item is either a block or it is terminated by a single semicolon in which case it has no block).

By no means was uniformity used as the only reason to make such a change. I think there are reasons beyond uniformity to do this such as making the syntax more user-friendly particularly since struct Foo { ... }; is legal in C and thus is a habit among many systems programmers. Given that people also need to switch between languages, such an easily made mistake becomes a stumbling block in learning Rust, and so eliminating the hurdle can be helpful here.

We also open the question of why the ; in, e.g., type Foo = Bar; is necessary, if the semicolon in struct Foo {}; is optional (and complicates slightly the distinction between struct Foo; and struct Foo {}).

That there is a distinction between struct Foo; and struct Foo {} in the first place is an unfortunate necessity to retain backwards compatibility. In the best of worlds, I would have preferred for struct Foo;, struct Foo {}, and struct Foo {}; to all be equivalent in every respect.

As for uniformity in this case, would it be more consistent to permit type Foo = Bar without a semicolon and is this a frequently made mistake? Are we also sure that it wouldn't cause us impediments for designing new language features moving forward?

Finally, (and in contrast to most other languages) the semicolon in Rust is significant, it is not just a delimiter, it changes the type of an expression to void, therefore in some sense it makes sense to allow it for statements, but not for items, since it does not make sense to change the type of an item to void.

While this is indeed the meaning of ; inside fn bodies, I don't think that this is a fact that many users know about or have reflected over. When you write fn ... { struct Foo {}; } the ; is associated to Foo even tho the parser does see it that way.

I'm having a hard time weighing up the pros and cons.

❀️

@Manishearth

To expand on this a bit, I feel like uniformity in and of itself is not a useful motivator, however there are often better arguments that have their root in the lack of uniformity, for example if people were finding this hard to work with because they expect things to work differently that would make sense.

I would agree in the sense that uniformity is not an end in itself. However, uniformity does matter for both how expressive a language is (i.e. how features can be combined) and in particular (in this case) how predictable and thus learnable/teachable a language is.

As for people expecting things to work differently, I do think that struct Foo {}; being legal inside fn bodies does set up expectations (as I've said before). As I've noted above, C & C++ use syntax which is quite close to Rust for defining data types; The bits that differ here are the separators we use inside { .. }, the order in which we define types / field names, and the requirement to include/exclude the trailing semicolon. While the two former differences would be significantly too costly to remedy, I think the trailing semicolon is an inexpensive change to make that also is not particularly inconsistent with Rust internally.

There's a pretty uniform model for reasoning about this feature: things in items and blocks that don't end with braces must end with semis. It's true that in statements Rust lets you do both (but not in items), but that's pretty rare in practice so it's questionable whether this inconsistency is problematic.

It does happen when refactoring and I think people do hit struct Foo { .. }; being an error in sufficient frequency for it to be problematic.

@Manishearth

This comment has been minimized.

Show comment
Hide comment
@Manishearth

Manishearth Sep 24, 2018

Member

. However, uniformity does matter for both how expressive a language is (i.e. how features can be combined) and in particular (in this case) how predictable and thus learnable/teachable a language is.

Right, my point is that it's better to point directly at those as a motivator, rather than the shorthand of "uniformity" which loses out a lot of how it's applied to the specific case.

It seems in this case the motivator is:

  • consistency with C++
  • struct Foo {...}; working in fn may set up bad expectations

I don't really see how the latter can be problematic since structs in function bodies itself is a rare thing. The former does make sense.

Member

Manishearth commented Sep 24, 2018

. However, uniformity does matter for both how expressive a language is (i.e. how features can be combined) and in particular (in this case) how predictable and thus learnable/teachable a language is.

Right, my point is that it's better to point directly at those as a motivator, rather than the shorthand of "uniformity" which loses out a lot of how it's applied to the specific case.

It seems in this case the motivator is:

  • consistency with C++
  • struct Foo {...}; working in fn may set up bad expectations

I don't really see how the latter can be problematic since structs in function bodies itself is a rare thing. The former does make sense.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Sep 24, 2018

Contributor

@Manishearth

It seems in this case the motivator is:

I would also add to the list that when you refactor a unit / tuple struct to a braced struct, you might leave the extra ; in place -- that does happen to me relatively speaking fairly frequently.

Contributor

Centril commented Sep 24, 2018

@Manishearth

It seems in this case the motivator is:

I would also add to the list that when you refactor a unit / tuple struct to a braced struct, you might leave the extra ; in place -- that does happen to me relatively speaking fairly frequently.

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Sep 27, 2018

Member

I agree with @nrc's concern here, even though I make this mistake somewhat frequently.

This feels like one of those things that's just how Rust does it, and you follow it. There are a whole bunch of those things, and I don't think allow-but-warn makes those better. For example, we could say "well, we allow ending struct fields with semicolons too, but warn about it", another thing I regularly when switching languages. But I also don't think we should be doing that.

I might be more worried if this was something detected very late in the pipeline, but it's an error that shows up so early that it didn't waste my time. And it's surface syntax, so even editor highlighting could plausibly pick it up to give squiggles, not to mention all the things -- even sublime text, these days -- that can now offer buttons to quick-fix structured suggestions from running background check.

Member

scottmcm commented Sep 27, 2018

I agree with @nrc's concern here, even though I make this mistake somewhat frequently.

This feels like one of those things that's just how Rust does it, and you follow it. There are a whole bunch of those things, and I don't think allow-but-warn makes those better. For example, we could say "well, we allow ending struct fields with semicolons too, but warn about it", another thing I regularly when switching languages. But I also don't think we should be doing that.

I might be more worried if this was something detected very late in the pipeline, but it's an error that shows up so early that it didn't waste my time. And it's surface syntax, so even editor highlighting could plausibly pick it up to give squiggles, not to mention all the things -- even sublime text, these days -- that can now offer buttons to quick-fix structured suggestions from running background check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment