Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing ambiguity in switch expressions: => might be part of guard expression #2672

Closed
stereotype441 opened this issue Nov 30, 2022 · 7 comments · Fixed by #2675
Closed

Parsing ambiguity in switch expressions: => might be part of guard expression #2672

stereotype441 opened this issue Nov 30, 2022 · 7 comments · Fixed by #2675
Labels
patterns Issues related to pattern matching.

Comments

@stereotype441
Copy link
Member

The following code is ambiguous:

f(x, y, z) => switch (x) { _ when y + () => () => z };

It could either mean:

f(x, y, z) => switch (x) { _ when (y + ()) => (() => z) };

Or:

f(x, y, z) => switch (x) { _ when (y + () => ()) => z };

(Either of which would successfully type check because y has type dynamic.)

More generally, in the parser we have the rule that when an expression is expected, if (...) => is found, the parser assumes it's looking at a function expression and greedily consumes the =>. Which means that we currently can't parse this, even though technically it's unambiguous:

f(x, y, z) => switch (x) { _ when y + () => z };

Because the parser interprets it as:

f(x, y, z) => switch (x) { _ when (y + () => z) };

(Which is missing the => expression part of the switchExpressionCase.)

There's one exception: if (...) => appears in a constructor initializer, and it's not inside any grouping constructs, then the parser assumes it's looking at a parenthesized expression or a record, so it doesn't consume the =>. This means that the => winds up getting interpreted as the beginning of the constructor body (which is kind of weird because non-factory constructor bodies aren't allowed to use => syntax, but who am I to judge?)

I propose that we address this ambiguity in a similar fashion: if (...) => appears in a guard expression inside a switch expression, and it's not inside any grouping constructs, then let's specify that the construct should be interpreted as a parenthesized expression or a record, thus allowing the => to be interpreted as terminating the guardedPattern. Which means that

f(x, y, z) => switch (x) { _ when y + () => () => z };

will be interpreted as:

f(x, y, z) => switch (x) { _ when (y + ()) => (() => z) };

And

f(x, y, z) => switch (x) { _ when y + () => z };

will be interpreted as

f(x, y, z) => switch (x) { _ when (y + ()) => z };

However, this will be rejected:

f(x, y, z) => switch (x) { _ when y + () => z => z };

Because the parser will interpret it as:

f(x, y, z) => switch (x) { _ when (y + ()) => (z => z) };

(Which is illegal because z => z isn't a valid expression.)

@stereotype441 stereotype441 added the patterns Issues related to pattern matching. label Nov 30, 2022
@stereotype441
Copy link
Member Author

@munificent
Copy link
Member

There's one exception: if (...) => appears in a constructor initializer, and it's not inside any grouping constructs, then the parser assumes it's looking at a parenthesized expression or a record, so it doesn't consume the =>. This means that the => winds up getting interpreted as the beginning of the constructor body (which is kind of weird because non-factory constructor bodies aren't allowed to use => syntax, but who am I to judge?)

I remember when that first dark corner of the language was first discovered while => functions were being added to the language. Interestingly, I can't find anything in the specification that actually explains it. The implementations do as you say, but I can't find anything in the spec that tells them to. 🤷

@eernstg, is it hiding in there somewhere?

I agree with your proposed resolution. I'll try to come up with a nice way to word it.

munificent added a commit that referenced this issue Dec 1, 2022
- Specify exhaustiveness checking of switch elements.
- Resolve ambiguity with `=>` in switch expression guards.
- Compile error if map pattern has identical keys.

Fix #2672.
Fix #2657.
@eernstg
Copy link
Member

eernstg commented Dec 1, 2022

@munificent wrote, about the treatment of => functions in the parser:

is it hiding in there somewhere?

Not in the language specification. I noticed this glaring ambiguity in the grammar when I created the spec parser, and Dart.g does actually handle => function literals in a different way than the language specification grammar: There is a non-terminal <functionExpression> which is derived directly from <expression> (here) and which will derive all function literals with an => body. Function literals with a {} body are derived from <functionPrimary>, which is derived from <primary> (here). This means that (...) {...} can occur basically anywhere, but (...) => ... must occur very nearly at the top level of an expression.

This discrepancy between the language specification grammar and the spec grammar does not cause any known parsing failures. In particular, a (...) => ... function literal can be an actual argument in a function call, and it can be the initializing expression of a variable declaration, no problem, and those cases are a vast majority of the cases where this type of function literal occurs.

At the same time, we don't have to worry about whether () => 1.toString() means (() => 1).toString() or () => (1.toString()), because () => 1 isn't a <primary> any more (which it is in the language specification grammar).

@jensjoha did do some work on an update to the parser following this approach, but it was never sufficiently near the top of the agenda to make it into the specification, or to get landed in the implementation. I'd love to get it done, of course.

If we assume this modified grammar then we could change the when clause of a <guardedPattern> to have a <conditionalExpression> rather than an <expression>. This would rule out the function literals using =>, throw expressions, assignments (including pattern assignments), and cascades. I don't think it's a big problem that those things can't be the expression of a when clause (they can still be used with parentheses, e.g., case 1 when (throw 2):).

@lrhn
Copy link
Member

lrhn commented Dec 1, 2022

Another issue in initializer lists is not =>, but block bodies. They're not impossible to parse, but they are hard to read and require potentially long look-ahead.

class C {
  Object x;
  C() : x = (y) { 
    ............... 
  };
}

This looks like a x is initialized to y, the followed by a constructor body.
It's really a closures with y as parameter, followed by a ; body.

The only difference between the two is the trailing ;.

(Even worse as x = (y) => (z) { body }. We can still distinguish based on what the } is followed by, because a constructor initializer list still needs to end in a body or ;, and that can only be followed by a declaration or a }, and I think we can still distinguish those from a continuation of an expression.)

That's reason enough, to me, to not allow function literals as the expression of an initializer list element. It's too unreadable in a position where any expression can be immediately followed by a block.
It might be unambiguous (possibly mostly by accident), but it's not readable.

I'd be happy requiring functions literals occurring un-delimited in when clauses to be parenthesized too.

@stereotype441
Copy link
Member Author

In https://dart-review.googlesource.com/c/sdk/+/273020 I'm using the parser's mayParseFunctionExpressions boolean to prohibit function expressions inside the when clauses of switch expressions (except when enclosed in grouping constructs). That's sufficient to address the ambiguity.

@lrhn's comment above:

I'd be happy requiring functions literals occurring un-delimited in when clauses to be parenthesized too.

doesn't mention switch expressions specifically. Should I extend the restriction to all when clauses?

@lrhn
Copy link
Member

lrhn commented Dec 1, 2022

It's true that it's only really expression-switch when clauses where it matters.
We can easily explain to users that it's only because of the following => that this restriction exists (if anyone ever notices, it's not like a boolean expression is a common place to have free-floating function literals).

I'd be fine with just restricting it there. I'd probably be fine with restricting it in every when clause too.

copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Dec 1, 2022
…sing.

We need to prohibit function expressions during guard clauses in
switch expressions, otherwise the `=>` might be misinterpreted.

Fixes #50591.

Bug: #50591, dart-lang/language#2672
Change-Id: Ie5703ac7049557798b19778281429e8a591cf09f
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/273020
Reviewed-by: Konstantin Shcheglov <scheglov@google.com>
Commit-Queue: Paul Berry <paulberry@google.com>
@munificent
Copy link
Member

munificent commented Dec 1, 2022

I'd be fine with just restricting it there. I'd probably be fine with restricting it in every when clause too.

I went with the latter. It felt simpler to me to just employ the rule consistently across all guards. I'm not sure if I can fully motivate that choice, but one meta-rule I use sometimes is "How long would the error message to explain this have to be?" and "guards can't end in function expressions" is marginally shorter than "guards in switch expressions can't end in function expressions".

munificent added a commit that referenced this issue Dec 5, 2022
* [patterns] More small fixes.

- Specify exhaustiveness checking of switch elements.
- Resolve ambiguity with `=>` in switch expression guards.
- Compile error if map pattern has identical keys.

Fix #2672.
Fix #2657.

* Clarify when restriction on "=>" in guards comes into play.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patterns Issues related to pattern matching.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants