Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parser recovery around nullable types in patterns #72805

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c058b16
Improve parser recovery around nullable types in patterns
DoctorKrolic Mar 29, 2024
af8391d
Add test
DoctorKrolic Mar 29, 2024
233adf0
Replace with more specific checks
DoctorKrolic Mar 29, 2024
21844c3
Already covered by `IsLiteral` check
DoctorKrolic Mar 29, 2024
7894262
Remove incorrect cases
DoctorKrolic Mar 29, 2024
db65d50
Fix existing semantic tests
DoctorKrolic Mar 29, 2024
dc7410f
Fix test
DoctorKrolic Mar 30, 2024
5d413b7
Simplify
DoctorKrolic Mar 30, 2024
212c0ae
Add binding tests
DoctorKrolic Mar 30, 2024
5c8eb47
Add array binding tests
DoctorKrolic Mar 30, 2024
bedc825
Fix test
DoctorKrolic Mar 30, 2024
568985b
Add `;` to the list of tokens, which end pattern
DoctorKrolic Mar 30, 2024
59a4e77
Fix `await` parsing
DoctorKrolic Mar 30, 2024
1d92ba9
Add more reference type tests
DoctorKrolic Apr 1, 2024
45e87df
Merge branch 'dotnet:main' into nullable-type-in-pattern-recovery
DoctorKrolic Apr 2, 2024
382b1c0
Add tests where `await` is a pattern variable
DoctorKrolic Apr 2, 2024
e4869d9
Add tests from main
DoctorKrolic Apr 2, 2024
00eaf58
Merge branch 'main' into nullable-type-in-pattern-recovery
DoctorKrolic Apr 2, 2024
a8d1fe5
Add more tests with missing comma in list pattern followed by a literal
DoctorKrolic Apr 13, 2024
fd940ed
Explain my life choices
DoctorKrolic Apr 13, 2024
4b7079f
Add more tests around predefined types
DoctorKrolic Apr 13, 2024
4c404b8
Use single flag
DoctorKrolic May 5, 2024
bd87f6f
Better comment
DoctorKrolic May 5, 2024
b5e5fd9
Merge remote-tracking branch 'upstream/main' into nullable-type-in-pa…
CyrusNajmabadi May 5, 2024
7e7203d
break into pieces
CyrusNajmabadi May 5, 2024
bd74493
Await handling
CyrusNajmabadi May 5, 2024
2c90fdb
rename
CyrusNajmabadi May 5, 2024
80410e9
lint
CyrusNajmabadi May 5, 2024
006b7b9
Add test with conditional expression in switch expression
DoctorKrolic May 7, 2024
5dcc50a
Merge branch 'nullable-type-in-pattern-recovery' of https://github.co…
DoctorKrolic May 7, 2024
3a6f3bd
Remove wrong assumption
DoctorKrolic May 7, 2024
3c74dfd
Typo
DoctorKrolic May 7, 2024
19ee361
Fix semantic tests
DoctorKrolic May 7, 2024
13bf31c
Fix test
DoctorKrolic May 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
102 changes: 87 additions & 15 deletions src/Compilers/CSharp/Portable/Parser/LanguageParser.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7015,20 +7015,20 @@ private enum ParseTypeMode
FirstElementOfPossibleTupleLiteral,
}

private TypeSyntax ParseType(ParseTypeMode mode = ParseTypeMode.Normal)
private TypeSyntax ParseType(ParseTypeMode mode = ParseTypeMode.Normal, bool inSwitchArmPattern = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whenIsKeyword got renamed to inSwitchArmPattern to try to be clearer. We pass 'true' in here when parsign the pattern for a switch-statement or switch-expression pattern. And it controls how we view 'where' and a few other error recovery rules (including new ones in this pr).

{
if (this.CurrentToken.Kind == SyntaxKind.RefKeyword)
{
return _syntaxFactory.RefType(
this.EatToken(),
this.CurrentToken.Kind == SyntaxKind.ReadOnlyKeyword ? this.EatToken() : null,
ParseTypeCore(ParseTypeMode.AfterRef));
ParseTypeCore(ParseTypeMode.AfterRef, inSwitchArmPattern));
}

return ParseTypeCore(mode);
return ParseTypeCore(mode, inSwitchArmPattern);
}

private TypeSyntax ParseTypeCore(ParseTypeMode mode)
private TypeSyntax ParseTypeCore(ParseTypeMode mode, bool inSwitchArmPattern)
{
NameOptions nameOptions;
switch (mode)
Expand Down Expand Up @@ -7069,7 +7069,7 @@ private TypeSyntax ParseTypeCore(ParseTypeMode mode)
{
case SyntaxKind.QuestionToken when canBeNullableType():
{
var question = EatNullableQualifierIfApplicable(mode);
var question = EatNullableQualifierIfApplicable(mode, inSwitchArmPattern);
if (question != null)
{
type = _syntaxFactory.NullableType(type, question);
Expand Down Expand Up @@ -7142,7 +7142,9 @@ bool canBeNullableType()
return type;
}

private SyntaxToken EatNullableQualifierIfApplicable(ParseTypeMode mode)
/// <param name="inSwitchArmPattern">If this is in the patternsection of a switch-expression arm or a
/// switch-statement case-clause.</param>
private SyntaxToken EatNullableQualifierIfApplicable(ParseTypeMode mode, bool inSwitchArmPattern)
{
Debug.Assert(this.CurrentToken.Kind == SyntaxKind.QuestionToken);
using var resetPoint = this.GetDisposableResetPoint(resetOnDispose: false);
Expand All @@ -7164,18 +7166,88 @@ bool canFollowNullableType()
case ParseTypeMode.AfterIs:
case ParseTypeMode.DefinitePattern:
case ParseTypeMode.AsExpression:
// These contexts might be a type that is at the end of an expression. In these contexts we only
// permit the nullable qualifier if it is followed by a token that could not start an
// expression, because for backward compatibility we want to consider a `?` token as part of the
// `?:` operator if possible.
// We are currently after `?` token after a nullable type pattern and need to decide how to
// parse what we see next. In the case of an identifier (e.g. `x ? a` there are two ways we can
// see things
//
// Similarly, if we have `T?[]` we do want to treat that as an array of nullables (following
// existing parsing), not a conditional that returns a list.
// 1. As a start of conditional expression, e.g. `var a = obj is string ? a : b`
// 2. As a designation of a nullable-typed pattern, e.g. `if (obj is string? str)`
//
// Since nullable types (no matter reference of value types) are not valid in patterns by
DoctorKrolic marked this conversation as resolved.
Show resolved Hide resolved
// default we are biased towards the first option and consider case 2 only for error recovery
// purposes (if we parse here as nullable type pattern an error will be reported during
// binding). This condition checks for simple cases, where we better use option 2 and parse a
// nullable-typed pattern
if (IsTrueIdentifier(this.CurrentToken))
{
// If we're in a cast statement/expression arm, then we have:
//
// `switch { case X ? y` or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumption is not correct. The following is valid code:

        switch(i3) {
            case b ? i1 : i2: break;
        }

This comment needs to be corrected, and we should re-examine this approach to ensure that we're not breaking anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "why on Earth this is legal" question should have probably been asked N years ago when switch statements were designed) I was sure that conditional expressions in case labels aren't semantically legal and thus there is no difference whether this is parsed as a conditional or a nullable pattern + label statement. There is already a test in the codebase for that case and I initially wrote a comment explaining how these 2 parse strategies both lead to a compiler error thus are identical for our purposes, had to revert that change. And I'm somewhat surprised that only a very few new test cases now are back to bad state after removing this assumption, so I don't think we should do additional work to restore them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "why on Earth this is legal" question should have probably been asked N years ago when switch statements were designed)

So, this is legal because cases allow for patterns. And one sort of pattern is the 'constant pattern' (that's how x is 0 works for example). So these places allow for any sort of constant expression (including conditional expressions). Note: for conditional expressions to be constant, their condition itself must be constant. But that can often normally happen, esp. in generated code, or code referencing constants that are conditionally compiled in.

//
// e switch { X ? y `
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this assumption is closer to correct, but isn't always:

        _ = i3 switch {
                (b ? i1 : i2) => 3
        };

You do need the parentheses to force parse in this manner, but I'm concerned that we'll still end up with bad behavior here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this requires parentheses, it wasn't affected even with the wrong assumption. Added a test for this case

//
// In both these cases, this isn't a conditional expression and should be treated as a
// nullable type pattern (with a a good binding error to be reported later).
if (inSwitchArmPattern)
return true;

// In a non-async method, `await` is a simple identifier. However, if we see `x ? await`
// it's almost certainly the start of an `await expression` in a conditional expression
// (e.g. `x is Y ? await ...`), not a nullable type pattern (since users would not use
// 'await' as the name of a variable). So just treat this as a conditional expression.
if (this.CurrentToken.ContextualKind == SyntaxKind.AwaitKeyword)
DoctorKrolic marked this conversation as resolved.
Show resolved Hide resolved
return false;

var nextTokenKind = PeekToken(1).Kind;

// These token either 100% end a pattern or start a new one:

// A literal token starts a new pattern. Can occur in list pattern with missing separation
// `,`. For example, in `x is [int[]? arr 5]` we'd prefer to parse this as a missing `,`
// after `arr`
if (SyntaxFacts.IsLiteral(nextTokenKind))
return true;

// A predefined type is basically the same case: `x is [string[]? slice char ch]`. We'd
// prefer to parse this as a missing `,` after `slice`.
if (SyntaxFacts.IsPredefinedType(nextTokenKind))
return true;

// `)`, `]` and `}` obviously end a pattern. For example:
// `if (x is int? i)`, `indexable[x is string? s]`, `x is { Prop: Type? typeVar }`
if (nextTokenKind is SyntaxKind.CloseParenToken or SyntaxKind.CloseBracketToken or SyntaxKind.CloseBraceToken)
return true;

// `{` starts a new pattern. For example: `x is A? { ...`. Note, that `[` and `(` are not
// in the list because they can start an invocation/indexer
if (nextTokenKind == SyntaxKind.OpenBraceToken)
return true;

// `,` ends a pattern in list/property pattern. For example `x is { Prop1: Type1? type, Prop2: Type2 }` or
// `x is [Type1? type, ...]`
if (nextTokenKind == SyntaxKind.CommaToken)
return true;

// `;` end a pattern if it finishes an expression statement: var y = x is bool? b;
if (nextTokenKind == SyntaxKind.SemicolonToken)
return true;

// EndOfFileToken is obviously the end of parsing. We are better parsing a pattern rather
// than an unfinished conditional expression
if (nextTokenKind == SyntaxKind.EndOfFileToken)
return true;

return false;
}

// If nothing from above worked permit the nullable qualifier if it is followed by a token that
// could not start an expression If we have `T?[]` we do want to treat that as an array of
// nullables (following existing parsing), not a conditional that returns a list.
return !CanStartExpression() || this.CurrentToken.Kind is SyntaxKind.OpenBracketToken;
case ParseTypeMode.NewExpression:
// A nullable qualifier is permitted as part of the type in a `new` expression.
// e.g. `new int?()` is allowed. It creates a null value of type `Nullable<int>`.
// Similarly `new int? {}` is allowed.
// A nullable qualifier is permitted as part of the type in a `new` expression. e.g. `new
// int?()` is allowed. It creates a null value of type `Nullable<int>`. Similarly `new int? {}`
// is allowed.
return
this.CurrentToken.Kind is SyntaxKind.OpenParenToken or // ctor parameters
SyntaxKind.OpenBracketToken or // array type
Expand Down
57 changes: 32 additions & 25 deletions src/Compilers/CSharp/Portable/Parser/LanguageParser_Patterns.cs
Original file line number Diff line number Diff line change
Expand Up @@ -50,21 +50,21 @@ private bool ConvertExpressionToType(ExpressionSyntax expression, [NotNullWhen(t
};
}

private PatternSyntax ParsePattern(Precedence precedence, bool afterIs = false, bool whenIsKeyword = false)
private PatternSyntax ParsePattern(Precedence precedence, bool afterIs = false, bool inSwitchArmPattern = false)
{
return ParseDisjunctivePattern(precedence, afterIs, whenIsKeyword);
return ParseDisjunctivePattern(precedence, afterIs, inSwitchArmPattern);
}

private PatternSyntax ParseDisjunctivePattern(Precedence precedence, bool afterIs, bool whenIsKeyword)
private PatternSyntax ParseDisjunctivePattern(Precedence precedence, bool afterIs, bool inSwitchArmPattern)
{
PatternSyntax result = ParseConjunctivePattern(precedence, afterIs, whenIsKeyword);
PatternSyntax result = ParseConjunctivePattern(precedence, afterIs, inSwitchArmPattern);
while (this.CurrentToken.ContextualKind == SyntaxKind.OrKeyword)
{
result = _syntaxFactory.BinaryPattern(
SyntaxKind.OrPattern,
result,
ConvertToKeyword(this.EatToken()),
ParseConjunctivePattern(precedence, afterIs, whenIsKeyword));
ParseConjunctivePattern(precedence, afterIs, inSwitchArmPattern));
}

return result;
Expand Down Expand Up @@ -101,16 +101,16 @@ private bool LooksLikeTypeOfPattern()
return false;
}

private PatternSyntax ParseConjunctivePattern(Precedence precedence, bool afterIs, bool whenIsKeyword)
private PatternSyntax ParseConjunctivePattern(Precedence precedence, bool afterIs, bool inSwitchArmPattern)
{
PatternSyntax result = ParseNegatedPattern(precedence, afterIs, whenIsKeyword);
PatternSyntax result = ParseNegatedPattern(precedence, afterIs, inSwitchArmPattern);
while (this.CurrentToken.ContextualKind == SyntaxKind.AndKeyword)
{
result = _syntaxFactory.BinaryPattern(
SyntaxKind.AndPattern,
result,
ConvertToKeyword(this.EatToken()),
ParseNegatedPattern(precedence, afterIs, whenIsKeyword));
ParseNegatedPattern(precedence, afterIs, inSwitchArmPattern));
}

return result;
Expand Down Expand Up @@ -155,21 +155,21 @@ private bool ScanDesignation(bool permitTuple)
}
}

private PatternSyntax ParseNegatedPattern(Precedence precedence, bool afterIs, bool whenIsKeyword)
private PatternSyntax ParseNegatedPattern(Precedence precedence, bool afterIs, bool inSwitchArmPattern)
{
if (this.CurrentToken.ContextualKind == SyntaxKind.NotKeyword)
{
return _syntaxFactory.UnaryPattern(
ConvertToKeyword(this.EatToken()),
ParseNegatedPattern(precedence, afterIs, whenIsKeyword));
ParseNegatedPattern(precedence, afterIs, inSwitchArmPattern));
}
else
{
return ParsePrimaryPattern(precedence, afterIs, whenIsKeyword);
return ParsePrimaryPattern(precedence, afterIs, inSwitchArmPattern);
}
}

private PatternSyntax ParsePrimaryPattern(Precedence precedence, bool afterIs, bool whenIsKeyword)
private PatternSyntax ParsePrimaryPattern(Precedence precedence, bool afterIs, bool inSwitchArmPattern)
{
// handle common error recovery situations during typing
var tk = this.CurrentToken.Kind;
Expand All @@ -192,10 +192,10 @@ private PatternSyntax ParsePrimaryPattern(Precedence precedence, bool afterIs, b
switch (CurrentToken.Kind)
{
case SyntaxKind.OpenBracketToken:
return this.ParseListPattern(whenIsKeyword);
return this.ParseListPattern(inSwitchArmPattern);
case SyntaxKind.DotDotToken:
return _syntaxFactory.SlicePattern(EatToken(),
IsPossibleSubpatternElement() ? ParsePattern(precedence, afterIs: false, whenIsKeyword) : null);
IsPossibleSubpatternElement() ? ParsePattern(precedence, afterIs: false, inSwitchArmPattern) : null);
case SyntaxKind.LessThanToken:
case SyntaxKind.LessThanEqualsToken:
case SyntaxKind.GreaterThanToken:
Expand All @@ -214,7 +214,14 @@ private PatternSyntax ParsePrimaryPattern(Precedence precedence, bool afterIs, b
TypeSyntax? type = null;
if (LooksLikeTypeOfPattern())
{
type = this.ParseType(afterIs ? ParseTypeMode.AfterIs : ParseTypeMode.DefinitePattern);
// Given that we are in a pattern parsing context, the only way `inSwitchArmPattern` flag can be true is if we are in a top level switch pattern, e.g.:
// num switch
// {
// 1 $$ => ... // `when` can be a keyword at `$$` and we are in a top-level switch pattern context
// (2 $$) => ... // `when` cannot be a keyword at `$$` since we are inside a parenthesized pattern. But that also means we are not in a top-level switch pattern context
// }
type = this.ParseType(
afterIs ? ParseTypeMode.AfterIs : ParseTypeMode.DefinitePattern, inSwitchArmPattern);
if (type.IsMissing || !CanTokenFollowTypeInPattern(precedence))
{
// either it is not shaped like a type, or it is a constant expression.
Expand All @@ -223,7 +230,7 @@ private PatternSyntax ParsePrimaryPattern(Precedence precedence, bool afterIs, b
}
}

var pattern = ParsePatternContinued(type, precedence, whenIsKeyword);
var pattern = ParsePatternContinued(type, precedence, inSwitchArmPattern);
if (pattern != null)
return pattern;

Expand Down Expand Up @@ -262,14 +269,14 @@ bool CanTokenFollowTypeInPattern(Precedence precedence)
}
}

private PatternSyntax? ParsePatternContinued(TypeSyntax? type, Precedence precedence, bool whenIsKeyword)
private PatternSyntax? ParsePatternContinued(TypeSyntax? type, Precedence precedence, bool inSwitchArmPattern)
{
if (type?.Kind == SyntaxKind.IdentifierName)
{
var typeIdentifier = (IdentifierNameSyntax)type;
var typeIdentifierToken = typeIdentifier.Identifier;
if (typeIdentifierToken.ContextualKind == SyntaxKind.VarKeyword &&
(this.CurrentToken.Kind == SyntaxKind.OpenParenToken || this.IsValidPatternDesignation(whenIsKeyword)))
(this.CurrentToken.Kind == SyntaxKind.OpenParenToken || this.IsValidPatternDesignation(inSwitchArmPattern)))
{
// we have a "var" pattern; "var" is not permitted to be a stand-in for a type (or a constant) in a pattern.
var varToken = ConvertToKeyword(typeIdentifierToken);
Expand All @@ -295,7 +302,7 @@ bool CanTokenFollowTypeInPattern(Precedence precedence)
var closeParenToken = this.EatToken(SyntaxKind.CloseParenToken);

parsePropertyPatternClause(out PropertyPatternClauseSyntax? propertyPatternClause0);
var designation0 = TryParseSimpleDesignation(whenIsKeyword);
var designation0 = TryParseSimpleDesignation(inSwitchArmPattern);

if (type == null &&
propertyPatternClause0 == null &&
Expand Down Expand Up @@ -333,12 +340,12 @@ bool CanTokenFollowTypeInPattern(Precedence precedence)
{
return _syntaxFactory.RecursivePattern(
type, positionalPatternClause: null, propertyPatternClause,
TryParseSimpleDesignation(whenIsKeyword));
TryParseSimpleDesignation(inSwitchArmPattern));
}

if (type != null)
{
var designation = TryParseSimpleDesignation(whenIsKeyword);
var designation = TryParseSimpleDesignation(inSwitchArmPattern);
if (designation != null)
return _syntaxFactory.DeclarationPattern(type, designation);

Expand Down Expand Up @@ -431,7 +438,7 @@ private CSharpSyntaxNode ParseExpressionOrPatternForSwitchStatement()
{
var savedState = _termState;
_termState |= TerminatorState.IsExpressionOrPatternInCaseLabelOfSwitchStatement;
var pattern = ParsePattern(Precedence.Conditional, whenIsKeyword: true);
var pattern = ParsePattern(Precedence.Conditional, inSwitchArmPattern: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of hte two locations where 'true' is passed in. Here we're parsing case clauses for switch-statements.

_termState = savedState;
return ConvertPatternToExpressionIfPossible(pattern);
}
Expand Down Expand Up @@ -589,7 +596,7 @@ SeparatedSyntaxList<SwitchExpressionArmSyntax> parseSwitchExpressionArms()

var savedState = _termState;
_termState |= TerminatorState.IsPatternInSwitchExpressionArm;
var pattern = ParsePattern(Precedence.Coalescing, whenIsKeyword: true);
var pattern = ParsePattern(Precedence.Coalescing, inSwitchArmPattern: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of hte two locations where 'true' is passed in. Here we're parsing the arms for switch-expressions.

_termState = savedState;

// We use a precedence that excludes lambdas, assignments, and a conditional which could have a
Expand Down Expand Up @@ -628,7 +635,7 @@ SeparatedSyntaxList<SwitchExpressionArmSyntax> parseSwitchExpressionArms()
}
}

private ListPatternSyntax ParseListPattern(bool whenIsKeyword)
private ListPatternSyntax ParseListPattern(bool inSwitchArmPattern)
{
var openBracket = this.EatToken(SyntaxKind.OpenBracketToken);
var list = this.ParseCommaSeparatedSyntaxList(
Expand All @@ -645,7 +652,7 @@ private ListPatternSyntax ParseListPattern(bool whenIsKeyword)
openBracket,
list,
this.EatToken(SyntaxKind.CloseBracketToken),
TryParseSimpleDesignation(whenIsKeyword));
TryParseSimpleDesignation(inSwitchArmPattern));
}
}
}