Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: expression-based switch ("match") for pattern matching #5154

Closed
gafter opened this issue Sep 11, 2015 · 286 comments
Closed

Proposal: expression-based switch ("match") for pattern matching #5154

gafter opened this issue Sep 11, 2015 · 286 comments

Comments

@gafter
Copy link
Member

gafter commented Sep 11, 2015

The spec has been moved

The spec for the proposed match expression has been moved to https://github.com/dotnet/roslyn/blob/future/docs/features/patterns.md

Below is a snapshot that may be out of date.


Match Expression

A match-expression is added to support switch-like semantics for an expression context.

The C# language syntax is augmented with the following syntactic productions:

relational-expression
    : match-expression
    ;

We add the match-expression as a new kind of relational-expression.

match-expression
    : relational-expression 'switch' match-block
    ;

match-block
    : '(' match-sections ','? ')'
    ;

match-sections
    : match-section
    | match-sections ',' match-section
    ;

At least one match-section is required.

match-section
    : 'case' pattern case-guard? ':' expression
    ;

case-guard
    : 'when' expression
    ;

The match-expression is not allowed as an expression-statement.

The type of the match-expression is the least common type of the expressions appearing to the right of the : tokens of the _match section_s.

It is an error if the compiler can prove (using a set of techniques that has not yet been specified) that some match-section's pattern cannot affect the result because some previous pattern will always match.

At runtime, the result of the match-expression is the value of the expression of the first match-section for which the expression on the left-hand-side of the match-expression matches the match-section's pattern, and for which the case-guard of the match-section, if present, evaluates to true.

@paulomorgado
Copy link

It is not proposed that match-expression be added to the set of syntax forms allowed as an expression-statement.

Does this mean that there won't be an expression match statement?

@orthoxerox
Copy link
Contributor

Feels linqy to me, with an infix keyword and no explicit terminators:

var areas =
    from primitive in primitives
    let area = primitive match {
        case Line l => 0
        case Rectangle r => r.Width * r.Height
        case Circle c => Math.PI * c.Radius * c.Radius
        default => throw new ApplicationException()
    }
    select new { Primitive = primitive, Area = area };

Will the absence of clause terminators cause any problems? I can't think of any cases, since case is a reserved word.

@MrJul
Copy link
Contributor

MrJul commented Sep 11, 2015

It is not proposed that match-expression be added to the set of syntax forms allowed as an expression-statement.

If I understand that sentence correctly, it means that it's the only place in the language (except Expression<T> of course, but it has its own set of rules) where I can't transform x => expr into x => { return expr; }? That might be problematic, forcing the user to create a method as soon as he needs to introduce a temporary local variable, but it might simplify parsing. I understand that technically the matches aren't functions, but since they share the same syntax, one might expect the same transformations to apply.

The type of the match-expression is the least common type of the expressions appearing to the right of the => tokens of the match sections and the default-match-section.

Does this proposal include a better way to find the common type?
Would the following expression be of type int?, object or a compile-time error?

x match {
  case string s => Int32.Parse(s)
  case int i => i
  default => null
}

And finally, how are variables from patterns scoped: can we reuse the same variable name for two different patterns in the same match expression?

@MrJul
Copy link
Contributor

MrJul commented Sep 11, 2015

Since a case guard is an expression, it can't introduce a variable. If the same expression is needed both in the guard and in the result, it has to be executed twice.

Consider the following example:

x match {
  case int i where SlowOperation(i) > 5 => SlowOperation(i)
  // other cases
}

Ideally SlowOperation(i) should only be computed once and stored into a local, but that's not possible. Is this acceptable? Personally, I think so, one can always fall back to a good old if statement if needed. Plus there's precedence with catch (...) when.

@paulomorgado
Copy link

@MrJul, I suppose that would be something like this:

x match {
    case is int i where SlowOperation(i) > 5 => SlowOperation(i)
    // other cases
}

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

@paulomorgado There is no is in the syntax for the match expression.

Does this mean that there won't be an expression match statement?

There is no proposed match statement.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

@MrJul

If I understand that sentence correctly, it means that it's the only place in the language (except Expression of course, but it has its own set of rules) where I can't transform x => expr into x => { return expr; }? That might be problematic, forcing the user to create a method as soon as he needs to introduce a temporary local variable, but it might simplify parsing.

You already cannot make that transformation in expression-boded methods. Programmers already introduce their temporary variables before a statement that needs it, and things get awkward in contexts, such as a ctor-initializer, where there is no statement list into which such a local declaration can be placed. This construct is not intended to address that issue.

Does this proposal include a better way to find the common type?
Would the following expression be of type int?, object or a compile-time error?

No, the proposal does not modify the best-common-type specification. Your example would be a compile-time error that is easy to fix.

And finally, how are variables from patterns scoped: can we reuse the same variable name for two different patterns in the same match expression?

Yes. Please see the first sentence in the "Semantics" section.

@paulomorgado
Copy link

@gafter,

There is no is in the syntax for the match expression.

It felt very strange when I wrote it!

Looks like pattern matching is not allowed on match expressions.

So, other than being an expression what differs match from switch with a few ifs and gotos?

@orthoxerox
Copy link
Contributor

@paulomorgado the is is implicit, o match { case Point(var x, var y) => ... } is the same as if (o is Point(var x, var y)) { ..., except it's an expression.

@SolalPirelli
Copy link

Nitpicking: Wouldn't when be a better choice than where for case guards? The former is already used in exception filters which are run-time checks, while the latter is used in generics as a compile-time check.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

I think of this as similar to the linq where clause, but when would work too.

@HaloFour
Copy link

Nice. So is there so separator between cases? var y = x match { case int i => i case string s => int.Parse(s) default => 0}; ? That feels a little confusing.

It feels a little weird that match is infix but if that clarifies it's use as an expression that works. I probably also fall more on the when camp but it's not a big difference. I might also prefer to see a wildcard pattern * rather than default but again that's just being nit-picky. Good stuff.

@SolalPirelli
Copy link

Continuing on @HaloFour's thoughts on default vs. *: is it necessary to have a special-purpose default case at all, if there is a wildcard pattern?

Suppose that there is no default case, and * => ... is simply another way to use the wildcard. Then the compiler would always be required to prove that the match is complete, and the presence of a * case is a trivial proof. Cases written after a * case would be considered dead code, just like e.g. code after an unconditional return, and trigger the same kinds of warnings.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

Perhaps there should be a comma between cases. I've changed the syntax to require that.

Using case * instead of default would work. I've changed the spec to remove the default.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

@orthoxerox's example is now

var areas =
    from primitive in primitives
    let area = primitive match {
        case Line l => 0,
        case Rectangle r => r.Width * r.Height,
        case Circle c => Math.PI * c.Radius * c.Radius,
        case * => throw new ApplicationException()
    }
    select new { Primitive = primitive, Area = area };

@HaloFour
Copy link

This would be nice with declaration expressions, too.

object x = "123";
int y = x match {
    case int i => i,
    case string s where int.TryParse(s, out int i) => i,
    case * => throw new ApplicationException()
};

@SolalPirelli
Copy link

@HaloFour: IIRC (not sure I'm remembering this correctly - there've been many discussions on pattern matching), this is an use case for a custom is operator, so you could do "42" match { case MyCustomIntOperator i => i, ... } and it'd return an int, no need for a guard.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

@SolalPirelli Yes, the current spec supports that, but not using the syntax you provided:

class ParsedInt
{
    public static bool operator is(string s, out int i) => int.TryParse(s, out int i);
}

...

"42" match { case ParsedInt(int i) => i, ...

@AdamSpeight2008
Copy link
Contributor

Why not use the exist syntax of named parameters? for introducing variables within the match-section pattern

object x = "123";
int y = match( x )
        {
          |: (byte)   => 0;
          |: (i: int) => i;
          |: (s: string) when int.TryParse(s, out int i) => i;
          |:  * => throw new ApplicationException();
        };

@aluanhaddad
Copy link

@gafter
I really like what I see in this proposal. It incorporates almost everything I wanted for such a construct. With the introduction of throw expressions I think the need for multiple statement match sections is now mostly irrelevant.
Making the match keyword infix is a really nice way highlight that this is an expression and differentiate it from control structures such as switch.
Bravo.

As others have said I do think that the when keyword would be more appropriate for guards for two reasons:

  1. It matches the syntax for exception filters and because catch clauses "match" on the type of the exception this provides a nice intuition and symmetry.
  2. In LINQ expressions it might be confusing for where to be used in a nested expression with a different meaning.

@HaloFour
Match as an infix operator may seem a bit strange but this is exactly how Scala does it and it works very well in that language. It also really makes it clear that this is an expression and cannot be used as a control structure.

@AdamSpeight2008
I think you're mixing up values and types. The syntax for named parameters is name: value.
You are suggesting name: type. This also be completely different from every other part of the language where a type is specified so I do not think it's a good idea. Right now the cases look like function parameter declarations which is just right.

@AdamSpeight2008
Copy link
Contributor

An alternative for when is where

@leppie
Copy link
Contributor

leppie commented Sep 11, 2015

if would work as an alternative too. (edit: what is the syntax used for exception filters? whatever that is would be consistent)

@AdamSpeight2008
Copy link
Contributor

@aluanhaddad I'm not mixing value and types, since a matching against a type is likely a common match pattern.
|: (i: int) => i; mean if the this part is of type int then store it value in i.
match is successful so return i. Otherwise try the next pattern.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2015

@AdamSpeight2008

Why not use the exist syntax of named parameters?

This issue does not propose any syntax for patterns. That is specified in #206. But the short answer is that these are more akin to parameter declarations than they are to named arguments.

@gafter
Copy link
Member Author

gafter commented Nov 12, 2015

@alrz the comma is coming back.

@alrz
Copy link
Member

alrz commented Nov 12, 2015

@gafter That is not great. Just because of case expression? What was wrong with t match (var a, var b) : a + b as a shorthand for t match (case (var a, var b): a + b)? Using switch in match expression is that important?

@gafter
Copy link
Member Author

gafter commented Nov 12, 2015

@alrz sigh

@alrz
Copy link
Member

alrz commented Nov 12, 2015

@gafter Actually, when I think about it, that was really helpful.

@gafter
Copy link
Member Author

gafter commented Nov 12, 2015

LOL 😆

@gafter
Copy link
Member Author

gafter commented Nov 12, 2015

Lets see how it works out without adding new keywords. That option remains open if it appears preferable after trying this version.

@aluanhaddad
Copy link

Using case for a single pattern match expression feels very consistent and intuitive.

@jonschoning
Copy link

I wish you guys would focus on semantics instead of syntax. I wouldn't care if zombodash was the chosen keyword; what matters is the semantics. Every language that supports pattern matching (and not just destructuring.. even ES6/Javascript has destructuring) does it slightly differently and leads different benefits and pains. Can you explain what those tradeoffs are, where we sit in that spectrum, and justify the decision? What kinds of tests will pass fail under what asumptions, what is an error, warning etc..

@alrz
Copy link
Member

alrz commented Nov 12, 2015

@jonschoning I'd say zombodash is bloody confusing. You are ignoring the fact that before the compiler it's you that write the code and hundreds that read it after that. Actually, I believe that syntax implies the semantics, hence it is important just as much as other aspects. If they were using zombodash instead of var you would check the documentation for sure, but still, you wouldn't mind, right? why bother. Good for Javascript that has destructuring, by the way. "Can you explain .. what is an error, warning, etc" this is not documentation, you should know by now. Just read the spec. "I wish you guys would focus on semantics instead of syntax" If you haven't said that I'm sure they were pretty much doomed to failure.

@gafter
Copy link
Member Author

gafter commented Nov 12, 2015

The syntax is the easiest aspect to adjust based on experience.

@alrz
Copy link
Member

alrz commented Nov 13, 2015

@gafter Isn't it proposed that case-expression would be part of statement-expression so that obj case A a : a.F(); as a cast be possible? instead of ((A)obj).F(); or (obj as A).F(); so either it would throw NullReferenceException, InvalidCastException or some other exception?

@gafter
Copy link
Member Author

gafter commented Nov 13, 2015

@alrz that form is proposed to require that the match can be proven to always succeed.

@gafter gafter removed this from the C# 7 and VB 15 milestone Nov 20, 2015
@gafter gafter added the 4 - In Review A fix for the issue is submitted for review. label Nov 20, 2015
@agocke
Copy link
Member

agocke commented Nov 24, 2015

@jonschoning 👍 I'm inclined to declare this a "semantics-only" thread from now on.

Syntax is easy. I can test 100 syntax changes a day. Semantics requires binding work and forethought, both of which are hard.

@agocke
Copy link
Member

agocke commented Nov 24, 2015

Reading the latest spec, am I correct that

throw throw throw null;

Would now be a legal expression in some contexts?

@gafter
Copy link
Member Author

gafter commented Nov 24, 2015

@agocke No. The spec says

A throw expression is allowed in only the following contexts:

  • As the second or third operand of a ternary conditional operator ?:
  • As the second operand of a null coalescing operator ??
  • After the colon of a match section
  • As the body of an expression-bodied lambda or method.

"the operand of throw" is not one of these contexts.

@agocke
Copy link
Member

agocke commented Nov 24, 2015

@gafter I see, thanks.

@gafter
Copy link
Member Author

gafter commented Jan 28, 2016

This has been folded into the pattern matching spec.

@Pzixel
Copy link

Pzixel commented Jul 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests