Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treatment of positional (e.g. tuple) patterns vs parenthesized expressions #82

Closed
gafter opened this issue Feb 13, 2017 · 29 comments
Closed
Assignees
Labels

Comments

@gafter
Copy link
Member

gafter commented Feb 13, 2017

I'm afraid we may have to take a breaking change in the handling of constant patterns when we begin to support positional patterns such as tuple patterns.

Recall that we are treating a type such as

class Point
{
    public void Deconstruct(out int X, out int Y) { ... }
}

as being capable of being deconstructed

Point p = ...;
(int x, int y) = p;

And similarly we will want to support such types in pattern-matching

var p = new Point(...);
switch (p)
{
    case (3, 4): ...
}

However, for the purposes of pattern-matching we also want to support types that decompose into zero or one value. An ambiguity arises for decomposition to one value:

class Price
{
    public void Deconstruct(out int Cents) { ... }
}

Now when we write

var p = new Price(...);
switch (p)
{
    case (3): ...
}

Here, it appears to the compiler that we are trying to match an int of value 3. Since a value of type Price is never an int of value 3, this fails.

Of course, we want it to work. They question is how? Should case 3: also work the same way as case (3):?

@gafter gafter added the Spec label Feb 13, 2017
@gafter gafter self-assigned this Feb 13, 2017
@alrz
Copy link
Contributor

alrz commented Feb 13, 2017

As mentioned in roslyn repo before: what if we require a trailing comma to disambiguate a oneple?

@gafter
Copy link
Member Author

gafter commented Feb 14, 2017

@alrz The point of this issue is to ask the question: "what if we do not require a trailing comma" (or any other extra syntactic noise just to satisfy the compiler)

@MadsTorgersen
Copy link
Contributor

The meaning of (3) as a constant expression in a switch case is already determined long ago, and I am loath to break that. However, I think we do have a couple of other options:

  1. Don't support tuple patterns for womples. Instead, you'll have to use the type explicitly: case Price(3):
  2. Some other syntax, e.g., the trailing comma suggested by @alrz (though I don't like that one much)
  3. Your Price example above would fail today, as long as (3) doesn't implicitly convert to Price. In such failure cases we could fall back to look for a unary deconstructor to see if one matches.

I tentatively like 3 a lot.

I think this is feature discussion. I'll reclassify as such, and @gafter if you disagree, it goes to show that we need to firm up classification! 😄

@mattwar
Copy link

mattwar commented Feb 14, 2017

I'm loathe to introduce anything that gives credence to oneples.

@gafter
Copy link
Member Author

gafter commented Feb 14, 2017

@MadsTorgersen Using the type explicitly is already required for the runtime-type tests that don't use ITuple (e.g. when switching on object). The only thing that wouldn't work for are the single-element value tests that are intended to be supported via ITuple, for example ValueTuple. I would think we could just say that isn't supported.

This is a spec bug because the current spec for pattern-matching is ambiguous in this case.

@gafter gafter added Spec and removed Discussion labels Feb 14, 2017
@gafter
Copy link
Member Author

gafter commented Feb 14, 2017

Here is the grammar exhibiting the ambiguity

constant_pattern
    : shift_expression
    ;

positional_pattern
    :  type? '(' subpattern_list? ')'
    ;

subpattern_list
    : subpattern
    | subpattern ',' subpattern_list
    ;

The pattern ( 3 ) can be parsed as a positional_pattern containing a constant_pattern or as a constant_pattern containing a parenthesized expression.

One thing we could consider doing is disallowing a simple parenthesized expression to be used as a constant pattern. It would always be interpreted as a positional pattern.

@alrz
Copy link
Contributor

alrz commented Feb 14, 2017

If we ever want to support parenthesized patterns (possibly along with some pattern operators like dotnet/roslyn#16766 or dotnet/roslyn#6235) to specify precedence, we probably should be avoiding any special meaning for the (<pattern>) form.

@alrz
Copy link
Contributor

alrz commented Feb 14, 2017

  1. Your Price example above would fail today, as long as (3) doesn't implicitly convert to Price. In such failure cases we could fall back to look for a unary deconstructor to see if one matches.

I tentatively like 3 a lot.

That means a pair of parentheses dramatically changes the meaning of code, conditionally? IMO wrapping parentheses should be always insignificant, for expressions, patterns or even types, just like (3) that is the same as 3, (<pattern>) would be equivalent to <pattern>, and (int) would be equivalent to int. So effectively, (int name) would be an error, though syntactically possible. Not sure if this helps with the ambiguity issue being discussed in this thread, if it does not make it worse.

@orthoxerox
Copy link

Should we allow to deconstruct values straight into tuples during pattern matching at all? I'd rather write the type name every time.

Well, I can see the aesthetic value of omitting it for simple wrapper types, like

Slope slope = GetSlope();
switch (slope) {
    case Slope(double.PositiveInfinity):
    case Slope(double.NegativeInfinity):
        //blah
        break;
    case Slope(var d):
        //do something with d
        break;
}
//vs
Slope slope = GetSlope();
switch (slope) {
    case (double.PositiveInfinity):
    case (double.NegativeInfinity):
        //blah
        break;
    case (var d):
        //do something with d
        break;
}

but I'd rather have an explicit token you can F12 and go to the definition. I nearly proposed extending option 3 to standalone unparenthesized expressions, but that would make var d patterns ambiguous.

@alrz
Copy link
Contributor

alrz commented Feb 14, 2017

@orthoxerox I recall that it's explicitly captured in a design note before that the LDT likes to be able to omit the type if it's statically known, that would be also applied to property-patterns e.g. point is {X:1, Y:2}.

@orthoxerox
Copy link

Then I guess you could force every type to implement an implicit conversion that matches the single-out-parameter deconstructor.

@alrz
Copy link
Contributor

alrz commented Feb 14, 2017

@orthoxerox I haven't checked the spec but I think implicit conversions do not contribute to that rule.

@HaloFour
Copy link
Contributor

I am of the opinion that the compiler should treat 3 and (3) as the same value, especially since it already does so. In fact, the above code works just fine already if Price has an implicit operator defined from Price to an integral type.

class Price
{
    public static implicit operator int(Price price) => price.Cents;
}
// later
var p = new Price(...);
switch (p) {
    case (3):  // perfectly legal C# 7.0
        break;
}

Given such my opinion is that when the compiler knows the type in question that it attempts to resolve an accessible implicit operator to a compatible target type. If one doesn't exist, then it falls back by trying to resolve an accessible Deconstruct method with a single compatible argument.

However, that does beg the question as to whether the compiler should follow the same logic when it doesn't know the type. Should it also attempt to resolve a compatible implicit operator and fallback to a Deconstruct method, or just go straight for the latter? For the sake of consistency I'd probably argue that they should be the same.

object p = new Price(...);
switch (p) {
    case Price(3): // potentially legal C#next given the definition of Price above
}

// equivalent to:

switch (p) {
    case Price temp when temp == 3:
}

@jnm2
Copy link
Contributor

jnm2 commented Feb 14, 2017

@alrz

That means a pair of parentheses dramatically changes the meaning of code, conditionally? IMO wrapping parentheses should be always insignificant, for expressions, patterns or even types, just like (3) that is the same as 3, () would be equivalent to , and (int) would be equivalent to int. So effectively, (int name) would be an error, though syntactically possible. Not sure if this helps with the ambiguity issue being discussed in this thread, if it does not make it worse.

I disagree. That statement is too broad. The presence or absence of a pair of parentheses can and should drastically change meanings. Foo(1, 2, 3) and Foo((1, 2, 3)) and Foo((1, 2), 3) are three distinct overloads.

@alrz
Copy link
Contributor

alrz commented Feb 14, 2017

@jnm2 I was talking about the case mentioned in the OP: when there is a single element inside parens.

@paulomorgado
Copy link

So, the compiler will generate code that will call Deconstruct behind the scenes. Why not allow it to be explicit?

This:

switch (p)
{
    case (3, 4): ...
}

could be, somehow, equivalent to this:

switch (p)
{
    case p.Deconstruct(out var t1, out var t2) when t1 == 3 && t2 == 4: ...
}

Which I would prefer not to use, but that would need to use for oneples and nonples

switch (p)
{
    case p.Deconstruct(out var t1) when t1 == 3: ...
}

@alrz
Copy link
Contributor

alrz commented May 18, 2017

@paulomorgado That looks like #277

@lachbaer
Copy link
Contributor

I just tried hard to create a womple by invoking the corresponding Decostruct and in C# 7 it doesn't work.

class MyClass {
    public void Deconstruct(out int a) => a = 4;
}

var myClass = new MyClass();
(int a) = inti; // Error CS1525 Invalid expression term 'int'	ConsoleApp7

ValueTuple<int> vti;
vti = inti; // Error CS0029 Cannot implicitly convert type 'ConsoleApp7.IsNotTest<int>' to '(int)'	

With 'twoples' the first one works however.

That's why I would go with disallowing womples in pattern scenarios.

@gafter
Copy link
Member Author

gafter commented May 19, 2017

@lachbaer So you're suggesting we should disallow data types that have a single underlying datum? I don't think that makes sense. Units (e.g. Kilograms, Dollars, etc) are often expressed that way, and would be hardly usable with pattern-matching with your suggestion.

@lachbaer
Copy link
Contributor

@gafter No, surely I'm not suggesting that 😆 I just think that when the resolution of this one item tuple in a pattern is done by calling an appropriate Deconstruct method I wonder why that is not possible by the same syntax in other expressions, and then I would disallow that syntax form.

Does it make sense to have an operator is with the same type as its enclosing type and that logically seperates from the Deconstruct method? I thought a while about it and have not come up with an idea. So it might just be safe to disallow same type operator is and have is Type(out var a) be a syntactic synonym for the Deconstruct method. That would lead to @MadsTorgersen solution number 1 for this topics problem.

@lachbaer
Copy link
Contributor

And besides that would allow for same type is Type(...) on record types, without declaring a seperate operator is. Or is that already possible?

@gafter
Copy link
Member Author

gafter commented May 20, 2017

Does it make sense to have an operator is with the same type as its enclosing type

While it might or not make sense, it would not be particularly useful. I'm more interested in the case where the underlying type is different. Using the record-declaration syntax, something like

class Dollars(decimal Value);

@erikhermansson79
Copy link

Would the ambiguity disappear if a parenthesized expression was a oneple?

@gafter
Copy link
Member Author

gafter commented Oct 19, 2017

@erikhermansson79 A oneple would not be a constant expression, and therefore would not be a valid constant pattern. In any case, there is no syntax for writing one.

@gafter
Copy link
Member Author

gafter commented Oct 19, 2017

Our tentative resolution of this is based on @MadsTorgersen's proposed option (3). In summary:

There is an ambiguity between a constant pattern expressed using a parenthesized expression, and a “positional” (Deconstruct) recursive pattern with one element:

          case (2):

Here are the two possible meanings:

          switch (2)
          {
                         case (2): // a parenthesized expression

and

          class C { public void Deconstruct(out int x) … }
          switch (new C())
          {
                         case (2): // Call Deconstruct once.

While it is possible for the programmer to disambiguate (e.g. case +(2): or case (2) {}), we still need a policy for the compiler to disambiguate. The policy is that this will be parsed as a parenthesized expression. However, if the type of the parenthesized expression is not suitable given the switch expression, semantic analysis will notice the set of parentheses and fall back to trying Deconstruct. That will allow both of the above examples to “work” as written.

It does this recursively, so this will work too:

          class C { public void Deconstruct(out D d) … }
          class D { public void Deconstruct(out int x) … }
          switch (new C())
          {
                         case ((2)): // Call Deconstruct twice.

A semantic ambiguity arises between these two interpretations when the switched type is, for example, object (we deconstruct using ITuple). Another way the ambiguity could arise would be if someone were to write an (extension) Deconstruct method for one of the built-in types. In either case we will just treat it as a parenthesized expression, and require explicit disambiguation if another interpretation is intended.

@gafter
Copy link
Member Author

gafter commented Oct 19, 2017

And while we're at it, the following will need to go in the spec too:

There is a syntactic ambiguity between a cast expression and a positional pattern. The switch case

          case (A)B:

could be either a cast (i.e. existing code, if A is the name of an enum type, Int32 or a using-alias)

          case (Int32)MyConstantValue:

or a single-element deconstruction pattern with the constant subpattern A being named B.

          case (MyConstantValue) x:

When such an ambiguity occurs, it will be parsed as a cast expression (for compatibility with existing code). To disambiguate, you can add an empty property pattern part:

          case (MyConstantValue) {} x;

@erikhermansson79
Copy link

@gafter: I see. I was thinking more in the line of replacing parenthesized expressions in the language with oneple literals. Since they don't yet exist in the language, maybe they could be added to have the same semantics as parenthesized expressions. Then "case 3" would be a constant expression and "case (3)" a positional pattern with a oneple.

@gafter
Copy link
Member Author

gafter commented Oct 19, 2017

@erikhermansson79 Tuples are generic struct types named System.ValueTuple. We have a single-element version of it, we just don't have a surface syntax for expressions of its type.

@gafter
Copy link
Member Author

gafter commented Oct 30, 2017

Discussion moved to #1054

@gafter gafter closed this as completed Oct 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants