Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2015-03-25 Design Discussion - Patterns and Records #1572

Closed
gafter opened this issue Mar 25, 2015 · 15 comments
Closed

2015-03-25 Design Discussion - Patterns and Records #1572

gafter opened this issue Mar 25, 2015 · 15 comments

Comments

@gafter
Copy link
Member

gafter commented Mar 25, 2015

These are notes that were part of a presentation on 2015-03-25 of a snapshot of our design discussions for C# 7 and VB 15. The features under discussion are described in detail in #206 and elsewhere.

Records

Changes since Semih Okur's work:

Working proposal resembles Scala case classes with active patterns.


class Point(int X, int Y);
  • defines a class (struct) and common members
    • constructor
    • readonly properties
    • GetHashCode, Equals, ToString
    • operator== and operator!= (?)
    • Pattern-matching decomposition operator
  • an association between ctor parameters and properties
  • Any of these can be replaced by hand-written code
  • Ctor-parameter and property can have distinct names

Use Cases

  • Simplifies common scenarios
  • Simple immutable user-defined data types
    • Roslyn Syntax Trees, bound nodes
  • Over-the-wire data
  • Multiple return values

With expressions

Illustrates an example of the value of having parameter-property association.

Given

struct Point(int X, int Y);
Point p = ...

the expression

p with { Y = 4 }

is translated to

new Point(p.X, 4)

Open issues

  • Closed hierarchy (and tag fields)
  • Readonly
  • Parameter names
  • Can you opt out of part of the machinery?
  • Serialization (in all of its forms)
  • Construction and decomposition syntax

Pattern Matching

Sources of Inspiration

  • Scala
  • F#
  • Swift
  • Rust
  • Erlang
  • Nemerle

A pattern-matching operation

  • Matches a value with a pattern
  • Either succeeds or fails
  • Extracts selected values into variables
    object x = ...;
    if (x is 3) ...
    if (x is string s) ...
    if (x is Point { X is 3, Y is int y }) ...
    if (x is Point(3, int y)) ...

Other aspects

  • Patterns defined recursively
  • "select" from among a set of pattern forms
  • Typically an expression form
  • active patterns support interop and user-defined types
switch (o)
{
    case 3:
        ...
        break;
    case string s:
        M(s);
        break;
    case Point(3, int y):
        M(y);
        break;
    case Point(int x, 4):
        M(x);
        break;
}

We think we want an expression form too (no proposed syntax yet, but for inspiration):

   M(match(e) {
        3 => x,
        string s => s.foo,
        Point(3, int y) => y,
        * => null })

Benefits

  • Condenses long sequences of complex logic with a test that resembles the shape of the thing tested

Use Cases

  • Simplifies common scenarios
  • Language manipulation code
    • Roslyn
    • Analyzers
  • Protocol across a wire

Open questions

  • How much of the pattern-matching experience do we want (if any)?
  • Matching may be compiler-checked for completeness
  • May be implemented using tags instead of type tests
  • Which types have syntactic support?
    • Primitives and string
    • Records
    • Nullable
    • objects with properties
    • anonymous types?
    • arrays?
    • List? Dictionary?
    • Tuple<...>?
    • IEnumerable?
@gafter gafter closed this as completed Mar 25, 2015
@gafter gafter added this to the C# 7 and VB 15 milestone Mar 25, 2015
@gafter gafter self-assigned this Mar 25, 2015
@gafter gafter reopened this Mar 25, 2015
@gafter
Copy link
Member Author

gafter commented Mar 25, 2015

@MadsTorgersen You may want to link to this from today's design notes.

@bbarry
Copy link

bbarry commented Mar 25, 2015

class Point(int X, int Y);
  • defines a class (struct) and common members

So it is a class in this instance and a struct if you use struct instead like you do further down?

@gafter
Copy link
Member Author

gafter commented Mar 25, 2015

@bbarry Yes. You can use this syntax with either the class or the struct keyword.

@MgSam
Copy link

MgSam commented Mar 26, 2015

Great to see with is now part of the proposal.

I still like the record keyword over all the behavior changes being implicit. People will likely use primary constructors on mutable classes as well; it seems weird that they would get different default implementations for ToString, GetHashCode, and Equals just because they chose to use a primary constructor.

@AdamSpeight2008
Copy link
Contributor

Also have a look at Nemerle for a C# like syntax for pattern matching.

@zastrowm
Copy link
Contributor

Has there been any consideration to what it means to overload the meaning of the switch statement w.r.t. Pattern Matching?

Right now, a switch statement (correct me if I'm wrong) can be viewed as a unordered block, basically a lookup table to compare against. But implementing pattern matching using the switch keyword (instead of something like match) it turns it into a order-dependent, top-down construct where you can no longer look for a single value but instead need to scan the entire block to figure out the flow control. Could this confuse users or make it more difficult to read the switch construct?

@gafter
Copy link
Member Author

gafter commented Mar 27, 2015

@zastrowm Yes, we've thought about it pretty deeply. Since it is an error for there to be overlaps between cases in existing switch statements, the semantics are upward compatible.

There is only one catch. You can place a default case anywhere, not just at the end. We'll define default to be a fallback (last in order) and not order-dependent.

@VladD2
Copy link

VladD2 commented Mar 28, 2015

Which types have syntactic support?
Primitives and string

+1

Records

+1

objects with properties

+1

See how it looks on Nemerle:

using System.Console;

class A { public Field1 : int = 40;     public Prop1 : int { get { Field1 + 2 } }  }
class B { public Field : string = "OK"; public Prop1 : A { get; set; }}

module Program
{
  Test(obj : object) : void
  {
    match (obj)
    {
      | A(Prop1 = 42)                 => WriteLine("A with Prop1 == 42");
      | A()                           => WriteLine("A");
      | B(Prop1 = null, Field = "OK") => WriteLine("B with Prop1 == null && Field == 'OK'");
      | B(Prop1 = A(Field1 = 40))     => WriteLine("B with Prop1 == A(Field1 == 40)");
      | B()                           => WriteLine("B");
      | _                             => WriteLine("other");
    }
  }

  Main() : void
  {
    Test(A());
    Test(B());
    def b = B();
    b.Prop1 = A();
    Test(b);
    _ = ReadLine();
  }
}

Nullable

+1

anonymous types?

+1

arrays?
List? Dictionary?
IEnumerable?

It is better to implement support of IEnumerable and IList.

Tuple<...>?

+1 And add support of tuple in syntax.

@VladD2
Copy link

VladD2 commented Mar 28, 2015

Use pattern matching (PM) in switch is bad idea. PM depends on a order.
It is better to add a new syntax.
I think that for С№ fit the following syntax:

    match (obj)
    {
      is A(Prop1 is 42)                  => WriteLine("A with Prop1 == 42");
      is A()                             => WriteLine("A");
      is B(Prop1 is null, Field is "OK") => WriteLine("B with Prop1 == null && Field == 'OK'");
      is B(Prop1 is A(Field1 is 40))     => WriteLine("B with Prop1 == A(Field1 == 40)");
      is B()                             => WriteLine("B");
      is *                               => WriteLine("other");
    }

@GeirGrusom
Copy link

A syntax suggestion:

var result = match(x, y) 
{
  case(x == 42) => "42";
  case(x == 101) { return "101"; }
  case(x is Point p; p.X > 100) => "Point!";
  case(x is Point p; y is long) => $"{p} and {y}";
  default => string.Empty;
}

@mpawelski
Copy link

Nice to see that you're thinking about with expression.
It might be worth considering to allow this feature to work work on current classes with mutable properties. For example:
for this class

class Test
{
    public string Foo { get; set; }
    public string Bar { get; set; }
}

and for variable create this way

var test = new Test
{
    Foo = "foo",
    Bar = "bar"
};  

this code

var changedCopyOfTest = new Test
{
    Foo = test.Foo,
    Bar = "bar2"
};

could be written like
var changedCopyOfTest = test with { Bar = "bar2" };

I'm not sure if it's worth it. I just had to write similar code yesterday (make copy of class with mutable properties but with small changes) and thought about your proposed with expression.

@GregRos
Copy link

GregRos commented Apr 3, 2015

I think extension of the existing switch and try-catch syntax is best, in order to avoid alienating people, especially people who don't know much about pattern matching or what it is. The try-catch syntax is actually a great thing to base things on, because that is the one example of pattern matching C# does support (it even has guards now!). Calling is switch is probably a bad idea though, because it might confuse people. match is better.

There is actually an example of using try-catch blocks for pattern matching (definitely NOT a good idea, but it shows to outline the fact that catch blocks really are pattern matching).

Here are some ideas about different kinds of patterns and pattern matching syntax.

Literal Pattern

A literal pattern matches a literal value known at compile time, for which the language can guarantee absolute equality. Here are some examples:

match (x) {
    case 1:
    case "Hello":
    case 'c':
}

Literal patterns may take advantage of some default conversions. You can match an int with a long value.

A literal pattern is basically equivalent to a case clause, though there is no reason to restrict the type of the expression being matched (as above, you could match an object against an int, string, or char literal).

Literal patterns can always be identified by their starting character. They cannot begin with parentheses.

Variable Binding

This pattern binds a value to a name. The variable type can be var (in which case the compiler tries to infer the type, using the rules of local type inference) or a type name, in which case it is equivalent to a runtime type test. The structure is similar to a catch clause, except that it works for any type, even one that isn't an exception.

Implicit conversions don't participate in variable binding patterns. The runtime type of the variable must conform to the declared type.

Here are some examples:

object n = 5;
match(n) {
    case (string str):
        //the value of 'n' is of type string and is assigned to 'str'
    case (long x):
        //...
    case (var y):
        //This is equivalent to 'object y' because 'y' is inferred to be type 'object'

}

Note that the parentheses are required. They also distinguish this pattern from active patterns and other things. It could also be viewed as a singleton tuple.

Active Pattern

Active patterns could be objects implementing one of the interfaces:

interface IPattern0<TIn> {
    bool Match(TIn);
}   

interface IPattern1<TIn, TOut1> {
    bool Match(TIn, out TOut1); 
}

interface IPattern2<TIn, TOut1, TOut2> {
    bool Match(TIn, out TOut1, out TOut2);
}

In practice though, requiring to fully parameterize active patterns in this way is way too much typing, so I think it would be better to have a set of marker interfaces IPatternN that define no methods and had no type parameters, but have the compiler require appropriate Match methods.

They're recommended to be struct (there is no need to box the pattern into the interface), and need to have an accessible constructor. It's important that these objects be usable in languages not supporting pattern matching, or that support different pattern matching (e.g. F#).

To match against an active pattern, you write TypeName[ConstructorArgs](output parameters). A new instance is constructed in each case clause. For example:

case Regex["(Hello|Hi)"](string matchOutput)

First constructs an instance of Regex using the constructor Regex(string s), and then calls the appropriate Match method.

Here is a full match block:

match (str) {
    case Regex["(Hello|Hi)"](string match):
        //...
    case Regex["An(other)?\s* examples?"](string match):
}

Where Regex is a pattern defined like this:

struct Regex : IPattern1 {
    Regex _regex;
    Regex(string regex) {
        _regex = new Regex(regex);
    }
    bool Match(string str, out string output) {
        var result = _regex.Match(str);
        if (!result.Success) {
            output = "";
            return false;
        }
        output = result.Value;
        retun tre;
    }
}

Active patterns don't begin with parentheses to distinguish them from simple binding patterns.

Tuple Pattern

Tuple patterns provide for a great way to match on multiple things at once. It matches both a Tuple<...> type and a list pattern matched value, separated by commas. E.g.

match (n, m) {
    case (1, 2):
        //...
    case (3, 4):
        //...
}

match (tuple) {
    case (string x, int y):
    ...
}

Pattern Guards

Guards can appear in the same way as guards for catch clauses. Guards can make use of any value in the scope of that case statement, including any values already matched by the pattern. Guards are equivalent to if statements that further refine the match.

match (x) {
    case (int n) if n < 5:
        //...
    case (string s) if s.Length = 10:
        //...
}

Combining Patterns

Patterns could be combined in several ways. One way is to allow patterns to be stacked, like case clauses in switch blocks can be stacked. In this case, all patterns must bind the same names to the same types of values. This is an OR combination, or a disjunction. For example:

match (x) {
    case (string s, int n):
    case (int n, string s):
        //stacking
}

Another way is to use a conjunction or AND combination. This can be done using the && operator. As implied, it is lazy, and doesn't evaluate the second pattern if the first doesn't match. Different patterns can bind different values to different names (though they can't bind to the same name multiple times), and all of these names are available.

match (x) {
    case Regex["regex"](string match) 
    &&   Regex["also match this one"](string match2):
        //...
}

@bitcrazed
Copy link

While I appreciate the C# design team's work to avoid creating additional new keywords and constructs wherever possible, I do feel that overloading the switch(...) case ... statement with a new (to the language) semantic that is distinct from the well understood switch statement's semantic.

I personally much prefer the F#/Nemerle discriminated-union-like syntax suggested by @VladD2 above. I feel it's not only more syntactically terse, it's more visually distinctive, making it easy to spot such constructs in a body of code.

I also very much enjoy the F# behavior when new types are derived from the type being matched that all DI match statements will fail to compile, requiring developers to be explicit in how their code should behave if the newly added type is discovered.

@gafter gafter changed the title Discussion - Patterns and Records 2015-03-25 Design Discussion - Patterns and Records Sep 13, 2015
@gafter gafter modified the milestone: C# 7 and VB 15 Nov 21, 2015
@gafter
Copy link
Member Author

gafter commented Apr 25, 2016

Design notes have been archived at https://github.com/dotnet/roslyn/blob/future/docs/designNotes/2015-03-25%20C%23%20Design%20Meeting.md but discussion can continue here.

@gafter gafter closed this as completed Apr 25, 2016
@gafter
Copy link
Member Author

gafter commented Oct 20, 2016

Discussion for pattern-matching has moved to #10153

@dotnet dotnet locked and limited conversation to collaborators Oct 20, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants