Proposal: Add completeness checking to pattern matching draft specification #188

agocke · 2015-01-31T05:45:04Z

Background

As noted in issue #180, many modern programming programs are data-focused, especially distributed applications which tend to store, manipulate, and move sets of data between different storage and computation points. One solution proposed to deal with this issue is a combination of records and pattern matching. Records provide a simple way to declare and structure the data types and pattern matching provides a way to destructure and manipulate the data.

Problem

Records provide a great way to represent the data and pattern matching provides a great way to manipulate the data, but there is currently no mechanism in the #180 proposal to ensure that the data and the logic remain in sync. The nature of records and pattern matching is that the data declaration code is often far from the data consumption code. In a distributed system it's even more likely that a single data structure will be consumed and manipulated in various parts of the code base. If the data structure is ever modified, there is currently no mechanism in the draft to alert the programmer that all instances of manipulation logic must be updated.

Solution

Add completeness checking to certain switch statements on certain record types. The core of this proposal is to provide a warning when a switch statement does not handle every possible match on a type hierarchy. This proposal features two possible designs for this idea, presented in order of increasingly intrusive modification to the language.

Design 1

This design actually features no new syntax or semantics beyond that of proposal #180. The suggestion is to create a C# type heirarchy which can be guaranteed 'complete' with existing language features. In this case, complete means that it is not possible for a new subclass of the root member of the type hierarchy, so the compiler can be sure that any and all subclasses of the chosen switching type are visible in the current compilation.

We can construct this type hierarchy in existing C# with the following rules:

All subclasses of the root type must be sealed, preventing any subclassing of any existing leaf types in the hierarchy.
All constructors of the root type must be private, prevent any subclassing of the root type.
- As a consequence, all subclasses must be inner classes and thus must be in source in the current compilation

Here's an example of the structure of this type hierarchy:

abstract class C
{
  private C() {}
  public sealed class C_1 : C {}
  public sealed class C_2 : C {}
      ...
  public sealed class C_n : C {}
}

This guarantees that switching on an instance of type C which explicitly matches C_1...C_n has matched against every possible instance of C. The only thing which changes about the language specification is a requirement that the compiler produce a warning when not all cases are matched.

Design 2

There are a few disadvantages to Design 1:

The mandated structure is complicated and brittle. Forgetting to mark any of the subclasses as sealed or adding any public constructors won't produce a compiler error or warning, but the compiler will now silently skip the completeness check.
The structure is verbose -- most of the sealed or private markers are mostly part of the 'incantation' of completeness and are not directly related to the task at hand.
The relevant record instances are all nested classes, so all references require an extra layer of naming indirection.

Design 2 attempts to fix these problems by replacing much of the boiler plate with a new combination of modifiers on a type -- abstract + sealed. Under Design 2, marking the root type of a hierarchy as abstract sealed will cause the structure from Design 1 to be generated by the compiler in lowering. The following example demonstrates what the structure from Design 1 looks like with an abstract sealed type:

abstract sealed class C {}
public class C_1 : C {}
public class C_2 : C {}
   ...
public class C_n : C {}

In this case, most of the problems with Design 1 are solved, but new semantics are required to be added to the language.

The text was updated successfully, but these errors were encountered:

alrz · 2015-09-30T10:08:33Z

This can be another option as well (based on Design 2):

public abstract case class C {}
public case class C_1 : C {}
public case class C_2 : C {}
   ...
public case class C_n : C {}

So case classes can only inherit from other case classes. Also, this helps to distinguish between case classes and regular ones, if they were defined in different files.

gafter · 2015-09-30T13:32:26Z

@alrz I don't get it. Would it be an error to extend C_1 in another assembly? Or is there some other rule that would allow the compiler to know that it sees all of the cases?

alrz · 2015-09-30T13:40:59Z

@gafter Quoting yourself, "It's a closed hierarchy of types" (like discriminated unions in F# but more flexible since you can inherit from other cases) so why should it be extendable in another assembly?

gafter · 2015-09-30T14:30:13Z

@alrz so case would mean "not extendable in another assembly"? Sort of a strange choice of keyword for that meaning.

alrz · 2015-09-30T15:28:03Z

@gafter Same keyword is used in Scala for exactly this purpose — implying that each case class corresponds to a case statement, I guess.

gafter · 2015-09-30T16:33:22Z

@alrz no, case in Scala does not restrict inheritance.

alrz · 2015-09-30T19:13:16Z

@gafter Yes, case has nothing to do with inheritance restriction in Scala, but my suggestion is not exactly what Scala is offering. Actually it's something in between of its syntax:

sealed abstract class C
case object C_1 extends C
case object C_2 extends C

and the proposed one:

sealed abstract class C {}
class C_1 : C {}
class C_2 : C {}

I'm trying to say that I think this is a more expressive syntax compared to above examples, for this specific use case:

abstract case class C {} // or sealed abstract case class?
case class C_1 : C {}
case class C_2 : C {}

Marking all classes with a unified keyword, indicating that these classes belong to a closed hierarchy.

PS: Although, this is just another option to consider. Except for this little concern, sealed abstract looks good to me.

orthoxerox · 2015-10-06T07:03:19Z

Will this support multiple levels of inheritance?

abstract sealed class C {}
abstract sealed class D : C {}
public class C_1 : C {}
public class C_2 : C {}
public class D_1 : D {}
public class D_2 : D {}

So now the compiler will look for either C, (C_1 + C_2 + D) or (C_1 + C_2 + D_1 + D_2) when checking for completeness.

gafter · 2015-10-06T16:44:58Z

@orthoxerox Yes, we do not plan to make one level of inheritance any more special that a second level of inheritance. The compiler will have to build a decision tree and ensure that every path reachable from the root is handled.

orthoxerox · 2015-10-06T19:55:25Z

@gafter great, I missed that in F#.

gafter · 2015-11-03T22:58:31Z

See the excellent comparison of difference languages by @jonschoning in #5154 (comment)

dsaf · 2015-11-29T14:10:29Z

Would C#7/8 promote having more of multiple types per file or the IDE will be capable of grouping them automatically?

DavidArno · 2016-02-10T13:28:24Z

As I understand this proposal, it would allow the creation of discriminated unions, such as

public abstract sealed class Option<T>
{
    class None();
    class Some<T>(T value);
}

And, because the compiler knows this is a complete hierarchy, I could pattern match as follows:

public string SomeFunction(Option<int> optionalValue) =>
    optionalValue switch
    {
        case None(): "It's a None!",
        case Some(var value): $"It's a {value}"
    };

In other words, the default (or * if we stick with that notation) case wouldn't be required.

Have I understood this correctly?

gafter · 2016-02-10T17:35:51Z

@DavidArno Yes, that is precisely correct.

DavidArno · 2016-02-10T18:41:55Z

@gafter,

Is this functionality implemented in any of the feature branches yet?

KalitaAlexey · 2016-04-03T06:45:30Z

I suggest ADT with the following syntax:

void Process(int | List<int> v)
{
    switch (v)
    {
        case int i:
            ProcessInt(I);
            break;
        case List<int> l:
            ProcessList(l);
    }
}

And to make ADT like type the following syntax:

// For structs
struct Type = int | bool;
// For classes or both structs and classes
class Type = int | List<int>;

Roslyn may enforce a user to write class when one of specified types is not a struct.

svick · 2016-04-03T09:49:35Z

@KalitaAlexey I don't think you can all that ADT, it's more like union type. In ADTs, the cases have names and can be recursive (I can't tell if your proposal would allow that or not).

KalitaAlexey · 2016-04-03T10:04:56Z

I like how it is done in Rust. I think we could inherit their enum ADT.
In Haskell

data Expression = Number Int
                | Add Expression Expression
                | Minus Expression
                | Mult Expression Expression
                | Divide Expression Expression

I'd like to have In C#

enum Expression {
    Number(int Number),
    Add(Expression LeftOperand, Expression RightOperand),
    Minus(Expression Expression),
    Mult(Expression LeftOperand, Expression RightOperand),
    Divide(Expression LeftOperand, Expression RightOperand),
}

And pattern matching like

void ProcessExpression(Expression expression)
{
    switch (expression)
    {
    case Expression::Number n:
        ProcessNumber(n);
        break;
    case Expression::Add a:
        ProcessAdd(a);
        break;
    case Expression::Minus m:
        ProcessMinus(m);
        break;
    case Expression::Mult m:
        ProcessMult(m);
        break;
    }
    case Expression::Divide d:
        ProcessDivide(d);
        break;
}

HaloFour · 2016-04-03T12:50:41Z

@KalitaAlexey #6739

KalitaAlexey · 2016-04-03T15:01:24Z

@HaloFour Thanks. What's the difference then?

wekempf · 2016-04-12T14:19:11Z

The discussion around this concept is scattered everywhere, so forgive me if I just repeat something said elsewhere. Option 1 has serious problems but is on the right track. Option 2 I don't care for because it scatters the declarations and muddles the concept. What we're modeling here is a discriminated union. @KalitaAlexey gets close to the syntax I'd prefer, but just using "enum" is at least confusing, if it doesn't actually cause parsing problem. I'd suggest (and saw others do so as well in other threads) "enum class".

public enum class Expression {
    Number(int Number),
    Add(Expression LeftOperand, Expression RightOperand),
    Minus(Expression Expression),
    Mult(Expression LeftOperand, Expression RightOperand),
    Divide(Expression LeftOperand, Expression RightOperand),
}

There's still lots of open questions after deciding on rough syntax like this, however. For instance, in this thread it was suggested a DU could have a type that's also a DU. I'm not sure that makes sense and would suggest not allowing that, knowing you can always add this feature in the future if that turns out to be the wrong decision but you can't remove a feature, ever. I just don't know when such a feature would actually be useful, and without a compelling use case it seems best to err on the conservative side.

I've seen other posts where the syntax is very similar to the above but "abstract sealed" is used instead of "enum class". Frankly, I think "abstract sealed" is highly confusing and gives no indication that one is building a DU, while "enum class" is intuitive.

gafter · 2017-04-21T22:45:14Z

Issue moved to dotnet/csharplang #486 via ZenHub

Pilchie added Language-C# Area-Language Design Feature Request labels Feb 2, 2015

SolalPirelli mentioned this issue Feb 5, 2015

Proposal: Pattern matching and record types #206

Closed

gafter self-assigned this Jul 24, 2015

gafter changed the title ~~Proposal: Add completeness checking to pattern matching draft specification~~ Proposal: Add completeness checking (algebraic data types) to pattern matching draft specification Sep 10, 2015

alrz mentioned this issue Oct 6, 2015

Proposal: Active Patterns #5718

Closed

orthoxerox mentioned this issue Oct 13, 2015

Limited double dispatch for ADTs #5909

Closed

gafter added this to the C# 7 and VB 15 milestone Oct 21, 2015

gafter mentioned this issue Oct 27, 2015

Pattern Matching #4882

Merged

49 tasks

gafter added the 2 - Ready label Nov 20, 2015

gafter assigned agocke and gafter and unassigned gafter and agocke Nov 20, 2015

This was referenced Nov 20, 2015

Outline of C# 7 demo at MVP Summit 2015-11-02 #6505

Closed

Proposal: Record Enum Types #6739

Closed

gafter removed this from the C# 7 and VB 15 milestone Nov 21, 2015

gafter added the New Language Feature - Pattern Matching Pattern Matching label Feb 15, 2016

orthoxerox mentioned this issue Feb 16, 2016

Proposal: Ability to seal trees of classes #8726

Closed

agocke mentioned this issue Feb 16, 2016

Proposal: "Closed" type hierarchies #8729

Closed

gafter added this to the 2.0 (RTM) milestone Feb 18, 2016

gafter added 1 - Planning New Language Feature - Records Records and removed 2 - Ready labels Feb 24, 2016

gafter mentioned this issue Mar 1, 2016

[Umbrella] Pattern-matching work checklist #9375

Closed

HaloFour mentioned this issue Apr 4, 2016

Discussion thread for pattern-matching #10153

Closed

gafter modified the milestones: 2.0 (RTM), 2.0 (RC), 3.0 Jun 26, 2016

gafter mentioned this issue Sep 30, 2016

Proposal: Add Discriminated Unions using a new Type #14208

Closed

gafter mentioned this issue Feb 15, 2017

Champion "Discriminated Unions" dotnet/csharplang#113

Open

5 tasks

agocke changed the title ~~Proposal: Add completeness checking (algebraic data types) to pattern matching draft specification~~ Proposal: Add completeness checking to pattern matching draft specification Feb 21, 2017

gafter mentioned this issue Apr 21, 2017

Proposal: "Closed" type hierarchies dotnet/csharplang#485

Open

gafter closed this as completed Apr 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add completeness checking to pattern matching draft specification #188

Proposal: Add completeness checking to pattern matching draft specification #188

agocke commented Jan 31, 2015

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

orthoxerox commented Oct 6, 2015

gafter commented Oct 6, 2015

orthoxerox commented Oct 6, 2015

gafter commented Nov 3, 2015

dsaf commented Nov 29, 2015

DavidArno commented Feb 10, 2016

gafter commented Feb 10, 2016

DavidArno commented Feb 10, 2016

KalitaAlexey commented Apr 3, 2016

svick commented Apr 3, 2016

KalitaAlexey commented Apr 3, 2016

HaloFour commented Apr 3, 2016

KalitaAlexey commented Apr 3, 2016

wekempf commented Apr 12, 2016

gafter commented Apr 21, 2017

Proposal: Add completeness checking to pattern matching draft specification #188

Proposal: Add completeness checking to pattern matching draft specification #188

Comments

agocke commented Jan 31, 2015

Background

Problem

Solution

Design 1

Design 2

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

gafter commented Sep 30, 2015

alrz commented Sep 30, 2015

orthoxerox commented Oct 6, 2015

gafter commented Oct 6, 2015

orthoxerox commented Oct 6, 2015

gafter commented Nov 3, 2015

dsaf commented Nov 29, 2015

DavidArno commented Feb 10, 2016

gafter commented Feb 10, 2016

DavidArno commented Feb 10, 2016

KalitaAlexey commented Apr 3, 2016

svick commented Apr 3, 2016

KalitaAlexey commented Apr 3, 2016

HaloFour commented Apr 3, 2016

KalitaAlexey commented Apr 3, 2016

wekempf commented Apr 12, 2016

gafter commented Apr 21, 2017