Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single pattern checks #79

Closed
wants to merge 10 commits into from

Conversation

vonagam
Copy link
Member

@vonagam vonagam commented Aug 3, 2020

Proposal to rename .match method for single pattern checks to .case for it to work on any type, not only on enums.

value.case(pattern) // returns a boolean

Render.

@RealyUniqueName
Copy link
Member

RealyUniqueName commented Aug 3, 2020

Proposed syntax variants are too obscure.
value is case pattern is kind of ok. But then where instead of if is inconsistent with the existing case syntax.
Assertion is not only obscure, but also confusing to read and understand what's the behavior behind it.

Anyway I'd like to have single pattern matching with var capturing, but I don't see how to fit it in if or any other existing syntax. So it probably should be something new.

Existing switch is good enough most of the time unless your code is enum-centric.

@skial skial mentioned this pull request Aug 3, 2020
1 task
@vonagam
Copy link
Member Author

vonagam commented Aug 3, 2020

Existing switch is good enough most of the time unless your code is enum-centric.

This is in part because currently single pattern check is usable only on enums and switch is verbose, so structural and tuple pattern matching is not as accessible for non-enum types.

But then where instead of if is inconsistent.

It can be renamed from where to if.

So it probably should be something new.

If case is no-go, then match will do:

// test
option match Some(int) where int % divisor == 0;

// condition
if (option match Some(int) where int % divisor == 0) {}

// cast
option cast match Some(int) where int % divisor == 0;

As i understand, new syntax is better not to start with a keyword itself because if it is then the word becomes reserved one.

Assertion is not only obscure, but also confusing to read and understand what's the behavior behind it.

Cast does what cast does - tries to convert from less specific (Option) to more specific (Some) (this is downcast) and if it fails it throws.

@Gama11
Copy link
Member

Gama11 commented Aug 3, 2020

Cast does what cast does - tries to convert from less specific (Option) to more specific (Some) (this is downcast) and if it fails it throws.

Except that that looks more like an unsafe cast, which doesn't throw. I think there's already enough confusion around casting in Haxe, let's not add more...

@back2dos
Copy link
Member

back2dos commented Aug 3, 2020

In general, I agree it would be nice to have a shorthand, but in some cases, this is not really working out so well:

if case (option, Some(int), int % divisor == 0) {
  then(int);
} else {
  or();
}

I'd stick to what's available:

switch option {
  case Some(int) if (int % divisor == 0): 
    then(int);
  default: 
    or();
}

It doesn't cluster keywords, but instead uses them to set the various subexpressions apart (instead of relying on ,).


I'm not big fan of that cast case syntax, because:

  1. it promotes the throwing of exceptions, which as far as I understand don't play too well with the analyzer
  2. for assertions/contracts/checks, I think macros are probably a better choice, as they're more easily customizable (some users may prefer to not include the checks in production builds, others will want to leave them, but integrate error reporting with whatever logging solution they have in place ...)

Personally, I think it would be better with a solution that expands the existing match to cover the use cases (except for the cast case). It would require two parts:

  1. Allow match to capture variables within conjunctions (chained &&). The scope would be any subsequent operand and the positive if-branch. The previous snippet would look like this:

    if (option.match(Some(int)) && int % divisor == 0) {
      then(int);
    } else {
      or();
    }

    In a disjunction, every operand will have to limit the scope of a match-captured var to itself, e.g. in someCheck() || (option.match(Some(int)) && int % divisor == 0) the scope of int will be limited to the second operand.

  2. Allow match with all types. For this, there are two solutions:

    1. Actually allow match for every type. This is quite invasive, creating room for conflict with any type that has a match method (e.g. EReg) or static extension (doesn't seem too unlikely).
    2. Allow match on tuples (i.e. directly on EArrayDecl). The arity of the tuple would determine the number of required arguments (to avoid a bracketfest). The 1-tuple thus allows matching over any type: [myStructure].match({ rating: "awesome", name: n}), or [myArray].match([2, _]). But this would also work: [1, false, "foo"].match(1, false, "bar").

There are a few details to hammer out, especially around ||. I think it'd be nice to keep track of captured vars that have their scope limited, so that the user can get a nicer error message if they attempt to access them (perhaps outside the scope they can be represented as tvars with some special meta that causes errors when accessing them).

This approach wouldn't require any new syntax and would extend an existing feature in a relatively intuitive way to solve the problems motivating this proposal.

@vonagam
Copy link
Member Author

vonagam commented Aug 3, 2020

Except that that looks more like an unsafe cast

Unsafe has one argument (expression), safe - two (expression and the type). This has two.

I chose cast because it is what it does, in terms of type and in terms of initial english meaning - you have a form (pattern) and try to fit material (variable value) into it.

it promotes the throwing of exceptions

Right now more than 10% of pattern matching defaults in haxe codebase simply throw an error. This is not about promotion, this is about optimising already popular case. And it allows not only to throw, but to do anything ($elseBody is an expression and a block is expression) including doing nothing.

I am okay with making doing nothing a default (also a popular thing, little less than throwing an error, but still more than 10%). That might be better actually, since no need to think about message/name for default exception. And if it does not throw by default then there is no need to communicate this and space of suitable words increase.

not include the checks in production builds

Ok, "assertion" is a wrong choice of a word. It allows you to introduce multiple captured variables into current scope and to exit early. This is not about development checks.

but in some cases, this is not really working out so well

Examples provided were intended only for syntax demonstration, there are not complex ones.

Allow match on tuples

That's an interesting idea.

@vonagam
Copy link
Member Author

vonagam commented Aug 3, 2020

Renamed shortcuts to one with match instead of case, so i hope it is ok to have guard with where and not if now...

if match is different now to be consistent with others, no commas now.

Renamed cast to extract (alternatives to consider - unwrap, expose or capture) to better communicate that captured variables will be extracted/exposed to following statements (and to avoid cast confusion).

I said that extract can do nothing if there is no match by default, but i think it would be mistake and lead to unexpected errors in void functions. So extract still throws by default.

(Regarding alternative of tuple.match i think as right now it is less defined/usable alternative as i see some problems with it - [array].match([pattern]), ||/&& and produced output, or-patterns in comma separated element pattern and so on.)

Maybe in extraction variable has to be declared with final or var to highlight declaration. (Right now it can be either final/var or simply an indent.)

@haxiomic
Copy link
Member

haxiomic commented Aug 3, 2020

@Gama11 just a thought on scopes and guards, maybe we could use the same syntax as a regular switch case:

option.match(Some(int) if(int % divisor == 0))

As this avoids any complexity with scope (as a user it’s confusing when identifiers can escape their parenthesis scope)

@kLabz
Copy link

kLabz commented Aug 3, 2020

As for data extraction:

option.match(Some(int) if(int % divisor == 0), {
  // Block with `int` available if `option` matched
});

@haxiomic
Copy link
Member

haxiomic commented Aug 3, 2020

@kLabz that would work but could get unwieldy because in the example your branch code is now running in your condition:

if (!option.match(Some(int) if(int % divisor == 0), then(int))) {
    or();
}

I think the benefit of match is when you want to concisely check an enum value and get a boolean but if you’re going to branch and use extracted values a switch is probably the best way and not less concise

@kLabz
Copy link

kLabz commented Aug 3, 2020

That was not intended to be used as a if condition. Maybe this could be an overload of match() which returns Void to avoid that. But yeah, switch is only a little longer than that..

@vonagam
Copy link
Member Author

vonagam commented Aug 3, 2020

Propositions with option.match does not answer the question of how to use single pattern check with any type, not just an enum.

it’s confusing when identifiers can escape their parenthesis scope

Parenthesis do not have their own scope. They use parent one.
This is a valid haxe code to see it (not that i would write like this though, as there is no benefit):

(final int = 3);
trace(int); // traces 3

How this can be made more clear?:

option capture match Some(final int);

Maybe such construction will work for you (but it will be a little harder to implement):

final int = option capture match Some(int);

The whole point of extract match/capture match is to reduce nesting for one of most common scenarios of using pattern match.

@vonagam
Copy link
Member Author

vonagam commented Aug 3, 2020

Added variables whitelist to extraction match.
Without declarations none of captured variables will be exposed.
I hope that this sufficiently addresses concerns about scope confusion.

With this change else word in extract match has to be renamed so not to be confused with some kind of ternary operator.

@haxiomic
Copy link
Member

haxiomic commented Aug 4, 2020

I think this could be frustrating to read when the proposal is so clear in your mind but here's the verbatim thoughts that go through my head as a user seeing it for the first time

option capture match Some(final int);
  • What's the returned value of this expression, with "match" I anticipate Bool like enum.match() or EReg.match()
  • What's the meaning of the keyword capture? Is it an operator – can I use it like capture ()? What are its semantics, where can it be used?
  • At this point I can't guess confidently so I have to read the docs and I learn capture changes how match works to return captured variables. Cool, so match probably returns bool without it
  • Now I want to know how to use it – what happens if I have multiple capture variables? If I write option capture match Pair(final a, final b), what is the returned value of the expression? Is this an error?
  • If the enum doesn't match, is the result null?
  • edit: now I read the docs more carefully and realise that option capture match Some(final int); is declaring int so I can use it in following expressions, I can now answer the Pair(final a, final b) question, great. But composability is now an issue – maybe is this actually a Void expression?

However, I think your simpler expression without capture is pretty clear:

option match Some(int) if (int % divisor == 0)
  • Returns a Bool, works just like switch option { case ... }

I understand the motivation for wanting a short match syntax and agree we could do something here, but I don't think I see the motivation for short extraction syntax when an expression like

final int = switch option { case Some(int): int; default: null; }

Is clear and not that much longer, I can see how the Pair(final a, final b) case will work and I know that no match will return default:

If we do decide that a short-hand extraction syntax is important, maybe we could do it by allowing a slimmed switch statement:

final int = switch option Some(int): int else null;

@vonagam
Copy link
Member Author

vonagam commented Aug 4, 2020

But composability is now an issue – maybe is this actually a Void expression?

is match - boolean expression.
if match - type determined by branches.
extract match - void, same as variable declaration.

I can now answer the Pair(final a, final b) question

Yes and simple switch expression does not answer how to extract both a and b.

the motivation for short extraction syntax when an expression like

Besides lacking multi variable support, you need to compare this two expressions for clarity:

final int = switch option { case Some(int): int; default: throw false; }
// vs
final int = option extract match Some(int);

Where instead of Some you will have more complex pattern. So let's pretend that this is a complex one and needs to be on a separate line:

final int = switch options { case {
  x: Some(int)
}: int; default: throw false; }
// vs
final int = options extract match {
  x: Some(int)
};

maybe we could do it by allowing a slimmed switch statement

Like the ideas there.

@vonagam
Copy link
Member Author

vonagam commented Aug 4, 2020

Removed vars whitelist to avoid confusion about type of extract match expression, also removed that thing about $ifBody being everything that follows. Now it is only about variables declarations and $elseBody if provided must exit the flow with return/break/continue/throw.

An example for why extract match is useful. Let's say you have macro that operates on a propery field and you want to get its get and set strings. If haxe.macro.Expr.FieldType had been not an enum but a class, you could write this line to access those properties:

var fProp = cast(fType, FProp);
// access get and set on fProp

Currently to do the same with enum you write this:

switch (fType) {
  case FProp(get, set, _, _):
    // access get and set through variables
  case _: 
    throw false;
}

Access to get and set is gained but with two indentations, additional lines and required custom handling of wrong type.

With extraction:

fType extract match FProp(get, set, _, _);
// access get and set through variables

Alternatively extracting match can be made to return an anonymous structure with captured variables:

var fProp = fType extract match FProp(get, set, _, _);
// access get and set on fProp

@vonagam
Copy link
Member Author

vonagam commented Aug 4, 2020

Renamed extract match to capture match and now it returns a structure with captured variables. Just like in last snippet of previous comment. I hope that this variant does not have the same ambiguity about scope or type.

Alternative syntax using case without where:

// test
case is (option, Some(int), int % divisor == 0);

// condition
case if (option, Some(int), int % divisor == 0) {};

// capture
case vars (option, Some(int), int % divisor == 0);

With case and if:

// test
option case is Some(int) if (int % divisor == 0);

// condition
option case Some(int) if (int % divisor == 0): {};

// capture
option case vars Some(int) if (int % divisor == 0);

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

Changed design to last commented alternative.
It uses case again, but this time consistently with itself and switch one.
Conditional usage being exactly like in switch, just without switch statement itself.

(And to repeat - case vars simply returns a structure with captured variables, no funny interactions with scope or following expressions.)

How do you find it?

@kLabz
Copy link

kLabz commented Aug 5, 2020

Honestly this doesn't look like haxe at all to me, more like a succession of [random] keywords.
Also, I don't see the point of separating option case is Some and option case Some, the latter seems enough to handle the former inside a if clause.

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

I don't see the point of separating option case is Some and option case Some

Instead of:

option case is Some(_);

You would prefer to always write this?:

option case Some(_): true else false;

Checking whenever pattern matches or not is very common scenario that's why it grants its own shortcut and this is what .match currently does. case is is an enum/pattern counterpart to classes is.

more like a succession of [random] keywords

switch option { case Some(int) if (int % divisor == 0): {} case _: }
// shortcut
       option   case Some(int) if (int % divisor == 0): {}

So just because switch is omitted it is too random?

@kLabz
Copy link

kLabz commented Aug 5, 2020

You would prefer to always write this?:
option case Some(_): true else false;

No, I meant if (option case Some(_)), but that's really just if (option.match(Some(int) if (int % divisor == 0)), an improved version of current match() that'd work with guards and tuples (or something like that) for non-enum pattern matching.

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

Well, wouldn't it cause the scope and type confusion of case expression?
And how would it work with || and &&?

@Simn
Copy link
Member

Simn commented Aug 5, 2020

Well... I appreciate that you put so much thought into improving Haxe as a language and I don't mean to shut you down, but at the same time I feel a bit bad if I think you're just wasting your time here. While I acknowledge that there are certain limitations with match and that this proposal would overcome those, the proposed syntax is not a good fit for Haxe. I'll have to agree that it "doesn't look like haxe".

I also sometimes want some notion of "match and extract", but I usually don't want it enough to justify the addition of an entirely new syntax. Maybe if we find something that looks just right, but so far that hasn't been the case.

Overall, I'm sorry to say that this proposal has no chance of being accepted. I'm fine with keeping it open to discuss the matter further, but please be aware that this isn't going to make it into the language.

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

Thanks, i'll close this then.

At the moment, i don't see how it can be "more haxe" than literally being syntax of a single case switch without word "switch" (with requirement of using if for guard):

switch option { case Some(int) if (int % divisor == 0): {} case _: }
// shortcut
       option   case Some(int) if (int % divisor == 0): {}

@vonagam vonagam closed this Aug 5, 2020
@benmerckx
Copy link
Contributor

As the discussion here is interesting, I'd like to add that destructuring (HaxeFoundation/haxe#4300 with var/final and enum matching) might meet some use cases. For example in Rust a pattern can be used in a let statement which can further be used in an if let construct:
https://doc.rust-lang.org/book/ch18-01-all-the-places-for-patterns.html#let-statements
https://doc.rust-lang.org/stable/rust-by-example/flow_control/if_let.html

Which in haxe could look something like:

// compile error: https://doc.rust-lang.org/book/ch18-02-refutability.html
final Some(int) = option;

// but:
if (final Some(int) = option)
  trace(int);

// dealing with conditions?
if ((final Some(int) = option) && int % divisor == 0)
  trace(int);

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

Yes, destructuring would have been useful in that capacity, shame that it was shut down for not being simple enough.
And it was proposed to solve the exact same problem of capture extractions too.

@vonagam vonagam reopened this Aug 5, 2020
@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

As destruction can handle extraction of captured variables case vars/capture match/cast case can be removed. This opens more room for maneuvers in this proposal.

Removed mentioned shortcut.
Updated syntax to this:

// test
case(option, Some(int), int % divisor == 0);

// condition
if case(option, Some(int), int % divisor == 0) {}

if and if case are different things, to get captured variables inside condition body you need to use if case without parenthesis around case.

Alternatively it can be case if instead of if case, and also switch instead of case (as usual switch expects only one argument there will be no conflict with current switch).

@nadako
Copy link
Member

nadako commented Aug 5, 2020

I would love to have this functionality, but I don't immediately like any of the discussed syntaxes... I'm wondering if we could re-use the extractor pattern syntax for this, e.g.

var isEven = case option => Some(int) if (int % 2 ==0);

UPD: found some real-world code to try this approach for the public to judge:
image
image

@vonagam
Copy link
Member Author

vonagam commented Aug 5, 2020

To iterate on extractor variant and differentiate between a test vs a condition (which should not be able to be used in composition) and help with parsing while nested in switch - is prefix can be added to boolean check:

// test
is case option => Some(int)

// conditional
if (case option => Some(int)) {}

Alternative is to use same approach with infix case for better readability (and same reasons of distinction, plus it is shorter):

// test
is option case Some(int)

// conditional
if (option case Some(int)) {}

@grepsuzette
Copy link

grepsuzette commented Aug 6, 2020

@nadako

I would love to have this functionality, but I don't immediately like any of the discussed syntaxes... I'm wondering if we could re-use the extractor pattern syntax for this, e.g.

var isEven = case option => Some(int) if (int % 2 ==0);

Maybe we can try to alter it somewhat, using the same idea as in:

if (option.match(Some(int)) && int % divisor == 0) {
  then(int);
} else {
  or();
}

The latter seemed a very natural extension of the already existing EnumValue.match(pattern:Dynamic) : Bool, but which for some reason I rarely used.

The latter also had the advantage that there is no confusion because of the subsequent if, e.g. compare the following:

  1. if (option.match(Some(int)) && int % divisor == 0) then(int)
  2. if (case option => Some(int) if (int % 2 ==0)) then(int);

I would say the postfixed if was adapted for switch/case but not here.

Then if we consider 8/10 times it would be used without a "guard", like this:

  1. if (option.match(Some(int))) then(int)
  2. if (case option => Some(int)) then(int);

If we consider that case, I start to actually prefer your proposal...

Now how about we alter the if so it becomes a &&:

if (case option => Some(int) && (int % 2 ==0)) then(int);

Seems it then becomes very natural.
And about the scope, I think it's ok - since after a few days of Haxe, not having a var in for (e in a) .. already feels the most natural thing in the world...

@vonagam
Copy link
Member Author

vonagam commented Aug 6, 2020

I would say the postfixed if was adapted for switch/case but not here.

That's right and this is one of problems that design should overcome.

But using && instead of if as a guard is problematic too as it creates sense that extracting case inisde if is a composable expression (so || comes into play) and this complicates generation of resulting switch, potentially making it not single-case and with code duplication of conditional body....

Another limitations is that if requires parens around condition to be consistent with other ifs in haxe language. And there is a need to communicate difference between boolean composable check and extracting conditional non-composable check.

One option not mentioned before is to go for in instead of if for conditional and it gives:

is case option => Some(int)

in case option => Some(int): {}

Allowing to keep using if for a guard as it does not lead to if(if). Same with option case:

is option case Some(int)

in option case Some(int): {}

is is optional here (but in is case it helps parser with situation of being nested inside a switch).

Also for can be an alternative to if with parens, but clear indication that you can't compose statement inside:

is case option => Some(int)

for (case option => Some(int)) {}
option case Some(int)

for (option case Some(int)) {}

@vonagam
Copy link
Member Author

vonagam commented Aug 10, 2020

After some internal polling i see that there is no consensus about which syntax is an acceptable one.
Went back to the very first is case, but without if case or cast case.
Short, readable, no new keywords.

@vonagam
Copy link
Member Author

vonagam commented Aug 14, 2020

Moved destructing into #82.
#80 will take care of a guard.
Now this is a simple proposal to rename .match to .case and allow it on any value.

@hughsando
Copy link
Member

Might be a bit late to the party, but I would like a nice simple version of this too. I like just if and ==. We already have:

   if (e == EnumName) { ... }

Just keep this consistent syntax and extend it:

   if (e == EnumNameAndValue(v) ) {  trace(v); }

   // also ? operator

  var x = e == EnumName ? 0 : 1;
  var x = e==EnumNameAndValue(v) ? v : 0;

  // and 
   if (e == EnumBool(true) ) trace("yes it is");

Ie, simply a shortcut for:

 switch(e) { case COND : STATEMENT; default: ELSE }
  = 
 if (e==COND) STATEMENT else ELSE
or
 e==COND ? STATEMENT : ELSE;

which could be applied at the AST level without changing the backends,
I would use Enums more if we had this syntax

@vonagam
Copy link
Member Author

vonagam commented Dec 7, 2020

I would prefer to have a pattern check (and destructing) shortcut that is not limited to enum values.
That's why in destructuring assignments proposal i have used =< operator and here .case method.

(Also extraction in == will create a problem with expectation of composability...)

@Simn Simn added this to the 2021-12 milestone Nov 8, 2021
@Simn
Copy link
Member

Simn commented Nov 8, 2021

lean-reject: questionable benefit

@vonagam vonagam closed this Nov 8, 2021
@Simn Simn removed the lean-reject label Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.