Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern guards #35

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

ncik-roberts
Copy link

This is a proposal for a new syntactic form similar to Haskell's pattern guards. @antalsz and @goldfirere helped prepare this (though mistakes are mine).

A rendered version of the proposal.

Here's an example of the proposed syntax, with carets for emphasis:

match expr with
| Literal x when Map.find x mapped match Some y -> y
         (* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *)
| Literal x | Add (0, x) | Add (x, 0) -> print_int x; process x

@fpottier
Copy link

Hi!

At a semantic level, this may raise some questions, e.g.: if several successive clauses involve the same expression e, is this expression executed once or several times? As in, say, p when e match p1 -> e1 | p when e match p2 -> e2.

At a syntactic level, with all due respect, this syntax seems quite ugly and confusing to me. Currently the keyword when must be followed with an expression e of type bool. Here it can also be followed by e match p. So, after reading when, we no longer quite know what to expect -- either a Boolean guard, or an arbitrary expression that is going to be examined. That said, if I understand correctly, the old when e can now be viewed as sugar for the new when e match true, so perhaps this is acceptable.

Another syntactic criticism is that this is an abuse (a new use) of the match keyword, potentially creating visual confusion. Are there other proposed syntaxes? How do Haskell, Agda, Idris and friends deal with this?

@ncik-roberts
Copy link
Author

Hi, thanks for the comments. I'll respond to your comments on syntax shortly, but I first wanted to weigh in on the semantic question you raise, namely whether e should be executed once or twice in p when e match p1 -> e1 | p when e match p2 -> e2.

My feeling is that it is least surprising if e is executed twice: e appears twice in the program text. Some reasoning:

  • Say that executing e performs effects (and indeed this is the only way that the answer to this question could affect program semantics). In writing e twice, the programmer probably intend for e's effects to be performed twice.
  • I can't think of a precedent in OCaml where the runtime semantics guarantees a property based on two expressions being identical.

It's true that the programmer might want e to be evaluated only once for efficiency. Multiway pattern guards make this behavior available to the programmer:

p when e match (
  | p1 -> e1
  | p2 -> e2
)

Indeed, this is one of the main impetuses for that extension of the RFC.

For sake of comparison: the compilation of a pattern-matching-related feature in Haskell, view patterns, does involve ensuring that identical expressions that are scrutinized in multiple cases are evaluated only once. But Haskell is pure, so this guarantee is just an optimization in Haskell, whereas it would be semantics-changing in OCaml. (Admittedly, view patterns are a distinct feature from pattern guards, which Haskell also has, but I couldn't easily find how Haskell handles pattern guards in the example you describe).

@alainfrisch
Copy link
Contributor

alainfrisch commented Jun 15, 2023

I understand this is not directly related to the proposal, but if we are thinking about extending the expressive power of when guards, did you consider making them part of the pattern algebra so that they can appear under or patterns?

  match e with
  |  ( ({ foo; _ } when foo > 0) | ( { bar = true; _ } when 42 match foo)) -> foo
  | _ -> ...

(returns the foo field is positive, or otherwise 42 if bar is true)

If we go this way, it would also be useful to provide a way to hide a capture variable from a sub-pattern (only used in a local when-clause).

Another example:

  match e with
  | ((x, _) when x > 0) | ((_, x) when x > 0) -> x

@gasche
Copy link
Member

gasche commented Jun 15, 2023

Alternative proposal

I have also been interested in Lionel Parreaux UCS work, and made a weird proposal that is as expressive as pattern guards in the 2019 blog post Musings on extended pattern-matching syntax, which you could consider including as related (design-only) work. The syntax proposed was as follows:

let rec filter_map f li = match li with
| [] -> []
| x :: xs and f x with (
  | Some y -> y :: filter_map f xs
  | None -> filter_map f xs
)

Back in 2019, this syntax was proposed as the composition of two independent features, "with-patterns" and "generalized handlers". I have looked again at this at the end of March this year, and I decided to instead design it as a single, better-behaved construct. I drafted the following grammar:

e, expr ::= .... | "match" e "with" clauses
clauses ::=
  | p "->" e
  | p "-> ."
  | (clauses "|" clauses)
  | p "and" e "with" clauses

A slightly more abstract presentation:

e, expr ::= .... | "match" e "with" clauses
clauses ::= clause | clauses clause
clause ::= pattern action
action :=
  | "->" e
  | "-> ."
  | "and" e "with" clauses

Comments on your RFC

(P.S.: These comments came out as a little direct. I should clarify that this is a nicely written RFC and I think it is helpful to get a discussion started. On the other hand, it is hard to make progress on syntactic discussions, and I think that it is helpful to get frank feedback early.)

I think that syntax designs that encourage people to write the same expression several times are bad. This example from the RFC is bad:

match foo with
| Foo f when List.find l f match Some true -> E1
| Foo f when List.find l f match Some false -> E2
| Foo f -> (* [List.find l f] is None *) E3

We should design the syntax from the start to discourage this. My proposal for this was basically to reuse Agda's and ... with construct, instead of Haskell's guards. They have similar expressivity, but and .. with is designed from the start to come with sub-clauses, instead of being part of a single pattern.

My other comment: I think that the many extensions you propose are not going to help getting support for the RFC; they each have some merit, but they each also give reasons to dislike the proposal and shoot it down. In my experience, in the realm of syntax, "optional extensions/generalizations" do not impress people with your forward thinking, they scare them away. I think that you should focus on one simple syntactic extension proposal, and try to get buy-in for it.
(My sensibility would suggest to remove all extensions and the main proposal, and keep only "multiway pattern guards". I think this gets the best ratio between expressivity|convenience and amount of new syntax. But of course you may have a different preference/opinion.)

@gasche
Copy link
Member

gasche commented Jun 15, 2023

Note: previous RFC #12 is also relevant prior work.

@gasche
Copy link
Member

gasche commented Jun 15, 2023

Another relevant previous work is the work on pattern guards for Successor ML, see https://github.com/JohnReppy/compiling-pattern-guards . The syntax that they propose is also "bad" in the sense that it encourages guarding on the same expression several times; but in their workshop abstract they detail compilation strategies for pattern guards, and those are easy to extend to other features in this space.

@ncik-roberts
Copy link
Author

ncik-roberts commented Jun 15, 2023

@alainfrisch : I hadn't considered that, and it's an interesting idea. I believe that implementing this would be a more significant undertaking, as it touches the innards of the pattern match compiler more. In any case, I don't believe the suggestion is at odds with any aspect of the current RFC, so I would move to leave it out of this discussion.

@gasche : Thanks for the pointers to those discussions. The match e with p and e' with ( p1 -> e1 | p2 -> e2 ) syntax you describe in this post did come up in a discussion, so it's no coincidence that this RFC proposes a similar construct to that, but I hadn't seen the post.

It's helpful to know what pieces of this RFC you feel are more or less important. It happens that my preference is the same as yours, I believe. I find what I call "multiway pattern guards" to be a much more important extension than the others mentioned in the RFC, given that they encourage writing an expression once and matching it against different cases. (Indeed, the RFC agrees with you that the bad example you quote is bad — it's the motivation for multiway pattern guards.)

I plan on reworking the RFC to highlight "multiway pattern guards" as the main syntactic form and to drop mention of the constructs I feel are less essential — namely, while and if constructs. While I feel that my personal thought process benefited from considering how the idea could be extended to other constructs, I can see how mentions of these things could distract from the RFC as a whole.

EDIT: In my rework, I plan also to do a better comparison against Haskell/Idris/Agda, as @fpottier suggested.

@gasche
Copy link
Member

gasche commented Jun 15, 2023

One question that is delicate and deserves careful discussion is whether | p and e with <clauses> should enforce the <clauses> part to be exhaustive (with an exhaustivity warning otherwise), or on the contrary accept partial clauses with the "obvious" fallback semantics of looking further down in the toplevel clause set.

Clearly, if we force sub-clauses to be exhaustive, we are not actually getting much expressivity benefits as this could generally be written as nested matches. But if we don't check exhaustivity, it is easy for people to shoot themselves in the foot by writing a non-exhaustive sub-clause without noticing.

I have considered the following options:

  1. Doing nothing (not expecting subclauses to be exhaustive, and hoping that users will not assume otherwise).
  2. Warning if people use exhaustive subclauses ("use a nested match instead"), so that everyone is constantly reminded that subclauses are not exhaustive and have a fallthrough semantics
  3. Adding some explicit | _ -> continue syntax for explicit fallback, and warning if it is not used. Downsides: (1) syntactically heavy, (2) requires picking a new keyword, (3) people are going to dislike it strongly.

It is not obvious which approach is best, maybe (2).

@Kakadu
Copy link

Kakadu commented Jun 15, 2023

For me the syntax Foo f when List.find l f match Some true is over weighted and I immediately want to refactor this into separate "function"... Which leads me to the concept of active patterns from RFC #12 mentioned by Gabriel.

So, I have meta-question: @ncik-roberts, what is your opinion about active patterns? In which cases your approach is more concise? Do active patterns (in your opinion) have issues except difficulties in efficient compilation?

@yallop
Copy link
Member

yallop commented Jun 15, 2023

One question that is delicate and deserves careful discussion is whether | p and e with <clauses> should enforce the <clauses> part to be exhaustive (with an exhaustivity warning otherwise), or on the contrary accept partial clauses with the "obvious" fallback semantics of looking further down in the toplevel clause set.

I think it'd be better to prefer <clauses> that are inexhaustive, and issue a warning for exhaustive clauses.

| p and e with <exhaustive-clauses> is equivalent to | p -> (match e with <exhaustive-clauses>) and would be clearer written that way.

The distinctive advantage of ... and e with is the ability to fall back to the next case in the top-level clause sequence. For cases with exhaustive clauses there's no advantage to using the new feature over the existing simpler construct, and it'd be reasonable to treat doing so as a user mistake by default.

@ncik-roberts
Copy link
Author

I've pushed a new commit that refocuses the RFC to:

  • make the multi-case nature of pattern guards the main point of the proposal.
  • remove mention of related syntactic constructs that distracted from the main point.

You'll also find some examples of how other languages present similar constructs to the programmer. Of all languages considered, Idris and Agda have constructs that are most similar to the one in the RFC.

@ncik-roberts
Copy link
Author

@Kakadu I've read through that RFC and believe it's independent from this one. (Well, they both deal with pattern matching, but I believe the axes are independent.) That's more prominent now with the refocus I just pushed.

  • This RFC is about writing nested non-exhaustive matches that fall through to the outer match.
  • That RFC is about creating a new kind of pattern whose matching involves running some user-defined code.

My impression is that, even if the language had active patterns, we still might want the feature in this RFC to allow for a partial match on an active pattern to fall through to the outer match.

@Kakadu
Copy link

Kakadu commented Jun 22, 2023

@Kakadu I've read through that RFC and believe it's independent from this one. (Well, they both deal with pattern matching, but I believe the axes are independent.) That's more prominent now with the refocus I just pushed.

  • This RFC is about writing nested non-exhaustive matches that fall through to the outer match.
  • That RFC is about creating a new kind of pattern whose matching involves running some user-defined code.

My impression is that, even if the language had active patterns, we still might want the feature in this RFC to allow for a partial match on an active pattern to fall through to the outer match.

Your explanation is decent. Please continue pushing this PR ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants