Or patterns#43
Conversation
|
I really like this idea and often wanted to have something like this. I assume an or-pattern where different parts reference a different set of variables or the variables have different types will just result in an error message? |
That's what I thought, but I think if we do something like: then we can allow different types for same variables in different patterns. Example: data T a = C1 Int | C2 Bool
f :: T a -> String
f (C1 x | C2 x) = show xIf we type check f a = case a of
C1 x -> show x
C2 x -> show xTo keep things simple (and in within the parts of the compiler internals that I understand), I said "same set of variables of same types". We can either generalize this later or right now if type checker experts can chime in here. One thing to note here is that is we restrict patterns to bind same set of variables of same types I think desugaring gets a lot simpler: we just generate a local function for the RHS, and call it with same set of variables in each case (similar to how |
|
One alternative to this feature is simply a warning for the existence of a wildcard pattern match. |
What would be the fix for that warning? Without or patterns you just replace one problem with another (namely, wildcard patterns with repetitive and/or duplicated code). |
That's true, but (IMO) it's replacing a big problem (wildcard patterns) with a small(er) one (duplicating the RHS on all the patterns you want to treat uniformly). |
|
I would very happy to have these. Ocaml has or-patterns already (https://caml.inria.fr/pub/docs/manual-ocaml/patterns.html#sec108), so we could borrow ideas from its implementation. A real-world use case where or-patterns would be helpful that I mentioned just the other day to someone working on improving iOS support in GHC: picCCOpts :: DynFlags -> [String]
picCCOpts dflags
= case platformOS (targetPlatform dflags) of
OSDarwin
-- Apple prefers to do things the other way round.
-- PIC is on by default.
-- -mdynamic-no-pic:
-- Turn off PIC code generation.
-- -fno-common:
-- Don't generate "common" symbols - these are unwanted
-- in dynamic libraries.
| gopt Opt_PIC dflags -> ["-fno-common", "-U__PIC__", "-D__PIC__"]
| otherwise -> ["-mdynamic-no-pic"]
OSMinGW32 -- no -fPIC for Windows
| -- ...The As for syntax, I'm +1 on using As for semantics when binding variables of different types, matching on existential constructors, etc., I'd be content with an incremental approach of supporting the simple cases first; that's where most of the value is anyways. |
|
We should have done this along time ago. I use it all the time in Rust too, and miss it all the time in Haskell. One interesting thing to note is that exhaustive lazy or patterns make sense. |
|
Can I use it in like this? |
Yes although in this situation you would need parentheses around An or-pattern is again a pattern so it can appear wherever a pattern can, including for example nested within another pattern ( |
|
@vagarenko, like @rwbarton said, an or pattern can appear anywhere that a pattern can appear. OCaml already supports or patterns in full generality (i.e. they can appear anywhere that patterns can appear). This is from Real World OCaml: let is_ocaml_source s =
match String.rsplit2 s ~on:'.' with
| Some (_,("ml"|"mli")) -> true
| _ -> falseIn Rust this is not the case, the reference says "Multiple match patterns may be joined with the | operator.". I started thinking about the implementation. As the first thing I think we may have to make significant changes in the parser. Currently patterns are subsets of expressions, so we have productions like this:
With this change patterns won't be a subset of expressions, so we may want to first parse for a pattern, and then try to transform it into an expression. Does anyone have any other ideas on this? |
|
Pattern syntax was never really a subset of expression syntax (especially before TypeApplications): I'm not sure whether the pattern parser reuses the expression parser out of technical necessity (e.g., we don't know up front whether we are parsing a pattern or an expression) or out of convenience. If it's the latter it might be time to create a separate pattern parser. IIRC the reuse of the expression parser already causes some oddities around the precedence of |
|
It's out of necessity. If a line begins Naked top-level splices are a misfeature, in my opinion. |
|
Is this proposal to force the pattern-match to match every possible ADT value explicitly? There are times where just having a wildcard Actual use-case -- we start with the following ADT definition and call-sites: Now, assume that within Another one: assume that within Therefore, my question. Can we add a pragma to ADTs to disallow wildcards? eg. Once this is done, having the |
|
Thanks for the examples @saurabhnanda. Your examples are exactly the same as the second example I gave in the proposal. About the pragma: it sounds like a good idea, but it's orthogonal to this proposal and can be done separately. Even without or patterns it may be useful, so I suggest creating a new proposal for that. So no, this is not a proposal to force pattern matching on every possible ADT constructor. |
|
@osa1 I'm not confident of being able to write all these sections: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposal-submission.rst#content-of-the-proposal-document |
|
I don't really see the value of or in case of non-enums with I.e. at the end it will be about discipline in the team, so I don't see OTOH, feel free to experiment with writing |
It wouldn't call a programmer doing that "lazy". It's more like an hostile attacker actively trying to break the rules. Now, it's unfortunate that this issue exists, but it doesn't take away from the usefulness of |
|
I'm not against or-patterns, even mildly in favour. But
Let's NOT do this . It would add a huge amount of complexity to a basically-simple feature. Each pattern in an or-group should bind the same term variables, exisitential type variables, and constraints. Don't forget there is work to do to fix up the pattern-match overlap checker. |
|
starting from some level of complexity the good complex rule that is catch all but it may be changed in future, and workaround for bypassing the wildcard restriction feature are indistinguishable. Then with the change of the data type - the type of deconstructor will also change and all users will be notified. |
it was a necessity even before top-level naked splices, e.g. in the statements of |
Well, it needs an active champion. Anyone who wants to play that role can of course reopen it. |
|
I'd very much love to see this supported in GHC. It'd make the life of the "working industry programmer" significantly better. If it helps simplify design/implementation, you can outlaw binding any parameters in the "or" case. I.e., you can pattern-match multiple constructors, so long as no variables are bound in any of those matches. This is a simplification, and I doubt it would take away much from usability in practice, yet deliver a nice solution to practical problems. (Of course, if it doesn't add extra complication, do allow them; but it's an option perhaps for the first version of an implementation?) |
|
Yes; perhaps we can deliver on nested (field) matches in a future proposal. That is, I propose the following change: -p1 and p2 bind same set of variables.
+p1 and p2 bind no variables, constraints or dictionaries.In particular, this change entails not having to worry about GADTs and whatnot for now. (No type refinement of the pattern variable is possible either.) I believe that if there is a semantics for or-patterns that can cover GADTs and bound variables, then this semantics can be expressed as a backwards-compatible extension to this simpler semantics. I think we are rather late to the party:
Let's ship the MVP that catches 90% and think about the hard 10% case afterwards. @LeventErkok (or someone who reads this!) would you be up to recycle this proposal into a new one that delivers the MVP? |
|
When you say
Do you mean that the patterns must not bind any of these, or that any bindings are ignored? For variables probably the former (i.e. requiring the programmer to write Here is an example. Allowed or not? |
The latter, at least in Haskell. The former in the desugaring to Core (where it's all just variables, I guess). I should have been more clear. So your data U a where
MkInt :: U Int
MkInt2 :: U Int
bar :: U a -> a
bar (MkInt | MkInt2) = 42 :: IntThat is, an or pattern will never provide new Givens such as Perhaps it should just be -p1 and p2 bind same set of variables.
+p1 and p2 bind no variables.because the proposal already states, in 1.4.1
Maybe that section should be reworded in terms of "Givens"? Not sure that helps. Anyway, my changed proposal would allow GADTs in or patterns (including existentials), but the desugared Core would not bind any of the aforementioned GADTy stuff in the view pattern. NB: A future proposal could relax this requirement and (semantically) allow more (syntactically valid) programs in the process that would be semantically incorrect in the MVP. I believe that is what the current proposal aimed at and is also what caused it to come to a halt, even after it only tried to focus on simple field binders without GADTy stuff. The details are non-trivial, especially if we want to keep the desugaring to view patterns. |
|
Thanks guys clarifying! |
|
? I think this proposal is in the same design space as the As at today, I think you could write a Pattern Synonym/ViewPattern to give a uniform view over diverse constructors -- again with more flexibility of typing. Admittedly that comes at cost of a couple of extra declarations, and some rather clunky code. |
|
@AntC2 the interaction of Or-patterns with f p1 q1 r1 = ...
f p2 q2 r2 = ...I think in both cases (i.e. here and with |
|
Note that #522 is an evolution of this proposal that eschews variable bindings altogether (in a forward compatible way), as motivated by #43 (comment). Feel free to offer your opinion there, since this proposal is dead as far as I can tell. |
|
AFAICT the current state of the world is that #522 was accepted, has an implementation planned to ship with GHC 9.12, and there is not currently another proposal in the works for extending the syntax to include bindings as in this original proposal; is that correct? Or is the next step being discussed somewhere else that I didn't find? |
|
Exactly. Or patterns have been merged (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9229) and will be released with GHC 9.12. There is no active proposal for introducing binders, but feel free to generalise the typing rule in #522. |
Early in this very long PR thread, most of the conversation seemed to be in agreement that ‘all inner patterns bind the same variables with the same type’ was a reasonable starting point. So I suppose I would first generalize the current typing rule to Beyond that, I suspect I don't understand the rules of the game. Is this |
|
The typing rules for pattern types are given on the very last page of https://mpickering.github.io/pattern-synonyms-extended.pdf. Yes, ‘all inner patterns bind the same variables with the same type’ sounds reasonable, and it should also be possible to specify the declarative typing rule along the lines of what you did. The one you presented almost works. Such a specification is a good first step towards an algorithmic implementation, however note that Ψi lists existential type variables unleashed by GADT constructors for example, and Γ might contain match variables in the type of which those existentials occur. Since your rule cleverly restricts all alternatives to the same Γ, this forces subsets of Ψi to equate. But since these binders in Γ reference a subset of Ψi in their type, your rule may no longer return the empty set as existentials in scope after the match, otherwise you lose well-scopedness. In the implementation, you would most likely need to unify any introduced existentials and binders in Γ and keep them in scope, while dropping the rest of ψi and Σi. Thus, it might be simpler to restrict At the end of the day, the restricted form of the proposal was just to color the bike shed and ship something. Perhaps this form is doable as well. |
|
Ah, lovely! I didn't realize the pattern synonyms paper had this tucked away in the appendix. Thank you for the pointer. I had assumed that data T where
C :: forall a. Eq a => a -> T
badEq :: (T, T) -> Bool
badEq (C x, C y) = x == yDesugaring badEq = \a -> case a of (C x, C y) -> x == yand applying the typing rules from the paper (large SVG image, sorry, I haven't figured out a good way to render these derivations): we can see that there is a valid derivation for this program per the rules, but clearly With that change, there would be no need for ‘ While I'm picking nits, having now spent some time staring at the rules, I think the intent is that the contexts on the right of the judgment do not contain the contexts on the left of the judgment as subsets, but instead represent only extensions (the base rules for variable patterns and wildcards use empty sets on the right of the judgment, for example). So the rule currently implemented should probably be written as and my conservative extension as But now that I have enough context to be dangerous I have a notion of a more liberal rule I could propose as a further step. Still proving to myself that it's sound under some appropriate reinterpretation of the rules in the paper, though. |
I think your example highlights an important point: In the product rule ( data T where
C1 :: forall a. a -> (a -> Int) -> T
C2 :: forall a. a -> (a -> Int) -> T
foo :: T -> Int
foo (C1 x f; C2 x f) = f xBut of course, we could only ever accept such a program if those
Indeed, that's a good point, and easy to fix as well as you point out.
I think so as well. But it's better to be clear and give that condition in one form or another. I think "(all
That's great news :) I'll be happy to assist with the implementation, although perhaps I'm not the best resource. |
|
Okay, here's my current thinking about the theory before shifting gears to implementation. First, I propose some modifications to the typing rules as they appear in the appendix, half to resolve soundness issues, half to cover a few more programs that GHC currently admits. I hope nobody is sick of my underinformed pedantry yet. Constructor patternsSoundness issue: renaming
|
|
Implementation question for @sgraf812 or anyone else: How should the renamer rename variables bound by or-patterns? Should
I figure unused name warnings are going to be relevant to this decision but I haven't gotten there yet, and I don't know in what direction they'd influence things. |
|
Before investing too much in an implementation, would it be possible to have a specification, in the form of a GHC proposal? I vaguely remember issues to do with existentials... |
|
If you want one strongly, then yes, but I'd consider myself still exploring whether I'm a good person to bring a proposal and how liberal that proposal would be. Is it not typical to hack out a prototype implementation prior to bringing a formal proposal? It would not be substantially different from this proposal except possibly accepting more of the examples, depending on which typing rule we use out of the two possibilities here. One of those should accommodate the existential issues, as far as I understand them, at the level of theory. Whether it's at all practical to implement is the question I'm currently pursuing. |
|
Indeed, it's absolutely fine to explore implementations. I'm just wanting to avoid a hard working contributor investing lots of effort in an implementation, only to find that the Steering Committee doesn't like the proposal, or finds serious flaws, or suggests alternatives. But nothing stops you exploring, provided you are content that subsequent debate might change the design you are implementing. |
|
I'm confused. From this #43 was spun-off #522, which was accepted and with some later mods. So is the continuing discussion here building on that spin-off or somehow orthogonal to it? wrt the implementation for whatever you're exploring here, wouldn't it be easier to build a PatternSynonym that's equivalent, then take its semantics? |
|
I necro'd this thread with this comment asking if a continuation was being discussed anywhere else; it seemed not. #522 was a reduction in scope of #43, so I figured this was the most likely spot to discuss returning to the scope of #43 or something like it. I don't think involving patsyns would make anything simpler; you'd need a view pattern to implement a patsyn for this and in that case you might as well use the view pattern directly, as #43 proposed. |
|
In reply to #43 (comment):
Agreed. I wonder what is the best place to commit up-to-date typing rules to. I think Artin Ghasivand is currently working with Simon on setting down typing rules for GHC Haskell. He might be interested. At any rate it is indeed hard to review if old and new rule are not written side-by-side, but I agree 100% with your written assessment on Constructor patterns.
Note that a premise in an inference rule is satisfied if it is true for some instantiation of its free variables. This is already a bit shady for index However, since (I read past the desugaring into a data constructor
I'm a bit confused by how your Ultimate rule would allow to type such a thing. Specifically, I fail to see how I would suggest to propose the "Professional" rule first to gather feedback and experience while implementing it. |
I recommend you to take a reference from how pattern signatures are implemented, because they solve almost the same problem. Consider this example: f (_ :: (a, a)) = ()Here, the first f :: forall a. ...
f (_ :: (a, a)) = ()Pattern signatures store all the type variables that they bring into scope in extension field, so the answer would be |
Ah, very true. And ViewPatterns are an anti-feature. I find when reasoning about patterns that Simon's suggestion/notation helps me see the wood for the trees. To take the Or-like example 2 from here, I write: So to answer your q, there are three variables, none of which should get unused warnings. As to how the code appears, I fear your "principle shminciple" isn't going to fly. Consider I've used an What's with that guard on RHS? Consider example 1 from that proposal: In the builder direction, use the guards on LHS, ignore those on RHS; in the matcher vv. Then we have each side of the |
I inferred inductively, and perhaps foolishly, from the source paper that indexes were treated not as free variables in this sense but as something to be quantified over for each judgment, with overbars used for intra-judgment sequences. I was trying to match that convention; if the result is confusing, I'd probably choose to prefix
The so satisfying the premises It's just like a polymorphic function: the RHS is generalized over types and constraints, and each alternative of the or-pattern, like a call to a polymorphic function, is responsible for providing type arguments and satisfying any constraints with those type arguments substituted in. But unlike with polymorphic functions, the site of the or-pattern is adjacent to the term being generalized, without an intermediate function term that needs a type from the user's perspective. Many of the objections to generalization don't seem to apply when the thing-to-be-generalized is defined with a new syntax that signals that generalization is intended to happen.
Oh, I'm definitely starting with Professional! But if that ends up being within my grasp I think an implementation of Ultimate would be informative enough to be worth trying, and subsequently discussing. I'd expect a lot of the uncertainty around a more powerful rule to stem from questions like how much complexity would it add to the implementation and would it make error messages worse, and if those questions can be answered then we'd all be in a better position to compare the different rules on their merits and not on the amount of uncertainty around them.
Very useful insight; thank you! |
|
@rhendric Awesome work! I've also been wanting for some bindings in Haskell's or patterns. It feels like a peculiar limitation, but I appreciate it since it got us orpats so quickly. What's your progress on it, if you don't mind? |
|
@subterfuge, I did manage to fumble together an implementation that works on the basics, but I stalled out at making a formal proposal based on it. I'd be happy to work with someone else on moving this forward, or I might get back around to doing it myself when the spirit moves me. |
|
@rhendric i'm not too knowledgeable about it, but i can probably learn! just let me know if you've got anything that needs help. i'm a little interested in playing with the impl if you don't mind! |
Rendered
Trac ticket
/r/haskell thread