Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GADT exhaustiveness check incompleteness #6437

Closed
vicuna opened this issue May 21, 2014 · 71 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented May 21, 2014

Original bug ID: 6437
Reporter: mqtthiqs
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-02-16T14:16:34Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.01.0
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: typing
Related to: #6403 #6801
Child of: #5998
Parent of: #6220
Monitored by: @gasche @yallop @alainfrisch

Bug description

This code:

type ('a, 'b) ctx =
| Nil : (unit, unit) ctx
| Cons : ('a, 'b) ctx -> ('a * unit, 'b * unit) ctx

type 'a var =
| O : ('a * unit) var
| S : 'a var -> ('a * unit) var

let rec f : type g1 g2. (g1, g2) ctx * g1 var -> g2 var = function
| Cons g, O -> O
| Cons g, S n -> S (f (g, n))
(* | Nil, _ -> (assert false) *)

reports a warning for non-exhaustive pattern-matching, whereas (I think) it should not, since the last commented case is impossible.
Or am I missing something?

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented May 23, 2014

Comment author: mqtthiqs

A more concise version of the same bug (I guess):

type 'a t = T1 : ('a * 'b) t | T2 : 'a t -> ('a * 'b) t
type 'a u = U1 : unit u | U2 : ('a * 'b) u

let bug : type a. a t * a u -> unit = function
| t, U2 -> ()
(* | T1, U2 -> () )
(
| T2 _, U2 -> () )
(
| _, U1 -> () ) ( impossible *)

reports a warning for non-exhaustive pattern-matching; but whatever the value of t is, the second argument cannot be U1 (this branch 4 is impossible). Note that eta-expanding the first argument (ie. uncommenting branches 2 and 3) the warning disappears.

Hope it helps.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 24, 2014

Comment author: @garrigue

This is a well known limitation of the way the exhaustiveness check works for GADTs requires a constructor to "seed" counterexamples. Doing it on the basis of the type alone could cause infinite loops.
We should eventually do something about that, but this is not high-priority.

By the way, this is not a bug, since we know that the exhaustiveness check cannot be complete, and do not provide any specification for it other than it should be sound (i.e. detect all non-exhaustive cases).

@vicuna

This comment has been minimized.

Copy link
Author

commented May 30, 2014

Comment author: mqtthiqs

Hello Jacques. Is the first piece of code (in the bug report, not in the comment) of the same kind? I don't see how it could be helped by "seeding" counterexamples.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 31, 2014

Comment author: @garrigue

Indeed the example in the bug report is a simpler case.
The warning can be avoided by reversing the order in the tuple:

let rec f : type g1 g2. g1 var * (g1,g2) ctx -> g2 var = function
| O, Cons g -> O
| S n, Cons g -> S(f (n,g));;

But the cause of the warning is the same: using the original order,
one tries to build a pattern for the Nil context, but then the var part
is not seeded, and the exhaustiveness checked does not try to split
the var into O and S. The O and S of the Cons case do no propagate
to the Nil (and they should not, since GADT matching being left to
right, they might actually not be valid there).

Progress report: I've tried to solve these problem (including 6220)
by using typing information to split a wildcard pattern when checking
exhaustiveness. However, there is a big problem: the current
of exhaustiveness checking does not keep types... Keeping them is
going to be a lot of work, and we should be extremely careful about
not propagating wrong typing information in presence of GADTs.
Compared to these risks, a few false positives does not seem to be
a big problem, particularly since you can really add the reported
pattern to remove the warning.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 27, 2015

Comment author: @garrigue

The patch attached to 6403 does not solve this case: not clear where to hook the expansion here.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 30, 2015

Comment author: @garrigue

After looking more carefully, this case is really interesting.
To prove that (Nil, _) doesn't accept any value, we need to do a case analysis on the right part, using its type. However, we cannot use the type coming from the other patterns, as it depends on the left part.

The solution seems to be to do the splitting when typing the pattern (to prove that it cannot be typed). The tricky part is that we need to do backtracking in the typing function Typecore.type_pat.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 30, 2015

Comment author: @garrigue

The patch type_pat_cps.diff implements a new approach, which can fully exploit typing information.
The idea is to rewrite Typecore.type_pat in continuation passing style, so that it becomes possible to backtrack, looking for a counter-example.
This is rather heavy-weight, and potentially brittle, but I think this is the only way to have accurate warnings. Looking at the testsuite, it indeed removes a number of false positives.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 30, 2015

Comment author: @garrigue

Brittle indeed: this doesn't bootstrap yet, due to an interaction with objects.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 7, 2015

Comment author: @garrigue

Updated patch. Now it bootstraps.
I see no slowdown when compiling ocamlc, so I'm thinking of merging this patch to trunk.
This fixes both this example, the one in #6403, and one in typing-gadts/omega07.ml.
Tell me if you see problems.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 8, 2015

Comment author: @garrigue

Cleaner patch. Could simplify parmatch more, as there seems to be no need to do 2 passes.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 10, 2015

Comment author: @garrigue

type_pat_cps3.diff additionally simplifies parmatch, merging the two passes.
Also changed the logic, so that the warning will preferably show ground patterns (they are checked first).
Note that all these patches change the output of exhaustiveness warnings, as or-patterns do not appear in the output (they are always exploded into their components, and only one of them is shown).

@vicuna

This comment has been minimized.

Copy link
Author

commented May 19, 2015

Comment author: @garrigue

Created a branch in the subversion server.
The code can be checked out with
svn checkout http://caml.inria.fr/svn/branches/gadt-warnings

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 13, 2015

Comment author: @garrigue

Still thinking about what to do without too much cost.
Handling of unused case warnings was added to the branch during ICFP (beginning of September).
Here are the costs

                   | camlinternalFormat | stdlib                | make all

Exh. (current) | 3.1s | 11.4s | 100s
Exh. & Unus. | 3.4s | 11.8s | 102s
Exh. + explode | 3.9s | 12.2s | 101s
Exh. & Un. + ex.| 4.5s | 13.0s | 103s

The cost is not big, but still about 3 percents.
And half of it comes from the unused case analysis.
A partial solution would be to avoid exploding in the unused case analysis,
even if it loses symmetry.

An alternative solution would be to be more explicit about which pattern-matching to analyze.
A potential syntax is

  | pat -> _

to mean that pattern pat is not reached.
That would mean writing

 match e with
 | p1 -> e1
   ...
 | pn -> en
 | _ -> _

to mean that we want explicitly to check that the last case is not needed.
Advanced unused case analysis would only be done in that case, preserving the symmetry.
An extra advantage is that this can also be used to give explicitly deep patterns that are not possible. For instance, in the case "harder" of the ML workshop talk (*),

let harder :
(zero succ, zero succ, zero succ) plus option
-> bool
= function None -> false
| Some (PlusS _) -> _

would force exploding the subterms of PlusS.

My only concern is that it is more verbose, and feels like an overkill.

(*) http://www.math.nagoya-u.ac.jp/~garrigue/papers/index.html

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 13, 2015

Comment author: @garrigue

Here are a few more times, for make library on another machine.
The number indicates the "exploding depth", i.e. the number of
nested GADTs one may explode for analysis.
Exh(2)+Unus(2): 77.0s
Exh(1)+Unus(1): 6.55s
Exh(1)+Unus(0): 6.35s
Exh(0)+Unus(0): 5.95s
Exh(1): 6.19s
Exh(0): 5.75s

Note that most of the cost is GADT related. And there seems to be no clear bottleneck (except that depths higher than 1 do not seem to be a good idea...)

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @gasche

As discussed during ICFP, I would be very happy to have an explicit syntax to claim that some patterns are unreachable. I think that would solve many problems with this proposal (by letting users express their intent in a way that can direct the choice of depth), and I suspect that code author could also find it a useful help to reason about the correctness of their GADT assumptions.

I'm not very happy with the syntax you suggest, as "-> _" to me reads exactly like "the right-hand-side is to be inferred or elaborated later". I think a nice syntax for an unreachable pattern should not have a "->" symbol in the first place; if we have to keep it for conflicts-avoidance reason, I would rather piggy-back on something like "(assert false)[@unreachable]", or plain "[%unreachable]".

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @garrigue

Ok, my syntax proposal was just trying not to confuse the tools trying to parse ocaml code, by making it look like a real case. And while _ is used in Coq for the refine tactic, it is not used in ocaml for anything like that, and even if it were used for implicit arguments this would not be just after an arrow.

This said, I don't care as long as this is not going to take more characters: I don't think we want the user to have to write a lot just to say that a pattern-matching is finished.
So, an alternative syntax would be

match e with
| pat1 -> e1
...
| patn -> en
| _

I find it less explicit, but it indeed doesn't have an arrow...

But please, this is enough work to implement that, do not compose it with a flame war on syntax.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @yallop

It's useful to be able to write unmatchable patterns at any point in a match rather than just at the end. From that perspective the rhs-underscore proposal

| pat -> _

is better than the arrow-free syntax

| pat

which interacts badly with or-patterns anywhere except at the end. (I hope this doesn't count as a syntax flame.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @trefis

Note that all these patches change the output of exhaustiveness warnings, as or-patterns do not appear in the output (they are always exploded into their components, and only one of them is shown).

I had missed the part between parentheses at the time.
May I ask why that change was introduced?

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @garrigue

IIRC recent versions of the branch fix the change for warnings (only explode or-patterns when they contain gadts).

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @lpw25

One thing that is not entirely clear to me about these proposals is whether they mean this case is absurd (i.e. it cannot happen due to the types of indices) or this case is redundant/unreachable (i.e. it cannot happen due to some combination of the types of indices and the cases above it in the match statement).

They both seem potentially useful, and I'm not sure which one is being proposed.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @garrigue

This is the second one (previous cases are taken into account). This is the whole point of combining this with the exhaustiveness/redundancy check. Thanks to that, in most cases one would only have to add a wild pattern clause, the remaining cases to check being automatically inferred.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 14, 2015

Comment author: @gasche

what about

| done

or

| false
?

This would give for example:

let rec f : type g1 g2. (g1, g2) ctx * g1 var -> g2 var = function
| Cons g, O -> O
| Cons g, S n -> S (f (g, n))
| Nil, _ done

let rec f : type g1 g2. (g1, g2) ctx * g1 var -> g2 var = function
| Cons g, O -> O
| Cons g, S n -> S (f (g, n))
| Nil, _ false

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @garrigue

I've added refuted cases to gadt-warnings (revision 16500).
I've kept the syntax "pat -> _" because this is for me the only one clear enough at this point.
(The arrow makes clear this is a full-fledge case, and the _ looks like undefined to me, hence unreachable...; I also tried "pat ->" but this creates conflicts in the grammar.)

A small explanation of how it works:

  • by default, both exhaustiveness and redundancy checks try typing generated patterns, but without exploding wild-cards
  • adding "_ -> _" to a pattern-matching will force exploding once for both exhaustiveness and redundancy (NB, this also means exploding tuples, records, and non-gadt datatypes with only one case)
  • if the pattern matching has only one branch (such as a function parameter or a let-binding),
    the we automatically explode once for redundancy (i.e. a bit like an implicit "_ -> _" automatically added). The rationale is that you cannot add it without modifying the structure of the program, and since there is only one case the cost is small anyway. Also, redundancy is not relevant in that case.
  • "pat -> " are included in both exhaustiveness and redundancy checks. For exhaustiveness, this means that pat is supposed to be handled. For redundancy, one checks that there is no value that can reach this case and be accepted by pat (i.e. based on both previous patterns and pat). The check again can explode wildcards just once, but since you can give an explicit pattern, it becomes possible to build an arbitrary finite proof. Failing to be unreachable is an error (not a warning). Note that technically speaking, " -> _" is also handled this way (i.e. the pattern-matching is made exhaustive, and reachability is checked during unused cases analysis).

Here are some concrete examples from the testsuite:

type zero = Zero
type _ succ = Succ
;;
type (,,_) plus =
| Plus0 : (zero, 'a, 'a) plus
| PlusS : ('a, 'b, 'c) plus ->
('a succ, 'b, 'c succ) plus
;;
let easy : (zero, zero succ, zero) plus option -> bool =
function None -> false (* ok since we explode once for single cases )
;;
let harder : (zero succ, zero succ, zero succ) plus option -> bool =
function None -> false (
warns )
;;
let harder : (zero succ, zero succ, zero succ) plus option -> bool =
function None -> false | Some (PlusS _) -> _ (
does not warn *)
;;

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @alainfrisch

If the syntax "pat -> " is retained, I'd prefer to consider "" as a new kind of expression that can appear anywhere, syntactically (instead of making the rhs field of "case" optional). This minimizes the impact on the Parsetree (and thus the breakage of existing code), and would allow to use this "_" in other contexts (e.g. for attribute/extension payloads). It could even make sense to have such "unreachable" expression in other contexts, in the future (with the intent of letting the compiler check unreachability). For instance:

let _ = f ... in
_

that would work only if the compiler can prove that f cannot return a value (based on its type).

(That said, and without entering the syntax flame, I find this choice of "" unfortunate, as it prevents to use it to future extensions where "" would be a more natural syntactic candidate.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @gasche

it prevents to use it to future extensions where "_" would be a more natural syntactic candidate

This is precisely my point.

We already have an expression marker for "unreachable code", it is (assert false). If you want to have a "->" (which I think in this particular case is wrong, but then I agree finding a satisfying syntax is difficult), I would rather have

| p [@empty] -> assert false

Note using an attribute to express intent here is backward-compatible and upward-compatible:

  • Backward-compatible: it will not create warnings/errors for people that use (| p -> assert false) for code that is unreachable for other reasons, for example a user-specified partial domain for the function.
  • Upward-compatible: it can evolve in the more general idea that we use static reasoning to verify that "assert false" indeed is unreachable, with (several) attributes used to justify why a specific occurence is unreachable. Here it is unreachable because the pattern cannot match a value.

I think that a good longer syntax is better than a bad shorter syntax in this instance. We are talking about complex code (the user is doing non-trivial reasoning here), so the code author is not in a hurry. Your (Jacques') idea of implicitly assuming "_ [@empty] -> assert false" after each one-pattern construction is excellent and will solve the one case that really needs concision.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @alainfrisch

I agree with most of Gabriel's points, except that I'm not a big fan of the idea of using attributes to affect the way type-checker works, except w.r.t. warnings (technically, we could argue that the current discussion is about the exhaustiveness check, which only triggers a warning, but this one should really be a strong error).

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @garrigue

Well, the real point is that this is not about empty patterns, but unreachable right-hand side.
assert false does not denote a statically unreachable point, but something that shouldn't be reached at runtime.
Maybe having a category of static assertions (checked by the compiler) would solve the problem, but then we need a new keyword...

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @garrigue

Also, the reason it should be compact:

The idea is that the programmer should just add the line

| _ -> _

to a pattern matching involving gadts, and expect it to work.
If it returns a message about some value being reachable, just add it again (before this line), and it will be refined.
Incrementally, you can ensure that your pattern-matching is complete, without doing any advanced reasoning about refutation.

Another reason you want to put it is that it enables exploding in the redundancy check, which is more symmetric.

So, the worse thing would be that to be seen as an advanced feature you only use to optimize your code. It allows you to write better code, not just faster code.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @lpw25

Another reason you want to put it is that it enables exploding in the redundancy check, which is more symmetric.

I'm slightly unclear on this part. If you don't enable exploding in the redundancy check will you still get a warning, or is the default no warning.

It is also not clear why adding this case should improve the redundancy check. Asserting that a case is unreachable says nothing about the redundancy of cases above it, and it seems pretty arbitrary to treat it as if it did.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @lpw25

I haven't checked the grammar properly but how about:

| _ -> *

I think it '' slightly better than '' because '' feels like it is saying "there is something here but I'm not specifying it" whereas '' feels like it is saying "anything could go here" (taking its traditional wildcard meaning from globs). It is also unlikely to be given special meaning elsewhere since it is a binary operator in both expressions and types -- so any future use would already be quite restricted.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 20, 2015

Comment author: @trefis

The "pref = []" check here https://github.com/ocaml/ocaml/blob/trunk/typing/parmatch.ml#L1933 looks dubious.

I don't know why it was added so I don't dare just remove it, but it implies that all of the following examples are accepted by the compiler:

let f () = match None with _ -> .;;
let g () = match None with _ -> () | exception _ -> .;;
let h () = match None with _ -> .  | exception _ -> .;;
@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 21, 2015

Comment author: @garrigue

@trefis:
Well-spotted. With the introduction of unreachable cases, the "pref = []" condition must be corrected.
Fixed in trunk, commit f4f858.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @trefis

I just noticed that I also get:

# let f x = match x with _ -> () | None -> .;;
Characters 33-37:
  let f x = match x with _ -> () | None -> .;;
                               ^^^^
Warning 11: this match case is unused.
val f : 'a option -> unit = <fun>

both before and after f4f858.

I assume this is because of the "&& q.pat_desc = Tpat_any" here https://github.com/ocaml/ocaml/blob/trunk/typing/parmatch.ml#L1929 , which I think shouldn't be there.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @trefis

Indeed if I remove that part of the condition the example works as expected and the testsuite still passes.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @garrigue

No, this is the intended behavior.
The idea is that this clause is covered by previous cases, which is different from being unreachable because of typing. If you write it, you expect it to be needed for the match to be complete.
Removing the condition would never warn for unreachable clauses which are already covered.
There is a choice here to do about the semantics.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @gasche

Indeed, I would expect to get some sort of warning if I write an unnecessary absurd clause (do we agree to call this clauses "absurd clauses", or what is the right vocabulary?). It may be useful, to avoid Thomas' confusion, to use a different warning message when the unnecessary clause is an absurd one.

I would suggest (with the de rigueur apology for my verbosity):
"Warning 11: this absurd clause is unnecessary, its pattern is covered by previous cases."

You may drop the part after the comma and still get a decent warning message.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @trefis

Removing the condition would never warn for unreachable clauses which are already covered.

Yes that's what I expected, but this is probably a result of me not reading this thread carefully enough.
Would you mind explaining then (and sorry if you already explained, I must have missed it) why you do warn in such cases except if the pattern is a wildcard, i.e. why [let f x = match x with _ -> () | _ -> .] doesn't trigger the warning?

Indeed, I would expect to get some sort of warning if I write an unnecessary absurd clause (do we agree to call this clauses "absurd clauses", or what is the right vocabulary?).

Fair enough. The warning specialization for "absurd clauses" might be nice then, but I don't have a strong opinion.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @gasche

My understanding is that using a final "| _ -> ." is to be understood as a statement of intention of exhaustivity: I declare the pattern-matching above to be complete, with all cases covered. And in particular that means: compiler, please work harder unfolding cases to check that I'm not mistaken.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @trefis

Thank you, sorry for the noise.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @gasche

Hopefully performing code reviews that clarify features and fix type-checking bugs are not yet considered "noise" around here!

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 23, 2015

Comment author: @lpw25

My understanding is that using a final "| _ -> ." is to be understood as a statement of intention of exhaustivity: I declare the pattern-matching above to be complete, with all cases covered.

But in this case the warning is only being raised for cases that are trivially covered (i.e. without recourse to type information). I don't see, if the None deserves a warning, why the _ doesn't.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 24, 2015

Comment author: @garrigue

Indeed, I would expect to get some sort of warning if I write an unnecessary absurd clause (do we agree to call this clauses "absurd clauses", or what is the right vocabulary?). It may be useful, to avoid Thomas' confusion, to use a different warning message when the unnecessary clause is an absurd one.

I rather call them "unreachable clauses", because what one would call "absurd" is the intersection of the pattern with the complement of the previous patterns

Changing the warning message is easy.
The question is whether this should be actually a different warning (or whether we really need a warning for that).
Indeed, one could also use this feature to ensure that a clause is already covered by the previous case (independently of typing). For instance, doing this before adding a catch all clause could make sense, to clarify the semantics of the program.

The reason "| _ -> ." is handled differently is that intuitively it means: "try harder to prove exhaustiveness". So you may end up adding it in places where it is not really needed, as it cannot harm anyway.
I we have a different warning for "already covered unreachable clauses", maybe this doesn't make sense to make an exception for it.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 24, 2015

Comment author: @lpw25

I think we need to clarify what ".... -> ." means. I can see five possible meanings:

a) This case is unreachable. In other words, all the possible matches of this case are covered by earlier cases in the match.

b) This case is absurd. In other words, all possible matches of this case are disallowed by the types.

c) The non-absurd matches of this case are unreachable. In other words all the non-absurd matches of this case (i.e. the matches not ruled out by typing) are covered by earlier cases in the match.

d) The reachable matches of this case are absurd. In other words all the reachable matches of this case (i.e. the matches not covered by earlier cases) are disallowed by the types.

e) This matches of this case are either absurd or unreachable.

a) is clearly not what we want as it does not allow you to specify absurd cases to force the exhaustivity checker deeper.

b) allows you to specify absurd cases and force the checking deeper, but it forces you to always write the absurd cases by hand even if there is only one reachable case remaining.

c), d) and e) are very similar and give an error on the same cases. However, they differ on when a warning should be given (a warning should indicate that an implication is trivially true because no matches meet its precondition). For c) a warning should be given if there are no non-absurd matches for the case, whereas for d) a warning should be given if there are no reachable matches for the case. Finally, e) would never give a warning.

In practice this means that c) would warn for:

| None -> foo
| Some _ -> .

for type empty option (where empty is some type with no valid constructors). Whilst d) would warn for:

| () -> foo
| _ -> .

What we currently have is part way between d) and e). It behaves as d) except for _ patterns where it behaves as e).

To me it seems that d) would be the most useful behaviour. We can already indicate that all remaining cases are unreachable by simply writing no more cases, so .... -> . should be reserved for indicating the presence of an absurd case.

This means that we should remove the special exception for _.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 24, 2015

Comment author: @garrigue

After this discussion I would argue for (e):

  • this is more economical, I.e. no need to warn for something which is not dangerous
  • this actually avoids introducing extra distinctions: we just require that the . should be unreachable
  • there are actually applications for this

If you can think of an example where warning for an unnecessary unreachable clause would detect a bug, please tell me.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 25, 2015

Comment author: @lpw25

I have a slight preference for d), but e) is also fine. This means that the original example reported by @trefis should be fixed to not give a warning.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 25, 2015

Comment author: @trefis

As you know my initial guess for the meaning of "... -> ." was (e), as such I have a slight preference for it.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 25, 2015

Comment author: @gasche

After thinking about this more, I think that semantically it is best if we avoid distinctions between "unreachable" and "absurd". This distinction is, in a sense, algorithmically motivated: we know what the exhaustiveness checker can reason about easily (the type-less constructor trees), and what needs refined type-directed reasoning. But the underlying explanation for pattern-matching is a capture of all the (well-typed) values that may flow into a clause, with no distinguo between those two settings.

This would suggest that (e) is the best choice -- in terms of mental model and explanation for beginners.

I'm not satisfied by the name "unreachable clause", because to me any clause whose right-hand-side is dead code (not necessarily ".") is an unreachable clause. (The warning "this match case is unused" could just as well say "this clause in unreachable".) We could pun on the "dead code" idea and call them "killed clauses", but I'm not sure this is of good taste. I still like the name "absurd clause" better; in the Agda world they are called "absurd patterns", but they have no "->" in their syntax, you would write

| Right ()

where "()" is the marker for absurdity, not unlike our ".". We write

| Right _ -> .

so we are writing a clause rather than a pattern.

Or what about "void clause" or "empty clause", maybe?

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 25, 2015

Comment author: @lpw25

I still like the name "absurd clause" better

Unless we were to go with (b), "absurd" is a bad name for them because absurd means that the case does not make sense at all (i.e. the types disallow it).

"empty" or "void" seem best if we go with (e) because they do not differentiate between absurd and unreachable.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 25, 2015

Comment author: @gasche

A nice thing with "empty" or "void" clause is that they can also be understood as saying that we leave the right-hand-side empty (or void), and this understanding is consistent with the syntax used.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 26, 2015

Comment author: @garrigue

Pushed change to (e) at commit 61c2dd2 in trunk.

For the name, I still think that unreachable is clearer.
Namely we have

  • unused clauses: for clauses with a real implementation that are never reached
  • unreachable clauses: for clauses "pat -> ."
    Of course, for an unreachable clause to be reachable is an error.
    And it is always ok to turn an unused clause into an unreachable one.

I'm not opposed to other names, but then they should show that we're trying to prove/check something. Maybe something like "refutation clause", but it sounds pedantic.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 26, 2015

Comment author: @gasche

I like "refutation clause". Maybe "refuted clause" would sound simpler and be equally precise? (I thought of turning it into "refused clause" to use simpler language but that suggests something else, namely indicating an intent to be partial and not handle some cases.)

I'm not sure being pedantic is an issue, but we need a clear name to communicate to our users. (I understand that the documentation for this feature is not yet written.) "higher-order polymorphism" or "relaxed value restriction" or "principal inference" are all pedantic names, yet they are precise and useful way to name concepts that we can and do explain to our users. We don't need a name that can be guessed when seeing the feature, but rather a name that can be remembered once it has been encountered in the Changelog, a blog post, or the documentation -- and for this precision is key.

Thanks a lot for your work on the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.