Treat empty variants as GADTs for exhaustiveness check #9309

smuenzel-js · 2020-02-17T05:19:44Z

Currently, the empty variant is not treated as a GADT when we're typing a pattern. This means that if we have an or pattern, and one case contains the empty variant, the other case may not be examined (example enclosed in the patch).
This patch attempts to treat the empty variant as a GADT, to fix this.

(Note: this is the first time I'm doing anything in /typing, so I expect that I may have gotten something wrong here).

trefis

I think a preliminary commit which introduces the test before your patch would have been useful, I initially scratched my head thinking "well that test looks fine, what is he complaining about?".

Apart from that the patch looks correct to me (though it's a bit unfortunate that the variable we look at is called must_backtrack_on_gadt), but I'll let a second pair of eyes (@gasche, @garrigue ?) confirm this before merging.

…e empty variant

smuenzel-js · 2020-02-17T08:05:51Z

Thanks! I updated the commits to first introduce the test, and then fix it.

gasche · 2020-02-17T08:28:12Z

I think that this matter is delicate and warrants a bit more thought. (The empty type is not a GADT and so it is not clear why it must be treated as one; why are there no issues with a non-empty non-GADT type used in the same way?) On the other hand, I am trying to take a break from compiler contributions right now, so I will not be the one giving more thought in the short future.

Octachron · 2020-02-17T08:47:30Z

The issue is quite apparent in the documentation of Refine_or:

Instead of always splitting each or-pattern, It first attempts to
find branches that do not introduce new constraints (because they
do not contain GADT constructors). Those branches are such that,
if they fail, all other branches will fail.

The above text fails to consider what happens if we pick an empty branch (and it should be updated).

The answer with the current code is that as soon as there is one empty branch, we raise Error(_,_,Empty_pattern) and since we are in the Refine_or mode, the only handler is the one in partial_pred.

Morally, we should probably raise an Empty_pattern exception to signal to type_pat_aux that the branch needs to be erased and don't need to be typed again once the parent or has been splitted. But approximating this behavior with the GADT-related one looks fine.

Octachron · 2020-02-17T08:53:28Z

testsuite/tests/typing-misc/empty_variant.ml

+
+type nothing = |
+type ('a, 'b, 'c) t = | A of 'a | B of 'b | C of 'c
+module Runner : sig


Concerning the test, a simpler

type nothing = | type 'b t = A | B of 'b | C let g (x:nothing t) = match x with A -> ()

triggers this issue. Did you have a specific reason in mind for the current test?

This is a simplified version of the error that we observed in our code base -- that's how I derived it. No preference on whether we should use the simpler one or not.

I think keep at least adding the simpler error case is useful, since it makes it clearer what is the intend of the test.

requires further changes

trefis · 2020-02-17T17:11:23Z

I agree with @Octachron's assessment.
Introducing a dedicated exception and not redoing some of the typing work (that we know will necessarily fail) once we start splitting the or-pattern sounds nice on paper but is probably quite tedious in practice.

I'm fine with the behavior proposed in this PR; but I'd like to see the doc / variable names updated, as this case is indeed unrelated to GADTs.
I attempted doing so myself, but am poor writer so perhaps @smuenzel-js or @Octachron will manage to produce something nice (my attempt is here).

Octachron · 2020-02-17T17:36:30Z

For comparison's sake, I have a version with a split exception: Octachron@8b8f9e0 .

The main advantage might be that it removes the Empty_pattern pseudo-error.

smuenzel-js · 2020-02-17T23:55:39Z

I added the changes from @trefis.
Regarding the name of must_backtrack_on_gadt_and_empty_type, maybe we can use something that suggests the underlying reason why we're doing it? Maybe must_backtrack_on_possibly_ill_typed_branch?

trefis · 2020-02-18T08:47:28Z

Regarding the name of must_backtrack_on_gadt_and_empty_type, maybe we can use something that suggests the underlying reason why we're doing it? Maybe must_backtrack_on_possibly_ill_typed_branch?

Yeah, except « possibly ill typed branch » doesn't quite match what's explained in the docs IIUC. The branch itself might type fine, but introduce constraints that make later parts of the pattern ill-typed.
That being said: yes must_backtrack_on_gadt_and_empty_type is an awful name and I'd welcome another suggestion.

Octachron · 2020-02-18T13:26:08Z

One possibility for the name might be to focus on the task at hand: refining_or_in_progress? Otherwise, we could group together the GADT and empty branch cases as unsuitable_branches and thus use must_backtrack_on_unsuitable_branches?

trefis · 2020-02-18T14:20:00Z

Happy with either.

trefis · 2020-02-19T09:50:07Z

Hi @smuenzel-js, in the end because of the social pressure of people who are taking a break from compiler contributions, I just pushed (in #9318) a slightly updated version of @Octachron's proposal.

gasche · 2020-02-19T10:04:14Z

I have looked at both the present PR and @Octachron's approach, and I believe that @Octachron's approach is better. The reason why we give up on empty types is not the same as the reason we give up on GADT constraints, and the way we handle these situations is different: on GADT constraints we go to the topmost non-split or-pattern and retry, empty types do not require backtracking and may be raised to the caller. Having two exceptions to distinguish these two different situations is the right approach.

garrigue · 2020-02-19T17:41:19Z

I discussed this extensively with @Octachron today.
It seems to me that we should include this PR in the 4.10 release, as it is trivially correct (i.e., reverting to backtracking is always correct, and sufficient to fix the bug), and fixes a soundness bug.
The alternative approach of having special handling for empty branches is potentially interesting, but in the current version it offers no performance advantage, as it is only reached on almost linear branches, containing no GADT to split, and as a result it only avoids repeating trivial computations, whose cost is negligible anyway.

Actually, the following example (not involving GADTs) may be faster with the alternative approach, but there should be a better way to improve performance than this very specific hack:

type empty = |
type t = A | B
let f = function (A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(_ : empty) -> .

gasche · 2020-02-19T21:23:43Z

@Octachron's approach is not a hack, it handles separately two things that are different; I believe that it is the right design, and in particular his approach is a natural suggestion of any attempt to update the documentation in a way that is correct (unlike the documentation changes in the present PR that tend to be incorrect).

Either approaches could be reviewed carefully and merged into the 4.10 branch, but do we really need to hold the release for a bug was reported a couple weeks after we started the release-candidate process, and is present in many existing releases? It is stressful to hold the presses at the last minute.

Note that the bug is also present in 4.07, 4.08 and 4.09: delaying a release that was planned this week is not solving the soundness issue for users in the wild, it is just adding more work to the release process. (On the other hand, we could plan for a bugfix release for 4.10, just as we can have bugfix releases of 4.09 and possibly 4.08, including whichever approach we think is the best one.)

I'm not the one making the decision, but if I was I would release without hurrying to fix this new report. It can go in 4.10.1.

Octachron · 2020-02-20T10:46:51Z

Another point is that there is a workaround for the issue that works on every version: use a GADT-based empty type. Since the bug was only discovered after the beta, and it is not widespread, this seems like a good reason to wait for a version later than 4.10 .

garrigue · 2020-02-21T07:31:46Z

If it were to me I would release with the 1 line fix, without changing the documentation.
We know that the 1 line fix is correct, so there is no need to hold the release.

The discussion about the difference in meaning between GADT constructors and empty types is moot to me, as one could make exactly the same distinction on GADTs. I.e., when expanding a GADT, we could immediately check whether all its cases are impossible, and it is empty as a result, avoiding going back to the englobing or-pattern.
So this is just an optimization (rather than a hack).

Octachron · 2020-02-21T10:04:22Z

We agree that the difference is not between GADTs and empty types and that the right distinction is between empty branches and branches adding local constraints. The fact that the first case only happens with empty sum type is indeed contingent. Rather than an optimisation, I see it as decoupling two different concerns in order to make it easier to understand the code locally.

lpw25 · 2020-02-21T10:40:12Z

The fact that the first case only happens with empty sum type.

Is that true? I would have thought matching on say (int, float) eq was an empty branch that adds no local constraints. (I don't think this particularly matters for the discussion, I'm just curious)

gasche · 2020-02-21T12:13:42Z

I went ahead, did a last review of the better implementation in #9318, and merged it. The bug reported in this PR is now fixed in trunk. Thanks @smuenzel-js for the report, the patch proposal, and the fairly enlightening repro case.

gasche · 2020-02-21T12:16:05Z

(We can of course continue discussing whether one or the other fix should be backported in 4.10 at the last minute.)

Octachron · 2020-02-21T12:43:57Z

@lpw25 , I meant that in the current implementation PT_empty is only returned when trying to explode an empty sum type. Like @garrigue said, we could refine our analysis of GADTs to also classify them into locally empty, adding constraints, or suitable for or-refining.

Octachron · 2020-02-21T13:08:36Z

Concerning backporting, this seems like a prime candidate for 4.10.1; but not 4.10 .

Treat empty variants as GADTs for exhaustiveness check

trefis previously approved these changes Feb 17, 2020

View reviewed changes

smuenzel-js added 2 commits February 17, 2020 16:01

Add a test showing that the exhaustiveness check doesn't work with th…

9e86263

…e empty variant

Treat empty variants as GADTs for exhaustiveness check

5d41ade

smuenzel-js force-pushed the exhaust-nothing branch from 872fd8a to 5d41ade Compare February 17, 2020 08:05

Octachron reviewed Feb 17, 2020

View reviewed changes

Update comments related to empty type

dcb5896

Add simpler test for empty type pattern match

c60af1a

use unsuitable branch terminology

b469d2a

trefis mentioned this pull request Feb 19, 2020

Fix exhaustivity checking in presence of empty types #9318

Merged

gasche closed this Feb 21, 2020

stedolan added a commit to janestreet/ocaml that referenced this pull request Mar 17, 2020

ocaml#9309 (rewrite 8d03cbb)

68fd81d

Treat empty variants as GADTs for exhaustiveness check

gasche mentioned this pull request Apr 20, 2020

Reproducible env summaries #9345

Merged

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 16, 2020

ocaml#9309 (rewrite 8d03cbb)

99f7b7c

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020

ocaml#9309 (rewrite 8d03cbb)

3a63c47

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020

ocaml#9309 (rewrite 8d03cbb)

db2e90f

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020

ocaml#9309 (rewrite 8d03cbb)

dfc5cb6

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020

ocaml#9309 (rewrite 8d03cbb)

eed86fd

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020

ocaml#9309 (rewrite 8d03cbb)

d0dc659

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020

ocaml#9309 (rewrite 8d03cbb)

27d4fbd

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 3, 2020

ocaml#9309 (rewrite 8d03cbb)

07d9459

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 4, 2020

ocaml#9309 (rewrite 8d03cbb)

8006aae

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 5, 2020

ocaml#9309 (rewrite 8d03cbb)

f5f9411

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 7, 2020

ocaml#9309 (rewrite 8d03cbb)

450afb3

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020

ocaml#9309 (rewrite 8d03cbb)

bcf9652

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020

ocaml#9309 (rewrite 8d03cbb)

a0532e7

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 17, 2020

ocaml#9309 (rewrite 8d03cbb)

1ff0f3c

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 18, 2020

ocaml#9309 (rewrite 8d03cbb)

91876f1

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 19, 2020

ocaml#9309 (rewrite 8d03cbb)

f478721

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 20, 2020

ocaml#9309 (rewrite 8d03cbb)

8c4c57f

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 28, 2020

ocaml#9309 (rewrite 8d03cbb)

2fe03d8

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9309 (rewrite 8d03cbb)

a584a72

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9309 (rewrite 8d03cbb)

e10b5fc

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9309 (rewrite 8d03cbb)

0261a9c

Treat empty variants as GADTs for exhaustiveness check

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 7, 2020

ocaml#9309 (rewrite 8d03cbb)

5073087

Treat empty variants as GADTs for exhaustiveness check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treat empty variants as GADTs for exhaustiveness check #9309

Treat empty variants as GADTs for exhaustiveness check #9309

smuenzel-js commented Feb 17, 2020

trefis left a comment

smuenzel-js commented Feb 17, 2020

gasche commented Feb 17, 2020 •

edited

Octachron commented Feb 17, 2020 •

edited

Octachron Feb 17, 2020

smuenzel-js Feb 17, 2020

Octachron Feb 17, 2020

trefis commented Feb 17, 2020

Octachron commented Feb 17, 2020

smuenzel-js commented Feb 17, 2020

trefis commented Feb 18, 2020

Octachron commented Feb 18, 2020

trefis commented Feb 18, 2020

trefis commented Feb 19, 2020

gasche commented Feb 19, 2020

garrigue commented Feb 19, 2020

gasche commented Feb 19, 2020 •

edited

Octachron commented Feb 20, 2020

garrigue commented Feb 21, 2020

Octachron commented Feb 21, 2020

lpw25 commented Feb 21, 2020

gasche commented Feb 21, 2020

gasche commented Feb 21, 2020

Octachron commented Feb 21, 2020

Octachron commented Feb 21, 2020

Treat empty variants as GADTs for exhaustiveness check #9309

Treat empty variants as GADTs for exhaustiveness check #9309

Conversation

smuenzel-js commented Feb 17, 2020

trefis left a comment

Choose a reason for hiding this comment

smuenzel-js commented Feb 17, 2020

gasche commented Feb 17, 2020 • edited

Octachron commented Feb 17, 2020 • edited

Octachron Feb 17, 2020

Choose a reason for hiding this comment

smuenzel-js Feb 17, 2020

Choose a reason for hiding this comment

Octachron Feb 17, 2020

Choose a reason for hiding this comment

trefis commented Feb 17, 2020

Octachron commented Feb 17, 2020

smuenzel-js commented Feb 17, 2020

trefis commented Feb 18, 2020

Octachron commented Feb 18, 2020

trefis commented Feb 18, 2020

trefis commented Feb 19, 2020

gasche commented Feb 19, 2020

garrigue commented Feb 19, 2020

gasche commented Feb 19, 2020 • edited

Octachron commented Feb 20, 2020

garrigue commented Feb 21, 2020

Octachron commented Feb 21, 2020

lpw25 commented Feb 21, 2020

gasche commented Feb 21, 2020

gasche commented Feb 21, 2020

Octachron commented Feb 21, 2020

Octachron commented Feb 21, 2020

gasche commented Feb 17, 2020 •

edited

Octachron commented Feb 17, 2020 •

edited

gasche commented Feb 19, 2020 •

edited