Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat empty variants as GADTs for exhaustiveness check #9309

Closed
wants to merge 5 commits into from

Conversation

smuenzel-js
Copy link
Contributor

Currently, the empty variant is not treated as a GADT when we're typing a pattern. This means that if we have an or pattern, and one case contains the empty variant, the other case may not be examined (example enclosed in the patch).
This patch attempts to treat the empty variant as a GADT, to fix this.

(Note: this is the first time I'm doing anything in /typing, so I expect that I may have gotten something wrong here).

trefis
trefis previously approved these changes Feb 17, 2020
Copy link
Contributor

@trefis trefis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a preliminary commit which introduces the test before your patch would have been useful, I initially scratched my head thinking "well that test looks fine, what is he complaining about?".

Apart from that the patch looks correct to me (though it's a bit unfortunate that the variable we look at is called must_backtrack_on_gadt), but I'll let a second pair of eyes (@gasche, @garrigue ?) confirm this before merging.

@smuenzel-js
Copy link
Contributor Author

Thanks! I updated the commits to first introduce the test, and then fix it.

@gasche
Copy link
Member

gasche commented Feb 17, 2020

I think that this matter is delicate and warrants a bit more thought. (The empty type is not a GADT and so it is not clear why it must be treated as one; why are there no issues with a non-empty non-GADT type used in the same way?) On the other hand, I am trying to take a break from compiler contributions right now, so I will not be the one giving more thought in the short future.

@Octachron
Copy link
Member

Octachron commented Feb 17, 2020

The issue is quite apparent in the documentation of Refine_or:

Instead of always splitting each or-pattern, It first attempts to
find branches that do not introduce new constraints (because they
do not contain GADT constructors). Those branches are such that,
if they fail, all other branches will fail.

The above text fails to consider what happens if we pick an empty branch (and it should be updated).

The answer with the current code is that as soon as there is one empty branch, we raise Error(_,_,Empty_pattern) and since we are in the Refine_or mode, the only handler is the one in partial_pred.

Morally, we should probably raise an Empty_pattern exception to signal to type_pat_aux that the branch needs to be erased and don't need to be typed again once the parent or has been splitted. But approximating this behavior with the GADT-related one looks fine.


type nothing = |
type ('a, 'b, 'c) t = | A of 'a | B of 'b | C of 'c
module Runner : sig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the test, a simpler

type nothing = |
type 'b t = A | B of 'b | C
let g (x:nothing t) = match x with A -> ()

triggers this issue. Did you have a specific reason in mind for the current test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simplified version of the error that we observed in our code base -- that's how I derived it. No preference on whether we should use the simpler one or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keep at least adding the simpler error case is useful, since it makes it clearer what is the intend of the test.

@trefis trefis dismissed their stale review February 17, 2020 16:40

requires further changes

@trefis
Copy link
Contributor

trefis commented Feb 17, 2020

I agree with @Octachron's assessment.
Introducing a dedicated exception and not redoing some of the typing work (that we know will necessarily fail) once we start splitting the or-pattern sounds nice on paper but is probably quite tedious in practice.

I'm fine with the behavior proposed in this PR; but I'd like to see the doc / variable names updated, as this case is indeed unrelated to GADTs.
I attempted doing so myself, but am poor writer so perhaps @smuenzel-js or @Octachron will manage to produce something nice (my attempt is here).

@Octachron
Copy link
Member

For comparison's sake, I have a version with a split exception: Octachron@8b8f9e0 .

The main advantage might be that it removes the Empty_pattern pseudo-error.

@smuenzel-js
Copy link
Contributor Author

I added the changes from @trefis.
Regarding the name of must_backtrack_on_gadt_and_empty_type, maybe we can use something that suggests the underlying reason why we're doing it? Maybe must_backtrack_on_possibly_ill_typed_branch?

@trefis
Copy link
Contributor

trefis commented Feb 18, 2020

Regarding the name of must_backtrack_on_gadt_and_empty_type, maybe we can use something that suggests the underlying reason why we're doing it? Maybe must_backtrack_on_possibly_ill_typed_branch?

Yeah, except « possibly ill typed branch » doesn't quite match what's explained in the docs IIUC. The branch itself might type fine, but introduce constraints that make later parts of the pattern ill-typed.
That being said: yes must_backtrack_on_gadt_and_empty_type is an awful name and I'd welcome another suggestion.

@Octachron
Copy link
Member

One possibility for the name might be to focus on the task at hand: refining_or_in_progress? Otherwise, we could group together the GADT and empty branch cases as unsuitable_branches and thus use must_backtrack_on_unsuitable_branches?

@trefis
Copy link
Contributor

trefis commented Feb 18, 2020

Happy with either.

@trefis
Copy link
Contributor

trefis commented Feb 19, 2020

Hi @smuenzel-js, in the end because of the social pressure of people who are taking a break from compiler contributions, I just pushed (in #9318) a slightly updated version of @Octachron's proposal.

@gasche
Copy link
Member

gasche commented Feb 19, 2020

I have looked at both the present PR and @Octachron's approach, and I believe that @Octachron's approach is better. The reason why we give up on empty types is not the same as the reason we give up on GADT constraints, and the way we handle these situations is different: on GADT constraints we go to the topmost non-split or-pattern and retry, empty types do not require backtracking and may be raised to the caller. Having two exceptions to distinguish these two different situations is the right approach.

@garrigue
Copy link
Contributor

I discussed this extensively with @Octachron today.
It seems to me that we should include this PR in the 4.10 release, as it is trivially correct (i.e., reverting to backtracking is always correct, and sufficient to fix the bug), and fixes a soundness bug.
The alternative approach of having special handling for empty branches is potentially interesting, but in the current version it offers no performance advantage, as it is only reached on almost linear branches, containing no GADT to split, and as a result it only avoids repeating trivial computations, whose cost is negligible anyway.

Actually, the following example (not involving GADTs) may be faster with the alternative approach, but there should be a better way to improve performance than this very specific hack:

type empty = |
type t = A | B
let f = function (A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(A|B),(_ : empty) -> .

@gasche
Copy link
Member

gasche commented Feb 19, 2020

@Octachron's approach is not a hack, it handles separately two things that are different; I believe that it is the right design, and in particular his approach is a natural suggestion of any attempt to update the documentation in a way that is correct (unlike the documentation changes in the present PR that tend to be incorrect).

Either approaches could be reviewed carefully and merged into the 4.10 branch, but do we really need to hold the release for a bug was reported a couple weeks after we started the release-candidate process, and is present in many existing releases? It is stressful to hold the presses at the last minute.

Note that the bug is also present in 4.07, 4.08 and 4.09: delaying a release that was planned this week is not solving the soundness issue for users in the wild, it is just adding more work to the release process. (On the other hand, we could plan for a bugfix release for 4.10, just as we can have bugfix releases of 4.09 and possibly 4.08, including whichever approach we think is the best one.)

I'm not the one making the decision, but if I was I would release without hurrying to fix this new report. It can go in 4.10.1.

@Octachron
Copy link
Member

Another point is that there is a workaround for the issue that works on every version: use a GADT-based empty type. Since the bug was only discovered after the beta, and it is not widespread, this seems like a good reason to wait for a version later than 4.10 .

@garrigue
Copy link
Contributor

If it were to me I would release with the 1 line fix, without changing the documentation.
We know that the 1 line fix is correct, so there is no need to hold the release.

The discussion about the difference in meaning between GADT constructors and empty types is moot to me, as one could make exactly the same distinction on GADTs. I.e., when expanding a GADT, we could immediately check whether all its cases are impossible, and it is empty as a result, avoiding going back to the englobing or-pattern.
So this is just an optimization (rather than a hack).

@Octachron
Copy link
Member

We agree that the difference is not between GADTs and empty types and that the right distinction is between empty branches and branches adding local constraints. The fact that the first case only happens with empty sum type is indeed contingent. Rather than an optimisation, I see it as decoupling two different concerns in order to make it easier to understand the code locally.

@lpw25
Copy link
Contributor

lpw25 commented Feb 21, 2020

The fact that the first case only happens with empty sum type.

Is that true? I would have thought matching on say (int, float) eq was an empty branch that adds no local constraints. (I don't think this particularly matters for the discussion, I'm just curious)

@gasche
Copy link
Member

gasche commented Feb 21, 2020

I went ahead, did a last review of the better implementation in #9318, and merged it. The bug reported in this PR is now fixed in trunk. Thanks @smuenzel-js for the report, the patch proposal, and the fairly enlightening repro case.

@gasche gasche closed this Feb 21, 2020
@gasche
Copy link
Member

gasche commented Feb 21, 2020

(We can of course continue discussing whether one or the other fix should be backported in 4.10 at the last minute.)

@Octachron
Copy link
Member

@lpw25 , I meant that in the current implementation PT_empty is only returned when trying to explode an empty sum type. Like @garrigue said, we could refine our analysis of GADTs to also classify them into locally empty, adding constraints, or suitable for or-refining.

@Octachron
Copy link
Member

Concerning backporting, this seems like a prime candidate for 4.10.1; but not 4.10 .

stedolan added a commit to janestreet/ocaml that referenced this pull request Mar 17, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 16, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 3, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 4, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 5, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 7, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 17, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 18, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 19, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 20, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 28, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020
Treat empty variants as GADTs for exhaustiveness check
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 7, 2020
Treat empty variants as GADTs for exhaustiveness check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants