Don't treat definitions from the current module specially #8900

lpw25 · 2019-08-28T10:15:12Z

The type-checker treats abstract type definitions in the current specially:

# type (_, _) eq = Refl : ('a, 'a) eq;;
    
# module M = struct
      type a
      type b
      let absurd : 'a. (a, b) eq -> 'a = function _ -> .
    end;;
module M : sig type a type b val absurd : (a, b) eq -> 'a end

# module M = struct
      module N = struct
        type a
        type b
      end
      let absurd : 'a. (N.a, N.b) eq -> 'a = function _ -> .
    end;;
Line 6, characters 54-55:
6 |       let absurd : 'a. (N.a, N.b) eq -> 'a = function _ -> .
                                                          ^
Error: This match case could not be refuted.
       Here is an example of a value that would reach it: Refl

In my experience this special treatment confuses people a lot and is not really useful for anything.

This PR removes the special treatment. This is not a backwards compatible change so it could probably use some testing on opam.

glondu · 2019-08-28T11:34:07Z

In your example, the special treatment looks legit: two abstract types defined in another module (N.a and N.b) could actually be equal (so that a value of type (N.a, N.b) eq can be constructed) whereas when they are defined in the same module, they are different by construction.

lpw25 · 2019-08-28T12:11:39Z

they are different by construction

This is debatable.

When I declare an abstract type I might mean a fresh empty type. I might also mean a non-empty type where all the values are constructed using C primitives. For example, Obj.t is declared as an abstract type in Obj but it is the opposite of empty.

Similarly, when I declare two abstract types I might mean two separate types, but I also might mean two equivalent types whose equivalence I do not intend to expose. Without a concrete definition OCaml cannot be certain that two types are genuinely incompatible.

However, this is all besides the point. The issue is more practical than that. Treating types from the same structure differently simply creates confusion for no increase in expressivity. Having explained what is happening is such cases many, many times I think it would be easier to just remove this special casing.

Drup · 2019-08-28T12:20:04Z

On paper I agree, but I'm fairly certain this is going to break some code, so we really need to evaluate this on opam.

trefis · 2019-08-28T12:28:07Z

I'm fairly certain this is going to break some code, so we really need to evaluate this on opam.

Yes… and no?
The fix is trivial, backward compatible, and even if many packages break, it's unlikely that any package will be broken in more than one or two places.

Which doesn't mean that we shouldn't run this on opam. But I think once that's done we just want to merge, and notify maintainers that they should update their package.

garrigue · 2019-09-02T13:22:17Z

As much as I wish I hadn't created this difference in the first place, this needs to be debated and advertised.
This basically means that almost all code snippets using GADTs in ocaml available around will break, which may not be the best way to help beginners.

Personally I would advocate to wait until we provide a way to define abstract types with an identity, which would also avoid the workaround we have for abstract types from the initial environment.
Actually, the current rule is not that complicated: the abstract type treated specially are those which have a non-prefixed, be it because they are from the initial environment or from the current module.

gasche · 2019-09-02T13:49:20Z

This basically means that almost all code snippets using GADTs in ocaml available around will break, which may not be the best way to help beginners.

I think that this point should make us take a long, hard pause.

lpw25 · 2019-09-02T14:19:54Z

I really hope this doesn't break many code snippets, because such snippets are very confusing for beginners. They are wrong and teach people to use GADTs incorrectly. I suspect that making such snippets be an explicit error will probably improve the situation for beginners.

yminsky · 2019-09-02T14:24:24Z

Maybe the way to make progress on this PR is to gather information from opam on the nature of the breakages that it creates? That should be some clue as to the nature of the incompatibilities introduced by the change.

gasche · 2019-09-02T15:05:17Z

An example that works today and would be broken by the change is:

type true_type
type false_type

type _ singleton_bool =
  | True : true_type singleton_bool
  | False : false_type singleton_bool

Where true_type and false_type are "axiomatized types" (abstract types declared in the local module, that will never be given a definition), so they are considered as distinct (today) and eg. true_type singleton_bool can only be True. (To fix the example under the change, one has to give a phony definition to those types, for example = Phony in both cases.)

However, the example I tried to write first is below, and I realized that it is in fact rejected today.

type zero
type 'n succ

# type ('a, _) vec =
  | Nil : ('a, zero) vec
  | Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vec
;;
Error: In this definition, a type variable cannot be deduced
       from the type parameters.

If I understand correclty, this fails because axiomatized types (abstract types declared in the current module) are not considered injective in their parameters: succ is not injective in its parameter, so that 'n is not "deducible" from 'n succ in the return type of Cons.
(Note: the error message could be more helpful, for example by pointing out the constructor for which the problem occurs, and maybe even the parameter position that cannot be determined.)

This suggests that only zero-ary uses of axiomatized types could be used for GADTs in the wild, and that the axiomatized-types approach would (already today) soon show its limitations to the GADT user.

Octachron · 2019-09-12T15:58:18Z

I have run some tests on the opam repository comparing a version of the compiler with this PR integrated and without:

The handful of new failures seems completely unrelated.

lpw25 · 2019-10-04T09:14:27Z

Thanks @Octachron. I think those results on opam are very encouraging. Are people still nervous about making the change, or can we go ahead with it? If people are nervous, are there further experiments that would help make them less nervous?

garrigue · 2019-10-04T09:48:06Z

As I said above, I still advocate to wait until we have a way to give abstract types an identity, so that we could also remove the special case for predefined abstract types.
I would not oppose a warning, but this may be more work (hard to track whether this was really needed).

gasche · 2019-10-04T09:50:08Z

I find the results generally reassuring.

We had this intuition (Jacques mentioned it and I shared it) that many beginner-oriented tutorials would be affected by the change. Could we maybe look at a couple such places (starting with the nice introduction to GADTs by @Octachron in the manual) and check that there is no issue there as well?

@Octachron here are some comments from my experience looking at your opamcheck-PR results:

The basic information we get is that there was a working version-choice (a choice of versions for all dependencies of the package) before, and the tool did not find a working version-choice after. But opamcheck tried hard to find a version-choice, exploring different choices that are different from the working before-choice (which may help finding a success); when they all fail, it reports one of the failure it uses a very different version-choice from before. As a result, often the failure is unrelated to the PR at hand and it makes things hard to understand. (example) Could we always see the failure corresponding to the same version-choice as the working pre-PR version-choice?
There seems to be something odd going on with the bytes/string difference, or at least I'm missing something. See for example the macaroons failure: both the working before-choice and the failing after-choice are using 4.08.1, yet the first builds and the second fails with a seemingly classic bytes/string type incompatibility. What is going on? Could this be related to the present PR?

I wish we could investigate point (2), which may be related to the current PR.

Octachron · 2019-10-04T10:02:58Z

Concerning tutorials, I would argue that any tutorial relying on the special behavior for local abstract types is dangerous.

Such tutorial teaches how to write toy examples that become mysteriously broken once you try to transform them to real code by putting the code in a separated module. This is utterly confusing for most people.

Octachron · 2019-10-08T14:00:34Z

After a bit of investigation, the macaroon failure is not directly related, the package builds with just 4.08.0+pr8900 .

Concerning tutorials, the manual tutorial and ocamlverse tutorials would not need to be changed after this PR. Scouring the first 10 pages of google results for blog post on GADTs in OCaml, I only found two blog posts using abstract types as type level GADT. (I sent a fix for one of them). However, there was a handful of lecture slides subject to this issue. (For instance, https://www.irif.fr/~treinen/teaching/pfav/cours/cours10.handout.pdf or http://yann.regis-gianas.org/mpri/01-gadts-checking.pdf )

gasche · 2019-10-08T14:07:04Z

cc @yurug @treinen: this change to OCaml will break your teaching documents for GADTs, because the pattern of declaring fully-abstract types locally will stop working. Instead of

type empty
type non_empty
type _ gadt = Foo : empty gadt | Bar : non_empty gadt

you should use known-distinct types for the phantom types, for example

type empty = Empty
type non_empty = Non_empty
type _ gadt = Foo : empty gadt | Bar : non_empty gadt

yurug · 2019-10-08T15:27:19Z

Hi @gasche! My mpri slides are already using this idiom, which I find more sensible.

lpw25 · 2019-10-14T10:02:10Z

Do we want to consider this for 4.10? or would people like more time to make a decision?

Drup · 2019-10-15T09:24:39Z

As I said above, I still advocate to wait until we have a way to give abstract types an identity, so that we could also remove the special case for predefined abstract types.

@garrigue Do you have a concrete plan for that ? If you don't, I propose we just merge the current version, and improve the situation for predefined abstract types later.

garrigue · 2019-10-15T10:33:52Z

I can write a proposal this week. Or earlier if you prefer.
Basically the idea would be to use an attribute to indicate an id for a pure abstract type (for which no implementation can be given inside ocaml). Then two types with different ids are different.

garrigue · 2019-10-15T19:38:02Z

Just added a proof of concept proposal of unique types: #9042

Octachron · 2022-02-04T11:26:57Z

I am still very much in favor of this PR, and still semi-regularly explaining that the special behavior for abstract types defined in the current module is a beginner trap.

gasche · 2022-02-04T12:51:52Z

It's an old discussion, people are generally in agreement that the current behavior is not good, and the opam results are good. Sounds good, right?

Besides the question of teaching materials (which I think is important, but basically impossible to quantify), the thing that is giving me pause is that personally I consider that the idiom type foo = Foo is hack. It's using a feature that was not meant to do this, in a way that is clever and looks nice, but the intent is hard to understand for people not already familiar with the technique, and it has unintended term-level consequences (we can write terms that are not supposed to make sense).
So we would be deprecating something which made sense (which known downsides) to replace it with a hack?

@garrigue is patiently pushing for a different approach using "generative abstract types" (see #9042 and ocaml/RFCs#4). I haven't invested enough effort in the proposal to know what to think of it, but in general I admire the idea of not satisfying ourselves with a hack, and trying to find a feature that captures our intent here. (And I think that the external/generative/nominal proposals are also justified by other needs, for example @alainfrisch's interest in runtime type representations, it's not like the admittedly-small problem we are discussing here is the only motivation.)

You want more support (at least from @garrigue and myself) for changing the behavior of type <foo> in modules? Please go reach a consensus on Jacques' proposals :-)

yallop · 2023-01-09T15:26:51Z

Besides the question of teaching materials (which I think is important, but basically impossible to quantify), the thing that is giving me pause is that personally I consider that the idiom type foo = Foo is hack.

I'm not especially keen on it, either, but I consider type foo = | better, because it doesn't have this drawback:

it has unintended term-level consequences (we can write terms that are not supposed to make sense).

garrigue

Considering that I've been long in advancing the nominal types, I agree that it is better to get rid of this kludge.
I expected more changes in the testsuite.
This seems to show that I'm already always writing my code that doesn't depend on this.

Before merging, there should be a log message explaining the change and how to fix code in case it breaks it.

testsuite/tests/typing-gadts/test.ml

typing/ctype.ml

gasche · 2023-01-12T10:08:42Z

@yallop

I'm not especially keen on it, either, but I consider type foo = | better, because it doesn't have this drawback:

I am worried by the interaction between this choice and the fact that people are asking for structural subtyping specifically for empty types (see eg #9459). We can add structural subtyping without type equalities, but it feels to me like we are treading on very fragile ground with this choice.

On the other hand I actually like @lpw25's proposal of using type foo = private Foo instead. This is quite neat.

yallop · 2023-01-12T10:13:46Z

Could you say exactly what you're worried about? I don't see what problem there could be.

yallop

It'd be good to add @stedolan's segfaulting example from #11492 to the test suite in this PR.

We could also include a variation on the example that uses ascription rather than functors

module M = struct type t = int let int = Int end
include (M : sig type t val int : t is_int end)

since that's also currently unsound.

gasche · 2023-01-12T10:21:54Z

I am worried about the fact that someday type t = | and type u = | could be treated as equal types, breaking the distinctness assumption of people using them as generative phantom GADT markers.

yallop · 2023-01-12T10:26:50Z

I'm not worried about that, because it doesn't seem any more likely than type s = Zero and type t = Zero being treated as equal types.

lpw25 · 2023-01-12T11:36:46Z

I think maybe I'm missing something: whilst type t = | and type u = | are aren't known to be equal, they also aren't known to be not equal, so they don't work well as GADT indices.

gasche · 2023-01-12T11:41:35Z

At least this currently works (OCaml 5.0)

type t = |
type u = |
type _ gadt = T : t gadt | U : u gadt

let f : t gadt -> unit = function T -> ()
(* no exhaustivity warning *)

yallop · 2023-01-12T11:56:35Z

Oh, @lpw25 is right, of course. While they're currently known to be unequal, they won't be after this PR.

gasche · 2023-01-12T12:05:58Z

Hm, this also means that my old proposal to allow renaming constructors when re-exporting a datatype is definitely dead for good after this PR. (Not that I was very insistent on it or anything .)

It will be even harder if this PR is merged to reason on the fact that types are known to be incompatible. I guess that it will be a motivation to look again at #9042 or other proposals in that space.

lpw25 · 2023-01-12T21:28:58Z

Rebased, comments addressed and Changes entry added

gasche · 2023-01-13T22:44:11Z

Note: this PR should also update the manual:

ocaml/manual/src/tutorials/gadtexamples.etex

Lines 300 to 304 in 4d932a8

    
           Of course, not all abstract types can be refined, as this would 
        
           contradict the exhaustiveness check. Namely, builtin types (those 
        
           defined by the compiler itself, such as "int" or "array"), and 
        
           abstract types defined by the local module, are non-instantiable, and 
        
           as such cause a type error rather than introduce an equation.

garrigue mentioned this pull request Oct 15, 2019

Unique types #9042

Draft

lpw25 mentioned this pull request Aug 16, 2022

Soundness issue in Ctype.non_aliasable #11492

Open

garrigue reviewed Jan 11, 2023

View reviewed changes

testsuite/tests/typing-gadts/test.ml Outdated Show resolved Hide resolved

typing/ctype.ml Outdated Show resolved Hide resolved

yallop requested changes Jan 12, 2023

View reviewed changes

lpw25 force-pushed the ignore-current-module branch from 524a04d to 9f6c3dc Compare January 12, 2023 21:26

lpw25 added 3 commits January 12, 2023 21:27

Add test for PR#11492

14fe6af

Don't treat definitions from the current module specially

2af9e08

Add Changes entry

d657b1f

lpw25 force-pushed the ignore-current-module branch from 9f6c3dc to d657b1f Compare January 12, 2023 21:28

gasche mentioned this pull request Aug 6, 2023

GADT exhaustiveness check fails with structurally equal records/variants #12472

Closed

anmonteiro mentioned this pull request Nov 2, 2023

POC of giving a type-error for non keyed elements when are under React.array reasonml/reason-react#812

Closed

Don't treat definitions from the current module specially #8900

Are you sure you want to change the base?

Don't treat definitions from the current module specially #8900

Conversation

lpw25 commented Aug 28, 2019

glondu commented Aug 28, 2019

lpw25 commented Aug 28, 2019

Drup commented Aug 28, 2019

trefis commented Aug 28, 2019

garrigue commented Sep 2, 2019

gasche commented Sep 2, 2019

lpw25 commented Sep 2, 2019

yminsky commented Sep 2, 2019

gasche commented Sep 2, 2019

Octachron commented Sep 12, 2019

lpw25 commented Oct 4, 2019

garrigue commented Oct 4, 2019

gasche commented Oct 4, 2019

Octachron commented Oct 4, 2019

Octachron commented Oct 8, 2019

gasche commented Oct 8, 2019 • edited Loading

yurug commented Oct 8, 2019

lpw25 commented Oct 14, 2019

Drup commented Oct 15, 2019 • edited Loading

garrigue commented Oct 15, 2019

garrigue commented Oct 15, 2019

Octachron commented Feb 4, 2022

gasche commented Feb 4, 2022 • edited Loading

yallop commented Jan 9, 2023

garrigue left a comment

Choose a reason for hiding this comment

gasche commented Jan 12, 2023

yallop commented Jan 12, 2023 • edited Loading

yallop left a comment • edited Loading

Choose a reason for hiding this comment

gasche commented Jan 12, 2023

yallop commented Jan 12, 2023

lpw25 commented Jan 12, 2023

gasche commented Jan 12, 2023 • edited Loading

yallop commented Jan 12, 2023

gasche commented Jan 12, 2023

lpw25 commented Jan 12, 2023

gasche commented Jan 13, 2023

gasche commented Oct 8, 2019 •

edited

Loading

Drup commented Oct 15, 2019 •

edited

Loading

gasche commented Feb 4, 2022 •

edited

Loading

yallop commented Jan 12, 2023 •

edited

Loading

yallop left a comment •

edited

Loading

gasche commented Jan 12, 2023 •

edited

Loading