Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't treat definitions from the current module specially #8900

Open
wants to merge 1 commit into
base: trunk
from

Conversation

@lpw25
Copy link
Contributor

lpw25 commented Aug 28, 2019

The type-checker treats abstract type definitions in the current specially:

# type (_, _) eq = Refl : ('a, 'a) eq;;
    
# module M = struct
      type a
      type b
      let absurd : 'a. (a, b) eq -> 'a = function _ -> .
    end;;
module M : sig type a type b val absurd : (a, b) eq -> 'a end

# module M = struct
      module N = struct
        type a
        type b
      end
      let absurd : 'a. (N.a, N.b) eq -> 'a = function _ -> .
    end;;
Line 6, characters 54-55:
6 |       let absurd : 'a. (N.a, N.b) eq -> 'a = function _ -> .
                                                          ^
Error: This match case could not be refuted.
       Here is an example of a value that would reach it: Refl

In my experience this special treatment confuses people a lot and is not really useful for anything.

This PR removes the special treatment. This is not a backwards compatible change so it could probably use some testing on opam.

@glondu

This comment has been minimized.

Copy link
Contributor

glondu commented Aug 28, 2019

In your example, the special treatment looks legit: two abstract types defined in another module (N.a and N.b) could actually be equal (so that a value of type (N.a, N.b) eq can be constructed) whereas when they are defined in the same module, they are different by construction.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Aug 28, 2019

they are different by construction

This is debatable.

When I declare an abstract type I might mean a fresh empty type. I might also mean a non-empty type where all the values are constructed using C primitives. For example, Obj.t is declared as an abstract type in Obj but it is the opposite of empty.

Similarly, when I declare two abstract types I might mean two separate types, but I also might mean two equivalent types whose equivalence I do not intend to expose. Without a concrete definition OCaml cannot be certain that two types are genuinely incompatible.

However, this is all besides the point. The issue is more practical than that. Treating types from the same structure differently simply creates confusion for no increase in expressivity. Having explained what is happening is such cases many, many times I think it would be easier to just remove this special casing.

@Drup

This comment has been minimized.

Copy link
Contributor

Drup commented Aug 28, 2019

On paper I agree, but I'm fairly certain this is going to break some code, so we really need to evaluate this on opam.

@trefis

This comment has been minimized.

Copy link
Contributor

trefis commented Aug 28, 2019

I'm fairly certain this is going to break some code, so we really need to evaluate this on opam.

Yes… and no?
The fix is trivial, backward compatible, and even if many packages break, it's unlikely that any package will be broken in more than one or two places.

Which doesn't mean that we shouldn't run this on opam. But I think once that's done we just want to merge, and notify maintainers that they should update their package.

@garrigue

This comment has been minimized.

Copy link
Contributor

garrigue commented Sep 2, 2019

As much as I wish I hadn't created this difference in the first place, this needs to be debated and advertised.
This basically means that almost all code snippets using GADTs in ocaml available around will break, which may not be the best way to help beginners.

Personally I would advocate to wait until we provide a way to define abstract types with an identity, which would also avoid the workaround we have for abstract types from the initial environment.
Actually, the current rule is not that complicated: the abstract type treated specially are those which have a non-prefixed, be it because they are from the initial environment or from the current module.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Sep 2, 2019

This basically means that almost all code snippets using GADTs in ocaml available around will break, which may not be the best way to help beginners.

I think that this point should make us take a long, hard pause.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Sep 2, 2019

I really hope this doesn't break many code snippets, because such snippets are very confusing for beginners. They are wrong and teach people to use GADTs incorrectly. I suspect that making such snippets be an explicit error will probably improve the situation for beginners.

@yminsky

This comment has been minimized.

Copy link

yminsky commented Sep 2, 2019

Maybe the way to make progress on this PR is to gather information from opam on the nature of the breakages that it creates? That should be some clue as to the nature of the incompatibilities introduced by the change.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Sep 2, 2019

An example that works today and would be broken by the change is:

type true_type
type false_type

type _ singleton_bool =
  | True : true_type singleton_bool
  | False : false_type singleton_bool

Where true_type and false_type are "axiomatized types" (abstract types declared in the local module, that will never be given a definition), so they are considered as distinct (today) and eg. true_type singleton_bool can only be True. (To fix the example under the change, one has to give a phony definition to those types, for example = Phony in both cases.)

However, the example I tried to write first is below, and I realized that it is in fact rejected today.

type zero
type 'n succ

# type ('a, _) vec =
  | Nil : ('a, zero) vec
  | Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vec
;;
Error: In this definition, a type variable cannot be deduced
       from the type parameters.

If I understand correclty, this fails because axiomatized types (abstract types declared in the current module) are not considered injective in their parameters: succ is not injective in its parameter, so that 'n is not "deducible" from 'n succ in the return type of Cons.
(Note: the error message could be more helpful, for example by pointing out the constructor for which the problem occurs, and maybe even the parameter position that cannot be determined.)

This suggests that only zero-ary uses of axiomatized types could be used for GADTs in the wild, and that the axiomatized-types approach would (already today) soon show its limitations to the GADT user.

@Octachron

This comment has been minimized.

Copy link
Contributor

Octachron commented Sep 12, 2019

I have run some tests on the opam repository comparing a version of the compiler with this PR integrated and without:

The handful of new failures seems completely unrelated.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Oct 4, 2019

Thanks @Octachron. I think those results on opam are very encouraging. Are people still nervous about making the change, or can we go ahead with it? If people are nervous, are there further experiments that would help make them less nervous?

@garrigue

This comment has been minimized.

Copy link
Contributor

garrigue commented Oct 4, 2019

As I said above, I still advocate to wait until we have a way to give abstract types an identity, so that we could also remove the special case for predefined abstract types.
I would not oppose a warning, but this may be more work (hard to track whether this was really needed).

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 4, 2019

I find the results generally reassuring.

We had this intuition (Jacques mentioned it and I shared it) that many beginner-oriented tutorials would be affected by the change. Could we maybe look at a couple such places (starting with the nice introduction to GADTs by @Octachron in the manual) and check that there is no issue there as well?

@Octachron here are some comments from my experience looking at your opamcheck-PR results:

  1. The basic information we get is that there was a working version-choice (a choice of versions for all dependencies of the package) before, and the tool did not find a working version-choice after. But opamcheck tried hard to find a version-choice, exploring different choices that are different from the working before-choice (which may help finding a success); when they all fail, it reports one of the failure it uses a very different version-choice from before. As a result, often the failure is unrelated to the PR at hand and it makes things hard to understand. (example) Could we always see the failure corresponding to the same version-choice as the working pre-PR version-choice?

  2. There seems to be something odd going on with the bytes/string difference, or at least I'm missing something. See for example the macaroons failure: both the working before-choice and the failing after-choice are using 4.08.1, yet the first builds and the second fails with a seemingly classic bytes/string type incompatibility. What is going on? Could this be related to the present PR?

I wish we could investigate point (2), which may be related to the current PR.

@Octachron

This comment has been minimized.

Copy link
Contributor

Octachron commented Oct 4, 2019

Concerning tutorials, I would argue that any tutorial relying on the special behavior for local abstract types is dangerous.

Such tutorial teaches how to write toy examples that become mysteriously broken once you try to transform them to real code by putting the code in a separated module. This is utterly confusing for most people.

@Octachron

This comment has been minimized.

Copy link
Contributor

Octachron commented Oct 8, 2019

After a bit of investigation, the macaroon failure is not directly related, the package builds with just 4.08.0+pr8900 .

Concerning tutorials, the manual tutorial and ocamlverse tutorials would not need to be changed after this PR. Scouring the first 10 pages of google results for blog post on GADTs in OCaml, I only found two blog posts using abstract types as type level GADT. (I sent a fix for one of them). However, there was a handful of lecture slides subject to this issue. (For instance, https://www.irif.fr/~treinen/teaching/pfav/cours/cours10.handout.pdf or http://yann.regis-gianas.org/mpri/01-gadts-checking.pdf )

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 8, 2019

cc @yurug @treinen: this change to OCaml will break your teaching documents for GADTs, because the pattern of declaring fully-abstract types locally will stop working. Instead of

type empty
type non_empty
type _ gadt = Foo : empty gadt | Bar : non_empty gadt

you should use known-distinct types for the phantom types, for example

type empty = Empty
type non_empty = Non_empty
type _ gadt = Foo : empty gadt | Bar : non_empty gadt
@yurug

This comment has been minimized.

Copy link

yurug commented Oct 8, 2019

Hi @gasche! My mpri slides are already using this idiom, which I find more sensible.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Oct 14, 2019

Do we want to consider this for 4.10? or would people like more time to make a decision?

@Drup

This comment has been minimized.

Copy link
Contributor

Drup commented Oct 15, 2019

As I said above, I still advocate to wait until we have a way to give abstract types an identity, so that we could also remove the special case for predefined abstract types.

@garrigue Do you have a concrete plan for that ? If you don't, I propose we just merge the current version, and improve the situation for predefined abstract types later.

@garrigue

This comment has been minimized.

Copy link
Contributor

garrigue commented Oct 15, 2019

I can write a proposal this week. Or earlier if you prefer.
Basically the idea would be to use an attribute to indicate an id for a pure abstract type (for which no implementation can be given inside ocaml). Then two types with different ids are different.

@garrigue garrigue mentioned this pull request Oct 15, 2019
@garrigue

This comment has been minimized.

Copy link
Contributor

garrigue commented Oct 15, 2019

Just added a proof of concept proposal of unique types: #9042

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.