Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change representation of class signatures #8516

Open
wants to merge 24 commits into
base: trunk
from

Conversation

@lpw25
Copy link
Contributor

commented Mar 18, 2019

This PR refactors code that handles class signatures and structures. It fixes a number of bugs including #7894 and improves error messages.

The main change is to replace the current representation of signatures:

type class_signature =
  { csig_self: type_expr;
    csig_vars:
      (Asttypes.mutable_flag * Asttypes.virtual_flag * type_expr) Vars.t;
    csig_concr: Concr.t;
    csig_inher: (Path.t * type_expr list) list }

where csig_self is restricted to be exactly a Tobject type and methods are accessed by fetching them out of this Tobject node, with:

  { csig_self: type_expr;
    mutable csig_self_row: type_expr;
    mutable csig_vars: (mutable_flag * virtual_flag * type_expr) Vars.t;
    mutable csig_meths: (private_flag * virtual_flag * type_expr) Meths.t }

where methods are represented directly in a map. The fields are also made mutable to allow class_signature to be treated more like type_expr and mutated as it is inferred. The restrictions on csig_self, which have been the cause of a number of bugs in the past, are removed.

Some of the changes are not so much bug fixes as slight changes to the design. These are mostly related to instance variables, whose behaviour was changed from shadowing to overriding in OCaml 3.10, where some remnants of the old shadowing behaviour seem to have persisted.

The first commit introduces a lot of new tests around classes. Each commit after that which is expected to change observable behaviour should have at least one test whose behaviour changes.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

Could you first explain which bugs you are fixing (other than #7894), and where they come from.
You seem to be introducing some extensive changes, with a high risk of introducing new bugs, so I would first like to understand why this is needed (rather than just rolling back the removal of the dummy method).
The idea of making the information in class signatures more easily usable is not bad, but the output of the test suite suggests that this had at least some impact of the sharing of types, which may matter.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

Could you first explain which bugs you are fixing (other than #7894), and where they come from.

I don't think most of them have bug reports. The easiest way to see them is to read through the commits one at a time. The changes to the tests in each commit should make clear what the bug is. I've even written some explanations in the commit messages for each of the changes.

You seem to be introducing some extensive changes, with a high risk of introducing new bugs, so I would first like to understand why this is needed

The main intention is really to make this code simpler and easier to maintain. The existing code is very intricate and easy to break. That means that this patch indeed has a high risk of introducing bugs, but that is an incentive to merge the patch rather than not to, since it should remove this risk for any later patch that touches this code.

inherit_class_type ~strict:false loc env sign parent.cltyp_type;
mkctf (Tctf_inherit parent)

Builtin_attributes.warning_scope ctf.pctf_attributes

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Mar 18, 2019

Contributor

It's a bit unfortunate (albeit very minor) that we need to duplicate the "logic" to handle pctf_attributes for all cases (except Pctf_attribute). One could have another helper function that encapsule the use of Builtin_attributes.warning_scope ctf.pctf_attributes and mkctf:

let mkctf_attr f = Builtin_attributes.warning_scope ctf.pctf_attributes (mkctf (f ())) in

This comment has been minimized.

Copy link
@lpw25

lpw25 Mar 18, 2019

Author Contributor

That does seem slightly better.

This comment has been minimized.

Copy link
@lpw25

lpw25 Mar 18, 2019

Author Contributor

Fixed

Builtin_attributes.warning_scope ctf.pctf_attributes
(fun () ->
let (cty, cty') = type_constraint env sty sty' ctf.pctf_loc in
mkctf (Tctf_constraint (cty, cty')))

| Pctf_attribute x ->

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Mar 18, 2019

Contributor

It took me a few seconds to realize that it's ok to ignore pctf_attributes on the Pctf_attribute node, even though they could in theory affect warnings triggered by the processing of the Pctf_attribute payload. The reason is that the concrete syntax does not allow attributes on Pctf_attribute. A PPX could insert some, but I think it's ok to ignore them.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

It could be worth adding a test to show the effect of d3763f1 (likely some code that becomes ill-typed).

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

Oh yes, I missed that one. I'll add some tests.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

Just to clarify: it seems that csig_inher is unused currently (not a consequence of other changes), or did I miss something?

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

I skimmed through each commit, and they all look very reasonable to me, although I don't have the expertise to review them in depth. Anyway, thanks @lpw25 for splitting the large contribution into individual commits, with good explanations and tests to show the user-side of the changes.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

Some explanation about the dummy method could be useful. Is it a hack to benefit from level (now scope) analysis, and could it be replaced by something more explicit?

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

Just to clarify: it seems that csig_inher is unused currently (not a consequence of other changes), or did I miss something?

Yeah, it is already unused

@@ -215,7 +215,7 @@ and expression_desc =
| Texp_for of
Ident.t * Parsetree.pattern * expression * expression * direction_flag *
expression
| Texp_send of expression * meth * expression option
| Texp_send of expression * meth

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Mar 18, 2019

Contributor

I've wondered several times what was this expression option (and always had to go back to the code to figure out). Very nice to see it going away.

This comment has been minimized.

Copy link
@gasche

gasche Mar 18, 2019

Member

Seconded, I also tried (not very successfully) to figure this out as part of the "recursive value declarations" work.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

Some explanation about the dummy method could be useful. Is it a hack to benefit from level (now scope) analysis, and could it be replaced by something more explicit?

It has two purposes: it stops the self types from becoming closed and it stops the self types from escaping the scope of the class declaration. It could be replaced by something more explicit, although the fundamental idea: that the restriction on closing be represented by some form of field of the object type is probably the right way to do things.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

Not directly related to your commits, but I've always found it weird that cl_num is represented as a string (which is always a stringified int, right?). What about introducing at least a type alias for the cl_num fields in Types (Val_self/Val_anc), perhaps switching to something else than string? And/or switching those constructors to use inline records, to document the meaning of their arguments.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

I'd like to tidy that up at some point, but I'd rather not do it in this PR -- which already rewrites a lot of the classes code.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

It has two purposes

Thanks! It could be useful to write this down in the code, if you find a natural place to put it.

I'd rather not do it in this PR

Fair enough.

@lpw25 lpw25 force-pushed the lpw25:simplify-type-class-structure branch from 5deac37 to 86966a3 Mar 18, 2019

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

I've added tests for the change to instance variables in class types.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

I have looked a bit at the commits to find what you are trying to fix, but this didn't help me much.
You add many tests, but there is no clear statement of what you expect to be the correct outcome of the test. The class system being underspecified, we might not agree on what it should be :)
Same problem with warnings. For instance, you choose to warn as soon as unification introduces a non-declared virtual method. The previous behavior of only warning when they were introduced by a method call was intentional, to avoid typos. The goal is less clear in you example.
I think we need a list of all changes, and why they were introduced, so that we can discuss them properly.
As always, I would have preferred discussing the specification issues before writing the code, rather than now with this huge PR.

From a different perspective, my impression was that the class language was stable, after many years of fixing bugs, and the best thing to do was to leave it alone as long as we don't want to introduce new features. You are clearly taking a wholly different approach, but does it mean that you want to introduce new features?

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

I think we need a list of all changes, and why they were introduced, so that we can discuss them properly.

I think that the list of specification changes -- i.e. things that are not just obvious bug fixes -- is as follows:

Extend warning 17 to catch indirect cases

Currently warning 17 occurs when attempting to directly send a method to self that has not been declared:

# class virtual c = object (self) initializer self#foo end;;
Characters 44-52:
  class virtual c = object (self) initializer self#foo end;;
                                              ^^^^^^^^
Warning 17: the virtual method foo is not declared.
class virtual c : object method private virtual foo : unit end

However, it does not detect the same thing happening indirectly:

# let send_foo o = o#foo;;
val send_foo : < foo : 'a; .. > -> 'a = <fun>
# class virtual c = object (self) initializer send_foo self end;;
class virtual c : object method virtual foo : unit end

This PR addresses that short-coming:

# let send_foo o = o#foo;;
val send_foo : < foo : 'a; .. > -> 'a = <fun>
#   class virtual c = object (self) initializer send_foo self end;;
Line 2, characters 18-61:
2 | class virtual c = object (self) initializer send_foo self end;;
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 17: the virtual method foo is not declared.
class virtual c : object method virtual foo : unit end

Check for warning 15 in more cases

Warning 15 detects when a private method is implicitly made public:

# class c = object(self) method private foo = () initializer send_foo self end;;
Characters 10-76:
  class c = object(self) method private foo = () initializer send_foo self end;;
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 15: the following private methods were made public implicitly:
 foo.
class c : object method foo : unit end

However, it currently only applies for class structures, failing to detect cases like:

# class ['a] c = object (_ : 'a) method private foo = () end;;
class ['a] c :
  object ('a) constraint 'a = < .. > method private foo : unit end
# class d = [ < foo : unit; .. > ] c;;
class d : object method foo : unit end

and it is applied for class types even though the same issue occurs:

# class type c = object (< foo : unit; ..>) method private foo : unit end;;
class type c = object method foo : unit end

This PR addresses those short-comings:

# class d = [ < foo : unit; .. > ] c;;
Line 1, characters 10-34:
1 | class d = [ < foo : unit; .. > ] c;;
              ^^^^^^^^^^^^^^^^^^^^^^^^
Warning 15: the following private methods were made public implicitly:
 foo.
class d : object method foo : unit end
# class type c = object (< foo : unit; ..>) method private foo : unit end;;

Line 1, characters 15-71:
1 | class type c = object (< foo : unit; ..>) method private foo : unit end;;
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 15: the following private methods were made public implicitly:
 foo.
class type c = object method foo : unit end

Relax object duplication restrictions

Currently object duplications (i.e. {< ... >}) can only refer toinstance variables that are specified before the method in which they occur:

# class c = object method foo = {< x = 5 >} val x = 0 end;;
Characters 30-41:
  class c = object method foo = {< x = 5 >} val x = 0 end;;
                                ^^^^^^^^^^^
Error: Unbound instance variable x

This restriction made sense before OCaml 3.10 when instance variables could shadow one another. However, instancevariables now override one another instead and so the restriction is not needed:

# class c = object method foo = {< x = 5 >} val x = 0 end;;
class c : object ('a) val x : int method foo : 'a end

Override instance variables in class types

Currently, the behaviour of repeated instance variable definitions in class types is very odd: it partially shadows and partially overrides the previous definition. The best example of this behaviour is probably:

# class type c = object val x : int val virtual x : float end;;
class type c = object val x : float end

This is probably a hangover from the change of behaviour in 3.10. Fixing this mixed behaviour is probably a bug-fix, however I've chosen to fix it by making repeated definitions override previous definitions -- which is less flexible than shadowing but more consistent with the behvaviour is classes -- and that's probably a change in specification.

Remove dummy methods from classes earlier

I've moved the removal of dummy methods from classes to earlier in the process, which allows more definitions such as:

class c = object method m = 5 end
  and d = object method n = object inherit c end end;;

that mix mutual class recursion and object literals.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

As always, I would have preferred discussing the specification issues before writing the code, rather than now with this huge PR.

I did not set out to make changes to the specification, I just wanted to make the code clearer and fix one bug. But, as always with the compiler, once I understood the code it was clear there were a number of bugs and undesirable behaviours. All the changes to the specification above are obvious once the code is made clearer -- it becomes clear that we are checking something in unnatural place or adding ad-hoc difference between equivalent operations -- and so by making them you improve the specification and further improve the code.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

Thank you for this list. This clarifies a lot.

Concerning warning 17, this may be open to debate, but I believe we should stick to the original behavior: there is nothing in the language that says that virtual methods should only be declared with the virtual keyword; this warning is rather about protecting against typos in calls to self-methods in the body of a class. So maybe the name of the warning is wrong, but I don't think its extent should change.

Concerning warning 15, again this is open to debate. I would be more inclined to side with you on that one, but we must see the implications.

Concerning the relaxation of object duplication, I'm not sure. The restriction is not completely unreasonable (usually one declares instance variables first, so this shouldn't happen anyway), but I agree this would be fine to remove it. Removing it of course has a cost, which disappears if you have two passes.

Concerning the overriding of instance variable declarations, I couldn't understand what is the new behavior: what is the difference between shadowing and overriding here? I agree this is a left-over of the previous behavior, but not dangerous. Note also that through a class type coercion one can hide an instance variable, and create later a new one with the same name, so changing the type of an instance variable is not meaningless, and allowing it in class types may indeed be a comfort, so changing this behavior could be problematic.

Last, for your examples allowed by removing the dummy method earlier, they seem indeed nice.
We could live with the limitation, but removing it is an improvement.

First, we need to decide on all these issues. Then I suppose that the improvements provided could make these changes worthwhile, despite the risks. The point being that improving maintainability is great, but if there is no maintenance needed to start with it may be both gratuitous and dangerous.

I'm still quite afraid of the size of your changes. I can try to read the code, but I'm not sure I could make a proper review. Fortunately you added lots of tests, which is a major plus. However the output changes on a lot of existing tests, which makes the checks difficult. And we also need to verify that this doesn't break any existing code in the wild.

I'm also concerned that your patch clearly changes the sharing inside types, as the results of tests show. This may be related to the redundancy you introduce in the representation of class signatures. I'm not sure how to investigate that, and I do not see a clear impact other than the printed types, so maybe it's fine. (I'm assuming that all tests run both in standard an principal mode, which should be the case with expect tests.)

lpw25 added some commits Jan 29, 2019

Relax object duplication restrictions
Currently object duplications (i.e. `{< ... >}`) can only refer to
instance variables that are specified before the method in which
they occur. This restriction made sense before OCaml 3.10 when
instance variables could shadow one another. However, instance
variables now override one another instead and so the restriction
is not needed.
Build met_env on second pass
Move the extension of the method environment to the second pass of
`class_field` in "typeclass.ml".
Change representation of class types
Previously, class types represented their methods via the csig_self
field. This was a type_expr that was restricted to be syntactically
a Tobject node. With this patch the methods are represented directly
with a methods table. csig_self is no longer restricted to be
a Tobject node and is no longer required to contain the private methods.
We also add a csig_self_row field to hold the row variable of the class
type -- which means we can avoid going through csig_self to find it.
Add dummy methods in all cases
Fix some regressions introduced 4.07 by ensuring that class types and
class expressions for classes under construction always include a dummy
method.

lpw25 added some commits Feb 27, 2019

Check for warning 15 when completing a class type
Previously warning 15 (private method implicitly made public) only
detected cases in class structures. However, the case being detected
can happen outside of class structures and so the check should be
applied more widely. Since the new [complete_class_signature] function
handles updating the method tables to reflect implicit changes, this
seems the right place to put the check.
Extend warning 17 to catch indirect cases
Warning 17 warns when a virtual method is not declared. Previously
it only caught cases where the virtual method was implicitly created
by a direct method call on [self]. Now it will also catch cases
where the virtual method is implicitly created by some other means.
Give more precise errors for virtual methods
Previously, the check that a non-virtual class did not
have virtual methods was delayed until the last stage of
type-checking the class. Now we do those checks more eagerly
allowing for clearer errors, especially in common cases.
Represent ancestor variables more directly
Previously, some of the translation of method calls on ancestor
variables (bindings from `inherit` statements) was done in typecore.ml
by creating fake typedtree expressions. This has been moved into
translcore.ml which is simpler. We also pass the types of inherited
methods directly as part of the ancestor variable's kind, which
eleminates the need for the dummy `selfpat-n` variables.
Track dummy methods via their scope
Prevent self types from escaping using the scope of the dummy
method field rather than its level. This is a more accurate
representation of the restriction and allows the level of the
self type to vary, which fixes a bug in principal mode.
Remove private_self and public_self
Now that private methods are stored in the method table there is
no need to distinguish `private_self`, `public_self` and `self_type`
in `Typeclass.class_structure`. This required extending
`Ctype.filter_method` to handle the case of filtering a private method
from a closed object type -- which brings the function into line with
the equivalent call to `Ctype.unify`.
Treat class_signature more like type_expr
Treat class_signature more like type_expr by making all its
components mutable and using three unification-like operations
for manipulating them: add_method, add_instance_variable and
inherit_class_signature. These operations behave similarly to
filter_self_method, which they replace.

We move much of the logic for handling class_signature into Btype
and Ctype and use it for the typing of class signatures
and class structures.

Instead of using method and variable tables with both identifiers and
type components as accumulators during class structure typing, we use
tables with just identifiers along with a single class_signature. This
makes the logic clearer and makes it easier to share things with the
typing of class signatures.
Fix warning attributes in class signatures
[@@@warning ...] was not previously preserved in class signatures
Set object name of self type
Sets the object name of the self type to the #-abbreviation, which
improves error messages and allows us to remove the `unify_parents`
functions from Typeclass.
Keep class signature row up-to-date
Keep the csig_self_row field up-to-date as we add new methods. This
seems more correct and should be slightly more efficient in the
common case.
Rename Concr
The Types.Concr module is just a string set, and it is used in
a number of places. It was originally used for the set of
concrete methods in a class signature, but it's not used for
that anymore and almost none of the places it is still used
are related to that. This commit replaces is with MethSet and
VarSet modules for sets of method or instance variable labels,
and with Misc.Stdlib.String.Set for other uses.
Change warning 13 message
Remove the mention of OCaml 3.10 in the message of warning 13 (instance
variable override). It's been over a decade since that release -- I
think we can stop telling people that it changed the behaviour of
instance variables.
Move method spine generalisation logic into Ctype
Move the method spine generalisation code into ctype.ml. Since that
is the only code that needs to use generalize_spine we also hide
generalize_spine from the interface of Ctype and rewrite the other
use to use generalize_structure. (I would prefer to get rid of
generalize_spine and use generalize_structure for methods as well,
since generalize_spine seems a bit suspect in terms of principality,
but that change would be very invovlved).
Override instance variables in class types
Previously, the behaviour of instance variables in class types was
pretty strange. Given two declarations of the same variable, the
second declaration would partially override and partially shadow
the first declaration. I presume this is a hangover from the
change in instance variable behvaiour in OCaml 3.10. This commit
brings the behaviour into line with the behaviour in classes -- the
second definition overrides the first. It is more restrictive than
the previous behaviour, so it is not backwards compatible, but it
is much more consistent.
Add helpers for warning_scope in typeclass.ml
Add a couple of small helper functions to reduce code duplication
around calls to warning_scope.

@lpw25 lpw25 force-pushed the lpw25:simplify-type-class-structure branch from 86966a3 to 056de11 Mar 20, 2019

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

Concerning warning 17...

I don't know what the original motivation for the warning was, but I disagree with you about what the purpose of the warning should be. The ability to implicitly declare virtual methods is, in my opinion, a confusing misfeature of the language. This warning tells you when this misfeature has occurred allowing you to avoid it, or to at least understand when it is the source of a confusing error. Besides, typos are no less likely to occur indirectly than directly, so even with your proposed meaning for this warning the change is an improvement.

Removing it of course has a cost, which disappears if you have two passes.

It's probably worth pointing out that we already had two passes really, its just that they were written using laziness rather than as explicit passes. This PR makes all this explicit.

Concerning the overriding of instance variable declarations, I couldn't understand what is the new behavior: what is the difference between shadowing and overriding here?

So the current behaviour is that the type and mutability are taken from the second definition, whilst the "virtual-ness" is the least-upper bound from both definitions. The new behaviour is that the types of both definitions are unified, the mutability of both definitions must agree and the "virtual-ness" is the least-upper bound from both definitions.

The new behaviour makes sense -- it matches the behaviour in classes. Another reasonable behaviour would be to take everything from the second definition and ignore the first. As far as I can tell, the current behaviour makes no sense and is probably a bug.

The point being that improving maintainability is great, but if there is no maintenance needed to start with it may be both gratuitous and dangerous.

There is always maintenance needed. When you leave bad code around in an evolving code base it always ends requiring changes, and since the code is hard to understand people try to make the smallest changes possible (i.e. hacks), so the code always just gets worse over time.

The changes in 4.07 are a great example. In order to get disambiguation of GADTs without breaking people's code we needed to make a slight change to how object literals were handled. Because the code was hard to understand we ended up putting in a number of bugs. If the code had been easy to understand that wouldn't have happened, and if we leave typeclass.ml in the state it is in then it will certainly happen again. I can't imagine managing to add modular implicits, algebraic effects, etc. without needing to make at least some changes to this file.

It also prevents making even small improvements to how objects and classes are handled. For example, see #7159, where you say:

I see your point.
As you point out, it shouldn't be that difficult, since one should just propagate the expected type to the internal self-type.
However, code in typeclass is really hairy, so don't hold your breath.

with this PR that change would be relatively simple (I considered including it in this PR but decided there was already enough in here).

I'm also concerned that your patch clearly changes the sharing inside types, as the results of tests show.

I don't think it changes much about the sharing inside the types. It's more that it adds another copy of the method types, and that copy is used for printing which obviously affects the output. The change in
this commit also affects the printing.

I'm still quite afraid of the size of your changes. I can try to read the code, but I'm not sure I could make a proper review.

I would suggest just going one commit at a time. I think the only difficult commits are going to be:

the rest are fairly small and self-contained. We can always split parts off into other PRs and merge them as we go. We can also undo changes to the specification if needed and then put them in a separate PR.

(Note that I just did a rebase and force push to fix the order that the commits are displayed by github -- they should now appear in the correct order.)

rc {cl_desc = Tcl_constraint (cl, Some clty, vals, meths, concrs);
cl_loc = scl.pcl_loc;
cl_type = snd (Ctype.instance_class [] clty.cltyp_type);
cl_type = ty;

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 25, 2019

Contributor

How is this line related with the addition of the dummy method?

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: This class should be virtual. The following methods are undefined : a
Error: This class has undeclared virtual methods.
The following virtual methods were not declared : a
|}];;

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

This error message is strange: there is a warning for undeclared virtual methods, but this is not an error.

Error: This class should be virtual.
The following methods are undefined : bar
Error: This class has undeclared virtual methods.
The following virtual methods were not declared : bar
|}];;

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

This error message is strange: there is a warning for undeclared virtual methods, but this is not an error.

@@ -835,7 +836,7 @@ let rec update_level env level expand ty =
set_level ty level;
iter_type_expr (update_level env level expand) ty
| Tfield(lab, _, ty1, _)
when lab = dummy_method && (repr ty1).level > level ->
when lab = dummy_method && level < (repr ty1).scope ->
raise Trace.(Unify [escape Self])

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

Good idea. The prohibition to lower the level of the dummy method (even locally) was a real pain.

@@ -1543,35 +1542,35 @@ let path_of_module mexp =

(* Check that all core type schemes in a structure are closed *)

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

Should probably change this comment too.

@@ -441,13 +441,11 @@ let message = function
| Unused_match -> "this match case is unused."
| Unused_pat -> "this sub-pattern is unused."
| Instance_variable_override [lab] ->
"the instance variable " ^ lab ^ " is overridden.\n" ^
"The behaviour changed in ocaml 3.10 (previous behaviour was hiding.)"
"the instance variable " ^ lab ^ " is overridden."

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

The fact 3.10 was a long time ago doesn't mean there is no old code left around.
(Yet there was no warning for the previous transition, from OCaml 1 to OCaml 2.00, when the behavior changed from overriding to shadowing.)

csig_meths = Types.Meths.empty ;
csig_inher = []; }
csig_vars = Types.Vars.empty;
csig_meths = Types.Meths.empty; }
in

This comment has been minimized.

Copy link
@garrigue

garrigue Mar 26, 2019

Contributor

csig_inher was intended to provide information about the inheritance hierarchy of a class type.
This is certainly useful information, and actually I was convinced that ocamldoc used it.
I believe Maxence asked me to add it.
Before removing it, it might be worth thinking again whether somebody has a use for it.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

Concerning warning 17...

I don't know what the original motivation for the warning was, but I disagree with you about what the purpose of the warning should be. The ability to implicitly declare virtual methods is, in my opinion, a confusing misfeature of the language. This warning tells you when this misfeature has occurred allowing you to avoid it, or to at least understand when it is the source of a confusing error. Besides, typos are no less likely to occur indirectly than directly, so even with your proposed meaning for this warning the change is an improvement.

Basically, your change means that all virtual methods must be explicitly declared.
In particular, you would have a warning with the following code:

type t = < a : int >
class virtual c = object (_ : <t; ..>) end

This code uses "interface inheritance through self", and I believe it is completely legit.

Of course one can make typos anywhere. But usually one is more careful when writing types than method calls, so I don't see the two problems are equivalent.

At the very least I would suggest moving this change to a separate PR, where we can discuss it independently. Same thing for warning 15, since this is basically the same problem (manipulating the interface through direct constraints to self).

Removing it of course has a cost, which disappears if you have two passes.

It's probably worth pointing out that we already had two passes really, its just that they were written using laziness rather than as explicit passes. This PR makes all this explicit.

Of course. Actually there were two passes in ocaml 1.00, and Jérôme changed that to one pass in ocaml 2.00, which resulted in using lazy when introducing polymorphic methods.
I should probably have reverted to two passes at that point, but somehow the changes where very circumscribed.

Concerning the overriding of instance variable declarations, I couldn't understand what is the new behavior: what is the difference between shadowing and overriding here?

So the current behaviour is that the type and mutability are taken from the second definition, whilst the "virtual-ness" is the least-upper bound from both definitions. The new behaviour is that the types of both definitions are unified, the mutability of both definitions must agree and the "virtual-ness" is the least-upper bound from both definitions.

The new behaviour makes sense -- it matches the behaviour in classes. Another reasonable behaviour would be to take everything from the second definition and ignore the first. As far as I can tell, the current behavior makes no sense and is probably a bug.

OK. I'm fine with the new behavior, but I would still suggest moving it to a separate PR, as there is a potential for breakage here.

The point being that improving maintainability is great, but if there is no maintenance needed to start with it may be both gratuitous and dangerous.

There is always maintenance needed. When you leave bad code around in an evolving code base it always ends requiring changes, and since the code is hard to understand people try to make the smallest changes possible (i.e. hacks), so the code always just gets worse over time.

I kind of agree. But I'm not sure your extensive changes remove all the mystery in typeclass.
Thank you for the courage anyway, I'm just too much of a coward.

I'm also concerned that your patch clearly changes the sharing inside types, as the results of tests show.

I don't think it changes much about the sharing inside the types. It's more that it adds another copy of the method types, and that copy is used for printing which obviously affects the output. The change in
(this commit)[https://github.com//pull/8516/commits/d8cab4df0c2746f9b6c3f85072948c7c76830342] also affects the printing.

Did you measure the change in the size of cmi's, for lablgtk for instance?
This was a real concern at some point.
Also, there is some logic in Subst to avoid duplicating too much (see "Do not clean up if saving"), but it may have suffered from bit-rot, and might be better removed if it doesn't help anymore.

I'm still quite afraid of the size of your changes. I can try to read the code, but I'm not sure I could make a proper review.

I would suggest just going one commit at a time. I think the only difficult commits are going to be:

I did go through each commit, adding a few comments.
My overall impression is that what you are doing is reasonable and helpful, but I think it is just too big for me to do a proper review (sorry, I'm a bad code reviewer).
If somebody is ready to go through all the code with good eyes, this might help (@trefis ?).
Discussing it directly might also work.

At this point, my only request would be to move the above changes of behavior to separate PRs, so that we can first fix the real bug this PR was supposed to fix.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

Thanks for the preliminary review. I'm happy to move the more controversial parts into their own PRs. I'm on holiday for a few weeks, so it will be a little while before I get around to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.