Skip to content

Remove erroneous optimization from eqtype#13306

Merged
goldfirere merged 7 commits intoocaml:trunkfrom
goldfirere:eqtype-no-phys-eq
Jul 23, 2024
Merged

Remove erroneous optimization from eqtype#13306
goldfirere merged 7 commits intoocaml:trunkfrom
goldfirere:eqtype-no-phys-eq

Conversation

@goldfirere
Copy link
Contributor

From the top of Ctype.eqtype, I've removed a call to eq_type and added this comment:

  (* It's tempting to check for physical equality via [eq_type] here, but that
     would be incorrect: imagine comparing ['a * 'a] with ['b * 'a]. The
     first ['a] and ['b] would be identified in [eqtype_subst], and then
     the second ['a] and ['a] would be [eq_type]. So we do not call [eq_type]
     here. *)

See if you agree with the comment.

There is no known actual bug that this fixes, and thus no test case. But I'm doing some other work around here in the Jane Street branch, and the check is causing problems in that context.

@Octachron
Copy link
Member

I agree that mixing the equalities of eqtype_subst with the physical eq_type is only valid if the type variable nodes of both types are separated. Now, I am wondering if those type variables nodes can be shared in the vanilla compiler.

@goldfirere
Copy link
Contributor Author

It turns out that the physical equality check is necessary when rename is false, as showed up when I tried to remove the further calls to eq_type that you found @Octachron. Hopefully this version is better.

@Octachron
Copy link
Member

We could also handle the variable case directly:

    | (Tvar _, Tvar _)  ->
        if rename then eqtype_subst type_pairs subst t1 t2
        else if eq_type t1 t2 then ()
        else raise_unexplained_for Equality

But I agree that using the representative physical equality is correct without renaming and faster.

@Octachron
Copy link
Member

I am also wondering if we want to add a short-circuit test in Ctype.equal:

let equal env rename tyl1 tyl2 =
  if not rename && List.for_all2 eq_type tyl1 tyl2 then () else
  let subst = ref [] in
  try eqtype_list rename (TypePairs.create 11) subst env tyl1 tyl2
  with Equality_trace trace ->
    raise (Equality (expand_to_equality_error env trace !subst))

under the hypothesis that the optimisation is safe and could trigger not that infrequently in this location.

If [eqtype] no longer assumes separation, then [Printtyp.Names]
shouldn't either.
typing/ctype.ml Outdated
let equal env rename tyl1 tyl2 =
if List.length tyl1 <> List.length tyl2 then
raise_unexplained_for Equality;
if not rename && List.for_all2 eq_type tyl1 tyl2 then () else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are no substitution in scope at this point, the not rename guard should not be needed, isn't it? If tyl1 and tyl2 are physically equal, they will be equal with the trivial substitution.

@garrigue
Copy link
Contributor

After the explanation, I think I understand the problem, and why it doesn't happen in practice.
There are two ways to use eqtype: either to check equivalent types during module subtyping, allowing renaming, but using disjoint types, or to check type equality in a context where you don't want to modify types like unify would, and not allowing renaming. But you are right that you are not losing any useful optimization, and it is always better to have correct code without assuming implicit invariants.

Co-authored-by: Gabriel Scherer <gabriel.scherer@gmail.com>
let substitute ty =
match List.assq ty !name_subst with
| ty' -> substitute ty'
| ty' -> ty'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fix, but the commit message seems inexact to me:

  • with separation, the inner substitute always falls because variables on the right side never appear on the left side
  • without separation the previous call is mixing the two sides and deciding that 'a * 'b * int = 'b * 'c * float, the left hand side should be displayed as 'c * ' c * int is not equal to 'c * 'c * float rather than 'b * 'c * int is not equal to 'b * 'c * float.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without separation, it's even worse: the previous call could loop infinitely. (Indeed I observed this in my branch that does not enforce separation.) Thus my claim that dropping this recursion removes the separation assumption. And, in any case, it seems unnecessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the function call is either unnecessary (with the separation hypothesis) or wrong. I am just not sure I would describe the current call as a "use of the separation assumption".

Copy link
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state looks ready to merge, and I think it is good to simplify preconditions in this part of the typechecker.

@gasche
Copy link
Member

gasche commented Jul 23, 2024

During our discussion today I had the impression that it is not obvious to everyone that subst is taken to be the relation we are checking on variables, instead of "additional equalities" that are true in addition to reflexivity. It might be useful to have a comment that makes this explicit -- this is implicitly asserted in the comment about eqtype below, but in a way that assumes more knowledge of the rest of the file.

@goldfirere goldfirere merged commit 0ebe56b into ocaml:trunk Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants