Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong calls to Env.normalize_path on non-module paths #2131

Merged
merged 6 commits into from Nov 22, 2018

Conversation

Projects
None yet
3 participants
@alainfrisch
Copy link
Contributor

commented Nov 2, 2018

Env.normalize_path is supposed to be applied on module paths; in particular, it may call find_module, which of course does not make sense on other kinds of paths, and is not completely cheap. This PR renames this function to normalize_module_path to make its intended use more explicit, adds normalize_type_path (which takes care of the encoding in paths introduced by inline records), and fixes several wrong calls to normalize_path to use either normalize_type_path or normalize_path_prefix (all other kinds of paths). There is also a new fast shortcut for persistent modules (i.e. compilation units) which cannot be aliases. (@lpw25 already confirmed to me that some of these calls were indeed wrong).

This results in a noticeable speedup of the typechecker (about 10% when compiling typing/typecore.ml with ocamlc).

I suspect that normalize_path still accounts for a non-negligible fraction of type-checking time and would be worth looking at more closely.

(Another source of improvement would be to avoid always calling twice expand_head_unif in Ctype.unify2 on each type. This is by tracking initially this one that I realized that normalize_path and find_module were called on "wrong" paths.)

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2018

It seems that this change requires a bootstrap. This is a bit surprising at first, but this may just be a problem of loss of sharing before the change (supposing you end up copying less now).

This looks like a good idea. Indeed, calling normalize_module_path may attempt more expansions, and this has a cost. I'm surprised the cost was so high, but then the original change (introduction of module aliases) included performance improvements, so that this may have gone unnoticed (I believed I did some benchmarks, but do not remember exactly on what).

As for the repeated calls to expand_head_unif, there is an explicit comment by Jerome Vouillon, so they seem really needed. The question may rather be whether there should be a caching mechanism to avoid trying to expand repeatedly. I.e., a new kind of abbreviation saying that there is no expansion (expansions are already cached, but their absence is not). But then, is the extra complexity worth it.

@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 5, 2018

There was at least one real bug; when passing the following to the compiler:

module X = struct
  module B = List
  exception B of {x:int}
end
let _ = X.B {x=2}

It used to fail with an assertion failure (LocalExt case in find_type_full). I will add a test for that.

;;
[%%expect{|
+module X : sig module B = List exception B of { x : int; } end
+- : exn = X.B {x = 2}

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Nov 5, 2018

Author Contributor

On trunk, the output is:

module X : sig module B = List exception B of { x : int; } end
Uncaught exception: File "typing/env.ml", line 942, characters 26-32: Assertion failed
@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 5, 2018

As for the repeated calls to expand_head_unif, there is an explicit comment by Jerome Vouillon, so they seem really needed.

Honestly, unify2 is a bit mysterious to me, but do you agree that if the first pair:

  ignore (expand_head_unif !env t1);
  ignore (expand_head_unif !env t2);

does nothing, then the second pair is also necessarily a no-op:

  let t1' = expand_head_unif !env t1 in
  let t2' = expand_head_unif !env t2 in

?
Even without a full caching, one could detect this situation.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2018

Note that checking whether they've changed using univ_eq before calling the second pair is what the code used to do back when it was a fixed point rather than two iterations. So it should be safe.

@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 5, 2018

So, anybody willing to review this? @lpw25 or @garrigue?

@alainfrisch alainfrisch added the bug label Nov 8, 2018

@alainfrisch alainfrisch force-pushed the alainfrisch:fix_normalize_path branch from 05f5121 to 9753b8d Nov 8, 2018

@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2018

This fixes both an actual bug, and a performance bug as well, so I think it's worth looking at for 4.08.

@alainfrisch alainfrisch added this to the 4.08 milestone Nov 15, 2018

@garrigue
Copy link
Contributor

left a comment

This looks fine to me.
I added just a request for comment, and a possible change of function name, to make the code more readable.

else normalize_module_path0 lax env (Papply(p1', p2'))
| Pident _ as path ->
normalize_module_path0 lax env path

This comment has been minimized.

Copy link
@garrigue

garrigue Nov 16, 2018

Contributor

What about calling this function expand_module_path rather normalize_module_path0 ?

| _ -> false

let normalize_type_path oloc env path =
(* Inlined version of Path.constructor_typath *)

This comment has been minimized.

Copy link
@garrigue

garrigue Nov 16, 2018

Contributor

Actually, you're inlining is_constructor_typath.
It would be good to add a comment about the structure of those paths: a regular type path is followed by a capitalized constructor name, hence the choice based on the capitalization.

@alainfrisch alainfrisch force-pushed the alainfrisch:fix_normalize_path branch from 9753b8d to 5b18159 Nov 16, 2018

@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 16, 2018

Thanks for the review! Comments taken into account and branch rebased. Waiting for a green light for CI before merging.

(* Inlined version of Path.is_constructor_typath:
constructor type paths (i.e. path pointing to an inline
record argument of a constructpr) are built as a regular
type path followed by a capitalized constructor name. *)

This comment has been minimized.

Copy link
@garrigue

garrigue Nov 16, 2018

Contributor

What about Ext paths?
According to Subst.type_path, it seems that if p is a module path, it should be normalized as such.
Or is there some invariant that ensures that this cannot happen?

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Nov 16, 2018

OK, if I understand correctly, it seems that in the following example, your code would fail to normalize N.E into M.E.
I ignore whether this can create problems. Basically, I don't see how one could create the type path N.E in the first place, since the constructor N.E just refers to M.E.

module M = struct exception E of {x:int; y:int} end
module N = M
@alainfrisch

This comment has been minimized.

Copy link
Contributor Author

commented Nov 16, 2018

To be on the safe side, I added explicit support for paths of inline records under extension constructor.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Nov 17, 2018

Ok. I let you merge it.

@alainfrisch alainfrisch force-pushed the alainfrisch:fix_normalize_path branch from 2eaa454 to 225817c Nov 22, 2018

@alainfrisch alainfrisch merged commit 4c130ca into ocaml:trunk Nov 22, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.