[tooling] Remember linked declarations#13308
Conversation
d308268 to
e517502
Compare
f6e740f to
d96a2b1
Compare
| slowdown described above. | ||
| *) | ||
| type positivity = Negative | Positive | Strictly_positive | ||
| type directionality = { in_eq : bool; positivity : positivity } |
There was a problem hiding this comment.
I wonder what is the current exchange rate between Merlin PRs and Pattern-Matching Bug PRs.
I'll have a look.
gasche
left a comment
There was a problem hiding this comment.
I had a quick look. My comments:
-
I don't know what
Directed | Undirectedin Cmt_format mean, some documentation comment (with an example!) would help. -
Looking at includemod.ml and includemod.mli, the first time we encounter this idea of "positivity" is in the
marktype, which is scarcely documented, so it's a bit cryptic. I learned more from the implementation comment ontype positivityin the .ml file, not necessarily because it is very didactic (it focuses on explaining a subtle trick) but because it has meat and something that serves as an example. I wouldn't mind an explanation of this notion of positivity/direction higher in one of the file, with a small example. -
I don't understand why
markandpositivityare two separate mechanisms to trace directions. Maybe they evolve differently for good (unexplained) reasons? Maybe they are partly redundant with each other? Who knows. -
Strictly_positiveis not documented anywhere that I can see, but my best guess is that it is in fact a combination ofPositiveand a separateat_toplevel : boolfield beingtrue. I suspect that the separate-field presentation would be cleaner. -
(see inline comment about
check_modtype_equivwhich I found supsicious)
I confirm we still use it as part of our internal version of dead_code_analyzer (not synced with the public version), but don't let this block progress! I didn't follow developments around |
|
@gasche , the positivity/negativity information is the usual concept of positive/negative position for subtyping used for parameter variance. In particular, strict positivity is the same in both case sand denotes the absence of double negations. For modules and Merlin, this information is useful because it corresponds to the only case where the subtyping relationship is matching implementation item with their inferred types to interface item (compared to the evenly-nested funtor argument cases module F= functor(_:sig
module G: functor(_:sig
val non_strict_pos: int end
end ) -> sig end
end) -> struct let strict_pos = 0 end) We have added a new commit which better separates the semantics of the marking condition and the positivity tracking. With this refactoring is seemed clearer to simplify the |
gasche
left a comment
There was a problem hiding this comment.
I read the new commit that moves "mark" and "positivity" to be in a common submodule, and I like the new result better. Plus you did write some extra documentation, which is nice, thanks! I think that the current state is okay, as far as understandability by others is concerned.
| | Mark_negative | Mark_neither -> false | ||
| The [posivity] field is used in the [cmt_declaration_dependencies] to | ||
| distinguish between directed and undirected edges, and to avoid recording | ||
| matched declarations twice. |
There was a problem hiding this comment.
I think you mean [pos] rather than [positivity]. It is unclear to me how the [pos] type "distinguishes between directed and undirected edges". (Or what this means in the first place. I suppose the "edges" are the dependencies, but what does it mean for a dependency to be undirected? Maybe this is better explained somewhere else, but then you could refer to that place.)
Edit: in fact maybe you don't need to document cmt_declaration_dependencies here anymore: now that the pos information is necessarily to correctly "mark" the submodules, it is used for strictly more than that.
| are inside an auxiliary check function.) | ||
|
|
||
| The [in_eq] field is [true] when we are checking both directions inside of | ||
| module types which allows optimizing module type equality checks. The module |
There was a problem hiding this comment.
I read this sentence as "inside (module types which allows optimizing module type equality checks)", as if certain module types allow this and others don't. I would rephrase this, for example as follows:
The [in_eq] field is [true] when we are checking one direction of module type equality, rather than subtyping.
| match dir.pos with | ||
| | Negative -> (Cmt_format.Undirected, elt2, elt1) | ||
| | Positive -> (Cmt_format.Undirected, elt1, elt2) | ||
| | Strictly_positive -> (Cmt_format.Directed, elt1, elt2) |
There was a problem hiding this comment.
This function does not strike me as something that clearly should be in this module. (It is not obvious why the notion of direction of module-checking would depend on the notion of dependency edges in the cmt format.) Given that it is only used once that I can see, maybe you could move the logic to the caller, which is still inside includemod.ml but not inside Directionality anymore. Later one maybe we would move the logic further away and change the dependency structures between these modules -- or maybe not.
|
With @gasche agreeing with the refactoring inside |
The list of paired uids is stored into the cmt files.
in cmt_declaration_dependencies Co-authored-by: Florian Angeletti <florian.angeletti@inria.fr>
Without the complementary PR that allows to distinguish between interface and implementation uids the test results are wrong.
fdabb7a to
4b1e092
Compare
[tooling] Remember linked declarations (cherry picked from commit b951207)
[tooling] Remember linked declarations
[tooling] Remember linked declarations
[tooling] Remember linked declarations
[tooling] Remember linked declarations
[tooling] Remember linked declarations
[tooling] Remember linked declarations
[tooling] Remember linked declarations
Resolve conflicts from ocaml/ocaml#13308 [tooling] Remember linked declarations
Resolve conflicts from ocaml/ocaml#13308 [tooling] Remember linked declarations
…ures. The bug was most certainly introduced in ocaml#13308 Illustrates issue ocaml#13955
…ures. The bug was most certainly introduced in ocaml#13308 Illustrates issue ocaml#13955
…ures. The bug was most certainly introduced in ocaml#13308 Illustrates issue ocaml#13955
…ures. The bug was most certainly introduced in ocaml#13308 Illustrates issue ocaml#13955 Co-authored-by: Florian Angeletti <florian.angeletti@inria.fr>
* Merge pull request ocaml#13286 from voodoos/distinct-uids-for-interfaces [tooling] Distinct uids for interfaces * Merge pull request ocaml#13308 from voodoos/link-declarations [tooling] Remember linked declarations * Store declaration dependencies in CMS files * Backport directionality fix from upstream ocaml#13956 * Undo format change
Since the change from `cmt_value_dependencies` to `cmt_declaration_dependencies`, the field holds more than value deps (see ocaml/ocaml#13308). In particular, it now also holds the link between the class def in `.ml` and decl in `.mli`. Consequently, during `prepare` in `DeadCode`, the def and decl locations are added to the `DeadObj.equals` table. They used to be added later on, during the `.cmt` analysis, in `DeadObj.tstr` : when `last_class` is the def and the `short` or `path`'s loc is the decl. Adding values to the equals table achieved through `add_equal` which, as a side effect, updates `last_class` to the second parameter if it was the first one. Because the equality is not added at the same time using the new `cmt_declaration_dependencies`, the side effect on `last_class` is also not happening at the same time as before, leading to `last_class` and the `expr`'s `loc` to be different in `collect_references` when they should be equal. Using the `repr_loc` in `DeadObj.tstr` for `last_class` enforces it to be the loc of the decl again.
Since the change from `cmt_value_dependencies` to `cmt_declaration_dependencies`, the field holds more than value deps (see ocaml/ocaml#13308). In particular, it now also holds the link between the class def in `.ml` and decl in `.mli`. Consequently, during `prepare` in `DeadCode`, the def and decl locations are added to the `DeadObj.equals` table. They used to be added later on, during the `.cmt` analysis, in `DeadObj.tstr` : when `last_class` is the def and the `short` or `path`'s loc is the decl. Adding values to the equals table achieved through `add_equal` which, as a side effect, updates `last_class` to the second parameter if it was the first one. Because the equality is not added at the same time using the new `cmt_declaration_dependencies`, the side effect on `last_class` is also not happening at the same time as before, leading to `last_class` and the `expr`'s `loc` to be different in `collect_references` when they should be equal. Using the `repr_loc` in `DeadObj.tstr` for `last_class` enforces it to be the loc of the decl again.
Since the change from `cmt_value_dependencies` to `cmt_declaration_dependencies`, the field holds more than value deps (see ocaml/ocaml#13308). In particular, it now also holds the link between the class def in `.ml` and decl in `.mli`. Consequently, during `prepare` in `DeadCode`, the def and decl locations are added to the `DeadObj.equals` table. They used to be added later on, during the `.cmt` analysis, in `DeadObj.tstr` : when `last_class` is the def and the `short` or `path`'s loc is the decl. Adding values to the equals table achieved through `add_equal` which, as a side effect, updates `last_class` to the second parameter if it was the first one. Because the equality is not added at the same time using the new `cmt_declaration_dependencies`, the side effect on `last_class` is also not happening at the same time as before, leading to `last_class` and the `expr`'s `loc` to be different in `collect_references` when they should be equal. Using the `repr_loc` in `DeadObj.tstr` for `last_class` enforces it to be the loc of the decl again.
This PR generalizes a mechanism that is used to store "value dependencies" in
cmtfiles introduced in 10abdce. It was meant to "track in external tools value declarations between implementations and interfaces".There are two differences in these changes:
cmt_uid_to_decltable (this might require loading a.cmtifile when the uid is from an interface).This information will enable external tools to navigate across the declaration graphs of any value, module, type, etc.
Additionally this information can be used to build equivalence classes of declarations. Enabling external tools to perform powerful refactoring by retrieving every usage of every definition whose uids are in the same equivalence class.
This PR compares uids to prevent noisy self-dependencies to be recorded (similarly to what
record_value_dependencyis doing). Since right now uids are not unique across a compilation unit this means that some links between an implementation uid and an interface uid with the same id are missing. Commit 7f5111b shows the expected test results with #13286 merged.@alainfrisch I think you were the one to introduce
value_dependencieswhich I remove in ec2cc6c. Do you now if that feature is actively used today ? I made a search on Sherlocode that return very few results. One of them is dead_code_analyser which is only available for OCaml < 4.05. The other looks active and compatible with 5.2: reanalyze. It might be safer to keep both for a while to give time for users to transition ?cc @lpw25