Skip to content

Reports with locations in module types are inconsistent #50

@fantazio

Description

@fantazio

Context

As exposed by #49, if multiple modules include the same module type in their signature, then values from that module type may be reported multiple times. Additionally, the reported values do not expose the module types' names.
The tests introduced by #49 are sufficiently minimal to effectively reproduce the bug.

Furthermore some values exported by module types are not reported (found in the old tests). I could trace a reason for the absence of reports back to fa0f60f.
E.g.

  1. In examples/using_make/dir/mod.mli, the module type G exposes 2 values (is_even and is_odd). They are used in examples/using_make/dir/mod.ml but not externally. A similar case can be found with T.five in examples/using_make/exported.mli.
  2. In examples/using_make/dir/sub_mod.mli the module type T exposes nothing which is not used at all (not even in Sub_mod). T is used in different places (as module type of module I and of the I parameter of functor F). The values nothing defined in modules of type T are also exported, never used and remain unreported.

Looking back at #49, I chose not to expect Bsa_api.PARAM.used_by_functor_app (used once internally at examples/using_dune/lib/values/builder_sig_api/bsa.ml:6) in the reports of the test scenarios threshold-1 and threshold-3-0.5. I made this choice because the added tests were motivated by changes in reports observed on Frama-C when using the dead_code_analyzer in OCaml 5.3, and I have not observed that values only used by functor application were reported. This is arbitrary, with the intent to be consistent with the current behavior.

Consequently, there is an ambiguity in the semantics of unused exported values reports when module types are involved.

Suggested Solutions

The easy way

Do not report any value exported via module type. This is supposed to be the current semantic (as far as I understand it). This means that the expected reports in #49 must be updated to not expect reported values from Bsa_api and Imt_modtype. Currently, we observe reports and duplicates in #49 because, although the location is the same, the values are "re-exported" by the modules including the module types in their signature. The "re-export" is a bug related to the selection of the sourcepath for .cmi (see 6294e18).

It has the benefit of removing the duplicates observed in #49. Reducing noise is important (especially on large projects). It also defines a very simple report semantic.
It also means that the analyzer will have known "false negatives" (not real FN in the sense that this is the expected behavior) and may produce few results on projects that heavily use module types.

The clean way

Report values exported via module types and consider all the linked definitions as one. A value declared in a module type would be reported as unused if either the module type is unused or if all the "re-exports" of the value are unused.
The expected results of examples/using_make should be updated, and those for Bsa_api.PARAM.used_by_functor_app in #49 also. Note that for the latter, Bsa exports the module Parameter : Bsa_api.PARAM, so the use of Parameter.used_by_functor_app is a use of Bsa_api.PARAM.used_by_functor_app, and Parameter.used_by_functor_app cannot be reported because it would not be tracked individually.

It has the benefit of having less FN for projects heavily using module types than the easy way while removing the duplicates observed in #49. It also defines a very simple report semantic.
The main challenge, I think, will be with functor applications. There may be some corner cases hiding in the shadows.

The exhaustive way

Report values exported via module types and those "re-exported" by individual modules. A value declared in a module type would be reported as unused if either the module type is unused or if all the "re-exports" of the value are unused. A re-export is reported as unused if it is unused and other "re-exports" are used. For individual modules, the reports would reference the locations where the module types are used (which is where the "re-exports" happen).
This is an improvement on the clean way where e.g. Bsa_use_less.sometimes_used would be reported as unused (it is declared by Bsa_api.S and only the "re-exports" in Bsa and Bsa_builder.Make are used) and its location would be examples/using_dune/lib/values/builder_sig_api/bsa_use_less.mli:1

With this solution, a user would know all the exported values that are unused and be able to selectively reduce signatures where desired.
However, defining new signatures instead of using available module types might not be a preferred way to clean the codebase, and all the "re-exports" reported will be considered as noise. This also looks like a lot more work than the clean way.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions