Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cover aggregation bug #2867

Open
MarkoMin opened this issue Feb 26, 2024 · 6 comments
Open

cover aggregation bug #2867

MarkoMin opened this issue Feb 26, 2024 · 6 comments

Comments

@MarkoMin
Copy link
Contributor

I'm using both proper and ct on a project and aggregated cover data is weird. Example: I have 95% coverage on a module with proper, but aggregated coverage is smaller than that. When I look into aggregated coverage data, all code that is used is marked green, but there are lines that should be ignored (-specs, fun defs, even comments!) that are marked in red.

I don't know if this is actually tied to rebar3, but I've found nothing mentioning "aggregate" in cover module, so I suspect that this is rebar3 feature.

Debug and diagnostic output gave me nothing, I believe it's algorithm error.

Willing to collaborate on the solution.

@ferd
Copy link
Collaborator

ferd commented Feb 26, 2024

Cover aggregation is done as follows:

Since the aggregation is done implicitly by loading multiple coverage files at once (as supported by cover) and then running the analysis on them, it's likely that any bug in processing is within the OTP library in Erlang, or through potentially issues in the module or file path tracking of individual coverage tracking step.

The actual tracking of which calls are handled or not is handled by the cover module which compiles modules and tracks data in a database, which we can then flush to files. It's possible there are bugs in how we invoke the module, but if the issues are "which lines are instrumented or not", this is likely to lie within the OTP libraries and not rebar3.

@MarkoMin
Copy link
Contributor Author

Turns out it was my fault that I didn't update all cover data after updating a file. More info in OTP bug.

Would it make sense to throw an error when cover data for a module aren't compatible to be aggregated?

@ferd
Copy link
Collaborator

ferd commented Mar 11, 2024

What would make them incompatible? Like changing the source files between runs?

Looking at the other ticket, we might want to aggregate cover data with a "last build" time or some hash and be able to maybe warn when we're like "this coverdata may be stale", but since we don't really track them as build artifacts, it'd need to be pretty specific.

@MarkoMin
Copy link
Contributor Author

What would make them incompatible? Like changing the source files between runs?

Yup

If we can get out of this without introducing new state that is saved between rebar3 runs, that would be awesome.

Maybe we could utilize the fact that module compilation is cached.
I was thinking something like "If coverdata for module X is generated before X is lastly compiled, warn the user. If there are multiple data for module X and at least one is generated before X is lastly compiled, warn the user".

In the former case, worst you can get is outdated (but correct) cover analysis. In the later result it would also make sense to throw an error, because you can get a result with comment lines or empty lines being marked as used/unused - which is incorrect.

@ferd
Copy link
Collaborator

ferd commented Mar 11, 2024

Yeah looking at the touch times would work at least, but will make the checks a tiny bit slower. Probably worth it. OTOH you'll have some issues where some files (eg. .hrl files or parse transforms) may change the behavior of .erl files elsewhere without necessarily updating the .beam files' stamps, so the method won't be perfect. But fixing that sort of requires reusing the whole graph the compiler uses to properly track things, and that would be complex.

The heuristic of "is this file newer?" is probably helpful enough.

@MarkoMin
Copy link
Contributor Author

MarkoMin commented Mar 11, 2024

Yeah looking at the touch times would work at least, but will make the checks a tiny bit slower. Probably worth it. OTOH you'll have some issues where some files (eg. .hrl files or parse transforms) may change the behavior of .erl files elsewhere without necessarily updating the .beam files' stamps, so the method won't be perfect. But fixing that sort of requires reusing the whole graph the compiler uses to properly track things, and that would be complex.

It could change the the behavior of .erl files, but one can't end up with coverdata that targets comments, specs or empty lines (at least I can't imagine so). Behavior can also be changed by modifying tests, so I think we can't (simply) guarantee that coverdata represents current state of the codebase, but we may at least solve the issue of non-code lines being used in coverdata reports.

Resulting guarantee would be: "each code line is either used or unused, each non-code line is ignored"

EDIT: prefix guarantee with "if there is no warning: "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants