Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Improve CLI report #222

Open
theofidry opened this issue Mar 14, 2018 · 9 comments
Open

[RFC] Improve CLI report #222

theofidry opened this issue Mar 14, 2018 · 9 comments
Labels

Comments

@theofidry
Copy link
Member

theofidry commented Mar 14, 2018

After playing a bit for a demo I found a few things that were either bugging me or that I think are a bit confusing.

This discussion was on Slack initially but I think putting it here is more appropriate. So right now here's the current CLI report:

Current report
You are running Infection with xdebug enabled.
    ____      ____          __  _
   /  _/___  / __/__  _____/ /_(_)___  ____
   / // __ \/ /_/ _ \/ ___/ __/ / __ \/ __ \
 _/ // / / / __/  __/ /__/ /_/ / /_/ / / / /
/___/_/ /_/_/  \___/\___/\__/_/\____/_/ /_/

Running initial test suite...

PHPUnit version: 7.0.2

    8 [============================] < 1 sec

Generate mutants...

Processing source code files: 2/2
Creating mutated files and processes: 6/6
.: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

M.....                                               (6 / 6)

6 mutations were generated:
       5 mutants were killed
       0 mutants were not covered by tests
       1 covered mutants were not detected
       0 errors were encountered
       0 time outs were encountered

Metrics:
         Mutation Score Indicator (MSI): 83%
         Mutation Code Coverage: 100%
         Covered Code MSI: 83%

Please note that some mutants will inevitably be harmless (i.e. false positives).

So here's the list of my nitpicks from a potential a first-time user:

  • What is used to run the initial test suite? How long does it take? There is nothing to help to debug here
  • I left the command run for some time and looked away. When I came back I had the result but no idea how long it took
  • Where is the log file generated?
  • I only know mutant killed and escaped: what is not covered by tests, not detected, errors and time outs?
  • I know Mutation Score, why do I have 3 of them?

Also although I think it would be cool at some point to be able to say "execute all the tests" or specific tests for non-covered mutants, I think it's too early for this extension point and for now way too expansive. As such I would just suggest to drop the notion of "not covered" and completely ignore non-covered code for now.

But because complaining is easy, here's a suggestion for improvements:

Suggested report
# You are running Infection with xdebug enabled.: removed as displayed in the initial run block
    ____      ____          __  _
   /  _/___  / __/__  _____/ /_(_)___  ____
   / // __ \/ /_/ _ \/ ___/ __/ / __ \/ __ \
 _/ // / / / __/  __/ /__/ /_/ / /_/ / / / /
/___/_/ /_/_/  \___/\___/\__/_/\____/_/ /_/

Running initial test suite...
    Executing "/path/to/phpunit -cconfig_file ...other options"

# Show full info; just make sure the progress mode is used 
PHPUnit 7.0.2 by Sebastian Bergmann and contributors.

Runtime:       PHP 7.2.2 with Xdebug 2.6.0
Configuration: /Users/tfidry/Project/mutation-testing-demo/phpunit.xml.dist

..                                                                  2 / 2 (100%)

Time: 36 ms, Memory: 4.00MB
OK (2 tests, 2 assertions)

Generate mutants...

Processing source code files: 2/2
Creating mutated files and processes: 6/6
# Additional blank line here
.: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

M.....                                               (6 / 6)

// Memory usage: 12.19MB (peak: 12.31MB), total time: 1.15s  # Log this bit of info

Generating the report "/path/to/infection-log.txt".

6 mutations were generated:
       5 mutants were killed: #renamed + group the killed mutants
             - 5 tests failures
             - 0 PHP errors
             - 0 time outs
       1 mutants escaped #renamed

Mutation Score Indicator (MSI): 83% # Simplify the metric provided

Please note that some mutants will inevitably be harmless (i.e. false positives).

WDYT?

@BackEndTea
Copy link
Member

As mentioned in slack, i'm a fan of making --only-covered the default, and add a --non-covered option.

And if we are generating an extensive CLI report, i'd like to use some kind of verbosity flag

@theofidry
Copy link
Member Author

What would you use the --non-covered option for though? For now we don't do anything about non-covered mutants so we actually waste time parsing files we are not going to run the tests for

@BackEndTea
Copy link
Member

Completeness is one thing.

It also helps to see what mutations would be created on not yet tested code.

@maks-rafalko
Copy link
Member

maks-rafalko commented Mar 15, 2018

First of all, thanks as always for your ideas guys 👍

I would like to move each point to a separate issue because it's too hard to follow the discussion, too many things are being discussed. Here are my 2 cents:

What is used to run the initial test suite? How long does it take? There is nothing to help to debug here

Could you please clarify what does user want to debug here? I agree that currently it's not clear how long does it take to run initial tests suite with this progress bar

53 [------------------------->--] 4 secs.

But I'm not sure we need to display the whole PHPUnit's output as it can be too long (imagine 2000 tests, or imagine --tap format for PHPSpec that displays each test case on a separate line)

I would propose here something like progress bar with percentage:

55/293 [▓▓▓▓▓░░░░░░░░░░░░░░░░░░░░░░░] 18%

Again, we can use Verbosity level to display percentage by default, or full output with more verbose level.


I left the command run for some time and looked away. When I came back I had the result but no idea how long it took

This is what I've been dreaming about last days ;) Just one minor suggestion, instead of

Memory usage: 12.19MB (peak: 12.31MB), total time: 1.15s

display

Total time: 1.15s, memory usage: 12.19MB (peak: 12.31MB).

as I think time - is what bothers people a lot with MT, not the memory usage.


Where is the log file generated?

Nice one. One addition from me here is to display all registered loggers, not just text one. For example:

Generating reports:
    - text: "/path/to/infection-log.txt".
    - summary: "/path/to/summary.txt".
    ...

Last two points (about killed/escaped/timeouts and different kinds of MSI) are much more complicated.

For now we don't do anything about non-covered mutants so we actually waste time parsing files we are not going to run the tests for

Yes, we are wasting time parsing them, but I can't say this is useless metric. Let me past a quote from our docs:

[...Code Coverage is 65% ...] MSI is 47%. This means that 47% of all generated mutations were detected (i.e. kills, timeouts or fatal errors). Given the code coverage of 65%, there is a 18% difference so Code Coverage was a terrible quality measurement in this example.

I like this example, because it shows that the test coverage metric (65%) of a particular project does not cover all branches/edge cases. Mutation testing shows here that there is a room for improvements.

On the other side, when you have 5% of Test Coverage, you can have 100% of Covered Code MSI. Personally, I like this mutation score much more than just MSI, because it doesn't force you to write tests for everything, but only force you to write maybe few, but good tests.

What I don't quite understand is Mutation Code Coverage.

@theofidry
Copy link
Member Author

What is used to run the initial test suite? How long does it take? There is nothing to help to debug here

Could you please clarify what does user want to debug here? I agree that currently it's not clear how long does it take to run initial tests suite with this progress bar

I think we can skip the why for now as it should be answered in the documentation rather than here I think.

What I'm missing from the initial test run here is:

  • What is being executed exactly? Is my config alright? So I think showing the exact comment would be important here
  • Regarding the progress of the tests, I like the progress bar rather than the . progress, as you say it's less verbose. If we can manage that or alternatively allow the full output to be able to see the progress but remove it once the tests are good. Anyway I think we can discuss this specific point separately.
  • How long it took: mostly to know which time was taken out of the total time. Turns out that in some cases this is the slowest part for me :)

I like this example, because it shows that the test coverage metric (65%) of a particular project does not cover all branches/edge cases. Mutation testing shows here that there is a room for improvements.

I actually totally agree with this. However I think there is some confusion here. So imagine you have the following files with their respective coverage:

src/
  Foo.php - 100% 
  Bar.php - 100%
  Baz.php - 0%

So the total code coverage is 66%. Right now (or at least from my understanding) Infection takes those 3 files as an input. For the sake of the example let's say 2 mutants are generated for the first two and 1 for the last, the result would be:

5 mutants generated
  2 killed
  2 escaped (as 1 mutant in Foo and another in Bar did not cause the tests to fail)
  1 not covered (there is not test for Baz so they are not covered)

MSI = (2 / 5) * 100 = 40%
MCC = ((5 - 1) / 5) * 100 = 80%
CCMSI = (2 / (5 - 1)) * 100 = 50%

My point here is that mutating a file that is not covered is useless (at least right now). The reasoning being that the code coverage tells you what is not tested. If it is not tested, it cannot be killed, so not point measuring. So in the case above you would instead have:

5 mutants generated
  2 killed
  2 escaped (as 1 mutant in Foo and another in Bar did not cause the tests to fail)

CCMSI = (2 / 4) * 100 = 50%

And I would actually argue that I didn't encounter this notion of "not covered" mutant. So to me the MSI is nbr of mutant killed / nbr of mutant generated which is actually the CCMSI above (except the above is in percentage).

In other words, I think we agree on the purpose: MSI is more accurate than code coverage. In the example above it shows that even though the code coverage is of 66%, actually only 50% of it is actually tested (and maybe that's what the CCMSI should be instead: CC * MSI / 100 so here 33%).

If you disagree I think we should go ahead and discuss about this in another issue :)


So unless anyone has a different opinion, I think there's a couple of things we can already implement. The remaining points to discuss are:

  • The initial run
  • The metrics summary

which I think would be best discussed in dedicated issue.

@sanmai
Copy link
Member

sanmai commented Mar 19, 2018

Mutating a file that is not covered isn't as useless as it seems. Think of it as a measure of potential defects. Every "not covered" mutant is a potential problem lurking out where, a problem for which one should write a test.

(So far all not covered mutant I found were harmless, but that's different question.)

@theofidry
Copy link
Member Author

theofidry commented Mar 19, 2018

@sanmai I don't really see the point still tbh. You don't need mutation testing for that: path code coverage is enough and besides being cheaper (it's way faster to get), it is mandatory for Infection right now

@sanmai
Copy link
Member

sanmai commented Mar 19, 2018

Well, with what I agree with you is that actually mutating a non-covered file is really pointless. It's uncovered so by definition no tests will show a thing. Only counting possible mutations should be enough. Am I still not any closer to your thinking?

@theofidry
Copy link
Member Author

theofidry commented Mar 19, 2018

Only counting possible mutations should be enough. Am I still not any closer to your thinking?

Yes, except I'm not sure I really see the benefit of that :)

It does have a cool factor as you can see that there would be X more mutants generated if you covered those files, but let's keep in mind that this is quite expansive: we are parsing each of those files with PHP-Parser. So I don't find that the performance tradeoff we are doing here is worth the little (IMO) value this information provides

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants