Add coverage analysis to offer improved performance #388

rouke-broersma · 2019-03-05T07:55:35Z

This issue follows from #183

Generate mutant coverage (which mutants is actually executed by any test). This could provide useful metrics as of now.

Offer to skip test runs for non covered mutants (hence assumed as survivors)
==> Improve performance for projects with low test coverage.
Identify tests that do not cover any mutant, offer to skip them from each run.
==> Could improve performance for projects with many tests
Optimise executed tests according to mutant coverage (i.e. run only the tests that cover the 'current' mutant)
==> Should dramatically improve performance
Step 1 requires to able to access coverage results from Stryker
Step 2 looks simple
Step 3 requires to control which tests are run
Step 4 require analysis of coverage data to build strategy

rouke-broersma · 2019-03-05T07:56:07Z

From dupdop:

Some update: I have namedpipes running smoothly and stable, after a few fights.
I have reached maturity level 2: (my) Stryker now assumes non-covered mutants are survivors hence do not test them.

Code available here:
https://github.com/dupdob/stryker-net/tree/mutant_coverage

dupdob · 2019-03-05T08:59:27Z

I have a branch ready, just waiting to finish with my current PR.
(Branch currently address use cases 1 and 2)

rouke-broersma · 2019-03-05T10:43:29Z

@dupdob Yes I've seen it. The named pipe seems like a good way to implement this. Step 3 can be achieved with the vstest integration.

To achieve step 4 I was thinking we could create a named pipe from the mutated assembly to a vstest data collector (out of proc as well) and then a named pipe from the data collector to stryker. The datacollector can execute for every single test being run and while in the middle we can receive which mutants have been hit for that testrun from the mutated assembly. Then we can send that information including the test that was run to stryker through the other named pipe. And then we can use vstest to run only those unit tests that are relevant to the mutant.

dupdob · 2019-03-15T10:26:40Z

@Mobrockers
thanks for the proposal, I have made an MVP using pipe communication between the assembly and Stryker, the objective reaching a proof of concept before considering improvement strategies.
Here is what I have achieved and learned so far. Spoiler alert, it is a bit disappointing (at least for netcore environment, it may be better with classical .Net)..

Overall, VsTest architecture is complex, brittle and prone to failure . I often end up having vstest.console.exe processes eating a core in some endless loop, logging nothing.
I have yet to achieve a complete run with NFluent due to above issues. At some point, Stryker runs out of available vstest runners.
Performance are abysmal. To get coverage info per-test, I launch each test in isolation. It is really slow.
~1% of runs failed to provide coverage data. This is probably due to the fact I use appdomain tear down event to send the coverage data through the named pipe. As of now, this is not the most urgent issue; I assume those test covers ALL mutants and have to run each time

The biggest disappointment for me was the fast that vstest provided little more than a facade against dotnet test...

I have various leads and ideas on how to fix those issues, but the fact is it turned more difficult than anticipated...
Coming back to the list:

Stability: No idea, but I assume my code and maybe the vstest stryker runner itself, still requires some clean up and bug fixing. Once code have been hardened and refactored, I hope those issue will disappear
Performance: there is an interesting option (AreTestCaseLevelEventsRequired) that may allow me to capture coverage in a single run (so I have yet to read the doc, if any, about it). There is also the option KeepAlive that looks promising, but this one is documented as not implemented/for future use.
Missing coverage: I hope I will be able to reduce the failure rate, plus I still can devise alternate communication strategies, including real time async, which would at least provide partial coverage in case of failure at tear down.

dupdob · 2019-03-15T11:33:19Z

OK, went through the code, AreTestCaseLevelEventsRequired does not serve the purpose I hoped fore (sequential notification)

rouke-broersma · 2019-03-15T14:16:13Z

The biggest disappointment for me was the fast that vstest provided little more than a facade against dotnet test...

How do you mean? Dotnet test internally uses vstest, not the other way around

rouke-broersma · 2019-03-15T14:28:11Z

Performance are abysmal. To get coverage info per-test, I launch each test in isolation. It is really slow.

~1% of runs failed to provide coverage data. This is probably due to the fact I use appdomain tear down event to send the coverage data through the named pipe. As of now, this is not the most urgent issue; I assume those test covers ALL mutants and have to run each time

Performance: there is an interesting option (AreTestCaseLevelEventsRequired) that may allow me to capture coverage in a single run (so I have yet to read the doc, if any, about it). There is also the option KeepAlive that looks promising, but this one is documented as not implemented/for future use.

I hope using a datacollector can specifically help really much with this problem because of:

https://github.com/Microsoft/vstest-docs/blob/master/docs/extensions/datacollector.md#datacollectionevents

Example:
https://github.com/Microsoft/vstest/blob/master/test/TestAssets/SimpleDataCollector/Class1.cs

I understand that it's only an MVP of course, so I get that you're not immediately going for a double named pipe with all it's added complexity.

dupdob · 2019-03-15T14:28:19Z

Well, I see vstest.console.exe launching dotnet.exe. I guess dotnet integrates some VsTest glue and logic, but from an executable perspective, that's how it works

Command line is:

"C:\Program Files\dotnet\dotnet.exe" exec --runtimeconfig "C:\Users\dupdo\source\repos\NFluent\tests\NFluent.NetStandard.20.Tests\bin\Debug\netcoreapp2.1\NFluent.NetStandard.20.Tests.runtimeconfig.json" --depsfile "C:\Users\dupdo\source\repos\NFluent\tests\NFluent.NetStandard.20.Tests\bin\Debug\netcoreapp2.1\NFluent.NetStandard.20.Tests.deps.json" "C:\Users\dupdo\.nuget\packages\microsoft.testplatform.testhost/15.9.0\lib/netstandard1.5/testhost.dll" --port 55100 --endpoint 127.0.0.1:055100 --role client --parentprocessid 12420 --diag "C:\Users\dupdo\source\repos\NFluent\tests\NFluent.NetStandard.20.Tests\StrykerOutput\2019-03-15.14-48-13\vstest\vstest-log.host.19-03-15_15-29-25_91599_16.txt" --tracelevel 3 --telemetryoptedin false

Ok, I get it now, it launches a dotnet host to execute the test.
thanks for challenging my comment

rouke-broersma · 2019-03-15T14:35:13Z

You are right in that vstest is internally a big can of worms :) It also internally spawns new processes for dotnet.exe, testhost.exe, and sometimes datacollector.exe and other collectors depending on configurations 💃

dupdob · 2019-03-16T14:52:40Z

Quick summary regarding various steps:

is pretty solid so far. It still fails sometimes, and I fixed a bunch of edge cases/timing issues regarding the pipe mechanism. There may still be some, but I also assume that AppDomain unloading event is unreliable
Is working good, albeit providing only a marginal gain from a performance perspective. But it also provides clear hindsight on non covered code
This where I am pretty stuck at. Launching test one by one is so slow: ~1 hour for the 1660 NFluent's tests. I had problem with coverage sometime failing, so I implemented a retry mechanism, so I should get results now.
Logic has been implemented: I have mutant to tests mapping informations, and I am able to run selected tests only. I need to stabilise step 3 before working on 4.

@Mobrockers : I missed your comment regarding DataCollector, will dig into this, indeed it looks what I need to address the issues I face with step 2!

dupdob · 2019-03-16T14:57:57Z

I just had an idea I find interesting: I am fascinated by the mutant->tests mapping information and I had a hunch there was value in those. And I may has just found it.
We could use coverage information in reports to provide hints. E.G:

if a mutant is not covered, suggestion should be around adding a test or refactoring code
otherwise, suggestion could be to consider reviewing some of the tests that cover the survivor.
and we could even cross information for survivors to identify quick win, i.e. need to improve a single test to kill 5-6 mutants.

I am not sure I am clear enough and that it makes sense to you, but I add it as an idea to be considered at a later time.

rouke-broersma · 2019-03-16T15:00:29Z

Perfectly clear to me, sounds awesome!

…

On Sat, Mar 16, 2019, 15:57 Cyrille DUPUYDAUBY ***@***.***> wrote: I just had an idea I find interesting: I am fascinated by the *mutant->tests* mapping information and I had a hunch there was value in those. And I may has just found it. We could use coverage information in reports to provide hints. E.G: - if a mutant is not covered, suggestion should be around adding a test or refactoring code - otherwise, suggestion could be to consider reviewing some of the tests that cover the survivor. - and we could even cross information for survivors to identify quick win, i.e. need to improve a single test to kill 5-6 mutants. I am not sure I am clear enough and that it make sense to you, but I add it as an idea to be considered at a later time. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#388 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADbenDI0H1QABU89DV3UtwYEsHgj4UFIks5vXQZ1gaJpZM4bd_Np> .

dupdob · 2019-03-25T08:44:45Z

Monday update: I had the opportunity to explore and I made a cool discovery: I can use VsTest data collection architecture to forward coverage results to Stryker, which should help avoid the 'double pipe' strategy proposed by @Mobrockers.
Still far from a finished features, though

dupdob · 2019-03-26T06:16:14Z

Quick update: the vstest runner now uses environment variables to transfer coverage data between test assembly and host. It means we no longer have to run tests in isolation to capture coverage.
Getting closer to a working version!

richardwerkman · 2019-03-26T07:16:54Z

Cool! Sounds awesome. Keep up the good work!

rouke-broersma · 2019-03-26T07:29:19Z

Sounds good!

dupdob · 2019-03-26T10:58:03Z

I made a run with NFluent, time is 55 minutes versus 135 minutes without optimisation.
I just pushed the associated code. I would welcome feedbacks using other projects

dupdob · 2019-04-01T08:19:40Z

Monday morning status update: still working on this. As environment variables do not seem target to me (performance and available space concerns), I have migrated back to pipes, but this is a constant struggle to make them work properly. Tests deadlock without identifiable cause, debugging is difficult and I can't find a way to log from the InProcDataCollector

rouke-broersma · 2019-04-01T15:48:27Z

Can you not log to file?

dupdob · 2019-04-01T16:36:23Z

Thanks for the suggestion The problem is that you have to coordinate the the consumption of the data: the mutated assembly side (mutantcontrol side) has no clue when tests start and stop, while the collector side knows but has no direct access to coverage data. I need to capture covered mutants and reset the list between tests The pipe is relatively good for that, as it allows synchronous request reply pattern. I can use a file, but it will be slower than environment variables., hence it is my fallback strategy for now. Envoyé de mon iPhone

…

Le 1 avr. 2019 à 17:48, Rouke Broersma ***@***.***> a écrit : Can you not log to file? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

rouke-broersma · 2019-04-01T16:43:14Z

Sorry for the confusion, I meant in relation to this part

Tests deadlock without identifiable cause, debugging is difficult and I can't find a way to log from the InProcDataCollector

You could debug-log to file from the datacollector no?

dupdob · 2019-04-01T20:25:39Z

Good news, finally spotted the deadlock issue: a bug in my pipe reading class when receiving an empty message. I can keep on improving my code.

rouke-broersma · 2019-04-01T20:50:33Z

Awesome :)

…

On Mon, Apr 1, 2019, 22:25 Cyrille DUPUYDAUBY ***@***.***> wrote: Good news, finally spotted the deadlock issue: a bug in my pipe reading class when receiving an empty message. I can keep on improving my code. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#388 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADbenGY2ojb80pfhgr75p5TAsuP0wOaHks5vcmtDgaJpZM4bd_Np> .

dupdob · 2019-04-02T20:33:50Z

Deadlock is gone, but I still have what looks like race conditions when capturing coverage.
I think it may relates to this issue with Nunit.
If I can confirm this is related to NUnit, I will probably add an option to workaround it until it is fixed on NUnitAdapter side

rouke-broersma · 2019-04-02T20:40:59Z

I think setting the option mentioned in that nunit issue in the runsettings by default might be an acceptable workaround assuming it fixes the issue. It should not interfere with other testrunners.

…

On Tue, Apr 2, 2019, 22:33 Cyrille DUPUYDAUBY ***@***.***> wrote: Deadlock is gone, but I still have what looks like race conditions when capturing coverage. I think it may relates to this issue <nunit/nunit3-vs-adapter#560> with Nunit. If I can confirm this is related to NUnit, I will probably add an option to workaround it until it is fixed on NUnitAdapter side — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#388 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADbenKIFw8R4jZUQ03XjLIzm57jtHhv5ks5vc76vgaJpZM4bd_Np> .

dupdob · 2019-04-03T20:13:35Z

I used it. (Kinda) Interestingly it triggered new deadlock issues. I guess I still have bugs. No worries, I love brainteasers and I am happy to find bugs BEFORE opening a PR.

Note that sequential execution slows test execution.

dupdob · 2019-06-06T20:28:02Z

see #506

rouke-broersma · 2019-07-01T15:09:50Z

Implemented by #506

rouke-broersma added 🚀 Feature request New feature or request to do labels Mar 5, 2019

rouke-broersma mentioned this issue Mar 5, 2019

Integrating with the test runner #183

Closed

This was referenced Mar 7, 2019

WIP: Mutant coverage #392

Closed

feat(Mutant coverage): Analyze mutant coverage to improve performance by filtering tests #393

Closed

richardwerkman mentioned this issue Mar 22, 2019

Feature request: Support for test filters #420

Closed

rouke-broersma removed the to do label Mar 22, 2019

rouke-broersma closed this as completed Jul 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coverage analysis to offer improved performance #388

Add coverage analysis to offer improved performance #388

rouke-broersma commented Mar 5, 2019

rouke-broersma commented Mar 5, 2019

dupdob commented Mar 5, 2019 •

edited

Loading

rouke-broersma commented Mar 5, 2019

dupdob commented Mar 15, 2019

dupdob commented Mar 15, 2019 •

edited

Loading

rouke-broersma commented Mar 15, 2019

rouke-broersma commented Mar 15, 2019

dupdob commented Mar 15, 2019 •

edited

Loading

rouke-broersma commented Mar 15, 2019

dupdob commented Mar 16, 2019

dupdob commented Mar 16, 2019 •

edited

Loading

rouke-broersma commented Mar 16, 2019 via email

dupdob commented Mar 25, 2019

dupdob commented Mar 26, 2019

richardwerkman commented Mar 26, 2019

rouke-broersma commented Mar 26, 2019

dupdob commented Mar 26, 2019

dupdob commented Apr 1, 2019

rouke-broersma commented Apr 1, 2019

dupdob commented Apr 1, 2019 via email

rouke-broersma commented Apr 1, 2019

dupdob commented Apr 1, 2019

rouke-broersma commented Apr 1, 2019 via email

dupdob commented Apr 2, 2019

rouke-broersma commented Apr 2, 2019 via email

dupdob commented Apr 3, 2019

dupdob commented Jun 6, 2019

rouke-broersma commented Jul 1, 2019

Add coverage analysis to offer improved performance #388

Add coverage analysis to offer improved performance #388

Comments

rouke-broersma commented Mar 5, 2019

rouke-broersma commented Mar 5, 2019

dupdob commented Mar 5, 2019 • edited Loading

rouke-broersma commented Mar 5, 2019

dupdob commented Mar 15, 2019

dupdob commented Mar 15, 2019 • edited Loading

rouke-broersma commented Mar 15, 2019

rouke-broersma commented Mar 15, 2019

dupdob commented Mar 15, 2019 • edited Loading

rouke-broersma commented Mar 15, 2019

dupdob commented Mar 16, 2019

dupdob commented Mar 16, 2019 • edited Loading

rouke-broersma commented Mar 16, 2019 via email

dupdob commented Mar 25, 2019

dupdob commented Mar 26, 2019

richardwerkman commented Mar 26, 2019

rouke-broersma commented Mar 26, 2019

dupdob commented Mar 26, 2019

dupdob commented Apr 1, 2019

rouke-broersma commented Apr 1, 2019

dupdob commented Apr 1, 2019 via email

rouke-broersma commented Apr 1, 2019

dupdob commented Apr 1, 2019

rouke-broersma commented Apr 1, 2019 via email

dupdob commented Apr 2, 2019

rouke-broersma commented Apr 2, 2019 via email

dupdob commented Apr 3, 2019

dupdob commented Jun 6, 2019

rouke-broersma commented Jul 1, 2019

dupdob commented Mar 5, 2019 •

edited

Loading

dupdob commented Mar 15, 2019 •

edited

Loading

dupdob commented Mar 15, 2019 •

edited

Loading

dupdob commented Mar 16, 2019 •

edited

Loading