Support tests written on interpreted languages #778

ligurio · 2020-11-18T08:43:26Z

In a most C/C++ projects only small part of tests written on C/C++. Other parts are implemented on Interpreted languages like Python, Perl etc, because they are cheaper in implementation and maintenance. As far as I understand Mull supports only tests written on C/C++. It means that it cannot kill all produced mutants even there are exists tests that can do it, because Mull don't run them.

It would be good to add ability to execute all possible tests exist in a project. Otherwise mutation testing using Mull is imperfect.

jnohlgard · 2020-11-18T09:20:31Z

I think this is a good suggestion. To implement this I assume that we would have to be able to create a mutated standalone binary that can be run in the same way as the original binary. Not sure how to take this from the current binary in memory, but one potential solution would involve calling a normal linker to be able to create a freestanding executable. Given the wide range of build systems and how they are invoked I think the easiest way might even be to provide hooks where Mull calls an external script/program for linking and another external program for executing the tests. That would allow for full flexibility regarding how test binaries should be built and how they are invoked.

Some open questions that I see around this approach:

Does it make sense to implement something like this in mull-cxx, or is this out of scope for this tool?
Are there any other mutation testing products (for C++ code) which already have this feature?
Running multiple mutations in parallel may pose an obstacle when built this way. Who should be responsible to ensure that each parallel run has its own working area, Mull or the external scripts?
Should each mutation have its own directory, or is it sufficient to have separate working directories per worker thread so that parallel runs are not stepping on each others' files?
Who cleans up object files and other temporaries after a mutation run?

AlexDenisov · 2020-12-07T22:30:52Z

I'd love to see Mull supporting this use case. There are, however, obstacles.

Here are some technical details.

As @gebart mentioned, this approach requires Mull to generate a standalone executable. We can get all the objects files from a program by compiling embedded bitcode to machine code (pretty much the same way we do now with the JIT engine). However, we'd also need to get linker flags from a user so that we can emit correctly linked executable. An alternative approach - store all the object files in some folder and ask a user to link them manually (e.g. clang *.o -o the-tests).
Mull relies on the code coverage info to generate fewer mutations. Currently, Mull gathers coverage by running tests before generating any mutants. In the case of a standalone executable, we can no longer run the tests via JIT so we have to add support for external coverage information (e.g., gcov, lcov, etc.). In this case, we will also lose the call-tree feature which is useful when combined with a specific test framework.
Currently, Mull uses indirect stubs + function pointers to control which mutants to turned on/off. For a standalone executable, we'd need another approach. The one that comes to my mind is to generate all the mutants and hide them behind a conditional, e.g.

if(env("specific-mutant-id")) {
  // mutated code
} else {
  // original behavior
}

This approach actually has some benefits (e.g. mutant generation will be a bit faster and the size of the program will be smaller). But it also has at least one drawback - if a hot path in the program contains many mutations, then we may lose many CPU cycles with all these if(env(....)) checks. I don't know how bad this drawback is, but this is certainly something to be aware of.

Inter-process communication. Currently, Mull's driver is a single program that does all the bookkeeping in one address space. In the case of a standalone executable, we'd need to find a way to collect all the information in one place to generate reports and whatnot. While this is not a blocker and not the hardest problem, it still adds certain complexity and overhead on top of everything else.
General performance implications: running mutation testing against integration tests (as opposed to unit tests) may appear impractical and useless.

Some comments on @gebart's questions

Does it make sense to implement something like this in mull-cxx, or is this out of scope for this tool?

It is possible to implement as part of mull-cxx, but it is probably out of scope because of the reasons mentioned above.

Are there any other mutation testing products (for C++ code) which already have this feature?

There is at least mutate_cpp that can be used to achieve this goal.

Running multiple mutations in parallel may pose an obstacle when built this way. Who should be responsible to ensure that each parallel run has its own working area, Mull or the external scripts?
Should each mutation have its own directory, or is it sufficient to have separate working directories per worker thread so that parallel runs are not stepping on each others' files?

IMO, these two share the same issue: there might be shared resources that prevent parallelization.

Who cleans up object files and other temporaries after a mutation run?

This is an easy decision, IMO. We can just store the temporaries in a temporary folder and then either let OS take care of them, or ask users to do it.

ligurio · 2020-12-10T17:10:08Z

I feel your pain with support tests written on interpreted languages!

But let's take a look at this from user point of view. Imagine you have a C/C++ project where someone wants to use mutation testing. In most projects unit tests covers a small part of a project, much more smaller, than "big" tests like integration/system tests and these tests often written on interpreted languages (like Python). So Mull have limited applicability in a real projects. It produces results that are not representative. Support of a single type of tests severely restricts use of Mull on practice.

I see several ways for those who like me wants to integrate Mull to a real projects:

use Mull for unit tests (because it is convenient, documented, fast and supported tool) and make 'self-made' tool for other types of tests. It is possible and I already did it in another project (using some Python code and tool like Frama-C-Mutation that generate mutations in C/C++ source code).
switch to other tool that supports integration tests (like mutate_cpp). But I personally don't like that tool.

Can you implement limited support for integration tests?
Support may look like this:

dumb mutator that produce mutants for a whole project or for files in a specified path one by one
run project test-suite using specified command line for each mutant and save exit code for each run
produce a report

AlexDenisov · 2020-12-18T08:49:37Z

Just to clarify a bit: I'd be happy to see this feature. My point was that it is not a trivial task and would require some fundamental changes in how Mull works.
First of all, we'd need to get rid of JIT which is certainly a good idea since it causes lots of troubles and adds a lot of limits.

It's worth mentioning that removing JIT will also lead to performance degradation. In this PR #793, I dropped part of JIT for controlling mutations and the test run against OpenSSL is now 6 times slower, ~100s vs ~600s.

I'd need to find a way to speed it up a bit.

ligurio · 2020-12-18T09:59:43Z

Personally I don't care about performance (total execution time) of mutation testing at all. For me it's ok to run Mull for a night or a couple of days and then analyze results. Mutation testing is not a usual testing like e2e, unit tests or even performance tests. I'm not sure it is worth to add MT as a pre-commit check and so it is not huge difference 100s or 600s. For those who want to integrate MT to continuous integrate much more important ticket with mutation testing only for changed source code (see #790).

With fixed #790 integration of MT to a development process would look like this:

Run Mull on a whole source code base or only on a critical parts of project.
Analyze results.
Fix tests that missed mutants and suppress non-interesting mutants.
Run Mull for a changes in source code only, it doesn't matter manually or as a CI job on pre-commit or every push.

AlexDenisov · 2021-01-07T23:43:52Z

Just a follow-up: with the recent changes in place (#798) Mull can handle tests written in interpreted languages.
However, there is no timeline yet on when Mull will be able to handle such a use case out of the box.

Please, stay tuned and let me know if you need it badly.

AlexDenisov · 2021-06-11T21:14:22Z

Done via #854 #855
The tutorial is coming up soon as well.

AlexDenisov · 2021-06-13T09:42:35Z

The tutorial is here: #882

ligurio mentioned this issue Nov 18, 2020

On the way to v1.0 #775

Open

8 tasks

ligurio changed the title ~~Add ability to catch mutants using tests written on other languages~~ Support tests written on interpreted languages Dec 10, 2020

AlexDenisov mentioned this issue Dec 23, 2020

PSA: Moving away from JIT #798

Closed

AlexDenisov mentioned this issue Jan 9, 2021

Duplicates and "False Positives" #782

Closed

AlexDenisov mentioned this issue Mar 1, 2021

[Refactoring] Remove Mutant dependency from reporters #835

Closed

This was referenced Apr 21, 2021

Revamp CLIOptions #853

Merged

Introduce mull-runner #854

Merged

Teach mull-runner about separate test programs #855

Merged

ligurio mentioned this issue Apr 23, 2021

Mull always expects executable binary #856

Closed

AlexDenisov closed this as completed Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support tests written on interpreted languages #778

Support tests written on interpreted languages #778

ligurio commented Nov 18, 2020

jnohlgard commented Nov 18, 2020

AlexDenisov commented Dec 7, 2020

ligurio commented Dec 10, 2020 •

edited

AlexDenisov commented Dec 18, 2020

ligurio commented Dec 18, 2020

AlexDenisov commented Jan 7, 2021

AlexDenisov commented Jun 11, 2021

AlexDenisov commented Jun 13, 2021

Support tests written on interpreted languages #778

Support tests written on interpreted languages #778

Comments

ligurio commented Nov 18, 2020

jnohlgard commented Nov 18, 2020

AlexDenisov commented Dec 7, 2020

ligurio commented Dec 10, 2020 • edited

AlexDenisov commented Dec 18, 2020

ligurio commented Dec 18, 2020

AlexDenisov commented Jan 7, 2021

AlexDenisov commented Jun 11, 2021

AlexDenisov commented Jun 13, 2021

ligurio commented Dec 10, 2020 •

edited