Ensure mutated code is always valid #301

maks-rafalko · 2018-04-09T20:26:26Z

This PR adds a check in unit tests that mutated code is always valid.

Extracted from #262 and this comment in particular

For reference:

Created Mutant with invalid syntax must be considered as a bug in Infection.

Concerns about this PR:

It slows down Mutators tests (from 5s to 15s)
I'm not sure it's the best way of PHP linting, but this is the simplest one that I was able to find

With this new assertion we are sure that at least tested cases produce valid code for each Mutator.

Thoughts?

maks-rafalko · 2018-04-09T20:36:44Z

tests/Mutator/Boolean/Yield_Test.php

@@ -27,13 +27,19 @@ public function provideMutationCases(): \Generator
            <<<'PHP'
 <?php

-(yield $a => $b);
+function test()


Before this change, this mutator was failing with PHP Fatal error: The "yield" expression can only be used inside a function in /tmp/XXX during linting

sanmai · 2018-04-10T08:55:31Z

Because of the slowdowns IMO we should not do this in general and by default. By request and during CI - that's sure.

Another way to lint is to use fork - eval($code) - waitpid cycle. And then look at the exit status pcntl_wexitstatus to see if there was an error or now. Since this is our code, we can guarantee lack ofside effects, and with no side effects we can expect no infinite cycles. This will be as expensive as a fork call. I believe exec is more costly. Please disregard my previous comment.

Scheme with fork could be even faster if we won't wait for a single process to finish, but rather collect all exit statuses later at some point.

theofidry · 2018-04-10T11:52:35Z

I'm not a big fan :/

While I do agree it would be better to keep generating valid code, there is two things I'm not fond of in that PR:

It is slower. The point of avoiding to generate redundant or invalid mutants would be to be faster because evaluating them would be more expansive than checking if they should be created in the first place; It doesn't look like to be the case here as highlighted by @sanmai and your perf results
It relies on the file-system. This is less of a concern right now but in the future it would be nice if we could do some monkey-patching via the autoloading rather than dumping the file and be dependent on the FS which is harder to parallelise properly and slower. That said we're not there yet so it's not the main point here

Maybe an alternative would be to try to be smarter in the way we generate the mutants? e.g. not removing a return statement if a return value is expected and given in the signature.

sanmai · 2018-04-10T12:32:20Z

Here's a PoC of a forking linter:

function eval_fork($code)
{
    $pid = pcntl_fork();

    if ($pid == 0) {
        // Child
        set_time_limit(1);
        error_reporting(0);
        eval($code);
        exit();
    }

    if ($pid == -1) {
        // fork() failed
        return false;
    }

    // we are the parent
    pcntl_wait($status);

    return 0 == $status;
}

var_dump(eval_fork('$a = 1;')); // true

var_dump(eval_fork('$a = 1 /* ; */')); // false

$code = '<?php if (false) {class A extends B {}}';
var_dump(eval_fork('?>' . $code)); // true

One call above takes approximately 11 ms.

sanmai · 2018-04-10T13:50:34Z

tests/Mutator/AbstractMutatorTestCase.php

+
+    private function assertSyntaxIsValid(string $realMutatedCode)
+    {
+        exec(sprintf('echo %s | php -l', escapeshellarg($realMutatedCode)), $output, $returnCode);


proc_open seems like a good fit here; there will be more LOC but less syscalls

I'd try with eval_fork first, though.

theofidry · 2018-04-10T14:06:53Z

@sanmai should we start to investigate solutions like amphp? I used them for Box to process stuff in parallel and so far it was pretty good. Much more performant that whatever was done with the Symfony Process which has a design flaw for parallel processing

maks-rafalko · 2018-04-10T20:21:35Z

@theofidry, unfortunately, I didn't get your comment at all.

It is slower. The point of avoiding to generate redundant or invalid mutants would be to be faster because evaluating them would be more expansive than checking if they should be created in the first place;

I would like to highlight that this PR checks PHP code validity only in the PHPUnit Test Suite. Nothing has been changed in the infection itself. We don't lint Mutant code in the normal usage.

This PR just double checks that Mutator generate the valid syntax in each Test Case in @dataProvider.

It relies on the file-system.

What do you mean here? exec(sprintf('echo %s | php -l', escapeshellarg($realMutatedCode)), $output, $returnCode); uses STDIN as far as I can say. No code is persisted to the FS in these tests.

Maybe an alternative would be to try to be smarter in the way we generate the mutants? e.g. not removing a return statement if a return value is expected and given in the signature.

That's right. And adding a test, that confirms mutated code is valid, is a great addition IMO

@sanmai thanks for the suggestions, I will check it when I have a chance. From what I can see now, exec() takes 100ms on my machine, so seems like your solution is much faster.

The only one concern here is that the code is evaluated (with eval()). I'm pretty sure there can be some difficulties with it, e.g. with autoloading. But it requires testing for sure

sanmai · 2018-04-11T00:07:24Z

I'm pretty sure there can be some difficulties with it, e.g. with autoloading.

We can no-op the code with if (false) { ... code ... } and/or with function() { ... code ... };, or like so.

var_dump(eval_fork('class A extends B {}')); // false
var_dump(eval_fork('if (false) {class A extends B {}}')); // true
var_dump(eval_fork('if (false) {yield true;}')); // false
var_dump(eval_fork('if (false) {function () {yield true;};}')); // true

Doing include $tmpfile; instead of eval() won't help much. You would have to apply these workarounds this way or another.

sanmai · 2018-04-11T02:23:12Z

function proc_open_linter($code) {
    // We have to override stdout and stderr here,
    // else they'll get connected to *our* stdin/stderr
    // flooding them with errors
    $process = proc_open('php -l', [
        ['pipe', 'r'],
        ['pipe', 'w'],
        ['pipe', 'r'],
    ], $pipes);

    if (!is_resource($process)) {
        return false;
    }

    // Pass our code
    fwrite($pipes[0], $code);
    fclose($pipes[0]);

    // PHP won't write anything of interest into stderr,
    // no point in looking there at:
    // "Errors parsing Standard input code"

    return proc_close($process) == 0;
}

var_dump(proc_open_linter('<?php return 1;')); // bool(true)
var_dump(proc_open_linter('<?php yield true;')); // bool(false)
var_dump(proc_open_linter('<?php var_dump(true); var_dump(true)')); // bool(false)

I get about 40-50 ms per invocation

theofidry · 2018-04-11T02:32:57Z

@borNfreee nevermind my comment; I should have checked the diff :)

maks-rafalko · 2018-04-11T18:43:45Z

@sanmai with proc_open_linter I have the same 100ms as with exec(), it has the same performance.

I would stick to exec() then, because:

it's a one-liner
it takes the same time
it works even on windows
it displayer error (Fatal error: The "yield" expression can only be used inside a function in) automatically while proc_open_linter does not

Regarding pcntl_fork(): it requires additional extension to be installed. Neither my local PHP setup nor our Docker images have it (and I pretty much sure many of the developers as well). So this is an additional dependency that we can avoid

maks-rafalko · 2018-04-12T05:24:56Z

Thank you @sanmai for your help

Ensure mutated code is always valid

cac00f5

maks-rafalko added the RFC label Apr 9, 2018

Use piping instead of creating a file for php -l linting commandline

3466d90

maks-rafalko force-pushed the check-syntax branch from 0fe2318 to 3466d90 Compare April 9, 2018 20:28

maks-rafalko commented Apr 9, 2018

View reviewed changes

sanmai mentioned this pull request Apr 10, 2018

Syntax errors are not accounted for, not counted #262

Closed

sanmai reviewed Apr 10, 2018

View reviewed changes

maks-rafalko added this to the 0.9.0 milestone Apr 11, 2018

maks-rafalko removed the RFC label Apr 11, 2018

maks-rafalko merged commit b0a34c2 into master Apr 12, 2018

theofidry deleted the check-syntax branch April 12, 2018 20:55

sanmai mentioned this pull request Nov 15, 2018

Enhancement: Implement ArgumentRemovalArrayReplace mutator #565

Closed

3 tasks

maks-rafalko mentioned this pull request Aug 12, 2021

Detect syntax errors during mutation analysis and differentiate them from all errors #1555

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure mutated code is always valid #301

Ensure mutated code is always valid #301

maks-rafalko commented Apr 9, 2018 •

edited

maks-rafalko Apr 9, 2018 •

edited

sanmai commented Apr 10, 2018 •

edited

theofidry commented Apr 10, 2018 •

edited

sanmai commented Apr 10, 2018 •

edited

sanmai Apr 10, 2018 •

edited

theofidry commented Apr 10, 2018

maks-rafalko commented Apr 10, 2018 •

edited

sanmai commented Apr 11, 2018 •

edited

sanmai commented Apr 11, 2018 •

edited

theofidry commented Apr 11, 2018

maks-rafalko commented Apr 11, 2018 •

edited

maks-rafalko commented Apr 12, 2018

Ensure mutated code is always valid #301

Ensure mutated code is always valid #301

Conversation

maks-rafalko commented Apr 9, 2018 • edited

maks-rafalko Apr 9, 2018 • edited

Choose a reason for hiding this comment

sanmai commented Apr 10, 2018 • edited

theofidry commented Apr 10, 2018 • edited

sanmai commented Apr 10, 2018 • edited

sanmai Apr 10, 2018 • edited

Choose a reason for hiding this comment

theofidry commented Apr 10, 2018

maks-rafalko commented Apr 10, 2018 • edited

sanmai commented Apr 11, 2018 • edited

sanmai commented Apr 11, 2018 • edited

theofidry commented Apr 11, 2018

maks-rafalko commented Apr 11, 2018 • edited

maks-rafalko commented Apr 12, 2018

maks-rafalko commented Apr 9, 2018 •

edited

maks-rafalko Apr 9, 2018 •

edited

sanmai commented Apr 10, 2018 •

edited

theofidry commented Apr 10, 2018 •

edited

sanmai commented Apr 10, 2018 •

edited

sanmai Apr 10, 2018 •

edited

maks-rafalko commented Apr 10, 2018 •

edited

sanmai commented Apr 11, 2018 •

edited

sanmai commented Apr 11, 2018 •

edited

maks-rafalko commented Apr 11, 2018 •

edited