Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutations should be skipped base on individual time, not total time #1965

Open
danepowell opened this issue May 17, 2024 · 9 comments · May be fixed by #1966
Open

Mutations should be skipped base on individual time, not total time #1965

danepowell opened this issue May 17, 2024 · 9 comments · May be fixed by #1966
Labels

Comments

@danepowell
Copy link

danepowell commented May 17, 2024

Is your feature request related to a problem? Please describe.

Mutations for a given line of code seem to be skipped based on the total amount of time taken for all tests that cover that line. In my case, I have a few lines of code that are covered by nearly every test case; so while each test takes 100 ms, mutations are skipped entirely for those lines because 100 ms x 300 tests = 30 s, which is greater than the timeout.

Of course I can just increase the timeout as a workaround, but this is still not ideal behavior.

Describe the solution you'd like

Mutations should only be skipped base on the time required (maybe maximum time required?) to run a single test case per line.

Additional context

As discussed in Discord: https://discord.com/channels/767914934016802818/771020567343136831/1241042229473316934

CC @sanmai who authored this code

@sanmai
Copy link
Member

sanmai commented May 18, 2024

I have to look into the code but as far as I recall the skipping logic follows the timeout logic. Let me get back after reconfirming.

@sanmai
Copy link
Member

sanmai commented May 18, 2024

PR that added this feature: #1171

Another relevant PR: #1424

@sanmai
Copy link
Member

sanmai commented May 18, 2024

It looks like we are tracking full test time instead of timings for individual tests, at line 106 here:

private function createCompleteTestLocation(TestLocation $test): TestLocation
{
$class = explode(':', $test->getMethod(), self::MAX_EXPLODE_PARTS)[0];
$testFileData = $this->testFileDataProvider->getTestFileInfo($class);
return new TestLocation(
$test->getMethod(),
$testFileData->path,
$testFileData->time,
);
}

So this isn't a problem with the feature itself, but rather a problem with the underlying data aggregation.

@sanmai sanmai linked a pull request May 18, 2024 that will close this issue
1 task
@sanmai
Copy link
Member

sanmai commented May 18, 2024

At this moment I don't feel particularly good about investing time into fixing the issue: the timeout feature exists to limit the time Infection spends running tests, so there should be no harm in adjusting (bumping) the timeout when needed to run more tests. But I can see it can be frustrating.

Furthermore, I still agree with my recommendation from this comment to tag integration tests with @coversNothing.

@theofidry
Copy link
Member

Personally I have to agree with @danepowell, I find the current behaviour quite unintuitive.

If I configure a timeout for a test (regardless of the framework/language), I personally would expect it to configure it for the execution of a singular test (unless the config hints at something else of course).

But personal bias aside, we're not in any language or any framework, we're in PHP and arguably the de facto test framework is PHPUnit. And in this case still, this is how the timeout works:

`CounterTest.php`
<?php

declare(strict_types=1);

namespace App\Tests;

use PHPUnit\Framework\Attributes\DataProvider;
use PHPUnit\Framework\Attributes\Small;
use PHPUnit\Framework\TestCase;
use function range;
use function sleep;

#[Small]
final class CounterTest extends TestCase
{
    #[DataProvider('inputProvider')]
    public function test_smaller_than_small(mixed $input): void
    {
        $this->addToAssertionCount(1);
    }

    #[DataProvider('inputProvider')]
    public function test_bigger_than_small(mixed $input): void
    {
        sleep(2);

        $this->addToAssertionCount(1);
    }

    public static function inputProvider(): iterable
    {
        foreach (range(1, 10) as $i) {
            yield [$i];
        }
    }
}

Provided your enforce the timeout with enforceTimeLimit="true" in your PHPUnit configuration file or --enforce-time-limit, only the individual tests will timeout. Indeed, although the timeout PHPUnit is somewhat inflexible*, it is the execution time of the individual test that is compared to the timeout.

*: There is no Timeout attribute, so besides the hardcoded Small, Medium and Large the only recourse you have is the default timeout time. And the timeouts are configured at the test case level, not the test level.


I think the current timeout is not devoid or sense or use, but it is not what I would expect timeout to be. Maybe something like subjectOverallTimeout or something would be more appropriate, definitely not the best name, but I think it should convey that it is about the time of all the tests that will be used to cover the subject.

@maks-rafalko
Copy link
Member

as I said on discord here, I also think this feature shouldn't sum all the tests' time covering mutated line to decide if mutant should be skipped or not, but instead check tests time individually.

In our docs explanation - https://infection.github.io/2020/08/18/whats-new-in-0.17.0/#Skip-S-mutations-that-are-over-specified-time-limit - we give an example with one integration test which takes more than %timeout%, but not with total time of all tests, that's why I think it's confusing at the moment.


By the way, there are 2 places where timeout configuration affects Infection's behavior:

  1. we use timeout for Mutant Processes (if after mutation we have an infinite loop, it will be killed after timeout seconds, which is 10s by default). This happens when Mutant is already created and executed in separate process.

$process = new Process(
command: $this->testFrameworkAdapter->getMutantCommandLine(
$mutant->getTests(),
$mutant->getFilePath(),
$mutant->getMutation()->getHash(),
$mutant->getMutation()->getOriginalFilePath(),
$testFrameworkExtraOptions,
),
timeout: $this->timeout,
);

  1. we use timeout to compare with the sum of all tests in MutationTestingRunner to decide if we even need to run particular Mutant. If not, Mutant process is not executed (so we are not wasting time here)

->filter(function (Mutant $mutant): bool {
// TODO refactor this comparison into a dedicated comparer to make it possible to swap strategies
if ($mutant->getMutation()->getNominalTestExecutionTime() < $this->timeout) {
return true;
}

In this issue, we are discussing point 2.

Thoughts how I think it should work instead, given timeout is 10s by default (please share your thoughts, I may be wrong!):

  • if each test takes <10s individually - we mutate the code, run Mutant process, because each tests, executed from the fastest to the slowest, can potentially kill this Mutant.
  • if M of N tests take >10s, we still should run Mutant process, because tests that take <10s indiviadually can potentially kill this Mutant. (optional optimization: do not run those M tests as they will anyway timeout, see point 1 above
  • if N of N (all tests) take >10s, we should skip running this Mutant Process, cause anyway all the Mutant Processes (point 1) will timeout

@danepowell
Copy link
Author

By the way, there are 2 places where timeout configuration affects Infection's behavior

This is the functional impact of this issue: if my tests take 100 ms, my timeout should be close to that to avoid wasting considerable time on actual mutation-induced timeouts. But I can't drop the timeout if it's just going to lead to all tests being skipped.

I'll look into the coversNothing annotation but I don't think that's going to be a satisfying workaround. I know my "unit" tests are actually integration tests but that's unfortunately all that's supported by the Symfony Console CommandTester.

@theofidry
Copy link
Member

unfortunately all that's supported by the Symfony Console CommandTester.

What would define unit vs integration there is whether you use an instantiated command/application or you get one from the kernel or something with externally configured services.

Otherwise, using directly the command to test it is not effective: the command runner is the application and it enriches the command definition.

@sanmai
Copy link
Member

sanmai commented May 19, 2024

In JUnit there are timings for every and each test. So we could calculate the exact time it requires to run a set of tests and use that. There should be no need for heuristics and guesswork.

<?xml version="1.0" encoding="UTF-8"?>
<testsuites>
  <testsuite name="phpunit.xml.dist" tests="2276" assertions="2489" errors="0" failures="0" skipped="0" time="5.511347">
    <testsuite name="Main" tests="2276" assertions="2489" errors="0" failures="0" skipped="0" time="5.511347">
      <testsuite name="Tests\Pipeline\ChunkTest" file="tests/ChunkTest.php" tests="25" assertions="25" errors="0" failures="0" skipped="0" time="0.083669">
        <testsuite name="Tests\Pipeline\ChunkTest::testChunk" tests="24" assertions="24" errors="0" failures="0" skipped="0" time="0.081661">
          <testcase name="testChunk with data set #23" file="tests/ChunkTest.php" line="77" class="Tests\Pipeline\ChunkTest" classname="Tests.Pipeline.ChunkTest" assertions="1" time="0.027784"/>
          <testcase name="testChunk with data set #13" file="tests/ChunkTest.php" line="77" class="Tests\Pipeline\ChunkTest" classname="Tests.Pipeline.ChunkTest" assertions="1" time="0.002334"/>
          <testcase name="testChunk with data set #9" file="tests/ChunkTest.php" line="77" class="Tests\Pipeline\ChunkTest" classname="Tests.Pipeline.ChunkTest" assertions="1" time="0.002521"/>
          <testcase name="testChunk with data set #19"
                     file="tests/ChunkTest.php" line="77" 
                     class="Tests\Pipeline\ChunkTest" 
                     classname="Tests.Pipeline.ChunkTest"
                     assertions="1" time="0.002292"/>
                                    ^^^^^^^^^^^^^^^

(I guess it wasn't the case before, and at one point we had to use test suite-wide timings as only these were available.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants