Skip to content

Add INFECTION and TEST_TOKEN environment variables for each Mutant process#1504

Merged
maks-rafalko merged 2 commits intomasterfrom
feature/test-token
Apr 24, 2021
Merged

Add INFECTION and TEST_TOKEN environment variables for each Mutant process#1504
maks-rafalko merged 2 commits intomasterfrom
feature/test-token

Conversation

@maks-rafalko
Copy link
Member

@maks-rafalko maks-rafalko commented Apr 18, 2021

Add INFECTION and TEST_TOKEN environment variables for each Mutant process to make it possible to run functional tests for different databases.

Background:

We have a lot of functional tests that do real SQL queries to MySQL and decided to run tests in parallel. To achieve it, we integrated Paratest library.

Parallel processes can not write to the same database, because tests start to conflict with each other. To fix this issue, Paratest provides TEST_TOKEN=<int> environment variable for each process that can be used to set up different connections to the databases.

So if you have 3 parallel processes, these processes will use db_1, db_2, db_3 correspondingly.

Infection's issue with functional tests

The same goes for Infection. As soon as we start using more than 1 thread, our ParallelProcessRunner runs processes in parallel, and without TEST_TOKEN being passed, all those processes use the same database, which fails the test, making the majority of the Mutants killed.

This PR adds TEST_TOKEN=<int> env variable for each "mutant process".

Note: we can't use round-robin for thread indexes, because different processes take different amount of time. Consider the following example:

Imagine -we run Infection with 3 threads: infection -j3, and we have 4 mutations, so we need to run 4 processes when 3 of them can be in parallel

Process 1, TEST_TOKEN = 1
|--------------------------------------------------|
    Process 2, TEST_TOKEN = 2
    |------------|
         Process 3, TEST_TOKEN = 3
         |--|
             Process 4, TEST_TOKEN = 1
             |-----------------------------------------------------|

In this example, as soon as Process 3 finishes its work, we can run Process 4. It will mean that at the same time we will have exactly 3 parallel processes

  • Process 1
  • Process 2
  • Process 4

But, since Process 1 takes too much time, round-robin algorithm gives TEST_TOKEN values of 1 for Process 4. And we have a situation where both processes Process 1 and Process 4 work with the same database at the same time.

To avoid such an issue, in this PR TEST_TOKEN gets the first available (free) index that is not used at this point of time by any other processes. For example above, as soon as Process 3 finishes its job, Process 4 will get TEST_TOKEN=3, because values 1 and 2 are still busy.

Process 1, TEST_TOKEN = 1
|--------------------------------------------------|
    Process 2, TEST_TOKEN = 2
    |------------|
         Process 3, TEST_TOKEN = 3
         |--|
-           Process 4, TEST_TOKEN = 1
+           Process 4, TEST_TOKEN = 3
             |-----------------------------------------------------|

INFECTION env variable

In our application, we have a custom bootstrup for PHPUnit, where we remove var/cache folder

# tests/bootstrap.php

$filesystem = new Filesystem();

$filesystem->remove([__DIR__ . '/../var/cache/test']);

Since Infection runs PHPUnit in separate processes, this bootstrap file gets called for each Mutant, again breaking the whole process, because while Process 1 can work, Process 3 removes its cache.

So, we need to understand when PHPUnit process is executed from Infection, and skip such removing:

$filesystem = new Filesystem();

if (!getenv('INFECTION')) {
    $filesystem->remove([__DIR__ . '/../var/cache/test']);
}

This PR:

…ocess to make it possible to run tests for different databases

Same approach uses Paratest to avoid conflicts when 2 parallel processes can write to the same database.

yield 'nominal' => [4];

yield 'infinite' => [PHP_INT_MAX];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we can no longer test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since now we use range(1, $threadsCount), it leads to

1) Infection\Tests\Process\Runner\ParallelProcessRunnerTest::test_it_handles_all_kids_of_processes_with_infinite_threads with data set "infinite" (9223372036854775807)
ValueError: The supplied range exceeds the maximum array size: start=1 end=9223372036854775807

But I've added similar test in the next commit, where threads count is bigger than number of processes

@maks-rafalko maks-rafalko added this to the 0.22.0 milestone Apr 20, 2021
@maks-rafalko maks-rafalko changed the title Add INFECTION and TEST_TOKEN environment variables for each Mutant process Add INFECTION and TEST_TOKEN environment variables for each Mutant process Apr 20, 2021
@maks-rafalko maks-rafalko merged commit 638d558 into master Apr 24, 2021
@maks-rafalko maks-rafalko deleted the feature/test-token branch April 24, 2021 17:25
@sanmai
Copy link
Member

sanmai commented Jul 9, 2025

How did we decide that we need this token to be taken from a pool? Because this whole ordeal could be simplified, greatly so, if the token could be sequential, and the user would simply do:

$serverID = $TEST_TOKEN % 3;

If they have only three servers. A demo.

With sequential tokens we will number the mutations as we create them, telling them to add a new env variable, and that will it, nothing more.

@sanmai
Copy link
Member

sanmai commented Jul 9, 2025

I just thought that we don't need even that. If a user could know that tests are under mutation testing, they can implement a very simple algo to find a free server:

if (getenv('INFECTION')) {
    include 'lockdb.php';
}

And in lockdb.php:

for ($i = 0; $i < 3; $i++) {
    $fp = fopen(__DIR__ . "/db{$i}.lock", 'c');
    if (flock($fp, LOCK_EX | LOCK_NB)) {
        putenv("TEST_DB_SERVER=db{$i}");
        echo "Using DB server db{$i}\n";
        return;
    }
    fclose($fp);
}
exit(0); // triggers if you only run Infection with `-j` greater than 3, to mark all mutations "escaping"

Or it can be:

putenv("TEST_TOKEN={$i}");

So I'm not sure this complication was necessary in the first place.

@maks-rafalko
Copy link
Member Author

I've answered here with our use case for TEST_TOKEN: #2314 (comment)

and for me, as a user of Infection, using TEST_TOKEN provided by it or writing the whole script suggested above is a huge downgrade of DX:

for ($i = 0; $i < 3; $i++) {
    $fp = fopen(__DIR__ . "/db{$i}.lock", 'c');
    if (flock($fp, LOCK_EX | LOCK_NB)) {
        putenv("TEST_DB_SERVER=db{$i}");
        echo "Using DB server db{$i}\n";
        return;
    }
    fclose($fp);
}
exit(0); // triggers if you only run Infection with `-j` greater than 3, to mark all mutations "escaping"

And also, as I explained in the referenced issue, Paratest uses TEST_TOKEN, so we can use exactly the same approach and Doctrine config for both Paratest and Infection, which is very developer-friendly:

doctrine:
    dbal:
        dbname: 'db_%env(default:test_token:TEST_TOKEN)%'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants