Parallel test improvements #3851

nikic · 2019-02-18T16:35:56Z

This a) reenables IO capture tests by inheriting std streams from the main process and b) implements the --CONFLICTS-- mechanism mentioned in #2822.

Some more CONFLICTS very likely missing, but I hope that we can start using this in CI and fix issues as they come up.

cc @hikari-no-yume

run-tests.php

nikic · 2019-02-19T15:39:31Z

@weltling @cmb69 There are some Windows specific failures here that I can't figure out. The failing tests are windows_mb_path tests, with diffs like this:

========DIFF========
001+ failed to create dir 'C:\projects\php-src\ext\standard\tests\file\windows_mb_path\dir_cp1252\tschüß3'
006- string(9) "tschüß4"
007- bool(true)
007+ string(7) "tsch��4"
008+ bool(false)
========DONE========
FAIL Test mkdir/rmdir cp1252 to UTF-8 path [C:\projects\php-src\ext\standard\tests\file\windows_mb_path\test_cp1252_to_utf8_1.phpt]

In the current setup, what happens is that all the windows_mb_path tests themselves do not run in parallel (i.e. no two windows_mb_path tests will run at the same time). However other tests run in parallel to the windows_mb_path tests. Which tests run can be seen in the AppVeyor log from the [1] and [2] prefixes.

Do you have any ideas why these failures happen? Can it be that that the codepage settings bleed over across processes with a common parent or something?

run-tests.php

hikari-no-yume · 2019-02-19T19:08:35Z

run-tests.php


 escape:
-	while ($testDirsToGo || ($testDirsInProgress > 0)) {
+	while ($test_files || $testsInProgress > 0) {


Why test_files and not testFiles?

$test_files is an existing variable. Would have to touch a lot of code (which all seems to use underscore style) to rename it.

hikari-no-yume · 2019-02-19T19:16:36Z

The --CONFLICTS-- mechanism's implementation looks pleasantly simple! ^^

hikari-no-yume · 2019-02-19T19:17:05Z

If/when this is merged, please squash it.

ext/xmlwriter/tests/002.phpt

ext/xmlwriter/tests/003.phpt

cmb69 · 2019-02-19T22:28:51Z

I have not been able to reproduce the Windows test failures on my machine. Will have a closer look tomorrow.

staabm · 2019-02-20T07:03:42Z

ext/sockets/tests/socket_import_stream-3.phpt

@@ -6,9 +6,9 @@ if (!extension_loaded('sockets')) {
 	die('SKIP sockets extension not available.');
 }
 $s = socket_create(AF_INET, SOCK_DGRAM, SOL_UDP);
-$br = socket_bind($s, '0.0.0.0', 58381);
+$br = socket_bind($s, '0.0.0.0', 58379);


Would it make sense to declare CONFLICTS for this tests based on the used port number?
That way you dont need to check the whole codebase which tests use which port

It's better to adjust the test (if it's very simple, like here), as it allows it to run in parallel. Adding a port-based CONFLICT would prevent spurious failures, but reduce parallelization opportunities.

In the future we might want to switch code to use dynamic port assignment instead (this is what HHVM did in their tests), but that's a lot more effort than changing a fixed port :)

KalleZ · 2019-02-20T07:24:23Z

@staabm I think checking the codebase gradually is worth while as it allows better parallelization in the long run

nikic · 2019-02-20T10:33:15Z

I've merged the conflict handling implementation in c0e15a3. Keeping this open until the Windows issue is resolved.

I'll probably experimentally enable use of -j2 on Travis as well, and will keep an eye on spurious build failures.

nikic · 2019-02-20T12:15:57Z

I've enabled -j2 on Travis now.

Also got a SOAP failure that I can't reproduce from that (https://travis-ci.org/php/php-src/jobs/495937131):

========DIFF========
002+ ��������0���>���*z#H�	7�����������j��mAj�Eo�ә���������[����������X��a�ɱ�����"JEQC��`h���.�}E�@�;%��_QD��������DisJ�@m'�f����t�&�kg��>C!s,��sb�	C	=����$��؞����(ZC�F��|c�j9F~��0�G'��B8�r���b'b_��
003+ 
004+ 
005+ -----------
006+ 
007+ <?xml version="1.0" encoding="UTF-8"?>
008+ <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><SOAP-ENV:Fault><faultcode>SOAP-ENV:Client</faultcode><faultstring>Bad Request</faultstring></SOAP-ENV:Fault></SOAP-ENV:Body></SOAP-ENV:Envelope>
002- <?xml version="1.0" encoding="ISO-8859-1"?>
003- <SOAP-ENV:Envelope
004-   SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
005-   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
006-   xmlns:xsd="http://www.w3.org/2001/XMLSchema"
007-   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
008-   xmlns:si="http://soapinterop.org/xsd">
009-   <SOAP-ENV:Body>
010-     <ns1:test xmlns:ns1="http://testuri.org" />
011-   </SOAP-ENV:Body>
012- </SOAP-ENV:Envelope>
013- 
014- 
015- -----------
016- 
017- <?xml version="1.0" encoding="UTF-8"?>
018- <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://testuri.org" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><SOAP-ENV:Body><ns1:testResponse><return xsi:type="xsd:string">Hello World</return></ns1:testResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>
019- ok
========DONE========
FAIL SOAP Server 29-CGI: new/addfunction/handle [ext/soap/tests/server029.phpt]

It looks like we're reading garbage from php://input, which seems pretty bad...

pcrov · 2019-02-20T13:51:14Z

That garbage looks like the gzipped post from server019.phpt.

nikic · 2019-02-20T13:53:19Z

@pcrov Ooooh, thanks, that makes a lot of sense. "uniq"id strikes again: https://github.com/php/php-src/blob/master/run-tests.php#L1967

nikic · 2019-02-20T16:05:44Z

The SOAP issue should be fixed with 967fa51.

@cmb69 I am able to occasionally reproduce these locally using run-tests.php ext/standard/tests/file/windows_mb_path/ ext/standard/tests/general_functions/ -j2. The most common failure is ext\standard\tests\file\windows_mb_path\bug75063_cp1251.phpt.

cmb69 · 2019-02-20T16:36:47Z

@nikic Sorry for the delay; still handicaped by a flu. Anyhow, it seems to me, that for CLI API, the codepage is set for the console which is inherited by all these processes. Therefore we're facing these race-conditions. I don't see a way to fix this, but to mark the respective tests, which appear to be many, as conflicting. :(

nikic · 2019-02-20T17:09:59Z

For the purposes of test parallelization I guess the solution will be to add an extra mode where a test conflicts with everything and thus enforce that no other tests may run in parallel with it.

This behavior seems pretty broken to me though. It means that it's impossible to safely switch PHP's internal CP, because it will also always switch the console CP as well (on CLI that is).

nikic · 2019-02-21T10:36:15Z

The all conflict is now implemented in 152e539 and I've enabled -j2 on AppVeyor in e3d502f.

Reusing this PR to check whether there's any benefit to using -j3...

cmb69 · 2019-02-21T10:54:07Z

I agree that the inherent coupling of the internal and console CP is a limitation which we may want to abolish sometime, but for most practical purposes it appears not to be a problem. After all, it's better than having no way to influence the CP as it was in PHP 7.0. :)

Thanks for taking care of the test framework work-around, anyway!

nikic · 2019-02-21T11:26:59Z

Using -j3 seems to be slower than -j2 (which is reasonable, as we only have 2 cores on AppVeyor). I got quite a few failures though where the test simply has no output at all: https://ci.appveyor.com/project/php/php-src/builds/22539918/job/r10v7qsx8qqx4y1s

nikic · 2019-02-22T09:40:45Z

It looks like the failures with no output always occur in the THREAD_SAFE=1, OPCACHE=1, INTRINSICS=AVX build. I'm wondering if this might be related to opcache. Are there any problems with trying to instantiate multiple opcache instances at the same time on Windows?

cmb69 · 2019-02-22T13:19:46Z

Are there any problems with trying to instantiate multiple opcache instances at the same time on Windows?

I though that a single OPcache instance would be used for all test processes. :S

nikic · 2019-02-22T14:14:40Z

@cmb69 At least on Linux it's going to be a separate one for each test (shm opcache is never shared on cli sapi). Is that different on Windows?

cmb69 · 2019-02-22T15:24:07Z

At least two instances of the built-in Webserver would share a single SHM Cache.

nikic · 2019-02-22T16:02:53Z

@cmb69 Interesting, didn't know that opcache behaves so differently on Windows. Do you know if there's any way to avoid this? Looking at shared_alloc_win32.c probably not -- seems like the only way to get a separate cache is to switch users?

cmb69 · 2019-02-22T16:29:12Z

It should be possible to change the temporary directory to enforce a new mmap file; can give that a try later.

PS: no, that won't work ("Unable to open base address file").

nikic · 2019-02-27T11:45:09Z

@cmb69 I've now tried to set a per-worker TEMP, TMP and TMPDIR. It does create separate base address files in that case. However, it still tries to use the same base address (wat?). I then tried explicitly setting opcache.mmap_base for each worker to a separate address. I see the base address change in the file, but overall this doesn't help with the test failures -- in fact there are more of them :(

cmb69 · 2019-02-27T12:31:34Z

However, it still tries to use the same base address (wat?).

Guess that comment explains it:

php-src/ext/opcache/shared_alloc_win32.c

Lines 283 to 285 in a72c741

    
           	/* Starting from windows Vista, heap randomization occurs which might cause our mapping base to 
        
           	   be taken (fail to map). So under Vista, we try to map into a hard coded predefined addresses 
        
           	   in high memory. */

weltling · 2019-03-01T20:13:55Z

Most likely this Opcache scenario will need to be covered, as we want to support the parallel tests with Opcache as well. The current expectation is, that same user would run on just one instance of the shared memory. This approach might have to be changed to support an arbitrary number of separate Opcache instances for the same user.

It is a complex topic though, which needs a careful consideration. It involves both shared memory and mutex handling. I can imagine, why is was done the way it is right now - having automatic namings has an advantage of less user errors and thus less false bug reports. Possible hard configuration mistakes can impact security and system stability. Some conversations about this happened in the past, which might be useful to read https://bugs.php.net/bug.php?id=72645.

Thanks.

nikic · 2019-03-04T10:34:37Z

Going to close this testing PR. The current state is that parallel testing is enabled on Travis, and the non-opcache job on AppVeyor. I'm stilling fixing occasional failures, but generally the functionality seems fairly stable :)

nikic force-pushed the parallel-tests branch from 840fdfc to a7b190b Compare February 18, 2019 17:05

staabm reviewed Feb 18, 2019

View reviewed changes

run-tests.php Outdated Show resolved Hide resolved

carusogabriel added the Category: Tests label Feb 19, 2019

nikic force-pushed the parallel-tests branch 2 times, most recently from d01e1a4 to 43b6cb9 Compare February 19, 2019 09:37

guilliamxavier reviewed Feb 19, 2019

View reviewed changes

run-tests.php Outdated Show resolved Hide resolved

nikic force-pushed the parallel-tests branch from 43b6cb9 to d138568 Compare February 19, 2019 11:16

nikic changed the base branch from master to PHP-7.4 February 19, 2019 12:42

nikic force-pushed the parallel-tests branch from 1958bb4 to 3e508a1 Compare February 19, 2019 12:43

hikari-no-yume reviewed Feb 19, 2019

View reviewed changes

run-tests.php Outdated Show resolved Hide resolved

hikari-no-yume reviewed Feb 19, 2019

View reviewed changes

staabm reviewed Feb 19, 2019

View reviewed changes

ext/xmlwriter/tests/002.phpt Outdated Show resolved Hide resolved

staabm reviewed Feb 19, 2019

View reviewed changes

ext/xmlwriter/tests/003.phpt Outdated Show resolved Hide resolved

staabm reviewed Feb 20, 2019

View reviewed changes

nikic force-pushed the parallel-tests branch 2 times, most recently from 8cae72a to d87c1a3 Compare February 20, 2019 10:31

nikic force-pushed the parallel-tests branch 2 times, most recently from 6b339eb to 9cff793 Compare February 21, 2019 10:09

nikic force-pushed the parallel-tests branch from 9cff793 to 4b99d5f Compare February 21, 2019 11:25

Try -j3 on AppVeyor

87c179e

nikic force-pushed the parallel-tests branch from 4b99d5f to 38d38eb Compare February 26, 2019 16:45

nikic added 2 commits February 27, 2019 11:03

Clarify that redir_tested always null for parallel tests

5388282

Use separate TEMP directories on Windows

0bd27bd

nikic force-pushed the parallel-tests branch from 38d38eb to 0bd27bd Compare February 27, 2019 10:32

nikic closed this Mar 4, 2019

cmb69 mentioned this pull request May 24, 2019

Enable opcache to run full list of PHPT tests in test package php/pftt2#30

Closed

Parallel test improvements #3851

Parallel test improvements #3851

Uh oh!

Conversation

nikic commented Feb 18, 2019

Uh oh!

Uh oh!

Uh oh!

nikic commented Feb 19, 2019

Uh oh!

Uh oh!

hikari-no-yume Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

nikic Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

hikari-no-yume commented Feb 19, 2019

Uh oh!

hikari-no-yume commented Feb 19, 2019

Uh oh!

Uh oh!

Uh oh!

cmb69 commented Feb 19, 2019

Uh oh!

staabm Feb 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

KalleZ commented Feb 20, 2019

Uh oh!

nikic commented Feb 20, 2019

Uh oh!

nikic commented Feb 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcrov commented Feb 20, 2019

Uh oh!

nikic commented Feb 20, 2019

Uh oh!

nikic commented Feb 20, 2019

Uh oh!

cmb69 commented Feb 20, 2019

Uh oh!

nikic commented Feb 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Feb 21, 2019

Uh oh!

cmb69 commented Feb 21, 2019

Uh oh!

nikic commented Feb 21, 2019

Uh oh!

nikic commented Feb 22, 2019

Uh oh!

cmb69 commented Feb 22, 2019

Uh oh!

nikic commented Feb 22, 2019

Uh oh!

cmb69 commented Feb 22, 2019

Uh oh!

nikic commented Feb 22, 2019

Uh oh!

cmb69 commented Feb 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Feb 27, 2019

Uh oh!

cmb69 commented Feb 27, 2019

Uh oh!

weltling commented Mar 1, 2019

Uh oh!

nikic commented Mar 4, 2019

Uh oh!

Uh oh!

staabm Feb 20, 2019 •

edited

Loading

nikic commented Feb 20, 2019 •

edited

Loading

nikic commented Feb 20, 2019 •

edited

Loading

cmb69 commented Feb 22, 2019 •

edited

Loading