Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement filtering tests based on XML input file #4449

Closed
wants to merge 4 commits into from

Conversation

mfn
Copy link

@mfn mfn commented Sep 8, 2020

Summary

Export via --list-tests-xml and cut and slice as wanted, to feed it back via --tests-xml. Works with PHPUnit tests as well as PHPT tests.

A cheap slicing script is provided for now at https://gist.github.com/mfn/256636242cbe8a51252ce28181a6b074

Example:

~/src/phpunit $ ./phpunit --list-tests-xml tests.xml
PHPUnit 9.4-g567e55e11 by Sebastian Bergmann and contributors.

Wrote list of tests that would have been run to tests.xml
~/src/phpunit $ ./phpunit_xml_slicer.php tests.xml 1/100 > partial_tests.xml
~/src/phpunit $ ./phpunit --tests-xml partial_tests.xml
PHPUnit 9.4-g567e55e11 by Sebastian Bergmann and contributors.

Runtime:       PHP 7.4.8
Configuration: /Users/neo/src/phpunit/phpunit.xml

.............................                                     29 / 29 (100%)

Time: 00:00.186, Memory: 22.00 MB

OK (29 tests, 29 assertions)

Overview

  • Add a new --tests-xml <xml file> argument to PHPUnit
  • Implement new filter \PHPUnit\Runner\Filter\XmlTestsIterator which parses the XML and builds an internal data structure to identify what tests to accept()

I consider the state of the PR in a "functional state" and before finishing it I would seek feedback if I missed parts and guidance what kind of tests are best.

Considerations

  • The handling of using SimpleXML and parsing in \PHPUnit\Runner\Filter\XmlTestsIterator::setFilter is pretty crude. I barely worked with XML and PHP in the last decade. I saw \PHPUnit\Util\Xml\Loader but wasn't sure if it should be used, it was just added a few days ago. Also I liked the "simplicity" of SimpleXml to get this up and running.
    => Replaced with ext-xml aka DomDocument and friends
  • I noticed that the filters (not just the new one I added) are instantiated multiple times, thus the XML is parsed many more times. No idea what/if to do here
  • I first wanted to name it --filter-tests-xml, seemed more apt. But I ran into a strange problem with sebastian/cli-parser I didn't really understand so I just gave it a different to get the ball rolling
Error when using --filter-tests-xml as option name
Testing started at 08:07 ...
/usr/local/bin/php /Users/user/src/phpunit/phpunit --configuration /Users/user/src/phpunit/phpunit.xml --teamcity
PHPUnit 9.4-g4719a8977 by Sebastian Bergmann and contributors.

Runtime:       PHP 7.4.8
Configuration: /Users/user/src/phpunit/phpunit.xml


Failed asserting that string matches format description.
--- Expected
+++ Actual
@@ @@
-PHPUnit %s by Sebastian Bergmann and contributors.
+string(11) "-tests-xml="
+int(31)
+int(87)
+string(6) "filter"
+string(7) "filter="
 
-...                                                                 3 / 3 (100%)
+Fatal error: Uncaught RuntimeException: dafuq in /Users/user/src/phpunit/vendor/sebastian/cli-parser/src/exceptions/AmbiguousOptionException.php:19
+Stack trace:
+#0 /Users/user/src/phpunit/vendor/sebastian/cli-parser/src/Parser.php(165): SebastianBergmann\CliParser\AmbiguousOptionException->__construct('--filter')
+#1 /Users/user/src/phpunit/vendor/sebastian/cli-parser/src/Parser.php(81): SebastianBergmann\CliParser\Parser->parseLongOption('filter', Array, Array, Array)
+#2 /Users/user/src/phpunit/src/TextUI/CliArguments/Builder.php(128): SebastianBergmann\CliParser\Parser->parse(Array, 'd:c:hv', Array)
+#3 /Users/user/src/phpunit/src/TextUI/Command.php(223): PHPUnit\TextUI\CliArguments\Builder->fromParameters(Array, Array)
+#4 /Users/user/src/phpunit/src/TextUI/Command.php(115): PHPUnit\TextUI\Command->handleArguments(Array)
+#5 /Users/user/src/phpunit/src/TextUI/Command.php(100): PHPUnit\TextUI\Command->run(Array, true)
+#6 Standard input code(9): PHPUnit\TextUI\Command::main()
+#7 {main}
 
-Time: %s, Memory: %s
-
-OK (3 tests, 3 assertions)
+Next PHPUnit\TextUI\Exception: dafuq in /User in /Users/user/src/phpunit/src/TextUI/Command.php on line 102

 /Users/user/src/phpunit/tests/end-to-end/filter-class-isolation.phpt:14
 /Users/user/src/phpunit/src/Framework/TestSuite.php:665
 /Users/user/src/phpunit/src/Framework/TestSuite.php:665
 /Users/user/src/phpunit/src/TextUI/TestRunner.php:668
 /Users/user/src/phpunit/src/TextUI/Command.php:147
 /Users/user/src/phpunit/src/TextUI/Command.php:100

TODO

  • Write tests; but what kind of? PHPT? Not sure where to put files / fixtures, so many already there. Guidance appreciated!
    => a phpt test has been added
  • XML parsing? Improve? Change to DomDocument?
    => Indeed, changed to DomDocument

Links

@codecov
Copy link

codecov bot commented Sep 8, 2020

Codecov Report

Merging #4449 (1e80f8a) into master (ba39893) will increase coverage by 0.09%.
The diff coverage is 90.47%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #4449      +/-   ##
============================================
+ Coverage     83.54%   83.63%   +0.09%     
- Complexity     4554     4578      +24     
============================================
  Files           286      287       +1     
  Lines         11759    11820      +61     
============================================
+ Hits           9824     9886      +62     
+ Misses         1935     1934       -1     
Impacted Files Coverage Δ Complexity Δ
src/TextUI/Help.php 100.00% <ø> (ø) 23.00 <0.00> (ø)
src/TextUI/CliArguments/Configuration.php 68.58% <87.50%> (+0.19%) 269.00 <4.00> (+3.00)
src/Runner/Filter/XmlTestsIterator.php 88.37% <88.37%> (ø) 17.00 <17.00> (?)
src/TextUI/CliArguments/Builder.php 80.56% <100.00%> (+0.18%) 120.00 <0.00> (+1.00)
src/TextUI/CliArguments/Mapper.php 79.51% <100.00%> (+0.24%) 82.00 <0.00> (+1.00)
src/TextUI/TestRunner.php 65.71% <100.00%> (+1.30%) 219.00 <0.00> (+2.00)
src/Runner/DefaultTestResultCache.php 93.65% <0.00%> (+1.58%) 27.00% <0.00%> (ø%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba39893...1e80f8a. Read the comment docs.

@sebastianbergmann sebastianbergmann added feature/test-runner CLI test runner type/enhancement A new idea that should be implemented labels Sep 8, 2020

// Regular PHPUnit tests
foreach ($xml->testCaseClass as $class) {
$className = $class['name']->__toString();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this typesafe? Maybe through an annotation like this:

/** @var array{name: ?} $class */

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the feedback!

I'll first wait regarding such typings as to in what direction the PR really goes.


I never used SimpleXml really before, this is all highly magic. $class is not an array, it's SimpleXMLElement providing lots-o-magic stuff…

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the time you invest in preparing this pull request. Please note, though, that I will not accept this if it introduces ext/simplexml usage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced SimpleXML with DomDocument and re-used the existing \PHPUnit\Util\Xml\Loader::loadFile

Copy link

@spawnia spawnia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @mfn, looks pretty good overall and opens the door for some very important functionality.

@mfn mfn force-pushed the mfn-tests-xml branch 3 times, most recently from f594394 to b10d64a Compare September 8, 2020 21:12
@mfn mfn marked this pull request as ready for review September 8, 2020 21:23
@theseer
Copy link
Collaborator

theseer commented Sep 11, 2020

Can we please stop using none namespaced XML and provide XSDs for them?

@mfn
Copy link
Author

mfn commented Oct 9, 2020

ping, anyone?

@theseer
Copy link
Collaborator

theseer commented Oct 9, 2020

Not sure what you're expecting at this moment.

While the current state of this PR is having conflicts that should get resolved, my comment regarding the lack of a namespace and an XSD is unanswered. I do realize you're only relying on the export that already is broken in that regard. But before building more things that rely on this, we should fix that, imho.

@epdenouden
Copy link
Contributor

epdenouden commented Oct 11, 2020

@mfn I am working on writing a more detailed review. Long story very short: the idea of filtering tests via an XML-file and the main Iterator is fine, but it needs refactoring.

I am also wary of any iterator work getting in the way of the coming redesign of the core iterator/filtering mechanisms. If this PR gets accepted as-is, I would have to refactor this new functionality, too.

}

/** @var TestCase $test */
$testClass = get_class($test);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way to ask anything that implements a Test its own unique identity, have a look at https://github.com/sebastianbergmann/phpunit/blob/master/src/Framework/Reorderable.php and the uses of Reorderable.

If you need more details about the (origins of) a test, let me know the use cases. It would be best if we can extend a central mechanism for identifying and locating tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to concentrate on this part first here before solving the others, might have an impact.

When a filter receives a test class in accept(), I've to figure out ifs part of the desired filter, i.e. the XML being feed back to phpunit.

The format of the XML is defined via \PHPUnit\Util\XmlTestListRenderer::render

if (get_class($test) !== $currentTestCase) {

and creates a XML structure like this:

<?xml version="1.0"?>
<tests>
 <testCaseClass name="PHPUnit\Framework\FunctionsTest">
  <testCaseMethod name="testGlobalFunctionsFileContainsAllStaticAssertions" groups="default" dataSet="&quot;assertArrayHasKey&quot;"/>
…
 </testCaseClass>
 <phptFile path="/absolute/path/phpunit/tests/end-to-end/abstract-test-class.phpt"/>
…
</tests>

And this now explains why I'm using get_class() and not something else: as a consumer of the XML, I've to match the producer.

This is, for \PHPUnit\Framework\TestCase, further stipulated with the dataSet-attribute, it's produced using this code

if (!empty($test->getDataSetAsString(false))) {
$writer->writeAttribute(
'dataSet',
str_replace(
' with data set ',
'',
$test->getDataSetAsString(false)
)
);
}

As can be seen, this manually removes some parts of \PHPUnit\Framework\TestCase::getDataSetAsString and writes it to the XML.

For that reason, when reading the XML I'm reconstructing the original "data as string" with this:

            $name                            = "{$methodName} with data set {$dataSet}";
            $this->filter[$className][$name] = true;

so that in accept() I can just call

        $name = $test->getName();

and getName internally calls getDataSetAsString, so that in the end by calling:

  • $testClass = get_class($test); in accept()
    and doing
  • "{$methodName} with data set {$dataSet} in setFilter
    I've built the matching mirror logic for consuming what was produced.

I guess some of this "peekaboo" here is reflected in #4449 (comment)


So currently I don't see how e.g \PHPUnit\Framework\TestCase::sortId (of \PHPUnit\Framework\Reorderable) helps me here, as the generated value is not usable in the context of what is produced in the XML

As for the format used in the \PHPUnit\Runner\Filter\XmlTestsIterator::$filter, I tried to document it:

    /**
     * The filter is used as a fast look for
     * - the class name
     * - the method name plus its optional data set description.
     *
     * Example: `filter[class name][method name + data set] = true;`
     *
     * The accept() method then can use a fast isset() to check if a test should
     * be included or not.
     *
     * This works equally for phpt tests, except we hardcode the class name.
     *
     * @var array<string,array<string,true>>
     */

*/
private function setFilter(string $xmlFile): void
{
$xml = (new XmlLoader())->loadFile($xmlFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the iterator gets instantiated multiple times, this becomes a very expensive implementation. Loading the configuration should be done once, stored somewhere and then used as a lookup in the iterator instances.

Currently this would be somewhere around TestRunner::processSuiteFilters().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I took a closer look and finally figured out why processSuiteFilters is only executed once but the constructor/filter called multiple times => I didn't realize this whole reflection/instantiation the first time around.

I'm not sure where exactly to put this code to a) load the XML b) parse it c) generate the data structure for fast lookup, so to satisfy the performance concerns for now, I left the code within XmlTestsIterator as static methods and call this helpers from processSuiteFilters.
What I find "nice" about this is that generating the filter and consuming it in accept() is in the same code space; I kinda like to think this makes easier to understand.

I'm happy to move it somewhere else, but wasn't sure if putting them directly into \PHPUnit\TextUI\TestRunner would be a good idea either.

private function extractTestCases(DOMXPath $xpath): Generator
{
/** @var DOMElement $class */
foreach ($xpath->evaluate('/tests/testCaseClass') as $class) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as loading the XML-file for every new Iterator: please do this once and use a centralized lookup.

Copy link
Author

@mfn mfn Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
/* @var DOMElement $phptFile */
foreach ($xpath->evaluate('/tests/phptFile') as $phptFile) {
$path = $phptFile->getAttribute('path');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why does CodeCov mark these lines as not-reached?

$path = $phptFile->getAttribute('path');

if ($path) {
yield $path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why does CodeCov mark these lines as not-reached?

@@ -1145,6 +1147,13 @@ private function processSuiteFilters(TestSuite $suite, array $arguments): void
);
}

if (!empty($arguments['testsXml'])) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Around here is the place to load+parse your configuration and get a list of tests. Perhaps you can even adapt/reuse the current NameFilterIterator, as it's basically what it is. :)

@mfn
Copy link
Author

mfn commented Dec 7, 2020

For some reason I never got the email notification from the review feedback #4449 (review) , I just discovered it now 💥

I'll try to find time ASAP to address things!

Copy link
Author

@mfn mfn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epdenouden thanks for your feedback!

I did explain some specifics regarding the accept() implementation, as I'm not sure what I could do to improve it regards "interoperability" with consumed XML.

Technically I did address the performance concern regarding the XML parsing, though I'm not yet sure if the code location is fine or not.

I'll address code coverage feedback once we all agree what the final implementation is supposed to look like to move forward.

Thank you!

}

/** @var TestCase $test */
$testClass = get_class($test);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to concentrate on this part first here before solving the others, might have an impact.

When a filter receives a test class in accept(), I've to figure out ifs part of the desired filter, i.e. the XML being feed back to phpunit.

The format of the XML is defined via \PHPUnit\Util\XmlTestListRenderer::render

if (get_class($test) !== $currentTestCase) {

and creates a XML structure like this:

<?xml version="1.0"?>
<tests>
 <testCaseClass name="PHPUnit\Framework\FunctionsTest">
  <testCaseMethod name="testGlobalFunctionsFileContainsAllStaticAssertions" groups="default" dataSet="&quot;assertArrayHasKey&quot;"/>
…
 </testCaseClass>
 <phptFile path="/absolute/path/phpunit/tests/end-to-end/abstract-test-class.phpt"/>
…
</tests>

And this now explains why I'm using get_class() and not something else: as a consumer of the XML, I've to match the producer.

This is, for \PHPUnit\Framework\TestCase, further stipulated with the dataSet-attribute, it's produced using this code

if (!empty($test->getDataSetAsString(false))) {
$writer->writeAttribute(
'dataSet',
str_replace(
' with data set ',
'',
$test->getDataSetAsString(false)
)
);
}

As can be seen, this manually removes some parts of \PHPUnit\Framework\TestCase::getDataSetAsString and writes it to the XML.

For that reason, when reading the XML I'm reconstructing the original "data as string" with this:

            $name                            = "{$methodName} with data set {$dataSet}";
            $this->filter[$className][$name] = true;

so that in accept() I can just call

        $name = $test->getName();

and getName internally calls getDataSetAsString, so that in the end by calling:

  • $testClass = get_class($test); in accept()
    and doing
  • "{$methodName} with data set {$dataSet} in setFilter
    I've built the matching mirror logic for consuming what was produced.

I guess some of this "peekaboo" here is reflected in #4449 (comment)


So currently I don't see how e.g \PHPUnit\Framework\TestCase::sortId (of \PHPUnit\Framework\Reorderable) helps me here, as the generated value is not usable in the context of what is produced in the XML

As for the format used in the \PHPUnit\Runner\Filter\XmlTestsIterator::$filter, I tried to document it:

    /**
     * The filter is used as a fast look for
     * - the class name
     * - the method name plus its optional data set description.
     *
     * Example: `filter[class name][method name + data set] = true;`
     *
     * The accept() method then can use a fast isset() to check if a test should
     * be included or not.
     *
     * This works equally for phpt tests, except we hardcode the class name.
     *
     * @var array<string,array<string,true>>
     */

*/
private function setFilter(string $xmlFile): void
{
$xml = (new XmlLoader())->loadFile($xmlFile);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I took a closer look and finally figured out why processSuiteFilters is only executed once but the constructor/filter called multiple times => I didn't realize this whole reflection/instantiation the first time around.

I'm not sure where exactly to put this code to a) load the XML b) parse it c) generate the data structure for fast lookup, so to satisfy the performance concerns for now, I left the code within XmlTestsIterator as static methods and call this helpers from processSuiteFilters.
What I find "nice" about this is that generating the filter and consuming it in accept() is in the same code space; I kinda like to think this makes easier to understand.

I'm happy to move it somewhere else, but wasn't sure if putting them directly into \PHPUnit\TextUI\TestRunner would be a good idea either.

private function extractTestCases(DOMXPath $xpath): Generator
{
/** @var DOMElement $class */
foreach ($xpath->evaluate('/tests/testCaseClass') as $class) {
Copy link
Author

@mfn mfn Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfn
Copy link
Author

mfn commented Dec 19, 2020

Btw, is it fine to target master or should I target another branch, perhaps 9.5?

@mfn
Copy link
Author

mfn commented Feb 16, 2021

Is there anything else I can do? AFAIK I addresses the feedback or did I miss anything, which blocks us?

@maks-rafalko
Copy link
Contributor

Let me up this ticket. Would like to help (if any help is needed) because this feature will greatly help us at Infection with filtering what tests to run (including data sets)

@theseer
Copy link
Collaborator

theseer commented Jul 1, 2021

I didn't look into the Code itself, as @epdenouden already did that and I trust he'll review the current state as well ;)

But I'd like to repeat my statement made earlier: We should clean up the XML produced before we add more functionality and have (more) external projects rely on it. At the very least, let us please add a namespace with a version identifier so any future upgrades and changes are not going to kill us.

@mfn
Copy link
Author

mfn commented Jul 3, 2021

I definitely feel caught in a catch-22 here and I get no clear indicator how I can move forward.

I feels out of scope for me to tackle the "XML / namespace / schema" problem here, that's not a problem I created and seems "higher up" to me.

Randomly I tried to update the PR and resolve the merge conflicts, but I got demotivated due to lack of feedback / specific actionable stuff, so I didn't touch it for some time.


In any way, I'm about to leave for vacation and can't provide updates within the next few weeks. I'll check the PR feedback when I get back but I'm also happy if someone else steps up and assists here 🙏

@sebastianbergmann
Copy link
Owner

Thank you for your contribution. I appreciate the time you invested in preparing this pull request. However, I have decided not to merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/test-runner CLI test runner type/enhancement A new idea that should be implemented
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Specify a list of tests to run Provide a way to accept a list of tests
6 participants