Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid utf8 causing partially missing output #2256

Closed
radiat-r opened this issue Aug 1, 2016 · 1 comment
Closed

Invalid utf8 causing partially missing output #2256

radiat-r opened this issue Aug 1, 2016 · 1 comment

Comments

@radiat-r
Copy link
Contributor

radiat-r commented Aug 1, 2016

phpunit/src/Util/Printer.php provides a function write, which has this line:
$buffer = htmlspecialchars($buffer);

The use of htmlspecialchars in combination with invalid utf8 is problematic:
PHP: htmlspecialchars

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.

This can lead to irritating behaviour, when Printer->write is fed with such input.

Take for example the following simple test:

<?php

class Invalid_UTF8_Test extends PHPUnit_Framework_TestCase
{
    public function test_invalid_utf8()
    {
        $this->assertSame("\xbf", '');
    }
}

the output of phpunit will look like this:

PHPUnit 5.2.4 by Sebastian Bergmann and contributors.

Runtime:       PHP 5.6.22 with Xdebug 2.4.0
Configuration: XXX/phpunit.xml

F                                                                   1 / 1 (100%)

Time: 167 ms, Memory: 7.50MB

There was 1 failure:

1) Invalid_UTF8_Test::test_invalid_utf8

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.

It will print, that a failure occured, but not where exactly and what went wrong.

It get´s even worse, when you use a dataProvider:

<?php

class Invalid_UTF8_Test extends PHPUnit_Framework_TestCase
{
    public function provider_invalid_utf8()
    {
        return [
            ["\xbf"],
        ];
    }

    /**
     * @dataProvider provider_invalid_utf8
     */
    public function test_invalid_utf8($input)
    {
        $this->assertSame($input, '');
    }
}

Now the output will look like this:

PHPUnit 5.2.4 by Sebastian Bergmann and contributors.

Runtime:       PHP 5.6.22 with Xdebug 2.4.0
Configuration: XXX/phpunit.xml

F                                                                   1 / 1 (100%)

Time: 148 ms, Memory: 7.75MB

There was 1 failure:

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.

Now you can´t even see in which test the failure occured.

The issue is solved, when the above line in Printer->write is changed to:
$buffer = htmlspecialchars($buffer, ENT_SUBSTITUTE);

While this may look like a constructed example, i actually hit this problem when writing tests to check how our code is handling invalid utf8.

@sgabler
Copy link

sgabler commented Aug 18, 2016

Looks like this could be turned into a one-line pull-request, @radiat-r? Or does changing it to $buffer = htmlspecialchars($buffer, ENT_SUBSTITUTE); give anyone cause for concern?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants