Skip to content

makes mb_ereg clear the $regs argument on failure #2046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions ext/mbstring/php_mbregex.c
Original file line number Diff line number Diff line change
Expand Up @@ -704,6 +704,11 @@ static void _php_mb_regex_ereg_exec(INTERNAL_FUNCTION_PARAMETERS, int icase)
RETURN_FALSE;
}

if (array != NULL) {
zval_dtor(array);
array_init(array);
}

options = MBREX(regex_default_options);
if (icase) {
options |= ONIG_OPTION_IGNORECASE;
Expand Down Expand Up @@ -743,8 +748,6 @@ static void _php_mb_regex_ereg_exec(INTERNAL_FUNCTION_PARAMETERS, int icase)
str = string;
if (array != NULL) {
match_len = regs->end[0] - regs->beg[0];
zval_dtor(array);
array_init(array);
for (i = 0; i < regs->num_regs; i++) {
beg = regs->beg[i];
end = regs->end[i];
Expand Down
30 changes: 19 additions & 11 deletions ext/mbstring/tests/bug43994.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ function_exists('mb_ereg') or die("skip mb_ereg() is not available in this build
--FILE--
<?php
/* Prototype : int mb_ereg(string $pattern, string $string [, array $registers])
* Description: Regular expression match for multibyte string
* Description: Regular expression match for multibyte string
* Source code: ext/mbstring/php_mbregex.c
*/

/*
* mb_ereg 'successfully' matching incorrectly:
* mb_ereg 'successfully' matching incorrectly:
* Bug now seems to be fixed - error message is now generated when an 'empty'
* pattern is supplied to mb_ereg. Similar error message to ereg().
*/
Expand All @@ -38,7 +38,7 @@ foreach($inputs as $input) {
};
?>

--EXPECTF--
--EXPECTF--

-- Iteration 1 --
Without $regs arg:
Expand All @@ -49,7 +49,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, that change (yield empty array instead of null) makes sense, but I'm not sure whether it should be part of this bug fix. Sticking with the current behavior might avoid BC breaks.

Copy link
Contributor Author

@ju1ius ju1ius Jul 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmb69 The function did not yield null, it just didn't modify the zval you passed as an argument when the match fails. If it was a string it stays a string, if it was not defined it will be null, etc...

This behavior was not consitent with preg_match, was highly counter-intuitive and looked like a bug to me:

$pattern = '(a)\1';
mb_ereg($pattern, 'aa', $matches);
// $matches = ['aa', 'a']
mb_ereg($pattern, 'WTF?', $matches);
// $matches = ['aa', 'a'] <-- obviously, this ain't right...

preg_match($pattern, 'aa', $matches);
// $matches = ['aa', 'a']
preg_match($pattern, 'WTF?', $matches);
// $matches = [] <-- that's more like it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But of course, I can rebase on another branch if it's too much a BC break.
Just tell me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function did not yield null, it just didn't modify the zval you passed as an argument when the match fails. If it was a string it stays a string, if it was not defined it will be null, etc...

Ah, now I understand! Thanks.

I agree that the behavior of preg_match() makes more sense, but mb_ereg() behaves like ereg(). Changing the behavior of the former, but not that of the latter, would be inconsistent. However, I don't think changing ereg() makes sense, as it has been deprecated as of PHP 7.0.0.

Existing code using (mb_)ereg() falls into two categories: code that does properly respect the current behavior (either by setting $regs to an explicit default before calling (mb_)ereg(), or by checking the return value of (mb_)ereg() before accessing $regs), and code that doesn't. The suggested change would only affect the latter, so one might argue that the BC break is acceptable (we might even fix such broken code).

So, I'm not sure what to do here (except wrt. the docs, which have to be fixed).

What do others think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think changing ereg() makes sense, as it has been deprecated as of PHP 7.0.0.

Actually, it had already been deprecated as of PHP 5.3.0, and removed as of PHP 7.0.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonetheless, the type declaration of the $regs parameter in the function prototype is array, what doesn't match the current behavior. However, the function prototype doesn't match the function's behavior anyway, what can be observed with strict_types=1.

@cmb69 Sorry I don't understand exactly what you mean. Should I close this PR or do you think it is useful ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I close this PR or do you think it is useful ?

I still think it is useful, as I prefer the behavior it introduces (same as preg_match()), and as it would fix the issue with the currently wrong function prototype (array $regs). However, firstly I considered it important to fix the docs (which I considered badly written), and secondly I'm still unsure to which PHP version this PR should be applied to (I would suggest at least master, i.e. PHP 7.2, but maybe even targetting PHP 5.6+ is appropriate).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, firstly I considered it important to fix the docs

Totally agree on that. I was thinking myself of adding an entire chapter dedicated to the Oniguruma pattern syntax and the similarities/differences with PCRE.
Do you think I should start ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think I should start ?

Yes, please! :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please! :-)

@cmb69 Okay I'm on it! 😉


-- Iteration 2 --
Without $regs arg:
Expand All @@ -60,7 +61,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 3 --
Without $regs arg:
Expand All @@ -71,7 +73,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 4 --
Without $regs arg:
Expand All @@ -82,7 +85,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 5 --
Without $regs arg:
Expand All @@ -93,7 +97,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 6 --
Without $regs arg:
Expand All @@ -104,7 +109,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 7 --
Without $regs arg:
Expand All @@ -115,7 +121,8 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}

-- Iteration 8 --
Without $regs arg:
Expand All @@ -126,4 +133,5 @@ With $regs arg:

Warning: mb_ereg(): empty pattern in %s on line %d
bool(false)
NULL
array(0) {
}
11 changes: 7 additions & 4 deletions ext/mbstring/tests/mb_ereg1.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,16 @@ foreach ($a as $args) {
}
?>
===DONE===
--EXPECTF--
--EXPECTF--
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to avoid whitespace changes in bug fixes (and feature implementations, for that matter), because that makes it easier to read the diff. (Yeah, I know that these trailing whitespaces shouldn't be there in the first place, but a separate commit/pull request might be more appropriate to fix this.)

Copy link
Contributor Author

@ju1ius ju1ius Jul 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmb69 Ah yeah sorry, this is automatic editor behavior...
I usually stage things interactively, but I guess I was being a little too enthusiastic... 😅

bool(false)
array(3) {
[0]=>
int(1)
[1]=>
int(2)
[2]=>
int(3)
array(0) {
}
}

Warning: mb_ereg(): empty pattern in %s on line %d
Expand All @@ -38,7 +39,8 @@ array(3) {
[1]=>
string(0) ""
[2]=>
string(0) ""
array(0) {
}
}

Notice: Array to string conversion in %s on line %d
Expand All @@ -50,7 +52,8 @@ array(3) {
[1]=>
int(1)
[2]=>
string(0) ""
array(0) {
}
}

Warning: mb_ereg() expects parameter 2 to be string, array given in %s on line %d
Expand Down
9 changes: 5 additions & 4 deletions ext/mbstring/tests/mb_ereg_basic.phpt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
--TEST--
Test mb_ereg() function : basic functionality
Test mb_ereg() function : basic functionality
--SKIPIF--
<?php
extension_loaded('mbstring') or die('skip');
Expand All @@ -8,7 +8,7 @@ function_exists('mb_ereg') or die("skip mb_ereg() is not available in this build
--FILE--
<?php
/* Prototype : int mb_ereg(string $pattern, string $string [, array $registers])
* Description: Regular expression match for multibyte string
* Description: Regular expression match for multibyte string
* Source code: ext/mbstring/php_mbregex.c
*/

Expand Down Expand Up @@ -113,5 +113,6 @@ array(3) {
string(8) "MTIzNA=="
}
bool(false)
NULL
Done
array(0) {
}
Done
48 changes: 31 additions & 17 deletions ext/mbstring/tests/mb_ereg_variation1.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ function_exists('mb_ereg') or die("skip mb_ereg() is not available in this build
--FILE--
<?php
/* Prototype : int mb_ereg(string $pattern, string $string [, array $registers])
* Description: Regular expression match for multibyte string
* Source code: ext/mbstring/php_mbregex.c
* Description: Regular expression match for multibyte string
* Source code: ext/mbstring/php_mbregex.c
*/

/*
Expand Down Expand Up @@ -65,7 +65,7 @@ $inputs = array(
/*12*/ "string",
'string',
$heredoc,

// object data
/*15*/ new classA(),

Expand Down Expand Up @@ -95,47 +95,58 @@ echo "Done";

-- Iteration 1 --
bool(false)
NULL
array(0) {
}

-- Iteration 2 --
bool(false)
NULL
array(0) {
}

-- Iteration 3 --
bool(false)
NULL
array(0) {
}

-- Iteration 4 --
bool(false)
NULL
array(0) {
}

-- Iteration 5 --
bool(false)
NULL
array(0) {
}

-- Iteration 6 --
bool(false)
NULL
array(0) {
}

-- Iteration 7 --
bool(false)
NULL
array(0) {
}

-- Iteration 8 --
bool(false)
NULL
array(0) {
}

-- Iteration 9 --
bool(false)
NULL
array(0) {
}

-- Iteration 10 --
bool(false)
NULL
array(0) {
}

-- Iteration 11 --
bool(false)
NULL
array(0) {
}

-- Iteration 12 --
int(6)
Expand All @@ -153,13 +164,16 @@ array(1) {

-- Iteration 14 --
bool(false)
NULL
array(0) {
}

-- Iteration 15 --
bool(false)
NULL
array(0) {
}

-- Iteration 16 --
bool(false)
NULL
array(0) {
}
Done
Loading