-
Notifications
You must be signed in to change notification settings - Fork 7.9k
makes mb_ereg
clear the $regs
argument on failure
#2046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When `mb_ereg` failed to match, it didn't update the `$regs` argument. Now it will always set it to the empty array. Fixes bug #72711
@@ -49,7 +49,8 @@ With $regs arg: | |||
|
|||
Warning: mb_ereg(): empty pattern in %s on line %d | |||
bool(false) | |||
NULL | |||
array(0) { | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that change (yield empty array instead of null) makes sense, but I'm not sure whether it should be part of this bug fix. Sticking with the current behavior might avoid BC breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmb69 The function did not yield null, it just didn't modify the zval
you passed as an argument when the match fails. If it was a string it stays a string, if it was not defined it will be null, etc...
This behavior was not consitent with preg_match
, was highly counter-intuitive and looked like a bug to me:
$pattern = '(a)\1';
mb_ereg($pattern, 'aa', $matches);
// $matches = ['aa', 'a']
mb_ereg($pattern, 'WTF?', $matches);
// $matches = ['aa', 'a'] <-- obviously, this ain't right...
preg_match($pattern, 'aa', $matches);
// $matches = ['aa', 'a']
preg_match($pattern, 'WTF?', $matches);
// $matches = [] <-- that's more like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But of course, I can rebase on another branch if it's too much a BC break.
Just tell me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function did not yield null, it just didn't modify the zval you passed as an argument when the match fails. If it was a string it stays a string, if it was not defined it will be null, etc...
Ah, now I understand! Thanks.
I agree that the behavior of preg_match() makes more sense, but mb_ereg() behaves like ereg(). Changing the behavior of the former, but not that of the latter, would be inconsistent. However, I don't think changing ereg() makes sense, as it has been deprecated as of PHP 7.0.0.
Existing code using (mb_)ereg() falls into two categories: code that does properly respect the current behavior (either by setting $regs to an explicit default before calling (mb_)ereg(), or by checking the return value of (mb_)ereg() before accessing $regs), and code that doesn't. The suggested change would only affect the latter, so one might argue that the BC break is acceptable (we might even fix such broken code).
So, I'm not sure what to do here (except wrt. the docs, which have to be fixed).
What do others think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think changing ereg() makes sense, as it has been deprecated as of PHP 7.0.0.
Actually, it had already been deprecated as of PHP 5.3.0, and removed as of PHP 7.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nonetheless, the type declaration of the $regs parameter in the function prototype is array, what doesn't match the current behavior. However, the function prototype doesn't match the function's behavior anyway, what can be observed with strict_types=1.
@cmb69 Sorry I don't understand exactly what you mean. Should I close this PR or do you think it is useful ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I close this PR or do you think it is useful ?
I still think it is useful, as I prefer the behavior it introduces (same as preg_match()), and as it would fix the issue with the currently wrong function prototype (array $regs
). However, firstly I considered it important to fix the docs (which I considered badly written), and secondly I'm still unsure to which PHP version this PR should be applied to (I would suggest at least master, i.e. PHP 7.2, but maybe even targetting PHP 5.6+ is appropriate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, firstly I considered it important to fix the docs
Totally agree on that. I was thinking myself of adding an entire chapter dedicated to the Oniguruma pattern syntax and the similarities/differences with PCRE.
Do you think I should start ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think I should start ?
Yes, please! :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please! :-)
@cmb69 Okay I'm on it! 😉
Should mb_ereg be removed/deprecated like ereg was? |
I don't think so. mb_ereg() does support several multibyte encodings, but preg_match() only supports UTF-8, so one would have to temporarily convert to UTF-8 for regexps matching. |
@staabm The naming is indeed infortunate. Combined to the rather spartan documentation, it leads people (as I did) into thinking the
But that's not directly related to this issue 😉 |
Comment on behalf of cmb at php.net: I've merged this PR into PHP 7.1+ now. I'm still not comfortable with changing the behavior for older versions, but 7.1 seems to be fine as it's still in beta phase. |
When
mb_ereg
failed to match, it didn't update the$regs
argument.Now it will always set it to the empty array.
Fixes bug #72711