Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str_split must return empty array for empty string #8924

Closed
mvorisek opened this issue Jul 5, 2022 · 5 comments
Closed

str_split must return empty array for empty string #8924

mvorisek opened this issue Jul 5, 2022 · 5 comments

Comments

@mvorisek
Copy link
Contributor

mvorisek commented Jul 5, 2022

Description

The following code:

<?php

print_r(str_split('', 1));
print_r(mb_str_split('', 1));
print_r(mb_str_split('', 1, 'UTF-8'));
print_r(mb_str_split('', 1, 'ASCII'));
print_r(array_chunk([], 1));

https://3v4l.org/1WsGi

Resulted in but I expected this output instead:

 Array
 (
-    [0] => 
 )
 Array
 (
 )
 Array
 (
 )
 Array
 (
 )
 Array
 (
 )

I belive this is an inconsistency that should be fixed.

The inconsistency was introduced in 0818fae by a mistake as found out by @emkey08 in the discussion below.

PHP Version

PHP 7.4 - master

Operating System

any

@emkey08
Copy link

emkey08 commented Jul 6, 2022

Wouldn't it actually be the correct solution instead if both str_split('') and mb_str_split('') would return an empty array?

Splitting a string is supposed to return an array of the string's characters. An empty string has zero characters, so the expected and logical result would be an empty array (that is, an array containing zero characters).

When splitting a string, it should hold that count(str_split($str)) is equivalent to strlen($str). However, this currently isn't the case if the input string is empty.

@emkey08
Copy link

emkey08 commented Jul 7, 2022

When str_split was initially added to PHP 5.0, str_split('') actually returned an empty array. See here.

This commit changed it, but this obviously was an unintended (and also undocumented) side effect. The commit claims to be a performance optimization only, while in fact it was a functional change: the commit author simply didn't correctly consider the case when the input string is empty. All later PHP versions built upon this change, and so the current str_split behavior is actually based on a mistake.

It would of course be a breaking change to fix str_split('') to return an empty array, but note that the current implementation also causes things to break. It's quite common to use foreach (str_split($str) as $char) to iterate over a string, but you must not assume strlen($char) === 1 inside that loop, or your code will break. Bummer.

@emkey08
Copy link

emkey08 commented Jul 7, 2022

As a final note, the current documentation also claims

The default length is 1, meaning every chunk will be one byte in size.

which obviously isn't true.

@mvorisek
Copy link
Contributor Author

mvorisek commented Jul 7, 2022

When str_split was initially added to PHP 5.0, str_split('') actually returned an empty array. See here.

This commit changed it, but this obviously was an unintended (and also undocumented) side effect. The commit claims to be a performance optimization only, while in fact it was a functional change: the commit author simply didn't correctly consider the case when the input string is empty. All later PHP versions built upon this change, and so the current str_split behavior is actually based on a mistake.

👍, that "optimization commit" should be fixed. Then it would be consistent also with array_chunk behaviour - https://3v4l.org/1QLKU .

@mvorisek mvorisek changed the title mb_str_split must return one empty element for empty string str_split must return empty array for empty string Jul 7, 2022
mvorisek added a commit to mvorisek/php-src that referenced this issue Jul 7, 2022
mvorisek added a commit to mvorisek/php-src that referenced this issue Jul 7, 2022
@cmb69
Copy link
Contributor

cmb69 commented Jul 7, 2022

I tend to agree that this is a bug (although one that should not be fixed in a revision), and not merely a documentation issue.

mvorisek added a commit to mvorisek/php-src that referenced this issue Jul 7, 2022
Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>
mvorisek added a commit to mvorisek/php-src that referenced this issue Jul 7, 2022
Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>
terrafrost added a commit to terrafrost/phpseclib that referenced this issue Aug 27, 2022
terrafrost added a commit to terrafrost/phpseclib that referenced this issue Aug 27, 2022
PowerKiKi added a commit to Ecodev/zendframework1 that referenced this issue Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants