Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add mb_ucfirst and mb_lcfirst functions #13161

Merged
merged 15 commits into from
Mar 20, 2024

Conversation

youkidearitai
Copy link
Contributor

Ref: #13075

Adds mb_ucfirst and mb_lcfirst functions.

RFC: https://wiki.php.net/rfc/mb_ucfirst

@youkidearitai youkidearitai changed the title Add mb_ucfirst and mb_lcfirst functions [RFC] Add mb_ucfirst and mb_lcfirst functions Jan 16, 2024
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, preliminary feedback 🙂

ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
@Ayesh
Copy link
Contributor

Ayesh commented Feb 9, 2024

@youkidearitai - Thanks for changing the case conversion to titlecase, and resetting the RFC vote. Although the process has restarted, I think you made the right call.

@youkidearitai
Copy link
Contributor Author

@Ayesh Thanks for your support.

@youkidearitai
Copy link
Contributor Author

This RFC has been accepted, change to Normal Pull Request.

@Ayesh
Copy link
Contributor

Ayesh commented Mar 7, 2024

Congratulations on passing this RFC!
Before merging, please add the entries to UPGRADING and NEWS files.

@youkidearitai
Copy link
Contributor Author

@Ayesh Thank you! Added.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, good work!
I have some minor comments. I also don't see an obvious way of making this more performant.
Maybe Alex knows though.

ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
@alexdowad
Copy link
Contributor

@youkidearitai Great job!

A good way to optimize this code for performance would be to first check if the first character is already in the desired case. If it is, just use zend_string_copy to increment the refcount of the input string and return it.

@youkidearitai
Copy link
Contributor Author

@alexdowad

A good way to optimize this code for performance would be to first check if the first character is already in the desired case. If it is, just use zend_string_copy to increment the refcount of the input string and return it.

Thank you. I tried it. Please confirm.

ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
ext/mbstring/mbstring.c Outdated Show resolved Hide resolved
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Ayesh added a commit to Ayesh/polyfill that referenced this pull request Mar 9, 2024
Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on
the polyfill shown in PHP.Watch. It basically takes the first mb
character, calls `mb_convert_case` with `MB_CASE_TITLE` or
`MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New mb_ucfirst and mb_lcfirst functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)
Ayesh added a commit to Ayesh/polyfill that referenced this pull request Mar 9, 2024
Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on
the polyfill shown in PHP.Watch. It basically takes the first mb
character, calls `mb_convert_case` with `MB_CASE_TITLE` or
`MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)
@youkidearitai
Copy link
Contributor Author

@alexdowad Could you review this pull request at free time?

@alexdowad
Copy link
Contributor

@alexdowad Could you review this pull request at free time?

@youkidearitai Certainly. 😄 I am sorry for the delay.

The code looks great. I would just like to ask if you can kindly add some more test cases. As a start, how would it be to test all ASCII chars, both for mc_ucfirst and mc_lcfirst?

You have tried empty strings, which is good. Maybe also try some slightly longer strings, string with NUL bytes...

Oh, and although you allow passing an explicit $encoding parameter, there are no test cases for that. I suggest adding test cases for at least several different explicit $encoding values.

@youkidearitai
Copy link
Contributor Author

@alexdowad Thank you for review! Added test case.

@Ayesh
Copy link
Contributor

Ayesh commented Mar 17, 2024

(Please feel free to completely ignore this very minor nitpick)

I find the tests a little bit difficult to grok. I think it will be much easier if we have a series of strict string comparisons, so both input and expected strings are in the same line:

var_dump(mb_ucfirst('abc') === 'Abc');
var_dump(mb_ucfirst('đắt quá!') === 'Đắt quá!');
var_dump(mb_ucfirst("\u{01CA}\u{01CA}") === "\u{01CB}\u{01CA}");

They are now compared against bool(true) var_dump outputs.

This will make test diffs not as verbose because they no longer show the actual difference, but I doubt they will be useful in the first place because the diffs will be in various encodings and some of the character differences are not so easily seen on terminal fonts.

I made a polyfill for this function, and the tests are like this.

@youkidearitai
Copy link
Contributor Author

@Ayesh
Yes, You have a point.
But I wrote this .phpt file because I think to easy to see if test fails to .diff files.
Thank you.

Copy link
Contributor

@alexdowad alexdowad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Thanks very much!

@alexdowad
Copy link
Contributor

@youkidearitai I am sorry that I took some time to review. Your work is really appreciated.

@youkidearitai
Copy link
Contributor Author

@youkidearitai I am sorry that I took some time to review. Your work is really appreciated.

@alexdowad No problem! Thank you very much to approved!

@nielsdos nielsdos merged commit 4d51bfa into php:master Mar 20, 2024
8 of 10 checks passed
@nielsdos
Copy link
Member

Merged, thanks!

@youkidearitai youkidearitai deleted the mb_ucfirst branch March 20, 2024 21:10
@youkidearitai
Copy link
Contributor Author

Thanks very much!

Ayesh added a commit to Ayesh/polyfill that referenced this pull request Mar 31, 2024
Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on
the polyfill shown in PHP.Watch. It basically takes the first mb
character, calls `mb_convert_case` with `MB_CASE_TITLE` or
`MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)

Co-authored-by: USAMI Kenta <tadsan@zonu.me>
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 12, 2024
… polyfills (Ayesh)

This PR was merged into the 1.x branch.

Discussion
----------

[mbstring][PHP 8.4] Add `mb_ucfirst` and `mb_lcfirst` to polyfills

Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on the polyfill shown in PHP.Watch. It basically takes the first mb character, calls `mb_convert_case` with `MB_CASE_TITLE` or `MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)

Commits
-------

5bd65b7 [mbstring][PHP 8.4] Add `mb_ucfirst` and `mb_lcfirst` to polyfills
symfony-splitter pushed a commit to symfony/polyfill-mbstring that referenced this pull request Apr 12, 2024
Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on
the polyfill shown in PHP.Watch. It basically takes the first mb
character, calls `mb_convert_case` with `MB_CASE_TITLE` or
`MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)

Co-authored-by: USAMI Kenta <tadsan@zonu.me>
symfony-splitter pushed a commit to symfony/polyfill-mbstring that referenced this pull request Apr 12, 2024
… polyfills (Ayesh)

This PR was merged into the 1.x branch.

Discussion
----------

[mbstring][PHP 8.4] Add `mb_ucfirst` and `mb_lcfirst` to polyfills

Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on the polyfill shown in PHP.Watch. It basically takes the first mb character, calls `mb_convert_case` with `MB_CASE_TITLE` or `MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)

Commits
-------

5bd65b7 [mbstring][PHP 8.4] Add `mb_ucfirst` and `mb_lcfirst` to polyfills
symfony-splitter pushed a commit to symfony/polyfill-php84 that referenced this pull request Apr 12, 2024
Adds polyfills for `mb_ucfirst` and `mb_lcfirst` functions based on
the polyfill shown in PHP.Watch. It basically takes the first mb
character, calls `mb_convert_case` with `MB_CASE_TITLE` or
`MB_CASE_LOWER`, and returns with the concat of the remaining string.

The tests are taken from the php-src PR.

 - [RFC](https://wiki.php.net/rfc/mb_ucfirst)
 - [php-src PR: php-src#13161](php/php-src#13161)
 - [PHP.Watch - PHP 8.4: New `mb_ucfirst` and `mb_lcfirst` functions](https://php.watch/versions/8.4/mb_ucfirst-mb_ucfirst)

Co-authored-by: USAMI Kenta <tadsan@zonu.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants