Fix camelize and upperCamelize for all-caps inputs#51
Conversation
When input contains delimiters (spaces, dashes, underscores), lowercase the string before camelizing. This fixes all-caps inputs like 'FOO BAR' producing 'fOOBAR' instead of the expected 'fooBar'. Inputs without delimiters (e.g. 'ServeHTTP', 'CamelCase') are not affected since their word boundaries are determined by case transitions. Agent-Logs-Url: https://github.com/voku/Stringy/sessions/ef23f588-aedc-4674-b479-c81d993fcc79 Co-authored-by: voku <264695+voku@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request updates the camelize and upperCamelize methods in src/Stringy.php to lowercase input strings containing delimiters before conversion, ensuring all-caps strings are handled correctly. New test cases have been added to verify this behavior with various delimiters and UTF-8 characters. Feedback suggests expanding the delimiter regex to include digits (\d) to align with word boundary definitions and adding the u modifier for proper UTF-8 support. Additionally, refactoring the duplicated preprocessing logic into a shared private helper method is recommended for better maintainability.
| { | ||
| $str = $this->str; | ||
|
|
||
| if (\preg_match('/[-_\s]/', $str)) { |
There was a problem hiding this comment.
The regex used to detect delimiters is missing digits (\d), which are explicitly defined as word boundaries in the method's docstring ("capitalizes letters following digits..."). Without this, all-caps strings separated by numbers (e.g., 'API2URL') will not be lowercased and will return incorrect results like 'aPI2URL' instead of 'api2Url'.
Additionally, consider adding the u modifier to the regex for consistent UTF-8 support. Without it, \s only matches ASCII whitespace, which might lead to inconsistent behavior if the input contains multibyte whitespace characters (like an ideographic space U+3000) that the underlying str_camelize might otherwise treat as a delimiter.
if (\preg_match('/[-_\s\d]/u', $str)) {| { | ||
| $str = $this->str; | ||
|
|
||
| if (\preg_match('/[-_\s]/', $str)) { |
There was a problem hiding this comment.
Similar to the camelize() method, this regex should include \d to correctly handle all-caps inputs containing numbers.
Also, note that this preprocessing logic is now identical in both camelize() and upperCamelize(). While out of scope for a quick fix, refactoring this into a shared private helper method would improve maintainability.
if (\preg_match('/[-_\s\d]/u', $str)) {- Expand delimiter regex to include digits (\d) and add u modifier for UTF-8 support, per code review feedback - Add !has_lowercase() guard to avoid incorrectly lowercasing mixed-case inputs that happen to contain digits (e.g. 'StringWith1Number') - Refactor duplicated preprocessing logic from camelize() and upperCamelize() into a private prepareStringForCamelization() helper - Add API2URL test cases to verify all-caps digit-separated input Agent-Logs-Url: https://github.com/voku/Stringy/sessions/2644a089-2d2a-45ba-8c08-2f959e048507 Co-authored-by: voku <264695+voku@users.noreply.github.com>
|
@copilot the bug was fixed upstream, validat that and just keep the tests afterwards |
- Revert src/Stringy.php to its original state (remove prepareStringForCamelization helper and the changes to camelize()/upperCamelize()) - Update composer.json to use portable-utf8 6.1.1 (aliased as 6.0.14 to satisfy transitive anti-xss constraint) which has the identical has_lowercase + delimiter-regex fix natively in str_camelize/str_upper_camelize - Keep the test cases added for all-caps ASCII, UTF-8, and digit-delimited inputs Agent-Logs-Url: https://github.com/voku/Stringy/sessions/50c08647-569d-4146-babb-8c10f05272db Co-authored-by: voku <264695+voku@users.noreply.github.com>
Confirmed: portable-utf8 6.1.1 has the identical Reverted |
Agent-Logs-Url: https://github.com/voku/Stringy/sessions/0ecf06e1-1f6a-4cd2-b0ec-899948adc157 Co-authored-by: voku <264695+voku@users.noreply.github.com>
voku/portable-utf8with~6.1.0This change is