-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix encoding detection on PHP 8.1 #182
Conversation
Use mb_check_encoding to detect encoding as mb_detect_encoding is misbehaving under PHP 8.1. Also use mb_convert_encoding instead of utf8_encode as it’s getting deprecated in PHP 8.2. Fixes sabre-io#181
do we need additional test-coverage for this change? |
Codecov Report
@@ Coverage Diff @@
## master #182 +/- ##
============================================
- Coverage 89.75% 89.60% -0.16%
Complexity 262 262
============================================
Files 15 15
Lines 898 885 -13
============================================
- Hits 806 793 -13
Misses 92 92
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
If the current coverage did not catch the bug, yes. Here are some example strings that could be used in the test: https://3v4l.org/RrjlE |
please investigate whether we need such a test and add it, if required. |
Signed-off-by: Côme Chilliet <come.chilliet@nextcloud.com>
Test added. |
|
||
switch ($encoding) { | ||
case 'ISO-8859-1': | ||
$path = utf8_encode($path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this method is deprecated as of php 8.2, this is a step into a good direction
👍
@DeepDiver1975 @phil-davis any objections to merging this ? thanks 😄 |
@DeepDiver1975 There is the same problem in https://github.com/sabre-io/dav/blob/master/lib/DAV/StringUtil.php#L78 and https://github.com/sabre-io/vobject/blob/master/lib/StringUtil.php#L41 Should I open PR on these repos as well? |
yes please |
Just a note for "information". There are some impossibilities for this kind of automated "educated guess" detection of encoding. So if any software is presented with There will be plenty of other examples. Have a play at https://dencode.com/en/string/hex and try putting in hex for https://en.wikipedia.org/wiki/UTF-8 code points, and find ones that match sets of ISO-8859-1 code points that represent sequences of characters that could also be a valid, reasonable combination that might occur in a file name, for example.
Any automated algorithm has to have a heuristic that "guesses" that Greek Omega is more common than the combination Ω |
Use mb_check_encoding to detect encoding as mb_detect_encoding is misbehaving under PHP 8.1.
Also use mb_convert_encoding instead of utf8_encode as it’s getting deprecated in PHP 8.2.
Fixes #181