-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix mbstring support for CP1252 encoding
It's a bit surprising how much was broken here. - Identify filter was utterly and completely wrong. - Instead of handling invalid CP1252 bytes as specified by `mb_substitute_character`, it would convert them to Unicode 0xFFFD (generic replacement character). - When converting ISO-8859-1 to CP1252, invalid ISO-8859-1 bytes would be passed through silently. - Unicode codepoints from 0x80-0x9F were converted to CP1252 bytes 0x80-0x9F, which is wrong. - Unicode codepoint 0xFFFD was converted to CP1252 0x9F, which is very wrong. Also clean up some unneeded code, and make the conversion table consistent with others by using zero as a 'invalid' marker, rather than 0xFFFD.
- Loading branch information
Showing
2 changed files
with
17 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters