-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Problem description
The TableDiff first converts some of the string's characters into named entities using htmlentities()
. The remaining characters are tried to be converted using iconv()
(might behave differently depending on underlying system) from UTF-8 into ISO-8859-1.
In 0.1.16 this failed with Notice: iconv(): Detected an illegal character in input string
if characters remained that have no equivalent in ISO-8859-1 and neither modifiers //TRANSLIT or //IGNORE have been used.
In 0.1.17 there was a change (see #134 and its corresponding PR) that added the iconv modifier //IGNORE. While the conversion now succeeds, it silently drops any characters that have no representation in ISO-8859-1.
In our application there is an editorial workflow that needs to rely on the diff output being complete, so silently dropping characters isn't an viable option.
Steps to reproduce
$old = '<table><tr><td>React with :-) and "joy".</td></tr></table>';
$new = '<table><tr><td>React with 🙂 and "joy".</td></tr></table>';
$htmlDiff = new HtmlDiff($old, $new);
echo $htmlDiff->build();
In 0.1.16 it fails (or might fail), in 0.1.17 the Unicode character 🙂 is silently dropped.
Possible solution
PR #137