You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GB18030 encoded text is being detected as utf_16, big5 and cp037, and only big5 can decode it.
Detection as utf_16 is very wrong as that codec must contain the utf16 BOM, so the library should be very cautious about that result, but chardet has a patch to do exactly that chardet/chardet#109
The GB18030 BOM tends to result in detection as cp037 The BOM is regularly causing problems in chardet-like-libraries. c.f. chardet/chardet#178
The text was updated successfully, but these errors were encountered:
GB18030 encoded text is being detected as utf_16, big5 and cp037, and only big5 can decode it.
Detection as utf_16 is very wrong as that codec must contain the utf16 BOM, so the library should be very cautious about that result, but
chardet
has a patch to do exactly that chardet/chardet#109The GB18030 BOM tends to result in detection as cp037 The BOM is regularly causing problems in chardet-like-libraries. c.f. chardet/chardet#178
The text was updated successfully, but these errors were encountered: