Skip to content

Commit 3517a70

Browse files
committed
Fix legacy text conversion filter for CP50220
CP50220 converts some codepoints which represent kana (hiragana/katakana) to a different form. This is the only difference between CP50220 and CP50221 (which doesn't perform such conversion). In some cases, this conversion means collapsing two codepoints to a single output byte sequence. Since the legacy text conversion filters only worked a byte at a time, the legacy filter had to cache a byte, then wait until it was called again with the next byte to compare the cached byte with the following one. That was all fine, but it didn't work as intended when there were errors (invalid byte sequences) in the input. Our code (both old and new) for emitting error markers recursively calls the same conversion filter. When the old CP50220 filter was called recursively, the logic for managing cached bytes did not behave as intended. As a result, the error markers could be reordered with other characters in the output. I used an ugly hack to fix this in 6938e35; when making a recursive call to emit an error marker, temporarily swap out `filter->filter_function` to bypass the byte-caching code, so the error marker immediately goes through to the output. This worked, but I overlooked the fact that the very same problem can occur if an invalid byte sequence is detected *in the flush function*. Apply the same (ugly) fix.
1 parent 4b37033 commit 3517a70

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

ext/mbstring/libmbfl/filters/mbfilter_cp5022x.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,9 @@ static int mbfl_filt_conv_wchar_cp50220_flush(mbfl_convert_filter *filter)
555555

556556
if (filter->cache) {
557557
int s = mb_convert_kana_codepoint(filter->cache, 0, NULL, NULL, mode);
558+
filter->filter_function = mbfl_filt_conv_wchar_cp50221;
558559
mbfl_filt_conv_wchar_cp50221(s, filter);
560+
filter->filter_function = mbfl_filt_conv_wchar_cp50220;
559561
filter->cache = 0;
560562
}
561563

0 commit comments

Comments
 (0)