`String#dump` returns incorrect character escaping for unicode codepoints > `0xffff` #2794

postmodern · 2022-11-21T13:31:43Z

I noticed a slight difference in how String#dump escapes unicode characters greater than 0xffff on TruffleRuby 22.3.0 vs. CRuby 3.0.4. TruffleRuby should escape unicode characters greater than 0xffff as \u{xxxx} to match CRuby; no clue why CRuby does this.

TruffleRuby 22.3.0

0xffff.chr(Encoding::UTF_8).dump
# =>  "\"\\uFFFF\""
0x10000.chr(Encoding::UTF_8).dump
# => "\"\\u10000\""

CRuby 3.0.4

0xffff.chr(Encoding::UTF_8).dump
# => "\"\\uFFFF\""
0x10000.chr(Encoding::UTF_8).dump
# => "\"\\u{10000}\""

The text was updated successfully, but these errors were encountered:

eregon · 2022-11-21T14:10:30Z

Thanks for the report.
I guess it's done so to know how many digits to look at, i.e., always 4 without {}, and what's between \u{ and } otherwise.

andrykonchin · 2022-11-29T11:49:32Z

Fixed in 57e53f8

postmodern changed the title ~~String#dump returns incorrect character escaping for unicode codepoints > 0xffff~~ String#dump returns incorrect character escaping for unicode codepoints > 0xffff Nov 21, 2022

eregon added the compatibility label Nov 21, 2022

andrykonchin self-assigned this Nov 21, 2022

andrykonchin closed this as completed Nov 29, 2022

andrykonchin added this to the 23.0.0 Release milestone Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`String#dump` returns incorrect character escaping for unicode codepoints > `0xffff` #2794

`String#dump` returns incorrect character escaping for unicode codepoints > `0xffff` #2794

postmodern commented Nov 21, 2022

eregon commented Nov 21, 2022

andrykonchin commented Nov 29, 2022

String#dump returns incorrect character escaping for unicode codepoints > 0xffff #2794

String#dump returns incorrect character escaping for unicode codepoints > 0xffff #2794

Comments

postmodern commented Nov 21, 2022

TruffleRuby 22.3.0

CRuby 3.0.4

eregon commented Nov 21, 2022

andrykonchin commented Nov 29, 2022

`String#dump` returns incorrect character escaping for unicode codepoints > `0xffff` #2794

`String#dump` returns incorrect character escaping for unicode codepoints > `0xffff` #2794