Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String#dump returns incorrect character escaping for unicode codepoints > 0xffff #2794

Closed
postmodern opened this issue Nov 21, 2022 · 2 comments

Comments

@postmodern
Copy link

I noticed a slight difference in how String#dump escapes unicode characters greater than 0xffff on TruffleRuby 22.3.0 vs. CRuby 3.0.4. TruffleRuby should escape unicode characters greater than 0xffff as \u{xxxx} to match CRuby; no clue why CRuby does this.

TruffleRuby 22.3.0

0xffff.chr(Encoding::UTF_8).dump
# =>  "\"\\uFFFF\""
0x10000.chr(Encoding::UTF_8).dump
# => "\"\\u10000\""

CRuby 3.0.4

0xffff.chr(Encoding::UTF_8).dump
# => "\"\\uFFFF\""
0x10000.chr(Encoding::UTF_8).dump
# => "\"\\u{10000}\""
@postmodern postmodern changed the title String#dump returns incorrect character escaping for unicode codepoints > 0xffff String#dump returns incorrect character escaping for unicode codepoints > 0xffff Nov 21, 2022
@eregon
Copy link
Member

eregon commented Nov 21, 2022

Thanks for the report.
I guess it's done so to know how many digits to look at, i.e., always 4 without {}, and what's between \u{ and } otherwise.

@andrykonchin andrykonchin self-assigned this Nov 21, 2022
@andrykonchin
Copy link
Member

Fixed in 57e53f8

@andrykonchin andrykonchin added this to the 23.0.0 Release milestone Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants