Fix `Symbol#inspect` of UTF_16/UTF_32 #4994
Conversation
Stop to append byte before inspect string. For example, when an encoding of `symbolBytes` is UTF_16LE, "a" is `[0x61, 0x00]`. If we append `":"` (0x3A) to `symbolBytes` before inspect it, bytes are `[0x3A, 0x61, 0x00]` with UTF_16LE encoding. This is not what we want to get. This commit chnages the order of inspecting and appending to avoid this. Ref: https://github.com/ruby/ruby/blob/v2_5_0/string.c#L10402
@yui-knk You seem to even fix a second bug in here where we seem to be adding :" at the front of a symbol but add no closing ". I might change this code now that you have fixed this because we potentially make 3 instances of RubyString depending on the symbol being inspected. I think we can reduce this to just one. |
Actually I will not be planning on changing this. 1) :sym.inspect is exceedingly rare in hot code 2) guts of bytelist vs RubyString and ability to determine CR_7BIT is much simpler if we make a string first. Working around that to defer making the string would involve some new code. I did glance at MRI and they remove some of this cost by using memcopy/memmove and set the ':' and contents of the string. We could optimize in this way if we wanted but due to 1) above I am not inclined to put in that extra effort :) |
I agree :) |
Stop to append byte before inspect string.
For example, when an encoding of
symbolBytes
is UTF_16LE, "a" is[0x61, 0x00]
. If we append":"
(0x3A) tosymbolBytes
beforeinspect it, bytes are
[0x3A, 0x61, 0x00]
with UTF_16LE encoding.This is not what we want to get. This commit chnages the order of
inspecting and appending to avoid this.
Ref: https://github.com/ruby/ruby/blob/v2_5_0/string.c#L10402