Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Encoding of String.name is ASCII-8BIT #5208
This is more of an observed difference between MRI Ruby and jruby 18.104.22.168, which I'm not sure is incorrect but the difference tripped up some code in our code base, so I thought I'd point it out. For the purposes of this report the expected behaviour is to be the same as MRI but I understand the difference might be allowed.
Expected behaviour is
String.name.encoding => #<Encoding:US-ASCII> class Hèllo; end => nil irb(main):003:0> Hèllo.name.encoding => #<Encoding:UTF-8>
String.name.encoding => #<Encoding:ASCII-8BIT> class Hèllo; end => nil irb(main):002:0> Hèllo.name.encoding => #<Encoding:UTF-8>
Ah a regression from fixing all our encoding issues. Ironic. when we register class names we have no eager symbol and it bubbles down to calculateRubyName which calls runtime.newString() which will set encoding to ASCII-8BIT (because default ByteList constructor assumes this encoding).
Our solution will be to be a bit smarter about this encoding because any class which is 7bit clean regardless of encoding specified should be US-ASCII (this is a more general rule of symbols but it will behave the same for class names).