Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rational/Float/Fixnum/Bignum `.to_s.encoding` is US-ASCII #517

Closed
coffeejunk opened this Issue Jan 29, 2013 · 4 comments

Comments

Projects
None yet
3 participants
@coffeejunk
Copy link

coffeejunk commented Jan 29, 2013

When converting a Number to a String with the .to_s method, the encoding of the resulting String is US-ASCII/ASCII-8BIT. (This is also the behavior of mri and rubinius/rubinius#2136)

jruby-1.7.2 :001 > __ENCODING__
 => #<Encoding:UTF-8> 
jruby-1.7.2 :002 > Encoding.default_internal
 => nil 
jruby-1.7.2 :003 > Encoding.default_external
 => #<Encoding:UTF-8> 
jruby-1.7.2 :004 > "abc".encoding
 => #<Encoding:UTF-8> 
jruby-1.7.2 :005 > 1.to_s.encoding
 => #<Encoding:US-ASCII> 
jruby-1.7.2 :006 > 1.to_r.to_s.encoding
 => #<Encoding:ASCII-8BIT> 
jruby-1.7.2 :007 > 1.0.to_s.encoding
 => #<Encoding:US-ASCII> 
jruby-1.7.2 :008 > Encoding.default_internal = "UTF-8"
 => "UTF-8" 
jruby-1.7.2 :009 > 1.0.to_s.encoding
 => #<Encoding:US-ASCII> 
$ jruby -v
jruby 1.7.2 (1.9.3p327) 2013-01-04 302c706 on Java HotSpot(TM) 64-Bit Server VM 1.6.0_37-b06-434-11M3909 [darwin-x86_64]
$ uname -a
Darwin Mandallia.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 16:32:48 PDT 2012; root:xnu-2050.20.9~2/RELEASE_X86_64 x86_64
@enebo

This comment has been minimized.

Copy link
Member

enebo commented Jan 29, 2013

Ah...I was going to tell you to open an issue on redmine for this, but you did already. We will wait and see what MRI decides then. You might also want to provide an example on the redmine bug showing why it is undesirable.

@BanzaiMan

This comment has been minimized.

Copy link
Member

BanzaiMan commented Jan 31, 2013

@BanzaiMan

This comment has been minimized.

Copy link
Member

BanzaiMan commented Feb 2, 2013

http://bugs.ruby-lang.org/issues/7752#note-4

On current policy, strings which always include only US-ASCII characters are US-ASCII.
If there is a practical issue, I may change the policy in the future.

Note that US-ASCII string is faster than UTF-8 on getting length or index access.

Looks like the ticket would be closed as NOTABUG. (At least, there is an explicitly stated policy about encoding here.)

One thing we take away from here, though, is that JRuby is not doing something right. In particular:

1.to_r.to_s.encoding #=> #<Encoding:ASCII-8BIT> in JRuby, #<Encoding:US-ASCII> in MRI
@BanzaiMan

This comment has been minimized.

Copy link
Member

BanzaiMan commented Feb 2, 2013

Fixed the issue above with e236a6a.

@enebo enebo closed this Feb 19, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.