Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different Encoding behavior from all other Rubies #2580

Closed
bf4 opened this Issue Feb 8, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@bf4
Copy link

bf4 commented Feb 8, 2015

See rspec/rspec-support#172 (comment) for origin of this issue.

In brief, on all other Rubies, "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8") returns ?, but JRUBY returns \x80

LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
   rvm ruby-1.9.2-p330,ruby-1.9.3-p551,ruby-2.0.0-p598,ruby-2.1.5,ruby-2.2.0,jruby-1.7.18,rbx-2.2.2 do \
   ruby -e 'p [RUBY_VERSION, RUBY_ENGINE, Encoding.default_external, __ENCODING__] << "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8")'

 for version in 1.9 2.0; do \
   export JRUBY_OPTS="-Xcompat.version=${version}" ; \
   LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
   rvm jruby-1.7.18 do \
   ruby -e 'p [RUBY_VERSION, RUBY_ENGINE, Encoding.default_external, __ENCODING__] << "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8")';
 done
["1.9.2", "ruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["1.9.3", "ruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["2.0.0", "ruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["2.1.5", "ruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["2.2.0", "ruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["1.9.3", "jruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "\x80"]
["2.1.0", "rbx", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]
["1.9.3", "jruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "\x80"]
["2.0.0", "jruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "\x80"]

bf4 added a commit to bf4/rspec-support that referenced this issue Feb 8, 2015

bf4 added a commit to bf4/rspec-support that referenced this issue Feb 8, 2015

Fix invalid byte sequence on EncodedString#split
Map string char with invalid encoding to '?'
Format identical string expectation to read easier

Refs:
- rspec/rspec-core#1760
- via rspec#134

Set to pending for JRuby, opened issue
jruby/jruby#2580
that JRuby is the only Ruby that returns "\x80"
in place of "?"
@bf4

This comment has been minimized.

Copy link
Author

bf4 commented Feb 25, 2015

? @headius @enebo this look at all interesting? I'm happy to look in the source code if you can point me where, though my java is only classroom level.

@headius

This comment has been minimized.

Copy link
Member

headius commented Mar 12, 2015

I assume you did not test on JRuby 9k, yes? This appears to work right in 9k:

$ ruby -v -e 'p [RUBY_VERSION, RUBY_ENGINE, Encoding.default_external, __ENCODING__] << "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8")'
jruby 9.0.0.0-SNAPSHOT (2.2.1) 2015-03-12 3b067b7 Java HotSpot(TM) 64-Bit Server VM 25.40-b23 on 1.8.0_40-ea-b19 +jit [darwin-x86_64]
["2.2.1", "jruby", #<Encoding:UTF-8>, #<Encoding:UTF-8>, "?"]

@headius headius added this to the 9.0.0.0.pre2 milestone Mar 12, 2015

@headius headius closed this Mar 12, 2015

@bf4

This comment has been minimized.

Copy link
Author

bf4 commented Mar 12, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.