Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnmarshalStream: Look for encoding information in all instance variables #1345

merged 1 commit into from Dec 19, 2013


Copy link

@DavidEGrayson DavidEGrayson commented Dec 17, 2013

Unfortunately, I think the solution to issue #939 did not go far enough. When unmarshaling a string, MRI actually checks every instance variable for encoding information and uses the last encoding it found.

We can see this from r_ivar in MRI 2.0.0p353 and 1.9.3p484:

I also checked the latest version of MRI's master branch.

Here is some example code where we unmarshal a string and the encoding is specified as UTF-16BE in the second instance variable:

puts Marshal.load("\x04\x08I\"\x06a\x07:\x07@aT:\x0Dencoding\"\x0DUTF-16BE").encoding

Here is a shell session showing how MRI respects the encoding but JRuby does not, giving the string an encoding of ASCII-8BIT:

$ jruby -v && jruby pr2_repro.rb
jruby 1.7.9 (1.9.3p392) 2013-12-06 87b108a on Java HotSpot(TM) 64-Bit Server VM
1.7.0_45-b18 [Windows 8-amd64]
$ source
$ ruby -v && ruby pr2_repro.rb
ruby 2.0.0p0 (2013-02-24) [x64-mingw32]

The code in this pull request fixes this issue. All I did was remove the "i==0" condition in defaultVariablesUnmarshal and adjusted the indentation accordingly.

The reason I am touching this code is because I plan to refactor the encoding-determination logic out of it into another method so I can use it for symbols for issue #1329, or maybe I will just call the existing method.

… in all the instance variables, and only use the last one. All I did was remove the "i==0" condition in defaultVariablesUnmarshal and adjust the indentation accordingly.
Copy link

@headius headius commented Dec 19, 2013

Looks good, I'll merge.

Whitespace-free diff for anyone following along at home:

$ git diff -w jruby-1_7
diff --git a/core/src/main/java/org/jruby/runtime/marshal/ b/core/src/main/java/org/jruby/runtime/marshal/
index 6adc71d..d26e169 100644
--- a/core/src/main/java/org/jruby/runtime/marshal/
+++ b/core/src/main/java/org/jruby/runtime/marshal/
@@ -375,8 +375,6 @@ public class UnmarshalStream extends InputStream {

             IRubyObject key = unmarshalObject(false);

-            if (i == 0) { // first ivar provides encoding
             if (runtime.is1_9() && object instanceof EncodingCapable) {

                 EncodingCapable strObj = (EncodingCapable)object;
@@ -407,7 +405,6 @@ public class UnmarshalStream extends InputStream {

                 } // else fall through as normal ivar
-            }

             String name = key.asJavaString();
             IRubyObject value = unmarshalObject();
@jrubyci jrubyci merged commit 3020ca2 into jruby:jruby-1_7 Dec 19, 2013
1 check failed
1 check failed
default The Travis CI build failed
@DavidEGrayson DavidEGrayson deleted the DavidEGrayson:jruby_pull_request_2 branch Jan 19, 2014
@enebo enebo modified the milestones: JRuby 1.7.10, JRuby 1.7.11 Feb 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants