Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marshal load loses correct encoding for string subclass #939

Closed
jkl1337 opened this issue Aug 2, 2013 · 2 comments
Closed

Marshal load loses correct encoding for string subclass #939

jkl1337 opened this issue Aug 2, 2013 · 2 comments
Labels
Milestone

Comments

@jkl1337
Copy link

@jkl1337 jkl1337 commented Aug 2, 2013

I am getting unexpected behavior when using Marshal.load on a subclass of string that includes an instance variable with a value. In the example below the instance variable is an integer, but it seems to do this with anything other than nil for the instance variable.

Note that Marshal.dump seems to yield identical output in JRuby and MRI (the encoding of the Marshal.dump string is ASCII-8BIT in both cases as expected)

This problem was encountered in a Rails app after attempting to cache an object containing an object with a subclass of string.

The workaround I am using is to customize marshal_dump on the class to return an array with the string data as one element and the ivars as the other.

jruby 1.7.5.dev (1.9.3p392) 2013-08-02 c672591 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_25-b15 [linux-amd64]

[1] pry(main)> class StringSubclass < String
[1] pry(main)*   attr_accessor :oops
[1] pry(main)* end  
=> nil
[2] pry(main)> s_ok = StringSubclass.new('what')
=> "what"
[3] pry(main)> s_ok.encoding
=> #<Encoding:UTF-8>
[4] pry(main)> Marshal.dump(s_ok)
=> "\x04\bIC:\x13StringSubclass\"\twhat\x06:\x06ET"
[5] pry(main)> Marshal.load(Marshal.dump(s_ok)).encoding
=> #<Encoding:UTF-8>
[6] pry(main)> 
[7] pry(main)> s_oops = StringSubclass.new('what').tap { |s| s.oops = 1; s }
=> "what"
[8] pry(main)> s_oops.encoding
=> #<Encoding:UTF-8>
[9] pry(main)> Marshal.dump(s_oops)
=> "\x04\bIC:\x13StringSubclass\"\twhat\a:\x06ET:\n@oopsi\x06"
[10] pry(main)> Marshal.load(Marshal.dump(s_oops)).encoding
=> #<Encoding:ASCII-8BIT>
ruby 2.0.0p195 (2013-05-14 revision 40734) [x86_64-linux]

[1] pry(main)> class StringSubclass < String
[1] pry(main)*   attr_accessor :oops
[1] pry(main)* end  
=> nil
[2] pry(main)> s_ok = StringSubclass.new('what')
=> "what"
[3] pry(main)> s_ok.encoding
=> #<Encoding:UTF-8>
[4] pry(main)> Marshal.dump(s_ok)
=> "\x04\bIC:\x13StringSubclass\"\twhat\x06:\x06ET"
[5] pry(main)> Marshal.load(Marshal.dump(s_ok)).encoding
=> #<Encoding:UTF-8>
[6] pry(main)> 
[7] pry(main)> s_oops = StringSubclass.new('what').tap { |s| s.oops = 1; s }
=> "what"
[8] pry(main)> s_oops.encoding
=> #<Encoding:UTF-8>
[9] pry(main)> Marshal.dump(s_oops)
=> "\x04\bIC:\x13StringSubclass\"\twhat\a:\x06ET:\n@oopsi\x06"
[10] pry(main)> Marshal.load(Marshal.dump(s_oops)).encoding
=> #<Encoding:UTF-8>
@headius
Copy link
Member

@headius headius commented Aug 27, 2013

Reproduced. Investigating.

@headius
Copy link
Member

@headius headius commented Aug 27, 2013

Turned out to be a fairly simple problem; the unmarshaling logic was buggy, looking for encoding in the last instance variable coming off the stream, rather than the first. I rewrote the logic to be less confusing and set it up to use the first variable, as in MRI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants