Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression with Marshal.load(a Hash) #5523

Closed
larskanis opened this Issue Dec 16, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@larskanis
Copy link

larskanis commented Dec 16, 2018

Environment

$ ruby -v
jruby 9.2.5.0 (2.5.0) 2018-12-06 6d5a228 OpenJDK 64-Bit Server VM 10.0.2+13-Ubuntu-1ubuntu0.18.04.2 on 10.0.2+13-Ubuntu-1ubuntu0.18.04.2 +jit [linux-x86_64]

$ uname -a
Linux netzbuch 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Expected Behavior

This is what MRI 2.5.3 and JRuby <= 9.2.4.0 computes:

h={:€a=>nil, :€c=>nil, :€h=>nil}
Marshal.load(Marshal.dump(h)) #  => {:€a=>nil, :€c=>nil, :€h=>nil}

Actual Behavior

But with JRuby-9.2.5.0 wired things are happening:

h={:€a=>nil, :€c=>nil, :€h=>nil}
Marshal.load(Marshal.dump(h)) #  => {:€a=>nil, :€c=>true, nil=>:€h}

This regression was raised by the Eventbox test suite.

@headius

This comment has been minimized.

Copy link
Member

headius commented Dec 17, 2018

Nice find! Maybe we should be running Eventbox's suite in our CI 😁

@headius headius added this to the JRuby 9.2.6.0 milestone Dec 17, 2018

headius added a commit to headius/jruby that referenced this issue Dec 17, 2018

@headius

This comment has been minimized.

Copy link
Member

headius commented Dec 18, 2018

Ok, I see the problem. It looks like a flaw in how we register a symbol and its encoding in the marshaled data. The indices of the links are getting off, which causes it to read too little data for €c, so the next value in the stream is the "T" for UTF-8 encoding, and then the next key appears to be the nil value that €c should have been assigned.

@headius

This comment has been minimized.

Copy link
Member

headius commented Dec 18, 2018

I believe I've fixed the issue in #5528, though there's a bit of a caveat listed there relating to visibility of partially-unmarshaled symbols.

The problem was largely as I described it above; however, the bug was in unmarshaling rather than marshaling. On the dump side, we register the symbol first and then its encoding, but on the unmarshal side we accidentally registered the encoding first. This led to it loading the wrong content for subsequent encodings since they all used the same encoding key of :E. The links for any encodings past the first would point at the first symbol, see it as not an encoding, and then not try to unmarshal the actual encoding. This shifted all values in the marshal stream around, so the keys and values were thrown off.

@larskanis

This comment has been minimized.

Copy link
Author

larskanis commented Dec 18, 2018

Thanks @headius for fixing this so fast!

Maybe we should be running Eventbox's suite in our CI 😁

It doesn't pass on JRuby currently due to issues in Thread.raise/Thread.handle_interrupt which Eventbox makes use of. I didn't dig deeper into it to build a minimal example, since the related MRI tests are still disabled in JRuby. It would be nice if they could be addressed sometime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.