New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String from literal sometimes not reused with frozen_string_literal enabled #4970

Closed
ivoanjo opened this Issue Jan 12, 2018 · 11 comments

Comments

Projects
None yet
2 participants
@ivoanjo
Contributor

ivoanjo commented Jan 12, 2018

Environment

  • JRuby: jruby 9.1.15.0 (2.3.3) 2017-12-07 929fde8 Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
  • Kernel: Linux u186024434db159d25c92 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Distro: Ubuntu 16.04.3 LTS

Expected Behavior

With frozen_string_literal enabled, always MRI reuses repeated copies of the same literal.

Testcase:

# frozen_string_literal: true

puts RUBY_DESCRIPTION

DUMMY_VALUE = 'hello bug'
puts "object_id for DUMMY_VALUE: #{DUMMY_VALUE.object_id}"

Integer(ENV['THREADS'] || 100).times do
  Thread.new { }
end

OTHER_VALUE = 'hello bug'
puts "object_id for OTHER_VALUE: #{OTHER_VALUE.object_id}"

Output on MRI:

$ ruby spec/helpers/dummy_thread.rb
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]
object_id for DUMMY_VALUE: 10642320
object_id for OTHER_VALUE: 10642320

Actual Behavior

JRuby's behavior for this is pretty bizarre. With THREADS set to a small number, it works the same as MRI:

$ THREADS=10 ruby --dev spec/helpers/dummy_thread.rb
jruby 9.1.15.0 (2.3.3) 2017-12-07 929fde8 Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
object_id for DUMMY_VALUE: 2000
object_id for OTHER_VALUE: 2000

But eventually as you raise the number of threads, it stops reusing the literal:

$ THREADS=100 ruby --dev spec/helpers/dummy_thread.rb
jruby 9.1.15.0 (2.3.3) 2017-12-07 929fde8 Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
object_id for DUMMY_VALUE: 2000
object_id for OTHER_VALUE: 2002

The example above is from a test suite that was randomly failing on JRuby sometimes, as it randomly ordered threads and thus it would fail when the tests that played with threads ran before a test that tested that two literals were exactly the same.

@headius

This comment has been minimized.

Member

headius commented Jan 12, 2018

Why are you testing that two literals are exactly the same?

@headius

This comment has been minimized.

Member

headius commented Jan 12, 2018

There's likely a race here in Ruby.freezeAndDedupString. We've seen some intermittent failures on other tests (which, being in MRI's suite, I understand). But the actual frozen string objects should be reused at each site.

As far as I know, there's nothing specified in any MRI documentation that says two different sites with the same content should return the same frozen string object. We can probably fix this, but I think it's a bad side effect to test for.

@ivoanjo

This comment has been minimized.

Contributor

ivoanjo commented Jan 12, 2018

You're right, for this test we can just use == instead of testing that the objects are exactly the same.

My main worry was the randomness of the behavior, and if it was a symptom of a bigger issue.

@headius

This comment has been minimized.

Member

headius commented Jan 12, 2018

I've located a likely culprit in the code and I'm working on a fix.

headius added a commit that referenced this issue Jan 12, 2018

Avoid race when deduplicating frozen strings. Fixes #4970.
The logic here goes like this:

* Attempt to lookup the dedup'ed string.
* If found, return it.
* Try to insert new dedup'ed string.
* If nobody beat us to it, return it.
* If someone beat us to it and the reference is still good, return
  it.
* If someone beat us to it and the reference is vacated,
  atomically replace it or retrieve any updates that beat us.
* Keep trying until we succeed in updating or get back a populated
  reference.

Previously, there could be a race if two threads try to dedup the
same string at the same time. Only one would get in the cache, but
the n+1 threads might not return the already-cached value.
@headius

This comment has been minimized.

Member

headius commented Jan 12, 2018

I believe I've got this fixed, but please review and test locally. I could not reproduce the issue with your script, but I believe you :-)

@headius headius closed this Jan 12, 2018

@headius headius added this to the JRuby 9.1.16.0 milestone Jan 12, 2018

@ivoanjo

This comment has been minimized.

Contributor

ivoanjo commented Jan 12, 2018

Thanks for the quick turnaround! I'll test and report back asap.

@ivoanjo

This comment has been minimized.

Contributor

ivoanjo commented Jan 13, 2018

Hey @headius, I've been able to re-test and I'm problem is still there. (I can't reopen the ticket myself, it seems).

I've debugged it a bit, it's definitely something around freezeAndDedupString and references getting collected. With an improved test case:

# frozen_string_literal: true

puts RUBY_DESCRIPTION

DUMMY_VALUE = 'hello bug'
puts "object_id for DUMMY_VALUE: #{DUMMY_VALUE.object_id}"

java.lang.System.gc unless ENV['NOGC']

OTHER_VALUE = 'hello bug'
puts "object_id for OTHER_VALUE: #{OTHER_VALUE.object_id}"

I can reproduce every time: it's the GC that triggers it. That's probably why you couldn't reproduce --- you probably have more RAM than I do :)

$ NOGC=1 ruby/jruby/bin/jruby testcase.rb
jruby 9.1.16.0-SNAPSHOT (2.3.3) 2018-01-13 23783ce Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
object_id for DUMMY_VALUE: 2000
object_id for OTHER_VALUE: 2000

$ ruby/jruby/bin/jruby testcase.rb
jruby 9.1.16.0-SNAPSHOT (2.3.3) 2018-01-13 23783ce Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
object_id for DUMMY_VALUE: 2000
object_id for OTHER_VALUE: 2002

Looking into it, now that it's consistently reproduceable, I found that these strings are being created as part of the PutConstInstr:

at org.jruby.ir.runtime.IRRuntimeHelpers.newFrozenString(IRRuntimeHelpers.java:1912)
at org.jruby.ir.operands.FrozenString.createCacheObject(FrozenString.java:109)
at org.jruby.ir.operands.ImmutableLiteral.cachedObject(ImmutableLiteral.java:63)
at org.jruby.ir.operands.ImmutableLiteral.retrieve(ImmutableLiteral.java:84)
at org.jruby.ir.instructions.PutConstInstr.interpret(PutConstInstr.java:26)

IRRuntimeHelpers::newFrozenString creates a new string with the received ByteList, and then calls Ruby::freezeAndDedupString. Inside freezeAndDedupString we create a frozen copy of the new string and add it to the map, keyed by the original string:

// Never use value as key
deduped = string.strDup(this);
deduped.setFrozen(true);

final WeakReference<RubyString> weakref = new WeakReference<>(deduped);

// try to insert new
dedupedRef = dedupMap.computeIfAbsent(string, new Function<RubyString, WeakReference<RubyString>>() {

...and then return the deduped frozen copy in the next step.

So the problem is that the WeakHashMap key is the original string, which gets instanced on IRRuntimeHelpers::newFrozenString and is no longer referenced anywhere after Ruby::freezeAndDedupString returns. So, whenever the GC gets around to checking the weak map, it rightfully sees that the original key is gone, and drops it.

I believe the fix is simple: the new deduped string should be the one that is used as a key, and I would've submitted a PR with that change if you hadn't explicitly add the comment // Never use value as key.

So my question is --- why? In this case it seems like exactly what we want: the deduped string is weakly referenced as a key AND weakly referenced as a value, because the behavior we want out of this map (I think...) is that it should behave as a WeakHashSet, if such a thing existed. That is, we expect that while the deduped string is referenced somewhere else, we keep it in the map, otherwise it goes away.

@headius headius reopened this Jan 16, 2018

headius added a commit that referenced this issue Jan 22, 2018

@headius

This comment has been minimized.

Member

headius commented Jan 22, 2018

I believe the fix is simple: the new deduped string should be the one that is used as a key, and I would've submitted a PR with that change if you hadn't explicitly add the comment // Never use value as key.

So my question is --- why?

Actually you're right...the comment is misleading.

What I'm trying to avoid here is ever using the incoming string as the dedup key, since we don't know how else that string is being used. You are correct; the actual deduped string should be both weak key and weak value (WeakHashSet would be nice) and the hard reference keeping it alive is the responsibility of freezeAndDedupString callers.

Also discovered: I will have to break the atomicity of the patch on JRuby 9.1 since it supports Java 7 and the Map.compute* APIs are Java 8+. I don't expect it to affect much code since it would require concurrently loading two+ pieces of code with the same frozen string(s).

headius added a commit that referenced this issue Jan 22, 2018

Additional fixes for deuping strings (#4970).
* Use deduped as both key and value. Hard reference to key is the
  responsibility of callers of freezeAndDedupString.
* Remove use of Java 8+ atomic Map.compute API calls. This means
  9.1 will have a race, but that should only affect concurrent
  loads that attempt to deduplicate the same currently-not-
  deduplicated strings.
@headius

This comment has been minimized.

Member

headius commented Jan 22, 2018

I have pushed a modified fix to jruby-9.1 that uses the deduped value as the key and passes all your examples. We'll want those to get into ruby/spec.

I'm also doing a fully Java 8 lambda-fied version of the atomic logic for master.

@headius headius closed this in 239ef50 Jan 22, 2018

@ivoanjo

This comment has been minimized.

Contributor

ivoanjo commented Jan 22, 2018

Thanks for the awesome fix!

Indeed the computeIfFoo APIs were a really great addition 🎉

headius added a commit that referenced this issue Jan 22, 2018

@headius

This comment has been minimized.

Member

headius commented Mar 1, 2018

This seems to be happening again for the "freeze" optimization. This recently modified spec now fails to assert two identical frozen strings are the same object.

I've filed #5070. No theories yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment