New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbThrowInternal throws error if IRubyObject ID is greater than Integer.MAX_VALUE #3980

Closed
juergenthomann opened this Issue Jun 27, 2016 · 14 comments

Comments

Projects
None yet
3 participants
@juergenthomann

Environment

  • JRuby 1.7.20.1
  • Debian Jessie 3.16.7-ckt20 x86_64
  • Oracle Java 1.8.0_92

This Problem occurs with Puppetserver, please also look at https://tickets.puppetlabs.com/browse/PUP-6409 for details in Puppet

Expected Behavior

JRuby shouldn't throw this error at all, as the Internal Object ID shouldn't influence the behavior of the program.

Actual Behavior

JRuby behaves as it should without any problems until the Object Id reached a value greater than Integer.MAX_VALUE. There is a line message += " in thread 0x" + Integer.toHexString(RubyInteger.fix2int(currentThread.id())); in org.jruby.RubyKernel which converts the Object ID to Integer.
If RubyBasicObject is the really source of the currentThread.id() it is a Long and can be too large

If I have missed something, let me know or also comment on the mentioned Puppet ticket if you like.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jun 27, 2016

Member

good catch I believe Long.toHexString(RubyInteger.fix2long(currentThread.id())) should be sufficient since toHexString internally treats long as unsigned.

Member

kares commented Jun 27, 2016

good catch I believe Long.toHexString(RubyInteger.fix2long(currentThread.id())) should be sufficient since toHexString internally treats long as unsigned.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jun 27, 2016

Member

the question is whether it should print the Ruby id or simply the "native" Java thread id-hash (address).

Member

kares commented Jun 27, 2016

the question is whether it should print the Ruby id or simply the "native" Java thread id-hash (address).

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jun 27, 2016

Member

looked around and its not in recent MRI (2.x) nor in 1.9.3 seems that the problematic part only existed in 1.8.7 ... although it seems useful to have the additional in thread 0x... in err.message I am not sure whether to keep it on 1.9/2.3 (JRuby 1.7/9k) since MRI does not longer have such code. // cc @enebo @headius ?

Thread.new { 
  puts Thread.current.inspect

  begin
    throw :fail
  rescue => e
    puts e.inspect
  end 
}.join

MRI 1.9.3 :

#<Thread:0x00000002509810 run>
#<ArgumentError: uncaught throw :fail>

JRuby 1.7 (same for 9k) :

#<Thread:0xc54ccd5 run>
#<ArgumentError: uncaught throw :fail in thread 0x7d0>
Member

kares commented Jun 27, 2016

looked around and its not in recent MRI (2.x) nor in 1.9.3 seems that the problematic part only existed in 1.8.7 ... although it seems useful to have the additional in thread 0x... in err.message I am not sure whether to keep it on 1.9/2.3 (JRuby 1.7/9k) since MRI does not longer have such code. // cc @enebo @headius ?

Thread.new { 
  puts Thread.current.inspect

  begin
    throw :fail
  rescue => e
    puts e.inspect
  end 
}.join

MRI 1.9.3 :

#<Thread:0x00000002509810 run>
#<ArgumentError: uncaught throw :fail>

JRuby 1.7 (same for 9k) :

#<Thread:0xc54ccd5 run>
#<ArgumentError: uncaught throw :fail in thread 0x7d0>

@kares kares removed the beginner label Jun 27, 2016

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 27, 2016

Contributor

It seems like if it's going to log anything about the thread ID, the JVM thread id would be more valuable than the Ruby object ID, but removing it altogether seems like it might be the right call since it's not logged that way in MRI. </two_cents>

Contributor

cprice404 commented Jun 27, 2016

It seems like if it's going to log anything about the thread ID, the JVM thread id would be more valuable than the Ruby object ID, but removing it altogether seems like it might be the right call since it's not logged that way in MRI. </two_cents>

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 27, 2016

Contributor

Tangientially related to this issue: I'm trying to figure out how a user would end up in a situation where this occurs. I don't see any immediately obvious ways that new instances of RubyThread would be getting constructed by our code, and it seems like in order to hit this issue, there'd have to be a huge number of objects that had been created (to get the object id counter up to the offending range), and then a RubyThread would have to be created after that point in time. Are there common scenarios that would result in the construction of RubyThread objects after the container is initialized, and assuming that the Ruby code being loaded doesn't explicitly create any Thread objects?

Contributor

cprice404 commented Jun 27, 2016

Tangientially related to this issue: I'm trying to figure out how a user would end up in a situation where this occurs. I don't see any immediately obvious ways that new instances of RubyThread would be getting constructed by our code, and it seems like in order to hit this issue, there'd have to be a huge number of objects that had been created (to get the object id counter up to the offending range), and then a RubyThread would have to be created after that point in time. Are there common scenarios that would result in the construction of RubyThread objects after the container is initialized, and assuming that the Ruby code being loaded doesn't explicitly create any Thread objects?

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 28, 2016

Contributor

For posterity: I found a code path in the JRuby code that might explain it. There is a method called org.jruby.Ruby#getCurrentContext, which gets called from lots and lots of places. Its call stack ends up checking for the existence of a soft reference of a context object that would contain the current RubyThread object, and if that reference has been cleared out by GC, then it calls an adopt method to "adopt" the current Java thread, which ultimately results in the construction of a new RubyThread instance.

If you have enough load on your JVM, and a small enough heap size, it may be garbage collecting these SoftReferences frequently enough to trigger this code path and I think that would explain it how we might be ending up with high values for object ids for {{RubyThread}} objects.

Contributor

cprice404 commented Jun 28, 2016

For posterity: I found a code path in the JRuby code that might explain it. There is a method called org.jruby.Ruby#getCurrentContext, which gets called from lots and lots of places. Its call stack ends up checking for the existence of a soft reference of a context object that would contain the current RubyThread object, and if that reference has been cleared out by GC, then it calls an adopt method to "adopt" the current Java thread, which ultimately results in the construction of a new RubyThread instance.

If you have enough load on your JVM, and a small enough heap size, it may be garbage collecting these SoftReferences frequently enough to trigger this code path and I think that would explain it how we might be ending up with high values for object ids for {{RubyThread}} objects.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jun 29, 2016

Member

@cprice404 that is an interesting reasoning - definitely worth confirming/looking into it. however the thread adopting path does not initialize an object-id for the Ruby Thread (it's lazy) ... maybe you should show us a piece of code for better understanding whats going on in your case.

Member

kares commented Jun 29, 2016

@cprice404 that is an interesting reasoning - definitely worth confirming/looking into it. however the thread adopting path does not initialize an object-id for the Ruby Thread (it's lazy) ... maybe you should show us a piece of code for better understanding whats going on in your case.

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 29, 2016

Contributor

Doesn't a new objectId get created every time a new RubyObject is instantiated? If so, wouldn't this line cause that to happen?:

https://github.com/jruby/jruby/blob/1.7.20.1/core/src/main/java/org/jruby/RubyThread.java#L418

I'm trying to come up with a reproducer but it's very challenging, because it will either need to be run with such a tight heap size that GC is frequently clearing out soft references, or it will need to artificially create a ton of extra Java threads that have their own ThreadLocal context. If I can come up with something, I'll post it.

The integer overflow seems like an obvious bug in any case, though, right?

Contributor

cprice404 commented Jun 29, 2016

Doesn't a new objectId get created every time a new RubyObject is instantiated? If so, wouldn't this line cause that to happen?:

https://github.com/jruby/jruby/blob/1.7.20.1/core/src/main/java/org/jruby/RubyThread.java#L418

I'm trying to come up with a reproducer but it's very challenging, because it will either need to be run with such a tight heap size that GC is frequently clearing out soft references, or it will need to artificially create a ton of extra Java threads that have their own ThreadLocal context. If I can come up with something, I'll post it.

The integer overflow seems like an obvious bug in any case, though, right?

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jun 29, 2016

Member

yes the bug is going to be fixed anyway, but there's probably something else going on (maybe another bug).

seems to me like you would need 2^31 threads hitting the currentThread.id() line or something else to consume up all of the integers in a 32-bit signed integer.

unless object_id is called the mentioned line won't use one (and thus increment the long object id counter) :

OBJECT_ID = ENV['OBJECT_ID'] != 'false'

threads = []
1000.times { threads << Thread.new { OBJECT_ID ? Thread.current.object_id : Thread.current  }  }

sleep 0.5; threads.each(&:join)

puts ''.object_id

you'll get a consistent result depending on whether you set OBJECT_ID=false

Member

kares commented Jun 29, 2016

yes the bug is going to be fixed anyway, but there's probably something else going on (maybe another bug).

seems to me like you would need 2^31 threads hitting the currentThread.id() line or something else to consume up all of the integers in a 32-bit signed integer.

unless object_id is called the mentioned line won't use one (and thus increment the long object id counter) :

OBJECT_ID = ENV['OBJECT_ID'] != 'false'

threads = []
1000.times { threads << Thread.new { OBJECT_ID ? Thread.current.object_id : Thread.current  }  }

sleep 0.5; threads.each(&:join)

puts ''.object_id

you'll get a consistent result depending on whether you set OBJECT_ID=false

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 29, 2016

Contributor

@kares ok, thanks for the extra info. I'll see if I can come up with a concrete reproducer.

Contributor

cprice404 commented Jun 29, 2016

@kares ok, thanks for the extra info. I'll see if I can come up with a concrete reproducer.

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 30, 2016

Contributor

Here's a Java program that reproduces:

public class ThreadIdRepro {
    public static void main(String[] args) throws InterruptedException {
        String objIdIncScript =
                "(0..2**32).each {|i| ''.object_id }\n" +
                "puts \"new object id: #{''.object_id}\"\n";
        String throwScript =
                "begin\n" +
                "  throw :foo\n" +
                "rescue => e\n" +
                "  puts \"CAUGHT EXCEPTION: #{e}\"\n" +
                "end\n";

        ScriptingContainer sc = new ScriptingContainer(LocalContextScope.SINGLETHREAD);
        long start = System.currentTimeMillis();
        System.out.println("Beginning object id generation.");
        sc.runScriptlet(objIdIncScript);
        System.out.println("Finished object id generation in " + (System.currentTimeMillis() - start) + " ms.");

        Thread t = new Thread(new Runnable() {
            @Override
            public void run() {
                sc.runScriptlet(throwScript);
            }
        });
        t.start();
        t.join();
    }
}

So, once the object ID gets too high (via any ruby objects being created that cause it to increment), then after that, you can trigger it by creating a single new Java thread that encounters a throw.

Contributor

cprice404 commented Jun 30, 2016

Here's a Java program that reproduces:

public class ThreadIdRepro {
    public static void main(String[] args) throws InterruptedException {
        String objIdIncScript =
                "(0..2**32).each {|i| ''.object_id }\n" +
                "puts \"new object id: #{''.object_id}\"\n";
        String throwScript =
                "begin\n" +
                "  throw :foo\n" +
                "rescue => e\n" +
                "  puts \"CAUGHT EXCEPTION: #{e}\"\n" +
                "end\n";

        ScriptingContainer sc = new ScriptingContainer(LocalContextScope.SINGLETHREAD);
        long start = System.currentTimeMillis();
        System.out.println("Beginning object id generation.");
        sc.runScriptlet(objIdIncScript);
        System.out.println("Finished object id generation in " + (System.currentTimeMillis() - start) + " ms.");

        Thread t = new Thread(new Runnable() {
            @Override
            public void run() {
                sc.runScriptlet(throwScript);
            }
        });
        t.start();
        t.join();
    }
}

So, once the object ID gets too high (via any ruby objects being created that cause it to increment), then after that, you can trigger it by creating a single new Java thread that encounters a throw.

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 30, 2016

Contributor

FWIW we have a fairly decent number of users who are running into this and have to restart our application once it starts happening. If there's anything we can do to help get a fix in (filing a PR or whatever), let us know and we are more than happy to help!

Contributor

cprice404 commented Jun 30, 2016

FWIW we have a fairly decent number of users who are running into this and have to restart our application once it starts happening. If there's anything we can do to help get a fix in (filing a PR or whatever), let us know and we are more than happy to help!

kares added a commit that referenced this issue Jun 30, 2016

@kares kares closed this in f2e250e Jun 30, 2016

@kares kares added this to the JRuby 1.7.26 milestone Jun 30, 2016

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jun 30, 2016

Contributor

@kares thank you!

Contributor

cprice404 commented Jun 30, 2016

@kares thank you!

@cprice404

This comment has been minimized.

Show comment
Hide comment
@cprice404

cprice404 Jul 21, 2016

Contributor

Any idea when 1.7.26 might be released? And/or is there another forum where I should ask that question? Thanks!

Contributor

cprice404 commented Jul 21, 2016

Any idea when 1.7.26 might be released? And/or is there another forum where I should ask that question? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment