jruby-1.7.3 Timeout is not working with long string and regexp. #792

Closed
Leeheng86 opened this Issue Jun 5, 2013 · 4 comments

Projects

None yet

2 participants

@Leeheng86

It is easy to reproduce situation trouble me.
And it is work with ruby 1.9.3.
plz help!!

require 'timeout'
posstr = "NRNRNNVVNRNNNNNNPUNNNNPUNRNNNNNNNNNNNNNRNNNNNRNNNRNNNNJJNNCCNNNRNNVVNNNNNRNNNNNRNNNNNRNNNNNRNNNRNNNNNNNNNNCCNNNNNNNNNNNNNNNRNNNNNNPUNNNNPUNNNNNNNRNNNNNNNRNNNNNNNNNNNNNNNNNNNNNNNNNRNNNNNNNNVVNNNNNNNNNNNNNNNNNNNNNRNNNNNNNNNRNNNNNNPUNNNNPUNRNNNNNNNNCCNNJJNN"
re= /((?:(?:(?:JJ|VA)(?:DEC)?)*|AS)?(?:(?:NN|NR|NT)+(?:PU|CC|PP)?)+)$/
begin
    Timeout::timeout(5) do
        posstr.scan(re)
    end
rescue
    puts "It won't never be there."
end
@headius
Member
headius commented Jun 6, 2013

Ahh this is a nice test case for regexp interruption.

In 1.7.4 we added the ability to interrupt a regexp (in the "joni" implementation we ship), but did not wire it up to all places where a thread can be interrupted from Ruby. Timeout appears to be one of them.

It will not be hard to add this to our logic, and I had also planned to clean up thread interruption for 1.7.5.

A workaround for you, using thread interruption at the Java level:

posstr = "NRNRNNVVNRNNNNNNPUNNNNPUNRNNNNNNNNNNNNNRNNNNNRNNNRNNNNJJNNCCNNNRNNVVNNNNNRNNNNNRNNNNNRNNNNNRNNNRNNNNNNNNNNCCNNNNNNNNNNNNNNNRNNNNNNPUNNNNPUNNNNNNNRNNNNNNNRNNNNNNNNNNNNNNNNNNNNNNNNNRNNNNNNNNVVNNNNNNNNNNNNNNNNNNNNNRNNNNNNNNNRNNNNNNPUNNNNPUNRNNNNNNNNCCNNJJNN"
re= /((?:(?:(?:JJ|VA)(?:DEC)?)*|AS)?(?:(?:NN|NR|NT)+(?:PU|CC|PP)?)+)$/

re_thread = Thread.new do
    posstr.scan(re)
end

# get java.lang.Thread for our Regexp-matching thread
java_thread = JRuby.reference(re_thread).native_thread

# timeout after 5 seconds
sleep 5
java_thread.interrupt if re_thread.alive?

# wait for thread to die
Thread.pass while re_thread.alive?
puts "successfully interrupted thread!"
@Leeheng86

Thanks for your reply!!!
It is very helpful, the insteresting part is I try to use thread.join to limit the running time of string.scan(re), it works.

but no matter which way (polling re_thread.alive? or thread.join) to monitor thread running time, my program suffer terrible efficiency issue, it become very slow.

so I make a experiment as follow:

require 'benchmark'
require 'timeout'
Benchmark.bmbm(10) do |x|
    x.report("timeout" ) { 5000.times { Timeout::timeout(5) { 1 + 1 } } }
    x.report("thread"  ) { 5000.times { t = Thread.new { 1 + 1 }; t.join(5) } }
end
hduser@server-828:~$ ruby test.rb
Rehearsal ----------------------------------------------
timeout      0.320000   0.570000   0.890000 (  0.768562)
thread       0.270000   0.610000   0.880000 (  0.756229)
------------------------------------- total: 1.770000sec
timeout      0.230000   0.560000   0.790000 (  0.697693)
thread       0.180000   0.500000   0.680000 (  0.563507)

In ruby 1.9.3 it is no difference, I guess it is because ruby thread is not a read thread.

hduser@server-828:~$ jruby-1.7.3 test.rb
Rehearsal ----------------------------------------------
timeout      0.690000   0.100000   0.790000 (  0.531000)
thread       2.710000   1.200000   3.910000 (  3.003000)
------------------------------------- total: 4.700000sec
timeout      0.270000   0.090000   0.360000 (  0.232000)
thread       2.520000   1.320000   3.840000 (  2.772000)

But in jruby, thread way is much slower then timeout way to limit the running of block.

So after interuption been add to jruby 1.7.5, is it possible to keep efficiency of timeout and I still can interupt ruby with a regexp within block?
Thanks for your answer again.

@headius
Member
headius commented Jun 6, 2013

It is interesting to see how much slower a thread is to spin up in JRuby. MRI threads are still "real" threads, but it is likely that the cost of creating structures around the thread is much lower. I will try to look into that issue as well.

@Leeheng86

Thank you very much. :)

@headius headius closed this in 042dd01 Jun 6, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment