Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutexes should use .lock instead of .lockInterruptibly #4261

Closed
andrewvc opened this issue Nov 2, 2016 · 10 comments
Closed

Mutexes should use .lock instead of .lockInterruptibly #4261

andrewvc opened this issue Nov 2, 2016 · 10 comments

Comments

@andrewvc
Copy link

@andrewvc andrewvc commented Nov 2, 2016

Since ruby does not have checked exceptions the behavior here: https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/ext/thread/Mutex.java#L91 is very surprising. If users really need this behavior they should use a ReentrantLock directly.

The only question I have now is if its removal might be even more surprising to existing users. Thoughts?

@andrewvc
Copy link
Author

@andrewvc andrewvc commented Nov 2, 2016

To be clear, this caused a nasty bug for Logstash that took a while to show up.

@headius
Copy link
Member

@headius headius commented Nov 2, 2016

I was about to comment that if we make this change, Thread blocked on Mutex acquisition would no longer be interruptible as in MRI:

[] ~/projects/jruby $ cat blah.rb
m = Mutex.new

m.lock

t = Thread.new { puts 'before'; m.lock rescue puts 'after' }

sleep 1
1 while t.status != "sleep"

t.raise
t.join

[] ~/projects/jruby $ ruby23 blah.rb
before
after

However, JRuby does not appear to be actually interrupt the mutex correctly, and never reaches "after".

@headius headius added this to the JRuby 9.1.6.0 milestone Nov 2, 2016
@andrewvc
Copy link
Author

@andrewvc andrewvc commented Nov 2, 2016

@headius I suppose the confusion is that I'm not using Thread#raise in my own usage, but rather am calling #interrupt on the java thread itself.

I'm not sure if there's a way to disentangle those things, but I do think that'd make sense. That said, we learned our lesson and used ReentrantLock directly so its not a pressing concern for me. I do think that it is confusing if you are using java's Thread#interrupt, but I think its all arguable either way.

At the very least I think it'd be desirable for the situation where there was no ruby raise called to either not throw anything or throw an InterruptedException.

@headius
Copy link
Member

@headius headius commented Nov 2, 2016

It's difficult (maybe impossible) to know whether the interrupt is from someone in normal Java space doing Thread.interrupt or from JRuby trying to wake the thread for a raise or kill event.

@andrewvc: To clarify: you had a thread waiting on a mutex that was interrupted by Java code? I'd like to understand your bug better so we can figure out the best path forward.

Meanwhile, I will push the lock change to a branch, so we can see how CI likes it.

headius added a commit that referenced this issue Nov 2, 2016
@andrewvc
Copy link
Author

@andrewvc andrewvc commented Nov 2, 2016

@headius The gist of the issue is that the grok filter in logstash now monitors grokking threads for timeouts with a TimeoutEnforcer object/thread. This thread periodically sees if threads have been running for too long and interrupts ones that have relying on Joni's interrupt behavior. The threads coordinated with Mutexes which is where the problem lay, that mutex would sometimes be the thing that was interrupted.

Here was the actual bug we hit, + the fix (moving to direct use of a ReentrantLock). logstash-plugins/logstash-filter-grok#98 The ticket is well documented. The PR has a few commits ,but here's the only one that matters: logstash-plugins/logstash-filter-grok@82a75e9 Let me know if you'd like additional information added clarify the ticket.

@headius
Copy link
Member

@headius headius commented Nov 3, 2016

Well it looks like CI is ok.

So the remaining question for me (cc @enebo) is whether we should be looking to fix the uninterruptibility. As far as I can tell, MRI will not break out of a blocked lock request unless you raise in the target thread; simply doing wakeup does not work. Of course neither work for us, but this tells me that being able to interrupt a thread waiting to lock is not typically used by Rubyists, since they'd only be able to do it by raising an error.

I think we'll go with the patch you suggest for now and deal with any fallout.

@headius
Copy link
Member

@headius headius commented Nov 3, 2016

FWIW, here's what I get from MRI if I change that lock-interrupting script to try to wakeup instead of raise.

$ ruby23 blah.rb
before
blah.rb:11:in `join': No live threads left. Deadlock? (fatal)
    from blah.rb:11:in `<main>'
@headius headius closed this Nov 3, 2016
@headius
Copy link
Member

@headius headius commented Nov 3, 2016

Merged to master in 6a436e9.

@headius
Copy link
Member

@headius headius commented Nov 28, 2018

@andrewvc So... we have #5476 now because there's actually specs testing for interruptibility of locks. This is a synthetic motivator but I think we need to revisit how to make Mutex interruptible again without breaking your use case.

@headius
Copy link
Member

@headius headius commented Nov 28, 2018

Main thought that comes to mind would be going back to a hand-written park-based implementation of Mutex. I just don't like to have to go that far.

headius added a commit to headius/jruby that referenced this issue Apr 10, 2019
Fixes jruby#5476.

This reverts the original change from jruby#4261 that disabled lock
interrupts, and uses Task logic from RubyThread to handle the
interrupting properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.