-
Notifications
You must be signed in to change notification settings - Fork 419
New ReentrantReadWriteLock class #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New ReentrantReadWriteLock class #381
Conversation
Note the new |
9b1e5d0
to
b977602
Compare
Specs are coming soon... |
I cherry-picked the 2 commits from @jdantonio's PR which still seem useful, so if/when this PR is ready to merge, those will go in at the same time. |
|
||
if c == 0 # no readers OR writers running OR writers waiting | ||
# if we successfully swap the RUNNING_WRITER bit on, then we can go ahead | ||
if @Counter.compare_and_set(0, RUNNING_WRITER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdowad If I follow the code correctly, this is the bit that allows a thread with a read lock to "upgrade" to a write lock (as in the Rails lock class). Am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdantonio, wow, you just made me realize that this code does not handle that "read lock upgrading to write lock" scenario properly! Thanks, I'll fix it!
@alexdowad This looks fantastic! Thank you very much for working on this! The tests are failing on all JRuby build because of the Java update to
This looks like it should be easy to fix. I can look into it if need be. The tests are also failing on Windows because of a With respect to the tests, it seems to me that we should be able to extract most of the tests for |
I'll fix the Java code. Sorry, this PR is still a bit rough. Regarding |
The Windows build will run automatically every time you update the PR. Right now we're testing both 32 and 64 bit versions of Ruby on Windows. There is a
|
e6526c5
to
64c9a57
Compare
You can see that the specs are underway now (and have already flushed out a number of bugs). Still lots more tests to add. A fun diversion has come up, though: I wrote a spec which causes MRI to segfault! To my trusty debugger... |
64c9a57
to
b223545
Compare
Sorry, just re-ordered some commits. |
5a10fc6
to
554a0be
Compare
We have a pretty good little suite of specs for I think a thorough code review by at least one either developer is needed before this goes into production. |
The tests hang with |
554a0be
to
cac958a
Compare
OK, I'm on it. |
If anyone is interested in the MRI segfault which was discovered while working on this PR, here is what it turned out to be: https://bugs.ruby-lang.org/issues/11404 |
c3f62ca
to
fcf02d1
Compare
The problem on Windows is sorted out. Unfortunately, it looks like def ns_wait(timeout = nil)
wchan = Rubinius::Channel.new
begin
@__Waiters__.push wchan
puts "Waiting.... #{self.inspect}"
puts "Locked = #{Rubinius.locked?(self)}"
puts Rubinius.unlock(self).inspect
puts "Locked = #{Rubinius.locked?(self)}"
signaled = wchan.receive_timeout timeout
ensure
Rubinius.lock(self)
if !signaled && !@__Waiters__.delete(wchan)
# we timed out, but got signaled afterwards,
# so pass that signal on to the next waiter
@__Waiters__.shift << true unless @__Waiters__.empty?
end
end
self
end That's a modified version of
Looking at the RBX source, it looks like @pitr-ch, your comments please! |
fcf02d1
to
dcba34d
Compare
I'm glad you were able to get things to work on Windows! I didn't have a chance to take a look last night as I had hoped. The one test that failed on one Windows build doesn't concern me. Tests that check the status of a thread are incredibly difficult to make work consistently. The Rbx bug is unfortunate, but I don't think this should stop is from merging this PR. There is nothing we can do about a buggy runtime. We can submit a bug to Rbx and add an Rbx-specific warning to the code. |
@jdantonio, of course you can do whatever you like, but I would suggest holding onto this one for just a few days to see if we can really make it right. |
@alexdowad I was under the impression that we couldn't do anything about this. If there is a way to fix it then by all means, we should. |
@alexdowad thanks for letting me know, I'll look asap. 👍 for waiting until fixed, we cannot released it if it's broken on Rubinius anyway, we should at least give them a chance to fix it if it's indeed bug in the runtime. |
@ReadQueue = Synchronization::Lock.new # used to queue waiting readers | ||
@WriteQueue = Synchronization::Lock.new # used to queue waiting writers | ||
@HeldCount = ThreadLocalVar.new(0) # indicates # of R & W locks held by this thread | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's more safe to have ensure_ivar_visibility!
here as in ReadWriteLock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is, if we put a memory barrier there on Rubinius, are we also going to put a memory barrier at the beginning of every method which uses the ivars? If not, then is the memory barrier in the constructor actually worth anything?
Regarding the current problem with Rubinius.unlock
-- I have tried peppering the code with memory barriers and it didn't make any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning for the barrier is as follows: the object is being constructed so no other thread has a reference to it (assuming proper construction, where reference does not escape during construction). The barrier is here so final field assignments of the constructed object cannot be reordered with an assignment of the constructed object to a publicly accessible variable where it can be read by other threads. If the barrier is omitted other threads can see the object instance before it's fully initialised because of the possible reorderings on thread performing the construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, makes sense!
I'd like to make this improvement without inheriting from Synchronization::Lock
. Lock
adds extra instance variables which will not be useful for ReentrantReadWriteLock
. This would increase the memory footprint of each instance.
What would you think of Rubinius.memory_barrier if Concurrent.on_rbx?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now i think its better to inherit so it's used standardly. However I am aware of this issue and I'll fix it later for all classes using Synchronisation::Object
. If here is some custom code here I can easily miss it.
61bc71d
to
4f47db6
Compare
@@ -8,13 +8,26 @@ module Synchronization | |||
class RbxObject < AbstractObject | |||
def initialize | |||
@__Waiters__ = [] | |||
@__Owner__ = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for nitpick: this should be @__owner__
since it's mutated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't know your convention was caps for "final". Fixed.
… held Rubinius.lock counts how many times an object has been locked, and only releases the lock when it has been unlocked an equal number of times. Therefore, if Lock#wait is called on a Lock which was *already* locked, the call to Rubinius.unlock which occurs before sleeping does not actually release the lock. This makes it impossible for other threads to wake it up with #signal or #broadcast. A solution is to make sure that #synchronize never calls Rubinius.lock on an object which is already locked.
ReadWriteLock is a fast and very useful utility, but it is non-reentrant and this makes it unsuitable for some (well, a lot of) applications. You can never take an exclusive lock without being 100% sure that you are not holding it already. If you mess up, then game over. Your thread will deadlock hard and grind to a halt. Any other thread which tries to take the exclusive lock will deadlock too. When there are only a few sections of code which access a shared, mutable resource, avoiding this scenario is not so hard. But when you have a lot of places where locking is needed, and they call each other in complex ways, forget it. You WILL get it wrong. If you are building a library for other programmers to use, and you call into "consumer" code with locks held, that's even worse. (Such as public API methods which yield to a block...) You don't know what that "consumer" code will do. It may call back into your library again. Reentrant locks are so much more friendly. You can audit each method individually to check whether it acquires (and releases) all the locks that it needs in the right places. You don't have to check every single call site to see which locks may already be held when the method is entered. (This will save you some lost hair, which you can later tear out when dealing with lock acquisition order issues.) Acquiring and releasing a ReentrantReadWriteLock is about 50% more expensive (in CPU time) than a plain ReadWriteLock. But it's still fast. Don't worry about performance unless you expect to acquire/release it 10,000s of times per second. ReentrantReadWriteLock has another nice property: if you are holding a write lock, you can FIRST acquire a read lock and THEN drop the write lock. This allows lock "downgrading". "Upgrading" can be done in the converse way. If you think "ReentrantReadWriteLock" is too long to type (and it is), feel free to "alias" it by defining a constant. This is Ruby, after all.
4f47db6
to
5d4e385
Compare
@pitr-ch, they do seem similar, because the algorithm used for DRY is generally a good thing, but can be detrimental when taken too far. Why? Because removing duplication often means adding another layer of indirection. Too much indirection makes code harder to read, understand, and maintain. At the same time, higher LOC count also makes a project harder to understand and maintain. And unifying duplicated code has the advantage that bug fixes and performance optimizations on de-duplicated code can benefit all the places that use it. On the other hand, de-duplicating also means that you can break class A while trying to fix class B. You are creating dependencies where none existed before. There is more that could be said about pros and cons on both sides, but in short, I believe that readability, maintainability, and performance of this code will all be enhanced if we just keep the reentrant RW-lock classes separate. However, I would love to be proved wrong, if you (or anybody) can show a nice way to break out the pieces of the algorithms, which reduces duplication while enhancing readability and maintaining good performance. |
Ok at first glance it looked similar a lot, but as you say it just appears that way. So lets keep it, no need to delay this PR. I may look at it later. |
New ReentrantReadWriteLock class
Unfortunately, while @alexdowad was doing his excellent work to create a highly-performant lock class that was compatible with the dependency loader lock in Rails, they added new features to their lock. At this point it looks like their lock is highly specialized and probably not worth our time to support. :-( |
It is quite specialized indeed. I wonder how much they gain from those "loose upgrades". It seems to me the purpose could probably be achieved in a much simpler way, though I'm not deep in the Rails code so I don't know. |
@alexdowad they prevent us from deadlocking as soon as you have two concurrent requests, which both need an opportunity to do a load |
@alexdowad Did you get my email? |
On the bright side, this new lock class is an excellent addition! And we have an improved TLV, too! |
@matthewd, why do requests need to do a load? Autoload of Rails classes/other gem classes? |
Rails application classes (e.g., the UserController or Post classes) -- the lock is for the development mode, (re)load-on-demand, behaviour. |
Why does that reloading have to be done in the middle of a request? Can't you complete all outstanding requests, reload, then start accepting requests again? |
We unload between requests, and we load each class when it is referenced (via |
OK, fair enough. |
@matthewd Thank you for all your feedback! |
We could generalise our ReadWrite lock implementation to be able to cover the Rails use-case. A factory class could be introduced to CR, which would generate a Lock classes based on given rules. The rules would be a small set of purposes and compatibility rules. Based on that it should be able to generate performant implementation. For ReadWriteLock the rules would be: This is just an idea for improvement, since the lock is not used in Rails production, there will be little gain if Rails migrate to this more performant implementation. |
@pitr-ch It's an interesting idea, but do you think such an interface would be useful for anyone else aside from Rails? |
Another thing: we can cover the Rails use case much more simply: class SharedLock < Concurrent::ReentrantReadWriteLock
def with_write_lock_loose(&block)
if (@HeldCount.value & READ_LOCK_MASK) > 0
# we are already holding a read lock
release_read_lock
with_write_lock(&block)
acquire_read_lock
else
with_write_lock(&block)
end
end
end That doesn't cover the "purpose" thing, does it? But I think even that can be covered by adding a couple more custom methods which delegate to the existing methods on |
Yeah that sounds more practical. I cannot think of another example. I mainly wanted to put the idea somewhere if anybody is interested looking into it. |
See PR #372 for the background on this one.
I am opening this PR to allow for code review. It is NOT ready to merge yet.