Permalink
Browse files

Add ZK::Locker.cleanup (closes #33)

  • Loading branch information...
1 parent ae40686 commit 84a1ac04343a6a4c02dd4b41218020c5f867d776 @slyphon slyphon committed May 25, 2012
Showing with 130 additions and 36 deletions.
  1. +6 −0 Guardfile
  2. +28 −14 README.markdown
  3. +21 −0 RELEASES.markdown
  4. +47 −16 lib/zk/locker.rb
  5. +0 −5 lib/zk/locker/locker_base.rb
  6. +5 −1 spec/shared/client_contexts.rb
  7. +23 −0 spec/zk/locker_spec.rb
View
@@ -13,6 +13,9 @@ end
guard 'rspec', :version => 2 do
watch(%r{^spec/.+_spec\.rb$})
+ # run all specs when the support files change
+ watch(%r{^spec/support}) { 'spec' }
+
watch(%r%^spec/support/client_forker.rb$%) { 'spec/zk/00_forked_client_integration_spec.rb' }
watch(%r{^lib/(.+)\.rb$}) do |m|
@@ -23,6 +26,9 @@ guard 'rspec', :version => 2 do
when 'zk/client/threaded'
["spec/zk/client_spec.rb", "spec/zk/zookeeper_spec.rb"]
+ when 'zk/locker'
+ 'spec/zk/locker_spec.rb'
+
when %r{^(?:zk/locker/locker_base|spec/shared/locker)}
Dir["spec/zk/locker/*_spec.rb"]
View
@@ -65,6 +65,34 @@ In addition to all of that, I would like to think that the public API the ZK::Cl
[zk-eventmachine]: https://github.com/slyphon/zk-eventmachine
## NEWS ##
+### v1.6.0 ###
+
+* Locker cleanup code!
+
+When a session is lost, it's likely that the locker's node name was left behind. so for `zk.locker('foo')` if the session is interrupted, it's very likely that the `/_zklocking/foo` znode has been left behind. A method has been added to allow you to safely clean up these stale znodes:
+
+```ruby
+ZK.open('localhost:2181') do |zk|
+ ZK::Locker.cleanup(zk)
+end
+```
+
+Will go through your locker nodes one by one and try to lock and unlock them. If it succeeds, the lock is naturally cleaned up (as part of the normal teardown code), if it doesn't acquire the lock, then no harm, it knows that lock is still in use.
+
+### v1.5.3 ###
+
+* Fixed reconnect code. There was an occasional race/deadlock condition caused because the reopen call was done on the underlying connection's dispatch thread. Closing the dispatch thread is part of reopen, so this would cause a deadlock in real-world use. Moved the reconnect logic to a separate, single-purpose thread on ZK::Client::Threaded that watches for connection state changes.
+
+* 'private' is not 'protected'. I've been writing ruby for several years now, and apparently I'd forgotten that 'protected' does not work like how it does in java. The visibility of these methods has been corrected, and all specs pass, so I don't expect issues...but please report if this change causes any bugs in user code.
+
+### v1.5.2 ###
+
+* Fix locker cleanup code to avoid a nasty race when a session is lost, see [issue #34](https://github.com/slyphon/zk/issues/34)
+
+* Fix potential deadlock in ForkHook code so the mutex is unlocked in the case of an exception
+
+* Do not hang forever when shutting down and the shutdown thread does not exit (wait 30 seconds).
+
### v1.5.1 ###
* Added a `:retry_duration` option to the Threaded client constructor which will allows the user to specify for how long in the case of a connection loss, should an operation wait for the connection to be re-established before retrying the operation. This can be set at a global level and overridden on a per-call basis. The default is to not retry (which may change at a later date). Generally speaking, a timeout of > 30s is probably excessive, and care should be taken because during a connection loss, the server-side state may change without you being aware of it (i.e. events will not be delivered).
@@ -118,20 +146,6 @@ zk.delete('/some/path', :ignore => :no_node)
* MASSIVE fork/parent/child test around event delivery and much greater stability expected for linux (with the zookeeper-1.0.3 gem). Again, please see the documentation on the wiki about [proper fork procedure](http://github.com/slyphon/zk/wiki/Forking).
-### v1.3.1 ###
-
-* [fix a bug][bug 1.3.1] where a forked client would not have its 'outstanding watches' cleared, so some events would never be delivered
-
-[bug 1.3.1]: https://github.com/slyphon/zk/compare/release/1.3.0...9f68cee958fdaad8d32b6d042bf0a2c9ab5ec9b0
-
-### v1.3.0 ###
-
-Phusion Passenger and Unicorn users are encouraged to upgrade!
-
-* __fork()__: ZK should now work reliably after a fork() if you call `reopen()` ASAP in the child process (before continuing any ZK work). Additionally, your event-handler (blocks set up with `zk.register`) will still work in the child. You will have to make calls like `zk.stat(path, :watch => true)` to tell ZooKeeper to notify you of events (as the child will have a new session), but everything should work.
-
-* See the fork-handling documentation [on the wiki](http://github.com/slyphon/zk/wiki/Forking).
-
## Caveats
View
@@ -1,5 +1,26 @@
This file notes feature differences and bugfixes contained between releases.
+### v1.6.0 ###
+
+* Locker cleanup code!
+
+When a session is lost, it's likely that the locker's node name was left behind. so for `zk.locker('foo')` if the session is interrupted, it's very likely that the `/_zklocking/foo` znode has been left behind. A method has been added to allow you to safely clean up these stale znodes:
+
+```ruby
+ZK.open('localhost:2181') do |zk|
+ ZK::Locker.cleanup(zk)
+end
+```
+
+Will go through your locker nodes one by one and try to lock and unlock them. If it succeeds, the lock is naturally cleaned up (as part of the normal teardown code), if it doesn't acquire the lock, then no harm, it knows that lock is still in use.
+
+### v1.5.3 ###
+
+* Fixed reconnect code. There was an occasional race/deadlock condition caused because the reopen call was done on the underlying connection's dispatch thread. Closing the dispatch thread is part of reopen, so this would cause a deadlock in real-world use. Moved the reconnect logic to a separate, single-purpose thread on ZK::Client::Threaded that watches for connection state changes.
+
+* 'private' is not 'protected'. I've been writing ruby for several years now, and apparently I'd forgotten that 'protected' does not work like how it does in java. The visibility of these methods has been corrected, and all specs pass, so I don't expect issues...but please report if this change causes any bugs in user code.
+
+
### v1.5.2 ###
* Fix locker cleanup code to avoid a nasty race when a session is lost, see [issue #34](https://github.com/slyphon/zk/issues/34)
View
@@ -99,24 +99,55 @@ class << self
# the default root path we will use when a value is not given to a
# constructor
attr_accessor :default_root_lock_node
- end
- # Create a {SharedLocker} instance
- #
- # @param client (see LockerBase#initialize)
- # @param name (see LockerBase#initialize)
- # @return [SharedLocker]
- def self.shared_locker(client, name, *args)
- SharedLocker.new(client, name, *args)
- end
+ # Create a {SharedLocker} instance
+ #
+ # @param client (see LockerBase#initialize)
+ # @param name (see LockerBase#initialize)
+ # @return [SharedLocker]
+ def shared_locker(client, name, *args)
+ SharedLocker.new(client, name, *args)
+ end
+
+ # Create an {ExclusiveLocker} instance
+ #
+ # @param client (see LockerBase#initialize)
+ # @param name (see LockerBase#initialize)
+ # @return [ExclusiveLocker]
+ def exclusive_locker(client, name, *args)
+ ExclusiveLocker.new(client, name, *args)
+ end
- # Create an {ExclusiveLocker} instance
- #
- # @param client (see LockerBase#initialize)
- # @param name (see LockerBase#initialize)
- # @return [ExclusiveLocker]
- def self.exclusive_locker(client, name, *args)
- ExclusiveLocker.new(client, name, *args)
+ # Clean up dead locker directories. There are situations (particularly
+ # session expiration) where a lock's directory will never be cleaned up.
+ #
+ # It is intened to be run periodically (perhaps from cron).
+ #
+ #
+ # This implementation goes through each lock directory and attempts to
+ # acquire an exclusive lock. If the lock is acquired then when it unlocks
+ # it will remove the locker directory. This is safe because the unlock
+ # code is designed to deal with the inherent race conditions.
+ #
+ # @example
+ #
+ # ZK.open do |zk|
+ # ZK::Locker.cleanup!(zk)
+ # end
+ #
+ # @param client [ZK::Client::Threaded] the client connection to use
+ #
+ # @param root_lock_node [String] if given, use an alternate root lock node to base
+ # each Locker's path on. You probably don't need to touch this. Uses
+ # {Locker.default_root_lock_node} by default (if value is nil)
+ #
+ def cleanup(client, root_lock_node=default_root_lock_node)
+ client.children(root_lock_node).each do |name|
+ exclusive_locker(client, name, root_lock_node).tap do |locker|
+ locker.unlock if locker.lock
+ end
+ end
+ end
end
# @private
@@ -239,11 +239,6 @@ def digit_from(path)
self.class.digit_from_lock_path(path)
end
- # possibly lighter weight check to see if the lock path has any children
- # (using stat, rather than getting the list of children).
- def any_lock_children?
- end
-
def lock_children(watch=false)
zk.children(root_lock_path, :watch => watch)
end
@@ -9,14 +9,16 @@
before do
# logger.debug { "threaded client connection - begin before hook" }
-
@connection_string = connection_host
@base_path = '/zktests'
@zk = ZK::Client::Threaded.new(*connection_args).tap { |z| wait_until { z.connected? } }
@threadpool_exception = nil
@zk.on_exception { |e| @threadpool_exception = e }
@zk.rm_rf(@base_path)
+ @orig_default_root_lock_node = ZK::Locker.default_root_lock_node
+ ZK::Locker.default_root_lock_node = "#{@base_path}/_zklocking"
+
# logger.debug { "threaded client connection - end before hook" }
end
@@ -30,6 +32,8 @@
z.rm_rf(@base_path)
end
+ ZK::Locker.default_root_lock_node = @orig_default_root_lock_node
+
# logger.debug { "threaded client connection - end after hook" }
end
end
@@ -0,0 +1,23 @@
+require 'spec_helper'
+
+describe ZK::Locker do
+ include_context 'threaded client connection'
+
+ describe :cleanup do
+ it %[should remove dead lock directories] do
+ locker = @zk.locker('legit')
+ locker.lock
+ locker.assert!
+
+ bogus_lock_dir_names = %w[a b c d e f g]
+ bogus_lock_dir_names.each { |n| @zk.create("#{ZK::Locker.default_root_lock_node}/#{n}") }
+
+ ZK::Locker.cleanup(@zk)
+
+ lambda { locker.assert! }.should_not raise_error
+
+ bogus_lock_dir_names.each { |n| @zk.stat("#{ZK::Locker.default_root_lock_node}/#{n}").should_not exist }
+
+ end
+ end
+end

0 comments on commit 84a1ac0

Please sign in to comment.