Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

[fix #129] Fix register/de-register bug when process is restarted out-of-band. #142

Open
wants to merge 1 commit into from

3 participants

@apinstein

Fixes a bug when a process is restarted out-of-band wherby the
original pid was never unwatched and thus would trigger
false-positives by god on noticing process exits.

I am not 100% sure that this doesn't cause problems elsewhere. Not exactly sure how to test and don't know enough about the overall architecture to know what I don't know about how all this works across different process mechanisms.

In any case I'm going to use this on my prod box for a bit and see how it goes.

Would love to have additional feedback.

@eric
Collaborator

Nice! I agree the current logic isn't right.

I think we could change the definition of pid() to something like this:

      def pid
        @pid ||= self.pid_file ? File.read(self.pid_file).strip.to_i : self.watch.pid
      end

and leave the rest of the logic as-is and it would do what we intended.

@apinstein
@apinstein

@eric your idea only half worked; the pid needed to be cleared out on de-register so that it would re-bootstrap itself based on the watch/pid file. Otherwise it would continue monitoring the prior pid.

What do you think?

@chetan

I've been running into this issue for a while (and #127 as well) but came up with a slightly different fix which you can see here:

https://github.com/chetan/god/compare/dangling_process_exit

In addition, I changed my config to disable the clean_pid_file behavior and changed the process_exit condition to transition to :init instead of :start, so that it has a chance to pickup the new pid instead of just blindly trying to start and throw errors.

edit: I'm not 100% sure that the Monitor synchronization is necessary, but it does seem there are possible race conditions there otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 7, 2013
  1. @apinstein
This page is out of date. Refresh to see the latest.
Showing with 4 additions and 1 deletion.
  1. +4 −1 lib/god/conditions/process_exits.rb
View
5 lib/god/conditions/process_exits.rb
@@ -28,7 +28,8 @@ def valid?
end
def pid
- self.pid_file ? File.read(self.pid_file).strip.to_i : self.watch.pid
+ # only read pid once since lazy evaluation can cause bugs due to changing external state
+ @pid ||= self.pid_file ? File.read(self.pid_file).strip.to_i : self.watch.pid
end
def register
@@ -50,6 +51,8 @@ def register
def deregister
pid = self.pid
+ @pid = nil # reset so that @pid will bootstrap itself again as needed
+
if pid
EventHandler.deregister(pid, :proc_exit)
Something went wrong with that request. Please try again.