[fix #129] Fix register/de-register bug when process is restarted out-of-band. #142

Open
wants to merge 1 commit into
from

Conversation

Projects
None yet
3 participants

Fixes a bug when a process is restarted out-of-band wherby the
original pid was never unwatched and thus would trigger
false-positives by god on noticing process exits.

I am not 100% sure that this doesn't cause problems elsewhere. Not exactly sure how to test and don't know enough about the overall architecture to know what I don't know about how all this works across different process mechanisms.

In any case I'm going to use this on my prod box for a bit and see how it goes.

Would love to have additional feedback.

Collaborator

eric commented Aug 6, 2013

Nice! I agree the current logic isn't right.

I think we could change the definition of pid() to something like this:

      def pid
        @pid ||= self.pid_file ? File.read(self.pid_file).strip.to_i : self.watch.pid
      end

and leave the rest of the logic as-is and it would do what we intended.

Yeah I suppose that could work as well. I was just trying to keep the concern separate since I didn't know what else was going on.

Thanks for reviewing!

On Aug 6, 2013, at 5:15 PM, Eric Lindvall notifications@github.com wrote:

Nice! I agree the current logic isn't right.

I think we could change the definition of pid() to something like this:

  def pid
    @pid ||= self.pid_file ? File.read(self.pid_file).strip.to_i : self.watch.pid
  end

and leave the rest of the logic as-is and it would do what we intended.


Reply to this email directly or view it on GitHub.

@eric your idea only half worked; the pid needed to be cleared out on de-register so that it would re-bootstrap itself based on the watch/pid file. Otherwise it would continue monitoring the prior pid.

What do you think?

chetan commented Jan 24, 2015

I've been running into this issue for a while (and #127 as well) but came up with a slightly different fix which you can see here:

https://github.com/chetan/god/compare/dangling_process_exit

In addition, I changed my config to disable the clean_pid_file behavior and changed the process_exit condition to transition to :init instead of :start, so that it has a chance to pickup the new pid instead of just blindly trying to start and throw errors.

edit: I'm not 100% sure that the Monitor synchronization is necessary, but it does seem there are possible race conditions there otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment