Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect status "start" when the process is "up" #175

Open
drhenner opened this issue May 22, 2014 · 5 comments
Open

incorrect status "start" when the process is "up" #175

drhenner opened this issue May 22, 2014 · 5 comments

Comments

@drhenner
Copy link
Contributor

Senario

  • I have delayed_jobs killed
  • I run god -c /etc/god.conf -D (normally without the -D but this is for show)
  • I get the following
I [2014-05-22 00:05:19]  INFO: delayed_job.0 start: RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start 
I [2014-05-22 00:05:45]  INFO: delayed_job.0 moved 'init' to 'start'
I [2014-05-22 00:05:45]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:45]  INFO: delayed_job.0 [ok] tries within bounds [1/5] (Tries)
I [2014-05-22 00:05:53]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:53]  INFO: delayed_job.0 [ok] tries within bounds [2/5] (Tries)
I [2014-05-22 00:06:01]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:01]  INFO: delayed_job.0 [ok] tries within bounds [3/5] (Tries)
I [2014-05-22 00:06:09]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:09]  INFO: delayed_job.0 [ok] tries within bounds [4/5] (Tries)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 [trigger] tries exceeded [5/5] (Tries)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 move 'start' to 'start'
I [2014-05-22 00:06:17]  INFO: delayed_job.0 before_start: no pid file to delete (CleanPidFile)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 start: RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start

I run god status it looks like the process is in the start status

~/apps/main_app$ god status
delayed_job:
  delayed_job.0: start

but the PID's are there and the processes are running. Why doesn't god think this is 'up'?

Here is my god.conf

RAILS_ROOT = "/home/xyz/apps/main_app/current"
RUBY_BIN   = '/home/xyz/.rvm/rubies/ruby-2.0.0-p247/bin/ruby'
# /home/backops/apps/main_app/current/god/staging4/delayed_job.god

1.times do |num|
  God.watch do |w|
    w.name = "delayed_job.#{num}"
    w.group = 'delayed_job'
    w.interval = 300.seconds
    w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid")
    w.dir      = RAILS_ROOT
    w.start    = "RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start "

    ##  NOTE: do not specify uid or gid when not a root user
    #   https://github.com/mojombo/god/issues/43#issuecomment-1225470
    # w.uid = 'xyz'
    # w.gid = 'xyz'

    # clean pid files before start if necessary
    w.behavior(:clean_pid_file)

    # restart if memory gets too high
    w.transition(:up, :restart) do |on|
      on.condition(:memory_usage) do |c|
        c.above = 1500.megabytes
        c.times = 2
      end
    end

    # determine the state on startup
    w.transition(:init, { true => :up, false => :start }) do |on|
      on.condition(:process_running) do |c|
        c.running = true
      end
    end

    # determine when process has finished starting
    w.transition([:start, :restart], :up) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        c.interval = 5.seconds
      end

      # failsafe
      on.condition(:tries) do |c|
        c.times = 5
        c.transition = :start
        c.interval = 5.seconds
      end
    end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
      end
    end
  end
end
@eric
Copy link
Collaborator

eric commented May 22, 2014

When this happens could you check if File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") exists?

It sounds like it doesn't.

@drhenner
Copy link
Contributor Author

After some debugging the PID files did exist but I think god was looking for them in the wrong place. I removed w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") and everything started working. I think there is still an issue but I found a way around it.

@drhenner
Copy link
Contributor Author

BTW: putting some print statements in the gem I found right before you check for !active? the pid_file was nil (or empty string)

@eric
Copy link
Collaborator

eric commented May 22, 2014

The decision to or not to specify w.pid_file has to do with if god should be responsible for daemonizing the process or if the w.start command is doing to daemonize (and write the pid file) itself.

If you specify w.pid_file god will expect the w.start command to create it.

@drhenner
Copy link
Contributor Author

I could be wrong I think this was my use case:

With w.pid_file * w.start specified & no PID's running I call:

god -c /etc/god.conf
  • Then the PID's are in the location specified in w.pid_file.
  • w.start was the command that created the PID's (is that true?).
  • pid_file was nil (or an empty string) judging from the debugging I did.

If you think I messed up someplace else feel free to close. I hope this helps otherwise.

Either way all is working now after I removed w.pid_file... The pid files are someplace else but I'm ok with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants