Skip to content

Fix god status and report process uptime: #55

Closed
wants to merge 10 commits into from

5 participants

@icy
icy commented Jul 15, 2011

Hey.

Sorry for the noise. I've made some changes, so that "god status" always reports the real status of process. I created the previous pull request #54 then I closed it as I want to add some more commits to the pull.

Thank you for consideration

Logs:

* If a process is not alive, "god status" may still report "up"
  (This is due to the fact that the :state is not updated at the time
  the process is killed; "god status" should rely on real status of
  (any) running process, otherwise it will trick any human eyes
* "god status" now reports: status, pid, uptime (if process is alive)
* Support new metric "uptime" (process uptime, not system uptime),
  which may be used to define new condition. For example, if some
  processes are running in more than 10 days, they should be restarted
  (Oh, I can not think this would help. At least, the metric will
  help us to support process uptime when typing "god status".) This
  feature needs some tests (TODO at the moment)

Sample output:

$ god status
foobar: up, pid 15906, uptime 28:17
foobar2: up, non alive
icy added some commits Jul 12, 2011
@icy icy (god status) Print the pid of processes if they are up. Fix typo 4125288
@icy icy Clean up my debug code e4a51ab
@icy icy Clean up my debug code a794b48
@icy icy Fix god status and report process uptime:
* If a process is not alive, "god status" may still report "up"
  (This is due to the fact that the :state is not updated at the time
  the process is killed; "god status" should rely on real status of
  (any) running process, otherwise it will trick any human eyes
* "god status" now reports: status, pid, uptime (if process is alive)
* Support new metric "uptime" (process uptime, not system uptime),
  which may be used to define new condition. For example, if some
  processes are running in more than 10 days, they should be restarted
  (Oh, I can not think this would help. At least, the metric will
  help us to support process uptime when typing "god status".) This
  feature needs some tests (TODO at the moment)
609e014
@icy icy Make the output of "god status" fancier b307e66
@icy icy Fix typo f9ddf07
@icy icy Fix error when getting exitcode f6f3f65
@icy icy Fix typo 878188b
@icy
icy commented Jul 15, 2011

Another output example

$ god status
    chef-expander: up, pid 20336, uptime 28:26
      chef-server: up, pid 20333, uptime 28:26
chef-server-webui: up, pid 20339, uptime 28:26
        chef-solr: up, pid 20331, uptime 28:26
           jarmon: up, pid 20459, uptime 27:26
@JDutil
JDutil commented Jul 19, 2011

+1 for nicer status output. I just started using this library so I'm not familiar with the incorrect "up" output, but I'd definitely want to see that fixed if that's the case.

@JDutil
JDutil commented Jul 19, 2011

After playing around some more intentionally stopping services and checking the status I see incorrect reports of the services being "up" too...

@icy
icy commented Jul 19, 2011

As far as I know ":up" means the transaction status that God knows. For example, in the following settings, God will check status of process every 5 minutes. If you start the process (by God) as 10:00 AM, then you use kill to stop the process, God still reports the process is ":up" until 10:05 AM. (My patch is to avoid this problem: "god status" should always report the real status of process.)

  w.transition(:up, :start) do |on|
    on.condition(:process_running) do |c|
      c.interval = 5.minutes
      c.running = false
    end
  end
10:00 AM: god start foobar
10:01 AM: kill the process foobar (manually)
10:02 AM: god status still reports "foobar is up"
10:05 AM: god detects that the process dies, and it tries to start it.

PS: Sorry for my bad English

@icy
icy commented Jul 19, 2011

So to avoid the problem, you may reduce "c.interval" in my example. E.g, use "c.interval = 10.seconds" (which isn't a good setting in some cases.)

@icy
icy commented Jul 25, 2011

Another case when "god status" reports wrong status: If the process is a daemon, and users have wrong settings (by miskakes, they don't provide the PID file).

@dukejones

i love the idea of richer status output.

@xanview
xanview commented Oct 17, 2012

Any updates on this? - this is a great idea :)

@zhaocai
zhaocai commented Oct 17, 2012

+1

@icy icy closed this Nov 6, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.