Monitoring a specific process #7660

tinyhammers · 2020-01-02T14:24:30Z

Happy New Year! Hope you all had a nice break away from those pesky computers.
But, new year, new Netdata question.
Caveat: I opened my laptop this morning and couldn't remember what my job is, let alone how to do it, so I could be missing the obvious here.

I need to monitor a specific process (in this case, Icecast). I've added icecast: *icecast* to apps_groups.conf and I've got it showing as a dimension in the apps.process and apps.threads charts. This is awesome.

What I need to do now is raise an alarm if the number of processes drops below 1.
I've stolen the template from #4614 and have sort of got an idea, but I can't figure out how to make do what I want.

Currently I have this

   on: apps.processes
   os: linux
hosts: *
families: *
lookup: min -1s unaligned of icecast
units: processes
every: 10s
 crit: $this == 0
delay: up 10s down 1m multiplier 2 max 10m
 info: Icecast has died
   to: sysadmin

As you can see, it's pretty much copypasta from #4614. I removed the warning line as I don't want a warning, I just need to know if Icecast is running or not.
I assumed that crit: $this == 0 would then shout if there were no processes running.

It does not.

Was crossing my fingers that the lookup line would work as it did in issue 4614, but maybe the problem is there?
Totally confused, and still full of cheese from Xmas to be honest.

The text was updated successfully, but these errors were encountered:

thiagoftsm · 2020-01-02T21:32:36Z

Hi @tinyhammers ,

The line families: * is applied only when you have a template, but here you are configuring an alarm.
Do you have any information about this alarm in your error.log?
Netdata also has an example in our documentation https://docs.netdata.cloud/health/reference/#example-1.
You also can find a complete example in this thread #873 (comment) .

Best regards!

ilyam8 · 2020-01-02T22:29:13Z

@tinyhammers

try this

alarm: apps_icecast_processes
   on: apps.processes
 calc: $icecast
units: processes
every: 10s
 crit: $this == nan OR $this == 0
delay: up 10s down 1m multiplier 2 max 10m
 info: icecast has died
   to: sysadmin

tinyhammers · 2020-01-03T06:39:34Z

@ilyam8 ❤️
Best new years gift ever.
That is working perfectly, thank you so much.
Could you explain what the line crit: $this == nan OR $this == 0 is doing, so I can try and get a better understanding of how it works?
Thanks again for helping. You guys are amazing 😍

ilyam8 · 2020-01-03T12:32:22Z

@tinyhammers ☺️

actually should be

alarm: apps_icecast_processes
   on: apps.processes
 calc: $icecast
units: processes
every: 10s
 crit: $this == nan
delay: up 10s down 1m multiplier 2 max 10m
 info: icecast is not up
   to: sysadmin

$this == 0 doesnt work, because if there is no processes the value is nan, not 0.

   on: apps.processes
 calc: $icecast
 crit: $this == nan

you can read it as: chart apps.processes has no icecast dimension.

tinyhammers added no changelog Issues which are not going to be added to changelog question labels Jan 2, 2020

ilyam8 added the area/health label Jan 2, 2020

ilyam8 closed this as completed Jan 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring a specific process #7660

Monitoring a specific process #7660

tinyhammers commented Jan 2, 2020

thiagoftsm commented Jan 2, 2020

ilyam8 commented Jan 2, 2020

tinyhammers commented Jan 3, 2020

ilyam8 commented Jan 3, 2020

Monitoring a specific process #7660

Monitoring a specific process #7660

Comments

tinyhammers commented Jan 2, 2020

thiagoftsm commented Jan 2, 2020

ilyam8 commented Jan 2, 2020

tinyhammers commented Jan 3, 2020

ilyam8 commented Jan 3, 2020