Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
System Health Monitor for Linux systems
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
COPYING
COPYRIGHT
README
system_health.py

README

System Health Monitor
Copyright 2004-2008 by Brian C. Lane <bcl@brianlane.com>
All Rights Reserved
Licensed under GPL v2.0

This python program monitors your system's network interfaces, memory, load,
running processes and disk space (and inode usage). It creates Round Robin
Databases of the information and generates graphs every 5 minutes. It
includes a user friendly interactive setup mode, and it generates HTML files
suitable for inclusion in a webpage or local viewing.


UPDATE - If you are upgrading from v0.5.1 to 0.6.0 you will need to delete
your meminfo.rrd file and create a new one with systemhealth.py --check,
this is due to the kernel folks adding a new line to /proc/meminfo in
v2.6.10 kernel.

UPDATE  v1.0
Added --new switch which will prompt for new processes to add to the system. This compares the processes being monitored with the current output of ps

v0.9
Changed the spacing on the COMMENT strings to work better with the larger font
used by more recent versions of rrdtool. Added a try/except check for external
config options since old versions won't have that.

v0.8
Changed ctime() call to ctime().replace(":","\\:") to escape the : character
for the rrdtool COMMENT sections.

v0.7
You can now add the output of an external program to System Health. Use the
--add argument and follow the prompts. You can pass arguments to the 
program and specify the title, rrd filename and type of value.

The output of the program must be a single number on a single line. Both
floating point and integer are accepted. This value will be inserted directly
into the RRD database for graphing.

The graph type of gauge is used for numbers that represent a level, like a
temperature or a system load. Counter type is for number that are always
incrementing, like a packet count.

When you are finished adding new programs you need to run systemhealth.py
with the --html argument to regenerate the html pages. If you have
hand-edited any of the pages you should back them up first and then re-edit.



REQUIREMENTS

This program requires a few things to work correctly.

1. Round Robin Database from Tobi Oetiker
   http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/

2. Python
   http://www.python.org

3. A web server or browser for viewing the summary pages


Installation and Setup

system_health.py should be run as a normal user. I suggest creating a new
user named health (or tommy or bob -- the name doesn't matter) and running
system_health.py from that user's crontab.

As root do the following:

# useradd health
# chmod ugo+x /home/health
# cp -p system_health.py /home/health/bin/
# chown -R health.health  /home/health/bin
# su - health

This creates a new user, named health in this case, and a bin directory for
system_health.py to run from.

Switch to the new user and run setup, which will create the health_rrd and
health_html directories, scan the system for network interfaces, drives and
running processes and prompt for which ones to monitor.

$ ./bin/system_health.py --setup

follow the prompts to select what network interfaces, drive partitions, and 
processes to monitor. Directories will be prompted for and created if they 
do not already exist (with a prompt before and a check for writeability).

Hit return to accept the defaults. All networks and drive partitions have a 
default of 'Y' and processes have a default of 'N'

Watch out for drives, some of the devices that are mounted aren't actual
drives and should be skipped.

Look in the ~/health_rrd and ~/health_html directories to makes sure the rrd
files and html files were created correctly.

Setup the crontab for the new user:

$ crontab -e

and add the following line:

*/5 * * * *     /home/health/bin/system_health.py --log --graph 2> /dev/null >/dev/null

(that should all be on one line)

After it is running you can add other processes, drives and interfaces to
monitor by editing the ~/.syshealthrc file. For example to add monitoring of
the sshd daemon you would add the following line to the [processes] section:

sshd = sshd

And then regenerate the html and create the new rrd file by running this as
the health user:

./system_health --check --html

Or you can use the --new switch added in v1.0 to prompt you for new processes to add to the system.

One thing to watch out for is that running --html will rewrite all the HTML
files, so if you have made any customizations to them they will be lost.


If you have any questions, comments, suggestions, etc. contact me at
bcl@brianlane.com (if you want to make sure your email makes it through my
carrier-class spam filtering you should add the word slartibartfast to the
subject line).

v0.5.1 Was the initial release

v0.6 Changes the license to GPL v2.0 and fixes a problem with newer
     kernels (v2.6.10) that include a new entry in /proc/meminfo
     Fixed command line processing so that --log and --graph can be used
     at the same time.

v0.7 Added external program graphs. See the UPDATE at the top of this file


The License

I have changed (as of release v0.6) licenses to GPL v2.0, although
donations are still greatly appreciated and help to motivate me to improve
and maintain my software. Donations can be made via paypal payment sent to
bcl@brianlane.com

Thanks for using my software,

Brian C. Lane
January 20, 2008
Something went wrong with that request. Please try again.