Skip to content

Bakafish/DNX_Affinity

Repository files navigation

DNX README File
---------------

DNX - Distributed Nagios eXecutor is a Nagios Event Broker (NEB) plug-in 
module that distributes checks amongst several servers ("worker nodes") to 
reduce load and check latency introduced by a large Nagios installation.

DNX exists as two parts:

   - The DNX NEB module itself (dnxServer.so) and
   - The DNX Client (dnxClient).

dnxServer.so works as any other NEB module would; it is loaded into the same 
process address space as Nagios upon start up.

dnxClient resides on a separate host and sends requests to the thread started 
by the DNX NEB module on the Nagios server ("head node") for checks to 
perform and then returns the status data back to the module for Nagios' 
interpretation and subsequent actions (alerts, notifications, etc).

Management Interface
--------------------

DNX provides a simple management interface, which allows the administrator
to perform management tasks against installed DNX clients. The tool is
called 'dnxstats' and is installed (by default) into the $(prefix)/bin
directory on the DNX server (installed using install or install-server). 

The dnxstats utility has a simple interface, which you can discover using 
the --help command-line option:

  $ /usr/local/nagios/bin/dnxstats --help
  Usage: dnxstats [options]
  Where [options] are:
    -s <host>    specify target host name (default: localhost).
    -p <port>    specify target port number (default: 12480).
    -c <cmdstr>  send <cmdstr> to server. (Hint: Try sending "HELP".)
    -v           print version and exit.
    -h           print this help and exit.

The functionality provided by dnxstats is really provided by the stats
server (the DNX client, in this case). To find out the level of
functionality supported by a given server, send the "HELP" command:

  $ /usr/local/nagios/bin/dnxstats -s 10.1.1.1 -c "HELP"
  DNX Client Management Commands:
    SHUTDOWN
    RECONFIGURE
    DEBUGTOGGLE
    RESETSTATS
    GETSTATS stat-list
      stat-list is a comma-delimited list of stat names:
        jobsok      - number of successful jobs
        jobsfailed  - number of unsuccessful jobs
        thcreated   - number of threads created
        thdestroyed - number of threads destroyed
        thexist     - number of threads currently in existence
        thactive    - number of threads currently active
        reqsent     - number of requests sent to DNX server
        jobsrcvd    - number of jobs received from DNX server
        minexectm   - minimum job execution time
        avgexectm   - average job execution time
        maxexectm   - maximum job execution time
        avgthexist  - average threads in existence
        avgthactive - average threads processing jobs
        threadtm    - total thread life time
        jobtm       - total job processing time
      Note: Stats are returned in the order they are requested.
    GETCONFIG
    GETVERSION
    HELP

Thus, the dnxstats utility is really a dumb client, sending text 
specified by the user, and dumping raw response text to the console.

Except for single-word commands, all command strings should be
enclosed in double or single quotes, so that the shell interprets
the entire command string (text following the -c option) as a single
argument.


Nagios Support
--------------

Currently, DNX has been tested with Nagios 2.7, 2.8, 2.9, 2.10, 2.11, 3.0
and 3.0.1. As we continue to release new versions of DNX, we'll continue
to enhance support for the latest Nagios versions. 


Advanced Features
-----------------

Local Execution Only Checks

You may have some check which must be run from one host only (firewall 
issues, SAN connections, proprietary libraries, etc). To accomodate this, 
you may flag certain checks to not be distributed. They will execute in
the normal fashion as if DNX was not loaded. This could also be used to
perform checks on the nagios server itself (check_load, check_nagios, etc). 
This could be problematic in that it makes it harder to run these same checks
on the worker nodes. We recommend the use of nrpe for all such checks (nagios
engine server and worker nodes), if for no other reason than for simplicity. 
The worker nodes will use nrpe to check the nagios engine server as well as 
each other.

Plug-in Propagation

One concern with DNX is that the plugins will exist on different servers and 
must all be identical. Thus, included with DNX is a simple script 
(sync_plugins.pl) which will run like a plugin that dnxServer.so would execute 
upon startup. This script provides a mechanism to ensure that all of your 
plugins exist on each of the worker nodes (this is important because you can't 
be sure which node will actually perform the checks). The script uses rsync to 
push all of the plugins in your plugin directory (/usr/local/nagios/libexec, 
for example) from the nagios engine server (the plugin authority) to each 
worker node. For this to work you must set up SSH key sharing (as the nagios 
user) between your servers for the nagios user so that this rsync will work 
without a password. Note that this could be considered a security risk by some 
people/organizations. If you do not like this mechanism, you may write your 
own sync plugin, or disable DNX's interal ability to sync plugins and do so 
by your own devices.

Also note that each worker node needs to be set up similarly, if not 
identically to other worker nodes (any external libraries, perl modules, 
paths, etc). And lastly, you should be aware that the rsync may clobber meta
data of plugins on the first run, which you may want to fix manually. For 
example, check_icmp needs to be run as root with the set uid bit, if 
check_icmp is updated (or moved to the worker node for the first time) by 
rsync, it will loose ownership by root. Once the plugin is in place with the 
permissions correct, rsync will leave it alone, however.


Additional Information
----------------------

For information on building, installation and configuration, please refer to
the INSTALL file. 

For the latest changes, please refer to the NEWS file.

For detailed information on source changes between versions, please refer to 
the ChangeLog file.

Please see the COPYING file for details on the GNU General Public License, 
under which this software is released.

About

This is a fork of DNX to enable individual clients to be affined to specific hosts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages