Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poller service #1561

Closed
wants to merge 101 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
810a1d6
add mysql lock for polling
clinta Jul 6, 2015
7890556
Merge branch 'master' into poller-lock
clinta Jul 6, 2015
54da139
add swp to gitignore for vim
clinta Jul 6, 2015
bbe65e0
remove debug echos
clinta Jul 6, 2015
29969eb
limit sql query to amount_of_workers
clinta Jul 6, 2015
0a3a561
make poller-service executable
clinta Jul 6, 2015
e216724
add shebang
clinta Jul 6, 2015
5e36127
change variables
clinta Jul 6, 2015
6b14ece
first attempt at looping logic
clinta Jul 6, 2015
505f9bc
working
clinta Jul 6, 2015
0305312
cleaned up
clinta Jul 6, 2015
fdd2d8c
more efficient sql queries, working well all night
clinta Jul 7, 2015
02eb7bf
fix bug on retrying failed devices
clinta Jul 7, 2015
d6dc96b
thread numbers are meaningless
clinta Jul 7, 2015
e607c37
add proper logging
clinta Jul 7, 2015
42778b6
log level from config
clinta Jul 7, 2015
93ee4fb
get all config from config.php
clinta Jul 7, 2015
9ba9f65
first attempt at upstart conf
clinta Jul 7, 2015
0cafd02
poller service upstart conf working
clinta Jul 7, 2015
6bb661e
start docs
clinta Jul 7, 2015
91f005b
add discovery support to service
clinta Jul 7, 2015
53b8d3a
discover integration working
clinta Jul 7, 2015
b731821
documentation written
clinta Jul 7, 2015
0b55425
more details about which cronjobs this replaces
clinta Jul 7, 2015
1f98f5e
more details indocs
clinta Jul 7, 2015
a818a44
document loglevel
clinta Jul 7, 2015
3fe0bf2
add license to docstring
clinta Jul 7, 2015
e8a7c74
pep8
clinta Jul 7, 2015
ef97d1b
get a lock when doing discovery
clinta Jul 7, 2015
39ddbe2
Merge branch 'master' into poller-service
clinta Jul 10, 2015
7372b36
log progress to db every 5 minutes
clinta Jul 10, 2015
9e9149c
fix stupid query
clinta Jul 10, 2015
043ebf2
sql stuff
clinta Jul 10, 2015
7ab0602
notes
clinta Jul 10, 2015
0170ca8
update sql schema to add primary key for pollers
clinta Jul 13, 2015
07d3ade
update sql schema to add primary key for pollers
clinta Jul 13, 2015
1592af0
add last_poll_attempted column to devices table
clinta Jul 13, 2015
aeabb67
store last attempted in sql for simpler down_retry
clinta Jul 13, 2015
6ae58de
don't check dont-retry'
clinta Jul 13, 2015
8f1884e
don't loop too fast
clinta Jul 13, 2015
c91afca
release lock for 1 device
clinta Jul 13, 2015
e666d90
remove queue locks on empty query
clinta Jul 13, 2015
985a07d
bad index
clinta Jul 13, 2015
8eb436c
test for null last_attempted
clinta Jul 13, 2015
24f0b86
bad sql
clinta Jul 13, 2015
81b13e8
change to 2 minute updates
clinta Jul 13, 2015
67d4924
update every minute
clinta Jul 13, 2015
a2aebf7
catch sql errors
clinta Jul 13, 2015
ea0738e
exit on mysql error
clinta Jul 13, 2015
cbbdf72
don't try to discover down devices
clinta Jul 13, 2015
b52a0ad
Merge branch 'master' into poller-service
clinta Jul 14, 2015
615b695
fix conflicts and adjust coding style
clinta Jul 15, 2015
a9e5139
move schema in prep for rebase
clinta Jul 15, 2015
c07c8f5
Merge branch 'master' into poller-service
clinta Jul 15, 2015
f4be501
remove unnecessary exit conditions
clinta Jul 16, 2015
8818d4f
Merge branch 'master' into poller-service
clinta Jul 16, 2015
2e5703e
attribution
clinta Jul 16, 2015
eeec0c1
index and limit
clinta Jul 17, 2015
93c74ca
Merge branch 'master' into poller-service
clinta Jul 17, 2015
8c73655
zero length field in format
clinta Jul 19, 2015
74531c6
make logging python3 compatible
clinta Jul 19, 2015
c87d626
upper case warning
clinta Jul 19, 2015
fc890b0
Revert "upper case warning"
clinta Jul 19, 2015
1511c6e
Revert "make logging python3 compatible"
clinta Jul 19, 2015
42f0627
decode proc
clinta Jul 19, 2015
6d67431
fix logging options
clinta Jul 19, 2015
803241d
better logging error checking
clinta Jul 19, 2015
89b2267
Merge branch 'master' into poller-service
clinta Jul 21, 2015
ada9a55
give discovery it's own lock
clinta Jul 22, 2015
d7d48be
check for schema_update lock when poller-service runs
clinta Jul 22, 2015
e995b40
Merge branch 'master' into poller-service
clinta Jul 22, 2015
2b08915
bail out if we can't get a lock on schema_update
clinta Jul 22, 2015
8967871
add function to check for a lock
clinta Jul 22, 2015
2b90c5c
wait for all locks to be free when updating schema
clinta Jul 22, 2015
3594af1
bail if schema is already up to date
clinta Jul 22, 2015
57db854
move lock checks after bail out
clinta Jul 22, 2015
d1db466
fix bailout comparison
clinta Jul 22, 2015
5bf3d96
return instead of exit
clinta Jul 22, 2015
c640745
release schema lock
clinta Jul 22, 2015
631c102
Merge branch 'poller-service' of github.com:clinta/librenms into poll…
clinta Jul 22, 2015
576bd8c
Merge branch 'schema-locks' into poller-service
clinta Jul 22, 2015
be4d8be
typo
clinta Jul 22, 2015
d3a7764
Merge branch 'master' into poller-service
clinta Jul 22, 2015
82a975b
Merge branch 'master' into poller-service
clinta Jul 22, 2015
97c4c0b
Merge branch 'master' into poller-service
clinta Jul 22, 2015
02ddc69
simplify poller-service.conf
clinta Jul 23, 2015
f9d135d
symlink instead of copy
clinta Jul 23, 2015
ddff192
add LSB script
clinta Jul 23, 2015
eb2593a
add lsb docs
clinta Jul 23, 2015
13acb51
space
clinta Jul 23, 2015
f436f3b
run as librenms and background
clinta Jul 23, 2015
1b3072b
make and remove pidfile
clinta Jul 23, 2015
9976445
no removepidfile
clinta Jul 23, 2015
79f788a
add note to reload initctl after linking upstart job
clinta Jul 23, 2015
f867b82
Merge branch 'master' into poller-service
clinta Jul 24, 2015
841ce2a
fix unused code
clinta Jul 28, 2015
8eef06d
prepare for merge
clinta Jul 28, 2015
c7b1755
Merge branch 'master' into poller-service
clinta Jul 28, 2015
d7d06a4
Merge branch 'master' into poller-service
clinta Jul 29, 2015
577e9d5
multi-master mysql docs
clinta Jul 30, 2015
0ea5dfa
Merge branch 'master' into poller-service
clinta Jul 30, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ nbproject
.alerts.lock
.ircbot.alert
.metadata_never_index
*.swp
4 changes: 3 additions & 1 deletion discovery.php
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,9 @@
}

foreach (dbFetch("SELECT * FROM `devices` WHERE status = 1 AND disabled = 0 $where ORDER BY device_id DESC") as $device) {
discover_device($device, $options);
if (dbGetLock('discovering.' . $device['device_id'])) {
discover_device($device, $options);
}
}

$end = utime();
Expand Down
29 changes: 29 additions & 0 deletions doc/Extensions/Poller-Service.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Poller Service
The Poller service is an alternative to polling and discovery cron jobs and provides support for distributed polling without memcache. It is multi-threaded and runs continuously discovering and polling devices with the oldest data attempting to honor the polling frequency configured in `config.php`. This service replaces all the required cron jobs except for `/opt/librenms/daily.sh` and `/opt/librenms/alerts.php`.

Configure the maximum number of threads for the service in `$config['poller_service_workers']`. Configure the minimum desired polling frequency in `$config['poller_service_poll_frequency']` and the minimum desired discovery frequency in `$config['poller_service_discover_frequency']`. The service will not poll or discover devices which have data newer than this this configured age in seconds. Configure how frequently the service will attempt to poll devices which are down in `$config['poller_service_down_retry']`.

The poller service is designed to gracefully degrade. If not all devices can be polled within the configured frequency, the service will continuously poll devices refreshing as frequently as possible using the configured number of threads.

The service logs to syslog. A loglevel of INFO will print status updates every 5 minutes. Loglevel of DEBUG will print updates on every device as it is scanned.

## Configuration
```php
// Poller-Service settings
$config['poller_service_loglevel'] = "INFO";
$config['poller_service_workers'] = 16;
$config['poller_service_poll_frequency'] = 300;
$config['poller_service_discover_frequency'] = 21600;
$config['poller_service_down_retry'] = 60;
```

## Distributed Polling
Distributed polling is possible, and uses the same configuration options as are described for traditional distributed polling, except that the memcached options are not necessary. The database must be acessable from the distributed pollers, and properly configured. Remote access to the RRD directory must also be configured as described in the Distributed Poller documentation. Memcache is not required. Concurrency is managed using mysql GET_LOCK to ensure that devices are only being polled by one device at at time. The poller service is compatible with poller groups.

## Multi-Master MySQL considerations
Because locks are not replicated in Multi-Master MySQL configurations, if you are using such a configuration, you will need to make sure that all pollers are using the same MySQL server.

## Service Installation
An upstart configuration `poller-service.conf` is provided. To install run `ln -s /opt/librenms/poller-service.conf /etc/init/poller-service.conf`. The service will start on boot and can be started manually by running `start poller-service`. If you recieve an error that the service does not exist, run `initctl reload-configuration`. The service is configured to run as the user `librenms` and will fail if that user does not exist.

An LSB init script `poller-service.init` is also provided. To install run `ln -s /opt/librenms/poller-service.init /etc/init.d/poller-service && update-rc.d poller-service defaults`.
34 changes: 34 additions & 0 deletions includes/dbFacile.php
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,40 @@ function dbQuery($sql, $parameters=array()) {
}//end dbQuery()


/*
* Aquire a lock on a string
* */


function dbGetLock($data, $timeout = 0) {
$sql = 'SELECT GET_LOCK(\'' . $data . '\',' . $timeout . ')';
$result = dbFetchCell($sql);
return $result;
}

/*
* Check a lock on a string
* */


function dbCheckLock($data) {
$sql = 'SELECT IS_FREE_LOCK(\'' . $data . '\')';
$result = dbFetchCell($sql);
return $result;
}

/*
* Release a lock on a string
* */


function dbReleaseLock($data) {
$sql = 'SELECT RELEASE_LOCK(\'' . $data . '\')';
$result = dbFetchCell($sql);
return $result;
}


/*
* Passed an array and a table name, it attempts to insert the data into the table.
* Check for boolean false to determine whether insert failed
Expand Down
18 changes: 18 additions & 0 deletions includes/sql-schema/update.php
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,22 @@

asort($filelist);

if (explode('.', max($filelist), 2)[0] <= $db_rev) {
if ($debug) {
echo "DB Schema already up to date.\n";
}
return;
}

if (!dbGetLock('schema_update')) {
echo "Schema update already in progress. Exiting\n";
exit(1);
} //end if

do {
sleep(1);
} while (@dbFetchCell('SELECT COUNT(*) FROM `devices` WHERE NOT IS_FREE_LOCK(CONCAT("polling.", device_id)) OR NOT IS_FREE_LOCK(CONCAT("queued.", device_id)) OR NOT IS_FREE_LOCK(CONCAT("discovering.", device_id))') > 0);

foreach ($filelist as $file) {
list($filename,$extension) = explode('.', $file, 2);
if ($filename > $db_rev) {
Expand Down Expand Up @@ -145,3 +161,5 @@

echo "-- Done\n";
}

dbReleaseLock('schema_update');
20 changes: 20 additions & 0 deletions poller-service.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# poller-service - SNMP polling service for LibreNMS

description "SNMP polling service for LibreNMS"
author "Clint Armstrong <clint@clintarmstrong.net>"

# When to start the service
start on runlevel [2345]

# When to stop the service
stop on runlevel [016]

# Automatically restart process if crashed
respawn

chdir /opt/librenms
setuid librenms
setgid librenms

# Start the process
exec /opt/librenms/poller-service.py
79 changes: 79 additions & 0 deletions poller-service.init
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
### BEGIN INIT INFO
# Provides: poller-service
# Required-Start: networking
# Required-Stop: networking
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: The LibreNMS poller-service daemon
# Description: The LibreNMS poller-service daemon
# This polls devices monitored by LibreNMS
### END INIT INFO

. /lib/lsb/init-functions

NAME=poller-service

DAEMON=/opt/librenms/poller-service.py

USER=librenms

PIDFILE=/var/run/poller-service.pid

test -x $DAEMON || exit 5

case $1 in

start)
# Checked the PID file exists and check the actual status of process
if [ -e $PIDFILE ]; then
status_of_proc -p $PIDFILE $DAEMON "$NAME process" && status="0" || status="$?"
# If the status is SUCCESS then don't need to start again.
if [ $status = "0" ]; then
exit # Exit
fi
fi
# Start the daemon.
log_daemon_msg "Starting the process" "$NAME"
# Start the daemon with the help of start-stop-daemon
# Log the message appropriately
if start-stop-daemon --start --quiet --oknodo --make-pidfile --pidfile $PIDFILE --exec $DAEMON --chuid $USER --background; then
log_end_msg 0
else
log_end_msg 1
fi
;;

stop)
# Stop the daemon.
if [ -e $PIDFILE ]; then
status_of_proc -p $PIDFILE $DAEMON "Stoppping the $NAME process" && status="0" || status="$?"
if [ "$status" = 0 ]; then
start-stop-daemon --stop --quiet --oknodo --pidfile $PIDFILE
/bin/rm -rf $PIDFILE
fi
else
log_daemon_msg "$NAME process is not running"
log_end_msg 0
fi
;;
restart)
# Restart the daemon.
$0 stop && sleep 2 && $0 start
;;

status)
# Check the status of the process.
if [ -e $PIDFILE ]; then
status_of_proc -p $PIDFILE $DAEMON "$NAME process" && exit 0 || exit $?
else
log_daemon_msg "$NAME Process is not running"
log_end_msg 0
fi
;;

*)
# For invalid arguments, print the usage message.
echo "Usage: $0 {start|stop|restart|reload|status}"
exit 2
;;
esac