-
Notifications
You must be signed in to change notification settings - Fork 173
Description
We're running check_postgres.pl
like this:
check_postgres.pl -u nagios --action=last_vacuum --exclude=~^pg --db=csi
The output is:
No matching tables found due to exclusion/inclusion options
But that can't be right. The query that ends up running is (reformatted):
SELECT current_database(), nspname, relname,
CASE WHEN v IS NULL THEN -1 ELSE round(extract(epoch FROM now()-v)) END,
CASE WHEN v IS NULL THEN '?' ELSE TO_CHAR(v, 'HH24:MI FMMonth DD, YYYY') END
FROM (
SELECT nspname, relname, GREATEST(
pg_stat_get_last_vacuum_time(c.oid),
pg_stat_get_last_autovacuum_time(c.oid)
) AS v
FROM pg_class c, pg_namespace n
WHERE relkind = 'r'
AND n.oid = c.relnamespace
AND n.nspname <> 'information_schema'
ORDER BY 3
) AS foo;
When I run that manually, I get rows like:
csi | pg_catalog | pg_authid | 390524 | 23:16 March 20, 2010
csi | public | foo | -1 | ?
csi | pg_catalog | pg_auth_members | -1 | ?
So we do have a table, "foo", that is not excluded. However, it's never been vacuumed, so round
is set to -1. The error should not be that no matching tables were found because of the exclusion, because a table is found and not excluded, but has never been vacuumed. So it should probably say something like "no unvacuumed tables found" instead.
I think that the reason it works that way is this bit of code in check_last_vacuum_analyze()
:
SLURP: while ($db->{slurp} =~ /(\S+)\s+\| (\S+)\s+\| (\S+)\s+\|\s+(\-?\d+) \| (.+)\s*$/gm) {
my ($dbname,$schema,$name,$time,$ptime) = ($1,$2,$3,$4,$5);
$maxtime = -3 if $maxtime == -1;
if (skip_item($name, $schema)) {
$maxtime = -2 if $maxtime < 1;
next SLURP;
}
So looking at the three rows returned above, it looks like:
- row one is excluded and
$maxtime
set to -2 - row two is not excluded, but
$maxtime
is -1 and so gets set to -3 - row three is excluded and
$maxtime
set to -2
Since the last row fetched set $maxtime
to -2, this code then gets triggered:
if ($maxtime == -2) {
add_unknown msg('no-match-table');
}
But that's wrong. I think what needs to happen is that it needs to know that unexcluded rows were returned (the second row in this example) but were never vacuumed. Not sure how you'd go about that using $maxtime
as a flag; maybe you need some other flag? Maybe something like this?
--- a/check_postgres.pl
+++ b/check_postgres.pl
@@ -3469,6 +3469,7 @@ sub check_last_vacuum_analyze {
my ($minrel,$maxrel) = ('?','?'); ## no critic
my $mintime = 0; ## used for MRTG only
my $count = 0;
+ my $unskipped;
SLURP: while ($db->{slurp} =~ /(\S+)\s+\| (\S+)\s+\| (\S+)\s+\|\s+(\-?\d+) \| (.+)\s*$/gm) {
my ($dbname,$schema,$name,$time,$ptime) = ($1,$2,$3,$4,$5);
$maxtime = -3 if $maxtime == -1;
@@ -3476,6 +3477,7 @@ sub check_last_vacuum_analyze {
$maxtime = -2 if $maxtime < 1;
next SLURP;
}
+ $unskipped ||= 1;
$db->{perf} .= " $dbname.$schema.$name=${time}s;$warning;$critical" if $time >= 0;
if ($time > $maxtime) {
$maxtime = $time;
@@ -3497,7 +3499,7 @@ sub check_last_vacuum_analyze {
}
if ($maxtime == -2) {
- add_unknown msg('no-match-table');
+ add_unknown msg($unskipped ? 'no-vacuumed-table' : 'no-match-table');
}
elsif ($maxtime < 0) {
add_unknown $type eq 'vacuum' ? msg('vac-nomatch-v') : msg('vac-nomatch-a');
Thanks.
David