Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNKNOWN Dynamic Smart Array P410i hpacucli #151

Closed
rajo opened this issue Oct 26, 2016 · 13 comments
Closed

UNKNOWN Dynamic Smart Array P410i hpacucli #151

rajo opened this issue Oct 26, 2016 · 13 comments

Comments

@rajo
Copy link

rajo commented Oct 26, 2016

Hi there,

and another "unknown". Debug as follows:

root@b:~# /usr/lib/nagios/plugins/check_raid
UNKNOWN: hpacucli:[Smart Array P410i: Array A(OK)[LUN1:OK], Smart Array P411: ]
root@b:~# /usr/lib/nagios/plugins/check_raid  -d
DEBUG EXEC: /usr/sbin/hpacucli controller all show status at /usr/lib/nagios/plugins/check_raid line 435.
DEBUG EXEC: /usr/sbin/hpacucli controller slot=0 logicaldrive all show at /usr/lib/nagios/plugins/check_raid line 435.
DEBUG EXEC: /usr/sbin/hpacucli controller slot=3 logicaldrive all show at /usr/lib/nagios/plugins/check_raid line 435.
UNKNOWN: hpacucli:[Smart Array P410i: Array A(OK)[LUN1:OK], Smart Array P411: ]
root@b:~# /usr/sbin/hpacucli controller all show status

Smart Array P410i in Slot 0 (Embedded)
   Controller Status: OK
   Cache Status: OK
   Battery/Capacitor Status: OK

Smart Array P411 in Slot 3
   Controller Status: OK
   Cache Status: Not Configured


root@b:~# /usr/sbin/hpacucli controller slot=0 logicaldrive all show

Smart Array P410i in Slot 0 (Embedded)

   array A

      logicaldrive 1 (838.1 GB, RAID 5, OK)

root@b:~# /usr/sbin/hpacucli controller slot=3 logicaldrive all show

Error: The specified device does not have any logical drives.

root@b:~#

Thanks!

@glensc
Copy link
Owner

glensc commented Nov 17, 2016

so, it's expected /usr/sbin/hpacucli controller slot=3 logicaldrive all show returns error:

Error: The specified device does not have any logical drives.

what check_raid should do here?

@rajo
Copy link
Author

rajo commented Nov 17, 2016

Well, that's true for the second controller in Slot3; but there is a working raid in Slot0 which is in state OK. So the question is how to behave in such a situation. My suggestion would be, that the aggregate state of all controllers should be returned. If there is nothing configured on a controller, then ... well... one could either ignore it, or have a command line switch (similar to the existing ones) where one can define the return value for such a case. Another option would be a command line switch for specifying which controller should be queried if multiple are found; so in this case one could restrict check_raid to slot=0 and ignore anything else as it isn't used.

So in short, I'd expect an "OK" as result, question is how this can be achieved :-)
(Don't know which solution is the more feasible code-wise, what do you suggest?)

Thanks for your time

@glensc
Copy link
Owner

glensc commented Nov 17, 2016

well. i see that your second controller is failing as controllers view it is ok, but in detailed view it gives error, so aggregate result of such is CRITICAL.

however for your usecase --hpacucli-option=slot=1 would make sense. there exist framework for plugin specific options, i'll see what can be done.

one option is perhaps handle the very specific error Error: The specified device does not have any logical drives. and use state OK. but then again some may consider this error situation when controller must have logical devices and suddenly they are gone and reporting OK in such case is unacceptable.

@glensc glensc closed this as completed in 1c9e0a2 Nov 17, 2016
@glensc
Copy link
Owner

glensc commented Nov 17, 2016

once travis build finishes, you can try the snapshot build:
https://github.com/glensc/nagios-plugin-check_raid/releases/tag/snapshot

@rajo
Copy link
Author

rajo commented Nov 18, 2016

First, thanks for the option but unfortunately it doesn't seem to work. I used the check_raid.pl from the above mentioned snapshot which reports its version as

./check_raid.pl -V
check_raid 4.0.2-61-g1c9e0a2

Considering that I can find the following comment in the file itself, I'd guess this is the correct version:

# if --plugin-option=hpacucli-target=slot=0 is specified
# filter only allowed values

So, I'm invoking the following command

./check_raid.pl -p hpacucli --plugin-option=hpacucli-target=slot=0
UNKNOWN: hpacucli:[Smart Array P410i: Array A(OK)[LUN1:OK], Smart Array P411: ]

Debug output gives:

root@xxx:~# ./check_raid.pl -p hpacucli --plugin-option=hpacucli-target=slot=0 -d
Visit <https://github.com/glensc/nagios-plugin-check_raid#reporting-bugs> how to report bugs

DEBUG EXEC: /usr/sbin/hpacucli controller all show status at ./check_raid.pl line 482.
DEBUG EXEC: /usr/sbin/hpacucli controller slot=3 logicaldrive all show at ./check_raid.pl line 482.
DEBUG EXEC: /usr/sbin/hpacucli controller slot=0 logicaldrive all show at ./check_raid.pl line 482.
UNKNOWN: hpacucli:[Smart Array P410i: Array A(OK)[LUN1:OK], Smart Array P411: ]
root@xxx:~#  /usr/sbin/hpacucli controller all show status

Smart Array P410i in Slot 0 (Embedded)
   Controller Status: OK
   Cache Status: OK
   Battery/Capacitor Status: OK

Smart Array P411 in Slot 3
   Controller Status: OK
   Cache Status: Not Configured


root@xxx:~# /usr/sbin/hpacucli controller slot=0 logicaldrive all show

Smart Array P410i in Slot 0 (Embedded)

   array A

      logicaldrive 1 (838.1 GB, RAID 5, OK)

root@xxx:~# /usr/sbin/hpacucli controller slot=3 logicaldrive all show

Error: The specified device does not have any logical drives.

root@xxx:~#

Also changing slot=0 to slot=3 or even something else doesn't change the output in any way. Or am I using a wrong syntax here? At least that's what I've understood from your commit message / help screen.

(BTW: should #139 be fixed with this build as well? If so, this doesn't seem to work either.)

@glensc glensc reopened this Nov 18, 2016
@glensc
Copy link
Owner

glensc commented Nov 18, 2016

#139 problem was slot=0b that code required it to be numeric only. it's fixed as much as i can test with outputs givem. make new issue or post to existing one if you have more details.

@glensc
Copy link
Owner

glensc commented Nov 18, 2016

the commandline option seems correct:

[~/scm/nagios/check_raid (issue151)⚡] ➔ ./check_raid.sh -p hpacucli --plugin-option=hpacucli-target=slot=0 -d
$VAR1 = {
          'enable_plugins' => [
                                'hpacucli'
                              ],
          'hpacucli-target' => 'slot=0'
        };
Died at /home/glen/scm/nagios/check_raid/bin/check_raid.pl line 148.
[~/scm/nagios/check_raid (issue151)⚡] ➔ git diff
diff --git a/bin/check_raid.pl b/bin/check_raid.pl
index a7d5dfc..5e15f45 100755
--- a/bin/check_raid.pl
+++ b/bin/check_raid.pl
@@ -144,6 +144,9 @@ if (my $opts = $mp->opts->get('plugin-option')) {
        }
 }

+use Data::Dumper;
+print Dumper \%plugin_options; die;
+
 my $mc = App::Monitoring::Plugin::CheckRaid->new(%plugin_options);

 $App::Monitoring::Plugin::CheckRaid::Utils::debug = $mp->opts->debug;

glensc added a commit that referenced this issue Nov 18, 2016
@glensc
Copy link
Owner

glensc commented Nov 18, 2016

if with current master it still doesn't work. could you apply following patch to git checkout and run it:

diff --git a/lib/App/Monitoring/Plugin/CheckRaid/Plugins/hpacucli.pm b/lib/App/Monitoring/Plugin/CheckRaid/Plugins/hpacucli.pm
index c113eb3..44b1109 100644
--- a/lib/App/Monitoring/Plugin/CheckRaid/Plugins/hpacucli.pm
+++ b/lib/App/Monitoring/Plugin/CheckRaid/Plugins/hpacucli.pm
@@ -40,7 +40,10 @@ sub sudo {
 sub filter_targets {
        my ($this, $targets) = @_;

+       use Data::Dumper;
        my $cli_opts = $this->{options}{'hpacucli-target'};
+       print "Input Targets: ". Dumper $targets;
+       print "CLI: ". Dumper $cli_opts;
        if (!$cli_opts) {
                return $targets;
        }
@@ -54,6 +57,7 @@ sub filter_targets {
                        $this->critical->message("Controller $filter not found");
                }
        }
+       print "Return Targets". Dumper \%res;

        return \%res;
 }

@glensc
Copy link
Owner

glensc commented Nov 18, 2016

ok, found bug with commandline options parsing. so just use git master or last snapshot build

@glensc
Copy link
Owner

glensc commented Nov 18, 2016

85423dd - the "Error: The specified device does not have any logical drives." are marked with --noraid=STATE. you can specify which slots to monitor --plugin-option=hpacucli-target=slot=0

same as #145

use check_raid.pl from snapshot release once build finishes, or git master

i plan to make release soon anyway.

@rajo
Copy link
Author

rajo commented Nov 21, 2016

Tested the latest snapshot (4.0.3-10-gf55300e) and can confirm that it is working and both my issues are fixed.

Thank you very much for your effort.

@glensc
Copy link
Owner

glensc commented Nov 21, 2016

you can just use last 4.0.3 release too

@SecriiNicolae
Copy link

SecriiNicolae commented Sep 23, 2020

Help =(

Repository owner locked as off-topic and limited conversation to collaborators Sep 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants