Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency for --noraid: No controller vs No Volumes; software RAID implications #138

Closed
robbat2 opened this issue Mar 17, 2016 · 5 comments · Fixed by #172
Closed

Consistency for --noraid: No controller vs No Volumes; software RAID implications #138

robbat2 opened this issue Mar 17, 2016 · 5 comments · Fixed by #172

Comments

@robbat2
Copy link
Contributor

robbat2 commented Mar 17, 2016

This is to split out the discussion on issue #130 regarding the return state of the dm plugin when there are no RAID volumes.

The existing help for --noraid says:

 --noraid=STRING
   Return STATE if no RAID controller is found. Defaults to UNKNOWN

In the case of mdstat, dmraid, dm, there is no controller per se, so special handling is needed.
--noraid should probably be Return STATE if no RAID volumes are found

Both the dmraid and mdstat plugins have their own active sub that doesn't report active unless there are ALSO volumes, which is probably a good thing to do for dm.

A further case that needs to be handled is the tools being installed, but NO controllers being present; EG, the megacli plugin presently reports OK in this case:

# check_raid.sh  -p megacli
OK: megacli:[Volumes(0): ; Devices(0): ]

Here's a list of possible cases; I see 12 possible combinations, of which 4 aren't valid.

  • program absent, controller absent OR present OR not-applicable, num-volumes = 0 (3 states) => plugin not active
  • program absent, controller absent OR present OR not-applicable, num-volumes > 0 (3 states, all invalid)
  • program present, controller absent, num-volumes = 0
  • program present, controller absent, num-volumes > 0 (invalid state)
  • program present, controller present, num-volumes = 0
  • program present, controller present, num-volumes > 0
  • program present, controller not-applicable, num-volumes = 0
  • program present, controller not-applicable, num-volumes > 0
@robbat2
Copy link
Contributor Author

robbat2 commented Mar 17, 2016

If we change the definition to be about volumes rather than the controller, then the list of cases gets much simplier:

  • program absent, num-volumes = 0 => plugin not active
  • program absent, num-volumes > 0 => invalid state
  • program present, num-volumes = 0 => return --noraid state
  • program present, num-volumes > 0 => return actual volume state.

This is problematic as if you're using LVM, at all, then dmsetup is present; regardless of using LVM RAID/mirroring at all. A simple example would be a user with a hardware RAID controller, and LVM on top of that. Even if the hardware controller volumes are all good, then dm would still presently report no devices.

@Napsty
Copy link

Napsty commented Mar 21, 2016

Quoting @robbat2: "A simple example would be a user with a hardware RAID controller, and LVM on top of that.".
Well, that's me. On a new physical server with an Adaptec Raid (arcconf is installed) I get a "WARNING" state back from check_raid because I use Logical Volumes on this server.

./check_raid.pl -V
check_raid 4.0.2

./check_raid.4.0.2.pl -d
Visit <https://github.com/glensc/nagios-plugin-check_raid#reporting-bugs> how to report bugs

DEBUG EXEC: /usr/bin/arcconf GETSTATUS 1 at ./check_raid.4.0.2.pl line 474.
DEBUG EXEC: /usr/bin/arcconf GETCONFIG 1 AL at ./check_raid.4.0.2.pl line 474.
DEBUG EXEC: /sbin/dmsetup status --noflush at ./check_raid.4.0.2.pl line 474.
WARNING: arcconf:[Controller:Optimal, Logical Device 0(raid10):Optimal, Drives: WD-WMC4N0641212,WD-WMC4N0740277,WD-WCC4E1868847,WD-WCC4N0SUV4ET=Online]; dm:[No devices to check]

It works with 4.0.1:

./check_raid.pl -V
check_raid 4.0.1

./check_raid.pl -d
DEBUG EXEC: /usr/bin/arcconf GETSTATUS 1 at ./check_raid.pl line 474.
DEBUG EXEC: /usr/bin/arcconf GETCONFIG 1 AL at ./check_raid.pl line 474.
OK: arcconf:[Controller:Optimal, Logical Device 0(raid10):Optimal, Drives: WD-WMC4N0641212,WD-WMC4N0740277,WD-WCC4E1868847,WD-WCC4N0SUV4ET=Online]

@glensc
Copy link
Owner

glensc commented Mar 21, 2016

@Napsty: 4.0.2 added the new dm plugin. it's the only change in that version. you can go back to 4.0.1 or add -p arcconf to use only arcconf plugin until this is resolved.

@Napsty
Copy link

Napsty commented Mar 21, 2016

Yes, saw it in the changelog. I'll use 4.0.1 for now and will test a future release again. Thanks.

@glensc glensc changed the title Consistency for --noraid: No controller vs No Volumes; software RAID implications Consistency for --noraid: No controller vs No Volumes; software RAID implications Nov 17, 2016
@glensc
Copy link
Owner

glensc commented Aug 9, 2017

will be resolved in #172.

@glensc glensc closed this as completed Aug 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants