Skip to content
A monitoring plugin for Icinga (2), Nagios, Shinken, Naemon, etc. to check the Logstash API (Logstash v.5+)
Ruby
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
contrib
lib fix inflight event calculation, finally. Apr 11, 2019
spec
.gitignore
AUTHORS
LICENSE
README.md
Rakefile
check_logstash
check_logstash.spec
icingaexchange.yml

README.md

check_logstash

A monitoring plugin for Icinga (2), Nagios, Shinken, Naemon, etc. to check the Logstash API (Logstash v.5+)

While the plugin works with the first release of Logstash 5.0, there is still some work to be done and some tests to be made.

A word of warning Be sure to read the configuration check part of this readme since there is a problem with this feature.

Usage

Options

-H HOST                          Logstash host
-p, --hostname PORT              Logstash API port
    --file-descriptor-threshold-warn WARN
                                 The percentage relative to the process file descriptor limit on which to be a warning result.
    --file-descriptor-threshold-crit CRIT
                                 The percentage relative to the process file descriptor limit on which to be a critical result.
    --heap-usage-threshold-warn WARN
                                 The percentage relative to the heap size limit on which to be a warning result.
    --heap-usage-threshold-crit CRIT
                                 The percentage relative to the heap size limit on which to be a critical result.
    --cpu-usage-threshold-warn WARN
                                 The percentage of CPU usage on which to be a warning result.
    --cpu-usage-threshold-crit CRIT
                                 The percentage of CPU usage on which to be a critical result.
    --inflight-events-warn WARN  Threshold for inflight events to be a warning result. Use min:max for a range.
    --inflight-events-crit CRIT  Threshold for inflight events to be a critical result. Use min:max for a range.
-h, --help                       Show this message

Using default values

./check_logstash.rb -H [logstashhost]

Using your own thresholds

./check_logstash.rb -H 127.0.0.1 --file-descriptor-threshold-warn 40 --file-descriptor-threshold-crit 50 --heap-usage-threshold-warn 10 --heap-usage-threshold-crit 20

or

./check_logstash.rb -H 127.0.0.1 --inflight-events-warn 5 --inflight-events-crit 1:10

Sample Output

With default values

OK - Logstash looking healthy. | process.cpu.percent=0;;;0;100 mem.heap_used_percent=18;70;80;0;100 jvm.threads.count=23;;;0; process.open_file_descriptors=46;3400;3800;0;4096 pipeline.events.out=166c;;;0; inflight_events=0;;;0;
OK: Inflight events: 0
OK: Heap usage at 18.00% (382710568 out of 2077753344 bytes in use)
OK: Open file descriptors at 1.12%. (46 out of 4096 file descriptors are open)

With thresholds set

CRITICAL - Logstash is unhealthy - CRITICAL: Inflight events: 0 | process.cpu.percent=0;;;0;100 mem.heap_used_percent=16;70;80;0;100 jvm.threads.count=23;;;0; process.open_file_descriptors=46;3400;3800;0;4096 pipeline.events.out=164c;;;0; inflight_events=0;5;10;0;
CRITICAL: Inflight events: 0
OK: Heap usage at 16.00% (352959904 out of 2077753344 bytes in use)
OK: Open file descriptors at 1.12%. (46 out of 4096 file descriptors are open)

Finding viable thresholds

To set your thresholds for inflight events to a sensible value use baselining. Don't set thresholds from the beginning but let Graphite or other graphers create graphs for inflight events. Just add some percent to what Logstash usually processes and set this as threshold. Or use the generator plugin to put as many events through your Elastic stack as possible. Use some percent (e.g. 90%) from this maximum as a threshold. Keep in mind that changing your configuration might change the maximum inflight events.

Configuration check

Logstash 5.0 can automatically reload changed configuration from disk. This plugin checks if the last reload succeeded or failed. Unfortunately the first release of Logstash 5.0 does not provide a way to show that it recovered from an invalid configuration. If you had an error, the plugin will still show that there is a problem with the configuration even when this is already fixed. There is already an issue with Logstash pending to fix this behaviour: https://github.com/elastic/logstash/issues/6149

Default values

There are some default values defined in the plugin. Some values are merely put out as performance data and some are normally used for performance data but can be checked against thresholds.

Checks with defaults

  • -H: 127.0.0.1
  • -p: 9600
  • --file-descriptor-threshold-warn : 85
  • --file-descriptor-threshold-warn : 95
  • --heap-usage-threshold-warn : 70
  • --heap-usage-threshold-warn : 80

Optionally checked

  • --cpu-usage-threshold-warn
  • --cpu-usage-threshold-crit
  • --inflight-events-warn
  • --inflight-events-crit

Building

While check_logstash is the finished plugin you can use, you might want to change something and rebuild the script.

Simply issue the rake command (if you have Rake installed) and it will run tests (and rubocop when all cops are pleased) and create the script from the library files. *Don't change the check_logstash file since it will be overwritten by Rake, change the files in lib/ instead.

Repositories

You can find some rpm packaged versions of check_logstash at my repo. Be prepared that I might change the location to a more Netways or Icinga specific repository when my build process matured a bit.

You can’t perform that action at this time.