The Broken Hosts App for Splunk is a useful tool for monitoring data going into Splunk. It has the ability to alert when hosts stop sending data into Splunk, as well as inspect the last time the final combination of data was received by Splunk.
If the arrival of the final log for the index/sourcetype/host combination is later than expected, the Broken Hosts App will send an alert. This allows for quick status detection of the hosts and fast issue resolution.
The Broken Hosts App for Splunk is the app for monitoring missing data in Splunk. The app’s three main objectives include:
- Alerting when data has stopped sending to Splunk.
- Utilizing saved searches to facilitate rapid detection of the missing data.
- Creating dashboards for visualization to help with further investigations.
- Detects gaps in data being collected into Splunk
- Detects unexpected latency in data being collected into Splunk
- Generates statistics about data being collected into Splunk for other uses
- Includes dashboards for investigating broken data sources
- Use Splunk modular alert actions for sending alerts
- Lookup and Eventtype-based configuration
Starting with Broken Hosts 5.0.0 data source alert threshold tunings and suppressions have been separated into separate lookups.
Note that the below searches use the | outputlookup
command to update the lookups used by Broken Hosts. This command is labled as risky by default in Splunk,
and a warning message will be displayed if this has not been changed. Users can safely click through this warning. If you wish to permanently disable it,
Cloud customers can open a support case to remove it from the list of risky commands. Enterprise customers can add a commands.conf file to the
default/ directory in the Broken Hosts app to prevent the warning from popping up.
Existing alerting will still function until the following steps are completed, but issues may arise if the following steps are not followed. Additionally, you will not be able to add new suppressions to expectedTime after updating.
Steps to upgrade to version 5.0.0:
- Run the search
Broken Hosts - Populate bh_suppressions from expectedTime
- Run the search
Broken Hosts - Clear Permanent Suppressions expectedTime
- Enable the search
Broken Hosts - Auto Sort v5
- Disable the search
Broken Hosts - Auto Sort
- Enable the search
Broken Hosts - Purge and Sort bh_suppressions
The above searches will automatically populate the new bh_suppressions lookup with currently used suppression entries in expectedTime, clear expectedTime of all permanent suppressions, enable new expectedTime sorting logic, and schedule a search to automatically remove outdated entries from bh_suppressions.
- Install the
Broken Hosts App for Splunk
on your ad-hoc search head. - Use the
Broken Hosts
dashboard to determine appropriate baselines for all of your critical data. - Use the
Configure Tunings
and dashboard to configure your baselines and create suppressions. - Configure alert actions on the
Broken Hosts Alert Search
saved search in the Broken Hosts App for Splunk. - Enable the
Broken Hosts Alert Search
saved search in the Broken Hosts App for Splunk. - Run the search
Broken Hosts - Clear Permanent Suppressions from expectedTime
https://brokenhosts.hurricanelabs.com
- By default this app is configured and all configuration options are optional. The following macros are available to configure:
default_contact
default_expected_time
ignore_after
linuxoslog_index
min_count
search_additions
wineventlog_index
bh_volume_alerting_indexes
- You can also configure the requirement of a ticket number being in comments when updating the table on the Configure Broken Hosts Lookup page. This configuration is availabe on the Setup page in the app.