In [None]:
Skipped Searches in Splunk – Advanced SHC SOP (With Captain / Member / Deployer Roles)

This SOP provides a complete troubleshooting and RCA workflow for Skipped Searches in a Splunk Search Head Cluster (SHC). 
It includes exact responsibilities for Captain, Members, and Deployer.

---

1. Overview
Skipped searches occur when Splunk intentionally does not execute scheduled searches due to resource, concurrency, or configuration limitations. 
This impacts correlation searches, dashboards, and ES notables.

---

2. Symptoms
• Correlation searches not generating notables
• Dashboard panels not updating
• ES Health Dashboard shows skipped searches
• scheduler.log contains: “Search skipped due to exceeded concurrency limit”

---

3. Root Causes
• Search concurrency exceeded (max_searches_per_cpu)
• CPU / Memory / I/O exhaustion
• Long-running hung searches
• Too many Data Model Acceleration jobs
• ES correlation searches heavily overlapping
• SHC Captain overloaded

---

4. SHC Role Responsibility Matrix

Captain:
• Runs scheduler
• Executes scheduled searches
• Runs correlation searches
• Manages DMA

Members:
• Execute delegated jobs
• Handle ad‑hoc searches
• Maintain KVstore replication

Deployer:
• Pushes configuration (savedsearches.conf, limits.conf, datamodels.conf)
• Central config authority for SHC

---

5. Diagnostics Commands

5.1 Identify SHC status (Any Member)
splunk show shcluster-status

5.2 Check skipped searches (Captain only)
| rest /services/admin/scheduler | table title skipped dispatchState

5.3 Check concurrency (Captain only)
| rest /services/server/status/scheduler

5.4 Identify hung searches (Any Member)
| rest /services/search/jobs | search isDone=0

5.5 System resource validation (Captain mandatory)
top -c
iostat -xm 5
df -h

---

6. Detailed Troubleshooting Steps

Step 1 — Validate System Load (Captain)
• Check CPU, Memory, I/O
• If load > 80%, expect skipped searches

Step 2 — Fix Scheduler Limits (Deployer)
Edit:
$SPLUNK_HOME/etc/shcluster/apps/<APP>/local/limits.conf

[scheduler]
max_searches_per_cpu = 1
max_concurrent = 50

Push bundle:
splunk apply shcluster-bundle

Step 3 — Kill Hung Searches (Any Member)
splunk cancel <sid>
splunk dispatch-clean --all

Step 4 — Tune ES Correlation Searches (Deployer)
• Disable unused ES searches
• Increase cron windows
• Stagger timings

File:
savedsearches.conf

Step 5 — Fix Data Model Acceleration (Deployer)
Disable unused DMAs:
datamodels.conf

Step 6 — SHC Captain Load Fix
• Validate captain performance
• Consider captain re-election:
splunk bootstrap shcluster-captain

---

7. Permanent Fixes
• Increase CPU on Captain
• Add more SHC members
• Tune savedsearches.conf on Deployer
• Reduce DMA load
• Disable unused searches

---

8. Validation (Captain only)
| rest /services/admin/scheduler | table skipped
| rest /services/server/status/scheduler

Skipped searches should drop to zero.

---

9. RCA Template
• Issue Summary:
• Impact:
• Root Cause:
• Evidence:
• Corrective Actions:
• Preventive Steps:

