In [None]:
Removing Excess Buckets from Cluster Master
Excess bucket copies are copies that exceed the cluster's replication factor (RF) or search factor (SF).
For example, if the RF is 3, each bucket should have exactly 3 copies across peer nodes.
If one bucket has 4 copies, then it has 1 excess copy.

Note: Excess copies do not affect cluster operations, but they consume extra disk space.

Method 1: From GUI
Ensure all peers are communicating with the Cluster Master (CM).
Restart the splunkd service on the Cluster Master.
Remove the excess bucket from the Cluster Master.
Sometimes the process gets stuck on the CM. Restarting splunkd usually resolves the issue.

Method 2: From CLI
Ensure all peers are communicating with the Cluster Master.

Restart the splunkd service on the Cluster Master.

Run the following commands:

# List excess buckets for a specific index
splunk list excess-buckets <index-name>

# Remove excess buckets for a specific index
splunk remove excess-buckets <index-name>
Search Artifacts Dispatch Error
You may encounter the following warning:

The number of search artifacts in the dispatch directory is higher than recommended
(count=5360, warning threshold=5000). This may impact search performance.

Root Cause
The warning is related to the dispatch_dir_warning_size attribute in limits.conf.
It occurs when the number of search artifacts in the dispatch directory exceeds the recommended threshold.
Default threshold: 5000.
Attribute Details
dispatch\_dir\_warning\_size = <int>
* Specifies the number of jobs in the dispatch directory that triggers a warning.
* Default: 5000
Observations
The dispatch directory is located at:
$SPLUNK_HOME/var/run/splunk/dispatch
This directory stores search artifacts.

A large number of searches can accumulate here, exceeding the threshold.
When exceeded, Splunk issues a performance warning.
Resolution Steps
Locate the limits.conf file
vi /opt/splunk/etc/system/local/limits.conf
If you’re unsure of the exact location, run:

splunk btool limits list --debug | grep dispatch_dir_warning_size
Modify the parameter Increase the threshold (default is 5000). Example:

[search]
dispatch_dir_warning_size = 10000
Restart Splunk

$SPLUNK_HOME/bin/splunk restart
After restarting, Splunk will use the updated threshold for search artifact warnings.

Logfile Data Not Coming in Correct Form
Issue
The data in the logfile is in Japanese language, but after indexing in Splunk, it appears incorrectly (garbled characters).
Root Cause
Splunk may not be detecting the correct character set encoding during data ingestion.
For Japanese logs, the supported encodings are:

EUC-JP
SHIFT-JIS
Reference: Splunk Documentation - Configure character set encoding

Resolution
You need to manually specify the character set in props.conf.

Steps:
Open or create props.conf in your configuration directory:
   vi $SPLUNK_HOME/etc/system/local/props.conf
Add a new stanza for the sourcetype of your input. Example:

[mysourcetype]
CHARSET = EUC-JP
Or, if EUC-JP does not work, try:

[mysourcetype]
CHARSET = SHIFT-JIS
Save the file and restart Splunk:

$SPLUNK_HOME/bin/splunk restart
Notes
You may need to test both encodings (EUC-JP and SHIFT-JIS) to identify which one correctly parses your log data.
Always configure this on the indexing tier (indexers or heavy forwarders), not just on universal forwarders.
Here’s your content formatted in Markdown for clear documentation:

KV Store Changed Status to Failed (Process Terminated)
Error Details
splunkd.log
11-04-2020 10:26:13.265 +1100 ERROR MongodRunner - mongod exited abnormally (exit code 14, status: exited with code 14) - look at mongod.log to investigate.
11-04-2020 10:26:13.269 +1100 ERROR KVStoreBulletinBoardManager - KV Store process terminated abnormally (exit code 14, status exited with code 14). See mongod.log and splunkd.log for details.
11-04-2020 10:26:13.269 +1100 ERROR KVStoreBulletinBoardManager - KV Store changed status to failed. KVStore process terminated..
mongod.log
2020-10-26T00:34:50.369Z I CONTROL \[initandlisten] \*\* WARNING: No SSL certificate validation can be performed since no CA file has been provided
2020-10-19T03:56:45.683Z I CONTROL \[initandlisten] \*\* Please specify an sslCAFile parameter.
2020-11-03T23:26:13.263Z F - \[main] Fatal Assertion 28652
\*\*\*aborting after fassert() failure
Root Cause
The KV Store (MongoDB process) failed due to issues with the SSL certificate (server.pem). If the SSL certificate has expired or is invalid, MongoDB cannot start, causing KV Store failure.

Procedure to Renew SSL Certificate
Check if the SSL certificate is expired

openssl x509 -enddate -noout -in $SPLUNK_HOME/etc/auth/server.pem
Backup the existing certificate

cp $SPLUNK_HOME/etc/auth/server.pem $SPLUNK_HOME/etc/auth/server.pem.bak
Generate a new SSL certificate

$SPLUNK_HOME/bin/splunk createssl server-cert \
-d $SPLUNK_HOME/etc/auth \
-n server \
-c SplunkServerDefaultCert \
-l 2048
Restart Splunk

$SPLUNK_HOME/bin/splunk restart
Verify KV Store status

$SPLUNK_HOME/bin/splunk show kvstore-status
Alternate Quick Fix
Rename or delete the existing server.pem file:

mv $SPLUNK_HOME/etc/auth/server.pem $SPLUNK_HOME/etc/auth/server.pem.old
Restart Splunk. Splunk will automatically generate a new server.pem on startup.

After this procedure, the KV Store should start successfully.

Indexer Down in Cluster (One Indexer Out of 6)
Issue
Out of 6 indexers in the cluster, one indexer shows as Down.

splunkd.log
12-09-2020 12:56:51.331 +0630 WARN CMSlave - Failed to register with cluster master reason: failed method=POST path=/services/cluster/master/peers/?output\_mode=json master=10.75.49.21:8089 rv=0 gotConnectionError=0 gotUnexpectedStatusCode=1 actual\_response\_code=500 expected\_response\_code=2xx status\_line="Internal Server Error" socket\_error="No error" remote\_error=Cannot add peer=10.75.49.25 mgmtport=8089 (reason: bucket already added as clustered, peer attempted to add again as standalone. guid=347A3164-2E3B-4D9D-8603-A17180A9E92E bid=tml\_app\_paloalto~~170~~347A3164-2E3B-4D9D-8603-A17180A9E92E).
Root Cause
The indexer attempted to register with the Cluster Master (CM) but failed because some buckets were detected as standalone buckets instead of clustered ones.

These standalone buckets are missing the cluster GUID in their name.
This mismatch prevents the peer from rejoining the cluster.
Resolution Steps
Search for standalone buckets
Navigate to the indexer data path:
cd /opt/splunk/var/lib/splunk/
Use the command to find problematic buckets:

find /opt/splunk -name "db_*"
Standalone bucket naming convention:

db_<newest_time>_<oldest_time>_<bucketid>
Example:

db_1550812574_1550720467_53
Append the Cluster Master GUID Rename each standalone bucket by appending the cluster master GUID to the end.

From:

db_<newest_time>_<oldest_time>_<bucketid>
To:

db_<newest_time>_<oldest_time>_<bucketid>_<guid>
Example:

mv db_1550812574_1550720467_53 \
   db_1550812574_1550720467_53_347A3164-2E3B-4D9D-8603-A17180A9E92E
(Here, guid=347A3164-2E3B-4D9D-8603-A17180A9E92E)

Reference: Splunk Docs - Bucket names

Restart the Indexer

$SPLUNK_HOME/bin/splunk restart
The search peer should now rejoin the cluster.

Disable Maintenance Mode (if enabled on CM).

Allow Bucket Fix-up Tasks Wait for the cluster to complete fix-up processing.

Important Notes
Ensure pass4symmkey is the same across all indexers and the Cluster Master.

Verify that the instance GUID is present in:

$SPLUNK_HOME/etc/instance.cfg
After these steps, the indexer should rejoin the cluster successfully.