CSPL-4281 Fix MC crash loop during scale-down #1627

kubabuczak · 2025-11-27T12:48:30Z

Description

Fixes MC crash loop during scale-down. MC ConfigMap update was gated by phase == PhaseReady check, causing peer config to not update when resources scaled down. MC kept trying to connect to deleted pods, resulting in continuous restarts.

Key Changes

Moved ApplyMonitoringConsoleEnvConfigMap() outside phase check in:

standalone.go
clustermanager.go / clustermaster.go
searchheadcluster.go
licensemanager.go / licensemaster.go

ConfigMap now updates immediately after StatefulSet changes, keeping peer list synchronized.

Testing and Verification

Automated: Existing managermc1 test validates MC ConfigMap and Ready state

Related Issues

JIRA: CSPL-4281 - MC crashes on Standalone scale-down

PR Checklist

Code changes adhere to coding standards
Tests included and pass locally
Documentation updated
PR description follows guidelines

…mmediately Monitoring Console crashed when resources scaled down because peer config update was gated by phase == PhaseReady check. During scale operations, phase changes and update was skipped, leaving MC trying to connect to deleted pods. Fix: Move ApplyMonitoringConsoleEnvConfigMap() outside phase check in all reconcilers to update peer list immediately after StatefulSet changes. Affected: ApplyStandalone, ApplyClusterManager, ApplyClusterMaster, ApplyLicenseManager, ApplyLicenseMaster, ApplySearchHeadCluster

coveralls · 2025-11-27T13:00:18Z

Pull Request Test Coverage Report for Build 19736798702

Details

20 of 33 (60.61%) changed or added relevant lines in 6 files are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.01%) to 86.511%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
pkg/splunk/enterprise/clustermanager.go	4	6	66.67%
pkg/splunk/enterprise/clustermaster.go	4	6	66.67%
pkg/splunk/enterprise/licensemanager.go	3	5	60.0%
pkg/splunk/enterprise/licensemaster.go	3	5	60.0%
pkg/splunk/enterprise/searchheadcluster.go	3	5	60.0%
pkg/splunk/enterprise/standalone.go	3	6	50.0%

Files with Coverage Reduction	New Missed Lines	%
pkg/splunk/enterprise/cp.go	1	33.33%

Totals
Change from base Build 19653987646:	-0.01%
Covered Lines:	10736
Relevant Lines:	12410

💛 - Coveralls

coveralls · 2025-11-27T13:00:18Z

Pull Request Test Coverage Report for Build 19759679833

Details

20 of 33 (60.61%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.003%) to 86.519%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
pkg/splunk/enterprise/clustermanager.go	4	6	66.67%
pkg/splunk/enterprise/clustermaster.go	4	6	66.67%
pkg/splunk/enterprise/licensemanager.go	3	5	60.0%
pkg/splunk/enterprise/licensemaster.go	3	5	60.0%
pkg/splunk/enterprise/searchheadcluster.go	3	5	60.0%
pkg/splunk/enterprise/standalone.go	3	6	50.0%

Totals
Change from base Build 19653987646:	-0.003%
Covered Lines:	10737
Relevant Lines:	12410

💛 - Coveralls

kubabuczak changed the title ~~CSPL-4281 fix stale peer cm config~~ CSPL-4281 Fix MC crash loop during scale-down Nov 27, 2025

kubabuczak marked this pull request as draft November 27, 2025 16:38

kubabuczak force-pushed the CSPL-4281-fix-stale-peer-cm-config branch from f588c3a to f3da616 Compare November 28, 2025 09:29

kubabuczak marked this pull request as ready for review November 28, 2025 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CSPL-4281 Fix MC crash loop during scale-down #1627

CSPL-4281 Fix MC crash loop during scale-down #1627

Uh oh!

kubabuczak commented Nov 27, 2025

Uh oh!

coveralls commented Nov 27, 2025

Uh oh!

coveralls commented Nov 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CSPL-4281 Fix MC crash loop during scale-down #1627

Are you sure you want to change the base?

CSPL-4281 Fix MC crash loop during scale-down #1627

Uh oh!

Conversation

kubabuczak commented Nov 27, 2025

Description

Key Changes

Testing and Verification

Related Issues

PR Checklist

Uh oh!

coveralls commented Nov 27, 2025

Pull Request Test Coverage Report for Build 19736798702

Details

💛 - Coveralls

Uh oh!

coveralls commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 19759679833

Details

💛 - Coveralls

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coveralls commented Nov 27, 2025 •

edited

Loading