Skip to content

Conversation

@kubabuczak
Copy link
Collaborator

Description

Fixes cluster manager reconciliation failures when multisite detection fails during initial deployment. Previously, the cluster manager would get stuck in Error phase because it tried to query its own pod before it was ready.

Key Changes

pkg/splunk/enterprise/clustermanager.go

  • Renamed VerifyCMisMultisiteCallGetCMMultisiteEnvVarsCall
  • Added fallback error handling: returns basic cluster manager env vars when GetClusterInfo() fails
  • Prevents Error phase loop by allowing reconciliation to continue

Test Files

  • Updated function mocks in clustermanager_test.go and upgrade_test.go

Testing and Verification

Issue: Test timeouts with cluster manager stuck in Error phase for 4+ hours
Fix: Returns fallback env vars so reconciliation continues even when CM pod not ready
Expected: MC ConfigMap always receives cluster manager URL; multisite info added when pod becomes ready

Related Issues

CSPL-4281: Fix stale peer CM config issue

PR Checklist

  • Code changes adhere to the project's coding standards.
  • Relevant unit and integration tests are included.
  • Documentation has been updated accordingly.
  • All tests pass locally.
  • The PR description follows the project's guidelines.

…heck

When GetClusterInfo fails (e.g., CM pod not ready), return basic
cluster manager environment variables to allow reconciliation to continue.
@kubabuczak kubabuczak changed the title CAPL-4281 Add fallback error handling for cluster manager multisite c… CSPL-4281 Add fallback error handling for cluster manager multisite c… Nov 28, 2025
@kubabuczak kubabuczak changed the title CSPL-4281 Add fallback error handling for cluster manager multisite c… CSPL-4281 Add fallback error handling for cluster manager multisite configuration Nov 28, 2025
@coveralls
Copy link
Collaborator

coveralls commented Nov 28, 2025

Pull Request Test Coverage Report for Build 19761284538

Details

  • 2 of 13 (15.38%) changed or added relevant lines in 1 file are covered.
  • 10 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.1%) to 86.389%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/splunk/enterprise/clustermanager.go 2 13 15.38%
Files with Coverage Reduction New Missed Lines %
pkg/splunk/enterprise/clustermanager.go 10 74.19%
Totals Coverage Status
Change from base Build 19653987646: -0.1%
Covered Lines: 10726
Relevant Lines: 12416

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants