Skip to content

Conversation

@danielmellado
Copy link
Contributor

When service-CA rotates certificates, the cluster-monitoring-operator
wasn't watching service-CA generated secrets, causing pods to continue
using expired certificates until manually restarted.

This change adds detect changes happeing for service-CA secrets in
monitoring namespaces (secrets ending with -tls or -cert) and triggers
reconciliation when they change, which triggers pod restarts and
certificate pickup.

Fixes the issue where EUS clusters running 3+ years without upgrades
require manual intervention after service-CA rotation.

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

@openshift-ci openshift-ci bot requested review from rexagod and slashpai October 24, 2025 06:42
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielmellado

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2025
When service-CA rotates certificates, the cluster-monitoring-operator
wasn't watching service-CA generated secrets, causing pods to continue
using expired certificates until manually restarted.

This change adds detect changes happeing  for service-CA secrets in
monitoring namespaces (secrets ending with -tls or -cert) and triggers
reconciliation when they change, which triggers pod restarts and
certificate pickup.

Fixes the issue where EUS clusters running 3+ years without upgrades
require manual intervention after service-CA rotation.

Signed-off-by: Daniel Mellado <dmellado@fedoraproject.org>
@danielmellado danielmellado force-pushed the fix-service-ca-certificate-rotation branch from 30ab202 to afd86af Compare October 24, 2025 11:48
@danielmellado
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 25, 2025

@danielmellado: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn afd86af link false /test okd-scos-e2e-aws-ovn
ci/prow/versions afd86af link false /test versions

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@danielmellado
Copy link
Contributor Author

/retitle OCPBUGS-63502: Fix service-CA certificate rotation not triggering pod restarts

@openshift-ci openshift-ci bot changed the title Fix service-CA certificate rotation not triggering pod restarts OCPBUGS-63502: Fix service-CA certificate rotation not triggering pod restarts Oct 28, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 28, 2025
@openshift-ci-robot
Copy link
Contributor

@danielmellado: This pull request references Jira Issue OCPBUGS-63502, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

When service-CA rotates certificates, the cluster-monitoring-operator
wasn't watching service-CA generated secrets, causing pods to continue
using expired certificates until manually restarted.

This change adds detect changes happeing for service-CA secrets in
monitoring namespaces (secrets ending with -tls or -cert) and triggers
reconciliation when they change, which triggers pod restarts and
certificate pickup.

Fixes the issue where EUS clusters running 3+ years without upgrades
require manual intervention after service-CA rotation.

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@danielmellado
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 28, 2025
@openshift-ci-robot
Copy link
Contributor

@danielmellado: This pull request references Jira Issue OCPBUGS-63502, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @juzhao

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from juzhao October 28, 2025 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants