Skip to content

SPLAT-2747: Updated kube cloud config controller to react to feature gate updates#489

Merged
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
vr4manta:SPLAT-2747
May 19, 2026
Merged

SPLAT-2747: Updated kube cloud config controller to react to feature gate updates#489
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
vr4manta:SPLAT-2747

Conversation

@vr4manta
Copy link
Copy Markdown
Contributor

@vr4manta vr4manta commented May 6, 2026

SPLAT-2747

Changes

  • Changed logic for accessing feature gates to be done directly through accessor to prevent race conditions
  • Fixed starting of FeatureGateAccessor to be in step w/ all required informers

Summary by CodeRabbit

  • New Features

    • Feature gates now synchronize dynamically at runtime without requiring restarts.
  • Improvements

    • Operator startup waits for feature-gate observation (with a brief timeout) before starting dependent controllers, preventing premature runs.
    • Clearer runtime warnings when feature gates are uninitialized and additional debug-level logs when evaluating gates.
    • Cloud-config management decisions now respect current feature-gate state and use debug logs instead of startup events.
  • Tests

    • Tests updated to use the runtime feature-gate accessor.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 6, 2026

@vr4manta: This pull request references SPLAT-2747 which is a valid jira issue.

Details

In response to this:

SPLAT-2747

Changes

  • Added logic to handle feature gate updates
  • Fixed starting of FeatureGateAccessor to be in step w/ all required informers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Reorders operator startup to start informers, run the feature-gate accessor, and block up to 1 minute for initial feature gates before constructing controllers. The kube-cloud-config controller now stores a featureGateAccessor (not a cached gates snapshot); controller methods were made pointer receivers and feature-gate checks re-fetch current gates and warn if initial gates are not observed.

Changes

Feature Gate Synchronization & Startup Sequencing

Layer / File(s) Summary
Controller data shape & constructor
pkg/operator/kube_cloud_config/controller.go
KubeCloudConfigController replaces currentFeatureGates with featureGateAccessor; NewController stores the accessor and no longer initializes cached gates.
Controller behavior & feature checks
pkg/operator/kube_cloud_config/controller.go
sync, shouldManageCloudConfig, and isFeatureGateEnabled converted to pointer receivers; isFeatureGateEnabled returns false and warns if initial gates not observed, otherwise calls featureGateAccessor.CurrentFeatureGates() (error ignored) and uses Enabled(gateName); skip-path logs use klog.V(4).Infof.
Operator startup & informer/gate sequencing
pkg/operator/starter.go
Alias Kubernetes API errors as k8serrors and switch not-found checks; start informers before starting the feature-gate accessor, run featureGateAccessor.Run(ctx), block on InitialFeatureGatesObserved() with a 1-minute timeout, and construct controllers that depend on feature gates after observation.
Tests updated to use accessor
pkg/operator/kube_cloud_config/controller_test.go
Unit tests updated to set featureGateAccessor on the controller instead of extracting and assigning currentFeatureGates.
sequenceDiagram
    participant Startup as Startup Flow
    participant Informer as Informers
    participant FGGate as FeatureGateAccessor
    participant Controller as Controllers

    Startup->>Informer: Start dynamic/config/kube informers
    Informer-->>Startup: Informers running

    Startup->>FGGate: Start feature-gate accessor (Run)
    FGGate->>FGGate: Await InitialFeatureGatesObserved (<=1m)
    alt Initial observed
        FGGate-->>Startup: Initial gates available
    else Timeout
        FGGate-->>Startup: Return timeout error
    end

    Startup->>Controller: Construct feature-gate–dependent controllers (use accessor)
    Controller->>FGGate: Query CurrentFeatureGates on demand
    FGGate-->>Controller: Return latest gates
    Controller->>Controller: isFeatureGateEnabled re-fetches gates and logs result
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • jstuever

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Tests use context.TODO() without timeouts (lines 231, 379) and lack assertion messages on ~40% of assertions, violating custom check requirements on timeouts and assertion messages. Add context.WithTimeout to both ctrl.sync calls and add descriptive messages to all assertions using assert.Equalf() consistently throughout test functions.
Ote Binary Stdout Contract ❓ Inconclusive No result was produced after verification. Marking as INCONCLUSIVE. Re-run the check or adjust instructions to produce a final result.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: the controller was refactored to react to feature gate updates by using an accessor instead of caching gates, enabling dynamic responsiveness to feature gate changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Tests in the PR use standard Go testing framework (testing.T), not Ginkgo. The check is specifically for Ginkgo test names, which are not present. No Ginkgo tests found with dynamic information.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. The PR modifies production code (controller and startup) and updates existing unit test initialization. The custom check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR does not add any Ginkgo e2e tests. The modified files (controller.go, starter.go, and controller_test.go) use standard Go testing package only. The custom check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed No scheduling constraints introduced. PR refactors internal feature gate patterns and operator startup without modifying workloads, affinity, node selectors, or topology-dependent configs.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR does not add any Ginkgo e2e tests. It only modifies standard Go unit tests in the cluster-config-operator repository. The check for IPv6 and disconnected network compatibility does not apply.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@vr4manta
Copy link
Copy Markdown
Contributor Author

vr4manta commented May 6, 2026

/retest

@vr4manta
Copy link
Copy Markdown
Contributor Author

vr4manta commented May 6, 2026

/hold
looking at issue w/ startup

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 6, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/kube_cloud_config/controller.go`:
- Around line 75-83: The field c.currentFeatureGates is written in
featureGateAccess.SetChangeHandler's callback and read concurrently by
isFeatureGateEnabled, causing a data race; add a sync.RWMutex (e.g.,
featureGatesMu) to the controller struct, use featureGatesMu.Lock/Unlock when
assigning c.currentFeatureGates in the SetChangeHandler callback and during the
initial assignment, and use featureGatesMu.RLock/RUnlock inside
isFeatureGateEnabled when reading c.currentFeatureGates to serialize access.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2cc482a0-127a-4616-a243-4bd79f60a718

📥 Commits

Reviewing files that changed from the base of the PR and between 15a09d1 and 3d0b4e0.

📒 Files selected for processing (2)
  • pkg/operator/kube_cloud_config/controller.go
  • pkg/operator/starter.go

Comment thread pkg/operator/kube_cloud_config/controller.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/operator/kube_cloud_config/controller.go (1)

80-91: 💤 Low value

Consider using featureChange parameter directly instead of re-fetching.

The featureChange parameter passed to the handler likely contains the new feature gates directly (commonly as featureChange.New), which would avoid the redundant call to CurrentFeatureGates(). However, this is a minor optimization.

Also note: feature gate changes won't trigger an immediate sync - the controller relies on the 1-minute resync interval to pick up changes. This is acceptable given feature gates rarely change in production.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/kube_cloud_config/controller.go` around lines 80 - 91, The
handler passed to featureGateAccess.SetChangeHandler should use the provided
featureChange argument instead of calling CurrentFeatureGates() again: extract
the new gates from featureChange (e.g., featureChange.New or the appropriate
field on featuregates.FeatureChange), acquire c.featureGatesMu, assign
c.currentFeatureGates to that new value, and unlock; optionally keep a fallback
to call featureGateAccess.CurrentFeatureGates() only if featureChange does not
contain the new gates. Update the code in the SetChangeHandler closure
(referencing featureGateAccess.SetChangeHandler, featureChange,
CurrentFeatureGates, c.featureGatesMu, and c.currentFeatureGates) accordingly.
pkg/operator/starter.go (1)

201-208: ⚡ Quick win

Handle context cancellation in the select for responsive shutdown.

The select statement doesn't handle ctx.Done(). If the operator receives a shutdown signal during startup, it will wait up to 1 minute before exiting. Adding a context cancellation case improves shutdown responsiveness.

♻️ Proposed fix
 	select {
 	case <-featureGateAccessor.InitialFeatureGatesObserved():
 		klog.V(4).Info("FeatureGates initialized")
 	case <-time.After(1 * time.Minute):
 		accessError := errors.New("timed out waiting for FeatureGate detection")
-		klog.Error(accessError, "unable to start operator")
+		klog.Errorf("unable to start operator: %v", accessError)
 		return accessError
+	case <-ctx.Done():
+		return ctx.Err()
 	}

Also: klog.Error(accessError, "unable to start operator") concatenates both arguments without formatting. Using klog.Errorf or klog.ErrorS would produce cleaner log output.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/starter.go` around lines 201 - 208, The select waiting on
featureGateAccessor.InitialFeatureGatesObserved() currently ignores context
cancellation and uses klog.Error with concatenated args; update the select in
starter.go to include a case for <-ctx.Done() that logs the cancellation using
klog.ErrorS (or klog.Errorf) with the context error and immediately returns
ctx.Err(), and also change the timeout branch to log the timeout using
klog.ErrorS/klog.Errorf with the accessError (instead of klog.Error) so both
branches produce structured/ formatted log output; locate the select around
featureGateAccessor.InitialFeatureGatesObserved(), the accessError variable, and
use ctx.Done()/ctx.Err() for responsive shutdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/operator/kube_cloud_config/controller.go`:
- Around line 80-91: The handler passed to featureGateAccess.SetChangeHandler
should use the provided featureChange argument instead of calling
CurrentFeatureGates() again: extract the new gates from featureChange (e.g.,
featureChange.New or the appropriate field on featuregates.FeatureChange),
acquire c.featureGatesMu, assign c.currentFeatureGates to that new value, and
unlock; optionally keep a fallback to call
featureGateAccess.CurrentFeatureGates() only if featureChange does not contain
the new gates. Update the code in the SetChangeHandler closure (referencing
featureGateAccess.SetChangeHandler, featureChange, CurrentFeatureGates,
c.featureGatesMu, and c.currentFeatureGates) accordingly.

In `@pkg/operator/starter.go`:
- Around line 201-208: The select waiting on
featureGateAccessor.InitialFeatureGatesObserved() currently ignores context
cancellation and uses klog.Error with concatenated args; update the select in
starter.go to include a case for <-ctx.Done() that logs the cancellation using
klog.ErrorS (or klog.Errorf) with the context error and immediately returns
ctx.Err(), and also change the timeout branch to log the timeout using
klog.ErrorS/klog.Errorf with the accessError (instead of klog.Error) so both
branches produce structured/ formatted log output; locate the select around
featureGateAccessor.InitialFeatureGatesObserved(), the accessError variable, and
use ctx.Done()/ctx.Err() for responsive shutdown.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7e9e6a22-fce2-4ad8-8bdf-488330eb3b03

📥 Commits

Reviewing files that changed from the base of the PR and between 3d0b4e0 and ccc9eed.

📒 Files selected for processing (2)
  • pkg/operator/kube_cloud_config/controller.go
  • pkg/operator/starter.go

@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/kube_cloud_config/controller.go`:
- Line 44: Tests still construct the controller using the old field name
currentFeatureGates; update those struct literals to use the renamed field
featureGateAccessor instead (replace occurrences of currentFeatureGates with
featureGateAccessor in the controller test setup so the controller struct
literal matches the controller.go definition).
- Around line 104-105: The log call using klog.V(4).Infof in
KubeCloudConfigController is malformed: the first argument is a literal
("KubeCloudConfigController") not a format string, causing EXTRA output; fix by
making the first argument a proper format string that includes the platform
placeholder (e.g., "KubeCloudConfigController: Skipping kube-cloud-config
management for platform %s") and pass platformName as the single format
argument, or alternatively use Info with a concatenated message or InfoS with
structured key/value logging; update the klog.V(4).Infof invocation accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8dbd8198-1a60-4f2a-a50a-e07831809d46

📥 Commits

Reviewing files that changed from the base of the PR and between ccc9eed and 06ad272.

📒 Files selected for processing (2)
  • pkg/operator/kube_cloud_config/controller.go
  • pkg/operator/starter.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/operator/starter.go

Comment thread pkg/operator/kube_cloud_config/controller.go
Comment thread pkg/operator/kube_cloud_config/controller.go Outdated
@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest

@vr4manta
Copy link
Copy Markdown
Contributor Author

/unhold
Resolved order of initialization and changed how to get feature gates to prevent timing issues of setting current feature gates

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 12, 2026
@vr4manta
Copy link
Copy Markdown
Contributor Author

@coderabbitai summary

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

✅ Actions performed

Summary regeneration triggered.

@vr4manta
Copy link
Copy Markdown
Contributor Author

/assign @JoelSpeed

@JoelSpeed
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 13, 2026
@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest

@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest
/verified by @vr4manta

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@vr4manta: This PR has been marked as verified by @vr4manta.

Details

In response to this:

/retest
/verified by @vr4manta

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 15a09d1 and 2 for PR HEAD e664206 in total

@vr4manta
Copy link
Copy Markdown
Contributor Author

So upgrade is failing due to :

2026-05-14T14:59:13.907630376Z I0514 14:59:13.907583       1 starter.go:250] Started feature gate accessor
2026-05-14T14:59:13.907718089Z I0514 14:59:13.907704       1 simple_featuregate_reader.go:171] Starting feature-gate-detector
2026-05-14T14:59:13.914243540Z E0514 14:59:13.914183       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:13.917143386Z E0514 14:59:13.917084       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:13.919541419Z E0514 14:59:13.919495       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:13.939708737Z E0514 14:59:13.939662       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:13.980526747Z E0514 14:59:13.980460       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:14.060832413Z E0514 14:59:14.060781       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:14.221159927Z E0514 14:59:14.221092       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:14.541970269Z E0514 14:59:14.541873       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:15.183878110Z E0514 14:59:15.183819       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:16.464523929Z E0514 14:59:16.464441       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:19.025511519Z E0514 14:59:19.025458       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:24.149787360Z E0514 14:59:24.149706       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:34.390727978Z E0514 14:59:34.390666       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T14:59:54.873020721Z E0514 14:59:54.872675       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"5.0.0-0.ci-2026-05-14-112559-test-ci-op-4xg02xdn-latest\" in featuregates.config.openshift.io/cluster"
2026-05-14T15:00:13.908603663Z E0514 15:00:13.908543       1 starter.go:256] timed out waiting for FeatureGate detectionunable to start operator
2026-05-14T15:00:13.908603663Z W0514 15:00:13.908583       1 builder.go:138] graceful termination failed, controllers failed with error: timed out waiting for FeatureGate detection

This looks like an issue being caused by the new feature gate accessor being added. @JoelSpeed Not sure why this would be happening. Is this something that may be an issue w/ the accessor?

@vr4manta
Copy link
Copy Markdown
Contributor Author

Current version in the featuregates CR:

  status:
    featureGates:
    - disabled:
      - <REDACTED>
      enabled:
      - <REDACTED>
      version: 5.0.0-0.ci-2026-05-14-112433-test-ci-op-4xg02xdn-initial

@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label May 15, 2026
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 15, 2026
@vr4manta
Copy link
Copy Markdown
Contributor Author

The accessor was being started before the feature gate controller starts and processes the upgrade feature gate list. Moving the start of the accessor to after the feature gate controller will fix the trick.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 15, 2026

@vr4manta: This pull request references SPLAT-2747 which is a valid jira issue.

Details

In response to this:

SPLAT-2747

Changes

  • Changed logic for accessing feature gates to be done directly through accessor to prevent race conditions
  • Fixed starting of FeatureGateAccessor to be in step w/ all required informers

Summary by CodeRabbit

  • New Features

  • Feature gates now synchronize dynamically at runtime without requiring restarts.

  • Improvements

  • Operator startup waits for feature-gate observation (with a brief timeout) before starting dependent controllers, preventing premature runs.

  • Clearer runtime warnings when feature gates are uninitialized and additional debug-level logs when evaluating gates.

  • Cloud-config management decisions now respect current feature-gate state and use debug logs instead of startup events.

  • Tests

  • Tests updated to use the runtime feature-gate accessor.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread pkg/operator/starter.go
@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest

Comment thread pkg/operator/starter.go Outdated
// Start the feature gate accessor and wait for it to observe initial feature gates
// This must happen before creating controllers that depend on feature gates
go featureGateAccessor.Run(ctx)
go featureGateController.Run(ctx, 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, this probably should come before the accessor, and also, can we have a comment explaining why the featuregate controller must be started early

// The featuregate controller must never be featuregates. It is responsible for ensuring the featuregate 
// object status contains the correct feature gate states for the current release.
// It must run before the featuregeate accessor as the accessor depends on the featuregate status being updated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i can add a comment and change order.

@JoelSpeed
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 18, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed, vr4manta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vr4manta
Copy link
Copy Markdown
Contributor Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 18, 2026

@vr4manta: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vr4manta
Copy link
Copy Markdown
Contributor Author

/verified by @vr4manta

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 19, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@vr4manta: This PR has been marked as verified by @vr4manta.

Details

In response to this:

/verified by @vr4manta

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot Bot merged commit 8524296 into openshift:main May 19, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants