Skip to content

OCPBUGS-86008: Gate Route watch on management cluster capability#8484

Open
smrtrfszm wants to merge 1 commit into
openshift:mainfrom
smrtrfszm:smrtrfszm/skip-route-watch
Open

OCPBUGS-86008: Gate Route watch on management cluster capability#8484
smrtrfszm wants to merge 1 commit into
openshift:mainfrom
smrtrfszm:smrtrfszm/skip-route-watch

Conversation

@smrtrfszm
Copy link
Copy Markdown
Contributor

@smrtrfszm smrtrfszm commented May 12, 2026

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
    • Only monitor the Route resource when the management cluster reports Route capability, preventing operator startup failures on non-OpenShift management clusters.
    • Controller startup is now more resilient: it detects cluster capabilities and skips incompatible watches so the operator can run on a wider range of management clusters without errors.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/needs-area labels May 12, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

📝 Walkthrough

Walkthrough

The Setup function in the resources controller now conditionally registers a watch for OpenShift Route resources on the management cluster. It creates a Kubernetes discovery client, detects management-cluster capabilities via capabilities.DetectManagementClusterCapabilities, and only registers the Route watch if the Route capability is present. Discovery or capability-detection errors are returned, preventing an unconditional Route watch on non-OpenShift management clusters.

Sequence Diagram(s)

sequenceDiagram
    participant Controller
    participant DiscoveryClient
    participant ManagementClusterAPI
    participant CapabilitiesDetector
    participant RouteAPI

    Controller->>DiscoveryClient: create discovery client
    DiscoveryClient->>ManagementClusterAPI: query API resources
    ManagementClusterAPI-->>DiscoveryClient: API resource list
    DiscoveryClient->>CapabilitiesDetector: provide resource list
    CapabilitiesDetector-->>Controller: capabilities (Route present / absent)
    alt Route capability present
        Controller->>RouteAPI: register watch for Route resources
        RouteAPI-->>Controller: watch started
    else Route capability absent
        Controller-->>Controller: skip Route watch
    end
Loading
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed No Ginkgo tests are present in this PR. The PR only modifies resources.go to add capability detection for routes, with no new test files or changes to test files. The check does not apply.
Test Structure And Quality ✅ Passed Ginkgo test quality check not applicable. Tests use standard Go testing (*testing.T), not Ginkgo patterns (no It/Describe/BeforeEach blocks).
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added. The PR modifies controller code to gate Route watch on capability detection, improving MicroShift compatibility.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests were added in this PR. The check only applies when tests are added. This PR only modifies controller code.
Topology-Aware Scheduling Compatibility ✅ Passed PR introduces no scheduling constraints. Changes only gate Route API watch registration based on cluster capabilities—a controller setup change, not a scheduling/deployment manifest change.
Ote Binary Stdout Contract ✅ Passed PR introduces no stdout writes in process-level code. All logging uses logr.Logger (stderr). Changes only add discovery client creation and conditional Route watch with error handling via fmt.Errorf.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR does not add Ginkgo e2e tests. It modifies a controller setup file to conditionally watch Route resources based on cluster capabilities. Custom check only applies when Ginkgo e2e tests are added.
Title check ✅ Passed The title clearly and specifically describes the main change: conditionally gating the Route watch based on management cluster capability detection.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: smrtrfszm
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 12, 2026
@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from daa5044 to 94461ec Compare May 12, 2026 06:24
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.99%. Comparing base (7f1af37) to head (f54aa29).
⚠️ Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
...rconfigoperator/controllers/resources/resources.go 0.00% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8484      +/-   ##
==========================================
- Coverage   40.00%   39.99%   -0.01%     
==========================================
  Files         751      751              
  Lines       92863    92872       +9     
==========================================
  Hits        37147    37147              
- Misses      53024    53033       +9     
  Partials     2692     2692              
Files with missing lines Coverage Δ
...rconfigoperator/controllers/resources/resources.go 55.20% <0.00%> (-0.18%) ⬇️
Flag Coverage Δ
cmd-support 34.09% <ø> (ø)
cpo-hostedcontrolplane 40.56% <ø> (ø)
cpo-other 40.11% <0.00%> (-0.03%) ⬇️
hypershift-operator 50.52% <ø> (ø)
other 31.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

All the facts are now clear. Here is the analysis:

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

codecov/patch — 0.00% of diff hit (target 39.85%)

Patch coverage is 0% with 12 lines in your changes missing coverage.
File: control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
Patch: 0.00% — 12 Missing lines

Summary

The codecov/patch check failed because the PR adds 12 new executable lines to resources.go and none of them are exercised by any unit test. The PR introduces a capability-detection gate around the Route watch in the HCCO Setup() function — creating a discovery client, calling DetectManagementClusterCapabilities, and conditionally watching Routes. Codecov's default patch-coverage policy requires new/changed lines to meet at least the project-level coverage rate (39.85%), but the patch achieves 0.00%. This is a code coverage gap, not a product bug or infrastructure failure.

Root Cause

The PR modifies the Setup() function in control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go to gate the routev1.Route watch behind a management-cluster capability check. All 12 new executable lines are completely uncovered by tests:

  1. Lines 314–316: discovery.NewDiscoveryClientForConfig(opts.CPCluster.GetConfig()) — creates a discovery client against the management (control-plane) cluster.
  2. Lines 317–319: capabilities.DetectManagementClusterCapabilities(cpDiscoveryClient) — probes the management cluster API to determine available capabilities.
  3. Line 320: mgmtCaps.Has(capabilities.CapabilityRoute) — checks whether the route.openshift.io API group is present.
  4. Lines 321–323: The conditional c.Watch(...) call and its error return.

The Setup() function is a controller-manager wiring function that creates real Kubernetes watches and caches. No existing test constructs a mock or fake environment that invokes this path, and the PR does not add one. The codecov.yml configuration does not define an explicit patch threshold, so Codecov defaults to requiring the patch to meet the project-level coverage (39.85%). Since 0% < 39.85%, the check fails.

This is strictly a missing-test-coverage issue. The functional logic itself is sound — gating the Route watch on CapabilityRoute prevents HCCO from failing to start on non-OpenShift management clusters.

Recommendations
  1. Add a unit test for the capability-gating logic. The simplest approach is to extract the capability-detection + conditional-watch block into a helper function that can be tested independently with a fake discovery client. For example:

    func shouldWatchRoutes(discoveryClient discovery.DiscoveryInterface) (bool, error) {
        caps, err := capabilities.DetectManagementClusterCapabilities(discoveryClient)
        if err != nil {
            return false, err
        }
        return caps.Has(capabilities.CapabilityRoute), nil
    }

    This function can be tested with a fake.NewSimpleClientset() discovery client that either does or does not expose route.openshift.io.

  2. Alternatively, add an integration-style test for the Setup() function using envtest or a mock controller-manager that validates the watch is (or is not) registered based on the management cluster's API groups.

  3. If the team considers this code inherently untestable at the unit level (controller wiring), add an explicit patch threshold or ignore rule in codecov.yml:

    coverage:
      patch:
        default:
          target: auto
          threshold: 1%  # allow small drops

    However, adding tests is preferred over relaxing thresholds.

  4. No action needed on the functional change itself — the Route watch gating logic is correct and addresses a real startup failure on non-OpenShift management clusters.

Evidence
Evidence Detail
Check conclusion failure — codecov/patch
Patch coverage 0.00% (0 of 12 new lines covered)
Project coverage 39.85% (unchanged from base)
Target threshold 39.85% (Codecov default: match project coverage)
File affected control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
File coverage delta 55.11% → 54.93% (−0.18%)
Lines added (executable) 12 (capability detection, conditional watch, error handling)
Lines covered 0
Codecov config codecov.yml — no explicit patch or project coverage targets defined; Codecov defaults apply
Codecov bot report "Patch coverage is 0% with 12 lines in your changes missing coverage"
Base commit 96a2bcb → Head commit: 94461ec

The hosted-cluster-config-operator unconditionally watches
route.openshift.io/v1 Routes against the management cluster to react
to hostname changes on the metrics-proxy Route. On management clusters
that do not expose the Routes API (e.g. non-OpenShift management
clusters) this watch fails during controller setup and prevents HCCO
from starting.

Detect the management cluster Route capability using the existing
capabilities.DetectManagementClusterCapabilities helper and only
register the watch when route.openshift.io is registered. This mirrors
the pattern already used in other parts of the code.
@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from 94461ec to f54aa29 Compare May 14, 2026 22:38
@smrtrfszm smrtrfszm marked this pull request as ready for review May 17, 2026 16:07
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2026
@openshift-ci openshift-ci Bot requested review from cblecker and csrwng May 17, 2026 16:08
@smrtrfszm smrtrfszm changed the title fix(HCCO): gate Route watch on management cluster capability OCPBUGS-86008: Gate Route watch on management cluster capability May 17, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@smrtrfszm: This pull request references Jira Issue OCPBUGS-86008, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
  • Only monitor the Route resource when the management cluster reports Route capability, preventing operator startup failures on non-OpenShift management clusters.
  • Controller startup is now more resilient: it detects cluster capabilities and skips incompatible watches so the operator can run on a wider range of management clusters without errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label May 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@smrtrfszm: This pull request references Jira Issue OCPBUGS-86008, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
  • Only monitor the Route resource when the management cluster reports Route capability, preventing operator startup failures on non-OpenShift management clusters.
  • Controller startup is now more resilient: it detects cluster capabilities and skips incompatible watches so the operator can run on a wider range of management clusters without errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants