Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGMT-15349: Don't attempt to contact spoke while unbinding a day2 host #5383

Merged

Conversation

carbonin
Copy link
Member

If an agent is unbound while it is installing the transition will fail, but the agent spec will still have no cluster deployment reference. The agent controller should not attempt to contact the spoke cluster in this case.

Previously unbinding a day-2 host that was installing could cause a panic. This patch guards against that situation.

Additionally this should be covered by the agent validating webhook which prevents agents from being unbound while they are installing, but we still shouldn't crash if this does happen to slip through somehow.

List all the issues related to this PR

https://issues.redhat.com/browse/MGMT-15349

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see [CONTRIBUTING] guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

If an agent is unbound while it is installing the transition will fail,
but the agent spec will still have no cluster deployment reference.
The agent controller should not attempt to contact the spoke cluster in
this case.

Previously unbinding a day-2 host that was installing could cause a
panic. This patch guards against that situation.

https://issues.redhat.com/browse/MGMT-15349
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 25, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 25, 2023

@carbonin: This pull request references MGMT-15349 which is a valid jira issue.

In response to this:

If an agent is unbound while it is installing the transition will fail, but the agent spec will still have no cluster deployment reference. The agent controller should not attempt to contact the spoke cluster in this case.

Previously unbinding a day-2 host that was installing could cause a panic. This patch guards against that situation.

Additionally this should be covered by the agent validating webhook which prevents agents from being unbound while they are installing, but we still shouldn't crash if this does happen to slip through somehow.

List all the issues related to this PR

https://issues.redhat.com/browse/MGMT-15349

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see [CONTRIBUTING] guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 25, 2023
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 25, 2023
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Merging #5383 (5968d86) into master (6a583ef) will increase coverage by 0.23%.
Report is 5 commits behind head on master.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5383      +/-   ##
==========================================
+ Coverage   67.67%   67.91%   +0.23%     
==========================================
  Files         226      226              
  Lines       33349    33582     +233     
==========================================
+ Hits        22570    22806     +236     
+ Misses       8745     8737       -8     
- Partials     2034     2039       +5     
Files Changed Coverage Δ
...nternal/controller/controllers/agent_controller.go 77.52% <100.00%> (+0.02%) ⬆️

... and 6 files with indirect coverage changes

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 26, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jul 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carbonin, filanov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Jul 26, 2023

@carbonin: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/edge-e2e-ai-operator-ztp-capi 5968d86 link false /test edge-e2e-ai-operator-ztp-capi

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 83852f8 into openshift:master Jul 26, 2023
14 of 15 checks passed
@carbonin
Copy link
Member Author

/cherry-pick release-ocm-2.8

@carbonin carbonin deleted the day2-unbind-installing-host branch July 26, 2023 17:31
@openshift-cherrypick-robot

@carbonin: new pull request created: #5388

In response to this:

/cherry-pick release-ocm-2.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

CrystalChun pushed a commit to CrystalChun/assisted-service that referenced this pull request Aug 25, 2023
…#5383)

If an agent is unbound while it is installing the transition will fail,
but the agent spec will still have no cluster deployment reference.
The agent controller should not attempt to contact the spoke cluster in
this case.

Previously unbinding a day-2 host that was installing could cause a
panic. This patch guards against that situation.

https://issues.redhat.com/browse/MGMT-15349
danielerez pushed a commit to danielerez/assisted-service that referenced this pull request Oct 15, 2023
…#5383)

If an agent is unbound while it is installing the transition will fail,
but the agent spec will still have no cluster deployment reference.
The agent controller should not attempt to contact the spoke cluster in
this case.

Previously unbinding a day-2 host that was installing could cause a
panic. This patch guards against that situation.

https://issues.redhat.com/browse/MGMT-15349
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants