Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1990916: ci-operator/step-registry/ipi/install/install: Default OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP to empty #20978

Merged

Conversation

wking
Copy link
Member

@wking wking commented Aug 7, 2021

c421970 (#20592) started setting the environment variable for all calls. It defaulted to false, apparently assuming that that meant "keep on deleting the bootstrap resources". But the installer actually treats any non-empty value as "please preserve".

This should avoid situations like [this]]3, where the false default lead the installer to say:

time="2021-08-05T21:44:40Z" level=warning msg="OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP is set, not destroying bootstrap resources. Warning: this should only be used for debugging purposes, and poses a risk to cluster stability."

which broke ingress on:

level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: EnsureBackendPoolDeleted: failed to parse the VMAS ID : getAvailabilitySetNameByID: failed to parse the VMAS ID

which make everything that's ingress-dependent (auth, console, ...) sad.

@openshift-ci openshift-ci bot added the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Aug 7, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

@wking: This pull request references Bugzilla bug 1990916, which is invalid:

  • expected the bug to target the "4.9.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1990916: ci-operator/step-registry/ipi/install/install: Default OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP to empty

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Aug 7, 2021
@openshift-ci openshift-ci bot requested review from csrwng and deads2k August 7, 2021 17:50
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

@wking: This pull request references Bugzilla bug 1990916, which is invalid:

  • expected the bug to target the "4.9.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1990916: ci-operator/step-registry/ipi/install/install: Default OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP to empty

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…ALL_PRESERVE_BOOTSTRAP to empty

c421970 (Preserve bootstrap node on single-node installations,
2021-07-26, openshift#20592) started setting the environment variable for all
calls.  It defaulted to 'false', apparently assuming that that meant
"keep on deleting the bootstrap resources".  But the installer
actually treats any non-empty value as "please preserve" [1].

This should avoid situations like [2,3], where the 'false' default
lead the installer to say [4,5]:

  time="2021-08-05T21:44:40Z" level=warning msg="OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP is set, not destroying bootstrap resources. Warning: this should only be used for debugging purposes, and poses a risk to cluster stability."

which broke ingress on [4]:

    level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: EnsureBackendPoolDeleted: failed to parse the VMAS ID : getAvailabilitySetNameByID: failed to parse the VMAS ID

which make everything that's ingress-dependent (auth, console, ...)
sad.

[1]: https://github.com/openshift/installer/blob/6d778f911e79afad8ba2ff4301eda5b5cf4d8e9e/cmd/openshift-install/create.go#L133
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1949267#c3
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1990916
[4]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-azure/1423392049742221312
[5]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-azure/1423392049742221312/artifacts/e2e-azure/ipi-install-install/artifacts/.openshift_install.log
@wking wking force-pushed the unset-preserve-bootstrap-on-false branch from b909290 to 9a5d667 Compare August 7, 2021 17:53
@wking
Copy link
Member Author

wking commented Aug 7, 2021

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Aug 7, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

@wking: This pull request references Bugzilla bug 1990916, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (liyao@redhat.com), skipping review request.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking
Copy link
Member Author

wking commented Aug 7, 2021

CC @osherdp, @deads2k

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 7, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: osherdp, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot merged commit 21e7b60 into openshift:master Aug 7, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

@wking: All pull requests linked via external trackers have merged:

Bugzilla bug 1990916 has been moved to the MODIFIED state.

In response to this:

Bug 1990916: ci-operator/step-registry/ipi/install/install: Default OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP to empty

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2021

@wking: Updated the step-registry configmap in namespace ci at cluster app.ci using the following files:

  • key ipi-install-install-ref.yaml using file ci-operator/step-registry/ipi/install/install/ipi-install-install-ref.yaml
  • key ipi-install-install-stableinitial-ref.yaml using file ci-operator/step-registry/ipi/install/install/stableinitial/ipi-install-install-stableinitial-ref.yaml

In response to this:

c421970 (#20592) started setting the environment variable for all calls. It defaulted to false, apparently assuming that that meant "keep on deleting the bootstrap resources". But the installer actually treats any non-empty value as "please preserve".

This should avoid situations like [this]]3, where the false default lead the installer to say:

time="2021-08-05T21:44:40Z" level=warning msg="OPENSHIFT_INSTALL_PRESERVE_BOOTSTRAP is set, not destroying bootstrap resources. Warning: this should only be used for debugging purposes, and poses a risk to cluster stability."

which broke ingress on:

level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: EnsureBackendPoolDeleted: failed to parse the VMAS ID : getAvailabilitySetNameByID: failed to parse the VMAS ID

which make everything that's ingress-dependent (auth, console, ...) sad.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@osherdp
Copy link
Contributor

osherdp commented Aug 7, 2021

@wking sorry for that
I'm too much used to boolean values stored as true/false string values 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
2 participants