Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.9] Bug 2060111: Set Upgradeable=False if default cert has no SAN #711

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Mar 2, 2022

This is a manual cherry-pick of #708 with an additional fix for an issue introduced in #709 (see #709 (comment)).


status: Use GetPlatformStatus

Don't use infraConfig.Status.PlatformStatus directly; infraConfig.Status could be nil.

Follow-up to #709.

  • pkg/operator/controller/ingress/status.go: Use GetPlatformStatus.

Set Upgradeable=False if default cert has no SAN

If an ingresscontroller's default certificate has a Common Name (CN) for the ingress domain but has no Subject Alternative Name (SAN) for the same, report that the cluster cannot be upgraded by setting the Upgradeable=False status condition on the ingresscontroller and clusteroperator.

Clients built using Go 1.17 reject certificates without SANs. OpenShift 4.10 is built using Go 1.17, which means that various operators that connect to routes that use the default certificate would reject the certificate and fail to complete the TLS handshake after upgrading to OpenShift 4.10 if the ingress operator didn't block the upgrade on a cluster with a problematic certificate.

  • pkg/operator/controller/ingress/status.go (syncIngressControllerStatus): Get the default certificate secret and pass it to computeIngressUpgradeableCondition.
    (computeIngressUpgradeableCondition): Add a parameter for the default certificate secret. Use the argument value to call the new checkDefaultCertificate function to check for problematic certificates.
    (checkDefaultCertificate): New function. Return a non-nil error value if the default certificate in the provided secret has a CN for the ingress domain and no SAN for the same.
  • pkg/operator/controller/ingress/status_test.go (TestComputeIngressUpgradeableCondition): Verify that computeIngressUpgradeableCondition reports Upgradeable=False if the default certificate has a CN and no SAN for the ingress domain and reports Upgradeable=True otherwise.

Don't use infraConfig.Status.PlatformStatus directly; infraConfig.Status
could be nil.

Follow-up to commit 9d4a41b.

* pkg/operator/controller/ingress/status.go: Use GetPlatformStatus.
@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Mar 2, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 2, 2022

@Miciah: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

[release-4.9] Bug 2060111: Set Upgradeable=False if default cert has no SAN

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 2, 2022
@Miciah Miciah force-pushed the cherry-pick-710-to-release-4.9 branch from f305946 to 082ddfb Compare March 2, 2022 22:09
@Miciah
Copy link
Contributor Author

Miciah commented Mar 3, 2022

TestConfigurableRouteNoConsumingUserNoRBAC failed; the same test failed on #707, so it looks like the test may be flaky.

/test e2e-aws-operator

@melvinjoseph86
Copy link

Verified via pre-merge verification workflow, more references related to the test can be found in:
https://bugzilla.redhat.com/show_bug.cgi?id=2060111#c3
/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Mar 3, 2022
@melvinjoseph86
Copy link

/test e2e-gcp-serial
/test e2e-aws-operator

@melvinjoseph86
Copy link

/test e2e-aws-operator

1 similar comment
@melvinjoseph86
Copy link

/test e2e-aws-operator

@Miciah
Copy link
Contributor Author

Miciah commented Mar 3, 2022

Must-gather failed.
/test e2e-aws-operator

@Miciah
Copy link
Contributor Author

Miciah commented Mar 3, 2022

/assign @candita

@Miciah
Copy link
Contributor Author

Miciah commented Mar 3, 2022

Installer failed:

level=error msg=Error: error updating LB Target Group (arn:aws:elasticloadbalancing:us-east-1:460538899914:targetgroup/ci-op-9ddchgy8-265e5-qlmhd-aint/638dbe06869a2be6) tags: error tagging resource (arn:aws:elasticloadbalancing:us-east-1:460538899914:targetgroup/ci-op-9ddchgy8-265e5-qlmhd-aint/638dbe06869a2be6): TargetGroupNotFound: Target groups 'arn:aws:elasticloadbalancing:us-east-1:460538899914:targetgroup/ci-op-9ddchgy8-265e5-qlmhd-aint/638dbe06869a2be6' not found
level=error msg=	status code: 400, request id: ca03fac1-2eea-451c-9a63-3d90143670ac
level=error
level=error msg=  on ../tmp/openshift-install-cluster-393099372/vpc/master-elb.tf line 45, in resource "aws_lb_target_group" "api_internal":
level=error msg=  45: resource "aws_lb_target_group" "api_internal" {
level=error
level=error
level=fatal msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change 

/test e2e-aws-operator

foundSAN = true
}
}
if cert.Subject.CommonName == domain && !foundSAN {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there always going to be a common name, even if there is a SAN?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but what we really care about is the situation where a serving certificate is accepted by older clients and rejected by newer clients. Checking that the CN matches the ingresscontroller domain and that no SAN does is the best approach I've come up with to warn when there is a problem without causing false positives. Consider these four cases:

  • The certificate has no SAN and no CN; such a certificate wouldn't work at all as a serving certificate even for older clients, so we should ignore it.
  • The certificate has no CN and does have a SAN; this is fine: if the SAN is valid, the certificate will be accepted by both old clients as well as new clients (and if the SAN is invalid, it won't be accepted by old or new clients), so we should ignore it.
  • The certificate has a CN that doesn't match the ingresscontroller domain; in this case, the certificate is presumably not the serving certificate but rather an intermediate, and anyway, as with the previous case, either it has a valid SAN, in which case both old as well as new clients will accept it, or it does not, in which case both old as well as new clients will reject it, so we should ignore it.
  • The certificate has a CN that does match the ingresscontroller domain; in this case, the certificate most likely is the serving certificate, and we need to make sure it has a SAN with the same domain in order to still be accepted by newer Go clients.

The last case is really the only one that matters for upgrades, so it is the only one where we set Upgradeable=True if we find a mismatch.

Comment on lines 587 to 588
// <https://bugzilla.redhat.com/show_bug.cgi?id=2057762>. This check can be
// removed after OpenShift 4.9.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in 4.9, right?

Suggested change
// <https://bugzilla.redhat.com/show_bug.cgi?id=2057762>. This check can be
// removed after OpenShift 4.9.
// <https://bugzilla.redhat.com/show_bug.cgi?id=2057762>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the 4.9 backport. Usually I keep the backport as close as possible to the original change (in this case, #708) to minimize risk of breaking something and to simplify future backports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if someone saw "This check can be removed after OpenShift 4.9." that they might remove it in a later 4.9 version. Or it would at least cause some questions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I guess someone could interpret "after OpenShift 4.9" as including newer z-stream releases of 4.9. Maybe I should have said "in OpenShift 4.10 or later" in the original commit. However, even if someone interpreted "after OpenShift 4.9" in that way, I think the risk that someone would remove this in a 4.9 z-stream is low since we have extra levels of review and scrutiny for backports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've amended the comment.

@candita
Copy link
Contributor

candita commented Mar 4, 2022

level=error msg=Bootstrap failed to complete: Get "https://api.ci-op-5g2hxix6-265e5.origin-ci-int-aws.dev.rhcloud.com:6443/version?timeout=32s": dial tcp 23.22.248.35:6443: connect: connection refused
...
level=info msg=Pull failed. Retrying registry.build02.ci.openshift.org/ci-op-5g2hxix6/release@sha256:745b411dc13d6fe198c528e91f77c8c1be6f108e7d9a99283bc84ea5464dfedc...
level=info msg=Error: Error initializing source docker://registry.build02.ci.openshift.org/ci-op-5g2hxix6/release@sha256:745b411dc13d6fe198c528e91f77c8c1be6f108e7d9a99283bc84ea5464dfedc: Error reading manifest sha256:745b411dc13d6fe198c528e91f77c8c1be6f108e7d9a99283bc84ea5464dfedc in registry.build02.ci.openshift.org/ci-op-5g2hxix6/release: unauthorized: authentication required
level=fatal msg=Bootstrap failed to complete

/retest

If an ingresscontroller's default certificate has a Common Name (CN) for
the ingress domain but has no Subject Alternative Name (SAN) for the same,
report that the cluster cannot be upgraded by setting the Upgradeable=False
status condition on the ingresscontroller and clusteroperator.

Clients built using Go 1.17 reject certificates without SANs.  OpenShift
4.10 is built using Go 1.17, which means that various operators that
connect to routes that use the default certificate would reject the
certificate and fail to complete the TLS handshake after upgrading to
OpenShift 4.10 if the ingress operator didn't block the upgrade on a
cluster with a problematic certificate.

The commit is related to bug 2057762.

https://bugzilla.redhat.com/show_bug.cgi?id=2057762

* pkg/operator/controller/ingress/status.go (syncIngressControllerStatus):
Get the default certificate secret and pass it to
computeIngressUpgradeableCondition.
(computeIngressUpgradeableCondition): Add a parameter for the default
certificate secret.  Use the argument value to call the new
checkDefaultCertificate function to check for problematic certificates.
(checkDefaultCertificate): New function.  Return a non-nil error value if
the default certificate in the provided secret has a CN for the ingress
domain and no SAN for the same.
* pkg/operator/controller/ingress/status_test.go
(TestComputeIngressUpgradeableCondition): Verify that
computeIngressUpgradeableCondition reports Upgradeable=False if the default
certificate has a CN and no SAN for the ingress domain and reports
Upgradeable=True otherwise.
@Miciah Miciah force-pushed the cherry-pick-710-to-release-4.9 branch from 082ddfb to ffef526 Compare March 4, 2022 18:36
@candita
Copy link
Contributor

candita commented Mar 4, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 4, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: candita, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 10, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 11, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 14, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2022

@openshift-bot: This pull request references Bugzilla bug 2060111, which is invalid:

  • expected dependent Bugzilla bug 2059210 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@melvinjoseph86
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Mar 16, 2022
@lihongan
Copy link
Contributor

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Mar 17, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2022

@lihongan: This pull request references Bugzilla bug 2060111, which is valid.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.z) matches configured target release for branch (4.9.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2059210 is in the state VERIFIED, which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2059210 targets the "4.10.0" release, which is one of the valid target releases: 4.10.0, 4.10.z
  • bug has dependents

Requesting review from QA contact:
/cc @melvinjoseph86

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@melvinjoseph86
Copy link

/test e2e-aws-operator

1 similar comment
@melvinjoseph86
Copy link

/test e2e-aws-operator

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2022

@Miciah: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 19c161a into openshift:release-4.9 Mar 17, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2022

@Miciah: All pull requests linked via external trackers have merged:

Bugzilla bug 2060111 has been moved to the MODIFIED state.

In response to this:

[release-4.9] Bug 2060111: Set Upgradeable=False if default cert has no SAN

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants