Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1989055: logins to the web console fail with custom oauth cert #571

Conversation

florkbr
Copy link
Contributor

@florkbr florkbr commented Aug 3, 2021

The cluster-authentication-operator was recently updated to publish
custom certs to a managed config map oauth-serving-cert. The console
needs to trust this new cert before logins will work propertly with
custom certs.

See openshift/cluster-authentication-operator#464

https://bugzilla.redhat.com/show_bug.cgi?id=1989055

@florkbr
Copy link
Contributor Author

florkbr commented Aug 3, 2021

/hold need to investigate if this breaks non-oauth logins and if there are any potential RBAC issues

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 3, 2021
@florkbr florkbr force-pushed the 1989055-web-console-login-fails-with-custom-cert branch from f81d0d3 to 59e0ca0 Compare August 3, 2021 14:57
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 3, 2021

@florkbr: An error was encountered querying GitHub for users with public email (yapei@redhat.com) for bug 1989055 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. non-200 OK status code: 403 Forbidden body: "{\n \"documentation_url\": \"https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#abuse-rate-limits\",\n \"message\": \"You have triggered an abuse detection mechanism. Please wait a few minutes before you try again.\"\n}\n"

Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

Bug 1989055: logins to the web console fail with custom oauth cert

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/api/api.go Outdated Show resolved Hide resolved
@openshift-ci openshift-ci bot requested review from jhadvig and spadgett August 3, 2021 14:58
@florkbr florkbr force-pushed the 1989055-web-console-login-fails-with-custom-cert branch from 59e0ca0 to 46e1797 Compare August 3, 2021 15:10
@spadgett
Copy link
Member

spadgett commented Aug 3, 2021

/cc @stlaz

@openshift-ci openshift-ci bot requested a review from stlaz August 3, 2021 16:41
@florkbr florkbr force-pushed the 1989055-web-console-login-fails-with-custom-cert branch 2 times, most recently from 4fcb0aa to ab2f46e Compare August 3, 2021 16:44
Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks @florkbr

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 3, 2021
@florkbr florkbr force-pushed the 1989055-web-console-login-fails-with-custom-cert branch from ab2f46e to 6bad965 Compare August 3, 2021 16:50
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2021
Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2021
The cluster-authentication-operator was recently updated to publish
custom certs to a managed config map `oauth-serving-cert`. The console
needs to trust this new cert before logins will work propertly with
custom certs.

See openshift/cluster-authentication-operator#464

https://bugzilla.redhat.com/show_bug.cgi?id=1989055
@florkbr florkbr force-pushed the 1989055-web-console-login-fails-with-custom-cert branch from 6bad965 to 8a5ee89 Compare August 3, 2021 17:04
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2021
Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2021
@florkbr
Copy link
Contributor Author

florkbr commented Aug 3, 2021

Test image pushed to: docker pull quay.io/bflorkie/console-operator:08032021

@spadgett
Copy link
Member

spadgett commented Aug 3, 2021

Upgrade test failed. I wonder if the console operator goes to unavailable if it rolls out before the new oauth-serving-cert config map is there, although there are other degraded operators, too.

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/571/pull-ci-openshift-console-operator-master-e2e-agnostic-upgrade/1422604521401487360

We should check our handling of available status... I believe if any replicas are available the console should be considered available.

cc @jhadvig

@florkbr
Copy link
Contributor Author

florkbr commented Aug 3, 2021

Confirmed this change works with the latest 4.9 build including the changes to the auth-operator:
Screen Shot 2021-08-03 at 4 23 22 PM
Screen Shot 2021-08-03 at 4 23 43 PM
Screen Shot 2021-08-03 at 4 23 55 PM

@florkbr
Copy link
Contributor Author

florkbr commented Aug 5, 2021

/retest-required

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2021
@florkbr
Copy link
Contributor Author

florkbr commented Aug 23, 2021

/bugzilla refresh

@openshift-ci openshift-ci bot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Aug 23, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 23, 2021

@florkbr: This pull request references Bugzilla bug 1989055, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @yapei

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Aug 23, 2021
@openshift-ci openshift-ci bot requested a review from yapei August 23, 2021 15:05
@florkbr
Copy link
Contributor Author

florkbr commented Aug 23, 2021

/retest-required

@florkbr
Copy link
Contributor Author

florkbr commented Aug 24, 2021

Flakes around cluster deployment/availability. Retesting.

@florkbr
Copy link
Contributor Author

florkbr commented Aug 24, 2021

/retest-required

@kdoberst
Copy link

/retest

2 similar comments
@kdoberst
Copy link

/retest

@kdoberst
Copy link

/retest

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@jhadvig
Copy link
Member

jhadvig commented Aug 26, 2021

=== RUN   TestEditUnmanagedConfigMap
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    util.go:28: patching Data on the console ConfigMap
    util.go:35: polling for patched Data on the console ConfigMap
    unmanaged_test.go:41: error: timed out waiting for the condition
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
    console-operator.go:370: waited 10 seconds to reach settled state...
--- FAIL: TestEditUnmanagedConfigMap (63.07s)

/retest

@florkbr
Copy link
Contributor Author

florkbr commented Aug 26, 2021

=== RUN   TestEditUnmanagedConfigMap
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    util.go:28: patching Data on the console ConfigMap
    util.go:35: polling for patched Data on the console ConfigMap
    unmanaged_test.go:41: error: timed out waiting for the condition
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
    console-operator.go:370: waited 10 seconds to reach settled state...
--- FAIL: TestEditUnmanagedConfigMap (35.08s)

/retest

@florkbr
Copy link
Contributor Author

florkbr commented Aug 26, 2021

=== RUN   TestEditUnmanagedConfigMap
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    util.go:28: patching Data on the console ConfigMap
    util.go:35: polling for patched Data on the console ConfigMap
    unmanaged_test.go:41: error: timed out waiting for the condition
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
    console-operator.go:370: waited 10 seconds to reach settled state...
--- FAIL: TestEditUnmanagedConfigMap (33.83s)

/retest

@florkbr
Copy link
Contributor Author

florkbr commented Aug 26, 2021

@jhadvig wondering if I should spend some time investigating these flakes or if we should disable the tests?

@florkbr
Copy link
Contributor Author

florkbr commented Aug 26, 2021

/retest

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@spadgett
Copy link
Member

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_console-operator/571/pull-ci-openshift-console-operator-master-e2e-aws-operator/1431161472309792768

nodes not ready: node "ip-10-0-129-57.us-east-2.compute.internal" not ready since 2021-08-27 08:36:25 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
	clusteroperator/machine-config is not available (Cluster not available for 4.9.0-0.ci.test-2021-08-27-075123-ci-op-jpr5h2ik-latest) because Failed to resync 4.9.0-0.ci.test-2021-08-27-075123-ci-op-jpr5h2ik-latest because: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 4, unavailable: 2)
	clusteroperator/monitoring is not available (Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.) because Failed to rollout the stack. Error: updating prometheus-k8s: waiting for Prometheus object changes failed: waiting for Prometheus openshift-monitoring/k8s: expected 2 replicas, got 1 updated replicas
	clusteroperator/network is degraded because DaemonSet "openshift-multus/multus" rollout is not making progress - last change 2021-08-27T08:36:26Z
DaemonSet "openshift-multus/multus-additional-cni-plugins" rollout is not making progress - last change 2021-08-27T08:36:26Z
DaemonSet "openshift-sdn/sdn" rollout is not making progress - last change 2021-08-27T08:36:26Z
	clusteroperator/openshift-apiserver is degraded because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
	clusteroperator/storage is progressing: AWSEBSCSIDriverOperatorCRProgressing: AWSEBSDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods

Similar errors seem to be affecting a lot of jobs:

https://search.ci.openshift.org/?search=Kubelet+stopped+posting+node+status&maxAge=48h&context=1&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 1d80049 into openshift:master Aug 27, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@florkbr: All pull requests linked via external trackers have merged:

Bugzilla bug 1989055 has been moved to the MODIFIED state.

In response to this:

Bug 1989055: logins to the web console fail with custom oauth cert

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants