-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-26762: Disable DNS resolving for CNO #3986
Conversation
@kyrtapz: This pull request references Jira Issue OCPBUGS-26762, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@kyrtapz: This pull request references Jira Issue OCPBUGS-26762, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest |
/lgtm |
/cc @sjenning |
@@ -265,7 +269,7 @@ type proxyResolver struct { | |||
|
|||
func (d proxyResolver) Resolve(ctx context.Context, name string) (context.Context, net.IP, error) { | |||
// Preserve the host so we can recognize it | |||
if isCloudAPI(name) { | |||
if isCloudAPI(name) || d.disableResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't perform dns resolution but dialling would sill go through the guest cluster and so preserving the issue described in the jira. What am I missing?
To overcome the issue you could use resolve-from-management-cluster-dns causing dialling to happen through the management cluster but then that would defeat the purpose of the healthcheck?
Has the CNO change the healthcheck approach to e.g. run a pod on the guest cluster or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is there a way we can include healthchecking as part of our TestCreateClusterProxy validations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't perform dns resolution but dialling would sill go through the guest cluster and so preserving the issue described in the jira. What am I missing?
That is intentional. The readiness check has to be performed from the guest cluster but we want to dial to the exact cluster-wide-proxy address.
I know the issue is confusing so lets consider what was happening before:
- CNO dials to the cluster-wide-proxy fqdn using the local socks5 proxy.
- socks5 proxy resolves the fqdn to an IP address and sends it to the guest cluster through konnectivity
- konnectivity agent in the guest cluster has the cluster-wide-proxy fqdn in no_proxy but the cluster-wide-proxy ip is not there. It tries to dial to the cluster-wide-proxy IP through the configured proxy causing a reject from the proxy server.
With this change the behavior changes to:
- CNO dials to the cluster-wide-proxy fqdn using the local socks5 proxy.
- socks5 proxy doesn't resolve the fqdn and sends it to the guest cluster through konnectivity unchanged.
- konnectivity agent in the guest cluster has the cluster-wide-proxy fqdn in no_proxy so when it dials to it it does it directly.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kyrtapz, sjenning The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
…socks5-proxy Signed-off-by: Patryk Diak <pdiak@redhat.com>
CNO uses the konnectivity proxy to perform cluster wide proxy readiness checks. It should always connect to the exact address and not to the IP to avoid double proxy issues. Signed-off-by: Patryk Diak <pdiak@redhat.com>
/lgtm |
/retest-required |
@kyrtapz: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
c8af049
into
openshift:main
@kyrtapz: Jira Issue OCPBUGS-26762: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-26762 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cherry-pick release-4.15 release-4.14 |
@kyrtapz: new pull request created: #4015 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fix included in accepted release 4.16.0-0.nightly-2024-05-14-095225 |
CNO uses the konnectivity-socks5-proxy to perform cluster-wide-proxy readiness checks through the hosted cluster's network.
The cluster-wide-proxy address it uses should not be resolved to avoid double-proxy issues.