OCPBUGS-26762: Disable DNS resolving for CNO #3986

kyrtapz · 2024-05-06T10:30:26Z

CNO uses the konnectivity-socks5-proxy to perform cluster-wide-proxy readiness checks through the hosted cluster's network.
The cluster-wide-proxy address it uses should not be resolved to avoid double-proxy issues.

openshift-ci-robot · 2024-05-06T10:30:31Z

@kyrtapz: This pull request references Jira Issue OCPBUGS-26762, which is invalid:

expected the bug to target the "4.16.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

CNO uses the konnectivity-socks5-proxy to perform cluster-wide-proxy readiness checks through the hosted cluster's network.
The cluster-wide-proxy address it uses should not be resolved to avoid double-proxy issues.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

kyrtapz · 2024-05-06T10:35:06Z

/jira refresh

openshift-ci-robot · 2024-05-06T10:35:12Z

@kyrtapz: This pull request references Jira Issue OCPBUGS-26762, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.16.0) matches configured target version for branch (4.16.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

kyrtapz · 2024-05-06T13:51:47Z

/retest

stevekuznetsov · 2024-05-06T16:46:30Z

/lgtm

kyrtapz · 2024-05-06T17:44:17Z

/cc @sjenning

konnectivity-socks5-proxy/main.go

enxebre · 2024-05-08T08:23:37Z

konnectivity-socks5-proxy/main.go

@@ -265,7 +269,7 @@ type proxyResolver struct {

 func (d proxyResolver) Resolve(ctx context.Context, name string) (context.Context, net.IP, error) {
 	// Preserve the host so we can recognize it
-	if isCloudAPI(name) {
+	if isCloudAPI(name) || d.disableResolver {


This won't perform dns resolution but dialling would sill go through the guest cluster and so preserving the issue described in the jira. What am I missing?

To overcome the issue you could use resolve-from-management-cluster-dns causing dialling to happen through the management cluster but then that would defeat the purpose of the healthcheck?
Has the CNO change the healthcheck approach to e.g. run a pod on the guest cluster or similar?

Also is there a way we can include healthchecking as part of our TestCreateClusterProxy validations?

This won't perform dns resolution but dialling would sill go through the guest cluster and so preserving the issue described in the jira. What am I missing?

That is intentional. The readiness check has to be performed from the guest cluster but we want to dial to the exact cluster-wide-proxy address.
I know the issue is confusing so lets consider what was happening before:

CNO dials to the cluster-wide-proxy fqdn using the local socks5 proxy.

socks5 proxy resolves the fqdn to an IP address and sends it to the guest cluster through konnectivity

konnectivity agent in the guest cluster has the cluster-wide-proxy fqdn in no_proxy but the cluster-wide-proxy ip is not there. It tries to dial to the cluster-wide-proxy IP through the configured proxy causing a reject from the proxy server.

With this change the behavior changes to:

CNO dials to the cluster-wide-proxy fqdn using the local socks5 proxy.

socks5 proxy doesn't resolve the fqdn and sends it to the guest cluster through konnectivity unchanged.

konnectivity agent in the guest cluster has the cluster-wide-proxy fqdn in no_proxy so when it dials to it it does it directly.

enxebre · 2024-05-08T09:49:07Z

/lgtm

sjenning · 2024-05-08T14:08:04Z

/approve

openshift-ci · 2024-05-08T14:08:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kyrtapz, sjenning

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sjenning]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kyrtapz · 2024-05-08T15:50:18Z

/retest

…socks5-proxy Signed-off-by: Patryk Diak <pdiak@redhat.com>

CNO uses the konnectivity proxy to perform cluster wide proxy readiness checks. It should always connect to the exact address and not to the IP to avoid double proxy issues. Signed-off-by: Patryk Diak <pdiak@redhat.com>

sjenning · 2024-05-08T16:36:23Z

/lgtm

openshift-ci-robot · 2024-05-08T18:08:42Z

/retest-required

Remaining retests: 0 against base HEAD 30fc457 and 2 for PR HEAD a41d9a7 in total

openshift-ci-robot · 2024-05-08T20:08:46Z

/retest-required

Remaining retests: 0 against base HEAD 37b99f5 and 1 for PR HEAD a41d9a7 in total

sjenning · 2024-05-08T21:02:05Z

/retest-required

openshift-ci · 2024-05-09T01:46:50Z

@kyrtapz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-azure	`a41d9a7`	link	false	`/test e2e-azure`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot · 2024-05-09T01:50:06Z

@kyrtapz: Jira Issue OCPBUGS-26762: All pull requests linked via external trackers have merged:

openshift/hypershift#3986

Jira Issue OCPBUGS-26762 has been moved to the MODIFIED state.

In response to this:

CNO uses the konnectivity-socks5-proxy to perform cluster-wide-proxy readiness checks through the hosted cluster's network.
The cluster-wide-proxy address it uses should not be resolved to avoid double-proxy issues.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

kyrtapz · 2024-05-10T12:36:26Z

/cherry-pick release-4.15 release-4.14

openshift-cherrypick-robot · 2024-05-10T12:37:09Z

@kyrtapz: new pull request created: #4015

In response to this:

/cherry-pick release-4.15 release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-merge-robot · 2024-05-14T16:58:18Z

Fix included in accepted release 4.16.0-0.nightly-2024-05-14-095225

openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels May 6, 2024

openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 6, 2024

openshift-ci bot added the do-not-merge/needs-area label May 6, 2024

openshift-ci bot requested review from csrwng and enxebre May 6, 2024 10:30

openshift-ci bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 6, 2024

openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 6, 2024

openshift-ci bot requested a review from anuragthehatter May 6, 2024 10:35

openshift-ci bot assigned stevekuznetsov May 6, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 6, 2024

openshift-ci bot requested a review from sjenning May 6, 2024 17:44

enxebre reviewed May 7, 2024

View reviewed changes

konnectivity-socks5-proxy/main.go Outdated Show resolved Hide resolved

kyrtapz force-pushed the disable_proxy_dns branch from 5511984 to 7acc542 Compare May 7, 2024 07:58

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label May 7, 2024

enxebre reviewed May 8, 2024

View reviewed changes

openshift-ci bot assigned enxebre May 8, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 8, 2024

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2024

kyrtapz added 2 commits May 8, 2024 18:11

Add an option to globally disable the DNS resolution in konnectivity-…

f03aef9

…socks5-proxy Signed-off-by: Patryk Diak <pdiak@redhat.com>

Disable DNS resolution in konnectivity-socks5-proxy for CNO

a41d9a7

CNO uses the konnectivity proxy to perform cluster wide proxy readiness checks. It should always connect to the exact address and not to the IP to avoid double proxy issues. Signed-off-by: Patryk Diak <pdiak@redhat.com>

kyrtapz force-pushed the disable_proxy_dns branch from 7acc542 to a41d9a7 Compare May 8, 2024 16:20

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label May 8, 2024

openshift-ci bot assigned sjenning May 8, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 8, 2024

openshift-merge-bot bot merged commit c8af049 into openshift:main May 9, 2024
12 of 13 checks passed

openshift-cherrypick-robot mentioned this pull request May 10, 2024

[release-4.15] OCPBUGS-33526: Disable DNS resolving for CNO #4015

Closed

kyrtapz mentioned this pull request Jun 4, 2024

[release-4.15] OCPBUGS-33526: Disable DNS resolving for CNO #4148

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPBUGS-26762: Disable DNS resolving for CNO #3986

OCPBUGS-26762: Disable DNS resolving for CNO #3986

kyrtapz commented May 6, 2024

openshift-ci-robot commented May 6, 2024

kyrtapz commented May 6, 2024

openshift-ci-robot commented May 6, 2024

kyrtapz commented May 6, 2024

stevekuznetsov commented May 6, 2024

kyrtapz commented May 6, 2024

enxebre May 8, 2024 •

edited

enxebre May 8, 2024

kyrtapz May 8, 2024

enxebre commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci bot commented May 8, 2024

kyrtapz commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci-robot commented May 8, 2024

openshift-ci-robot commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci bot commented May 9, 2024

openshift-ci-robot commented May 9, 2024

kyrtapz commented May 10, 2024

openshift-cherrypick-robot commented May 10, 2024

openshift-merge-robot commented May 14, 2024

OCPBUGS-26762: Disable DNS resolving for CNO #3986

OCPBUGS-26762: Disable DNS resolving for CNO #3986

Conversation

kyrtapz commented May 6, 2024

openshift-ci-robot commented May 6, 2024

kyrtapz commented May 6, 2024

openshift-ci-robot commented May 6, 2024

kyrtapz commented May 6, 2024

stevekuznetsov commented May 6, 2024

kyrtapz commented May 6, 2024

enxebre May 8, 2024 • edited

Choose a reason for hiding this comment

enxebre May 8, 2024

Choose a reason for hiding this comment

kyrtapz May 8, 2024

Choose a reason for hiding this comment

enxebre commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci bot commented May 8, 2024

kyrtapz commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci-robot commented May 8, 2024

openshift-ci-robot commented May 8, 2024

sjenning commented May 8, 2024

openshift-ci bot commented May 9, 2024

openshift-ci-robot commented May 9, 2024

kyrtapz commented May 10, 2024

openshift-cherrypick-robot commented May 10, 2024

openshift-merge-robot commented May 14, 2024

enxebre May 8, 2024 •

edited