Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregator uses the regular transport when handling upgrade requests #104985

Merged
merged 1 commit into from Sep 27, 2021

Conversation

caesarxuchao
Copy link
Member

@caesarxuchao caesarxuchao commented Sep 14, 2021

/sig api-machinery
/kind bug
/assign
/release-note none

Currently aggregator creates a SPDY transport to the remote if the request contains upgrade headers. This seems wrong to me:

  • what if the request is trying to upgrade not to SPDY but to websocket?
  • more fundamentally, aggregator as a proxy should only care about the transport level connectivity. Aggregator should create a transport connection to the aggregated apiserver and then stitch it together with the connection between the client and the aggregator, just like how it treats a non-upgrade request.

This PR let's the aggregator use the original normal transport for upgrade requests. As a side-effect, this fixes #71808, which points out that that the SPDY transport doesn't honor the custom dialer in the original transport. The consequence of #71808 is that upgraded requests to aggregated APIs do not work if SSH tunnels or Konnectivity is used. A real life issue we met is that virtctl console(similar to kubectl exec) doesn't work when Konnectivity is used. We verified that this PR fixed the virtctl console use case.

Side note: how upgrade requests for aggregated apis worked with the SPDY transport when there was NO ssh tunnels or konnectivity involved

It worked because SPDY transport was only used here:

func dialURL(ctx context.Context, url *url.URL, transport http.RoundTripper) (net.Conn, error) {
dialAddr := netutil.CanonicalAddr(url)
dialer, err := utilnet.DialerFor(transport)

utilnet.DialerFor will return an empty dialer and error, because it doesn't recognize a SPDY transport. dialURL then uses the default golang net.Dialer. So the SPDY transport never actually gets used.

(Supersedes #104763)

Fixes #71808

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 14, 2021
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 14, 2021
@fedebongio
Copy link
Contributor

/cc @deads2k

@caesarxuchao caesarxuchao changed the title [WIP] Aggregator uses the regular transport even if the request requires upgrades \Aggregator uses the regular transport even if the request requires upgrades Sep 17, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2021
@caesarxuchao caesarxuchao changed the title \Aggregator uses the regular transport even if the request requires upgrades Aggregator uses the regular transport even if the request requires upgrades Sep 17, 2021
@@ -178,6 +174,8 @@ func (r *proxyHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) {
}

handler := proxy.NewUpgradeAwareHandler(location, proxyRoundTripper, true, upgrade, &responder{w: w})
handler.InterceptRedirects = utilfeature.DefaultFeatureGate.Enabled(genericfeatures.StreamingProxyRedirects)
handler.RequireSameHostRedirects = utilfeature.DefaultFeatureGate.Enabled(genericfeatures.ValidateProxyRedirects)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two flags only affect upgraded requests. I added them here because the removed maybeWrapForConnectionUpgrades function used to set them for the SPDY transport.

They should have no effect as they are deprecated in 1.22 (kubernetes/enhancements#1558). I keep them just for consistency.

@caesarxuchao
Copy link
Member Author

/assign @deads2k @sttts

/cc @cheftako

@caesarxuchao caesarxuchao changed the title Aggregator uses the regular transport even if the request requires upgrades Aggregator uses the regular transport when handling upgrade requests Sep 17, 2021
@jkh52
Copy link
Contributor

jkh52 commented Sep 17, 2021

/cc @jkh52

@k8s-ci-robot k8s-ci-robot added the area/dependency Issues or PRs related to dependency changes label Sep 17, 2021
followRedirects := utilfeature.DefaultFeatureGate.Enabled(genericfeatures.StreamingProxyRedirects)
requireSameHostRedirects := utilfeature.DefaultFeatureGate.Enabled(genericfeatures.ValidateProxyRedirects)
upgradeRoundTripper := spdy.NewRoundTripper(tlsConfig, followRedirects, requireSameHostRedirects)
wrappedRT, err := restclient.HTTPWrappersForConfig(restConfig, upgradeRoundTripper)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this call set up the front proxy headers for the connections we re-establish?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the front proxy header the same as the auth proxy headers that are set here? Since we now use the same proxyRoundTripper to handle both upgrade/non-upgrade requests, those auth proxy headers are applied to upgrade requests as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. yes, that's the one I was looking for.

@aojea
Copy link
Member

aojea commented Sep 24, 2021

  • more fundamentally, aggregator as a proxy should only care about the transport level connectivity. Aggregator should create a transport connection to the aggregated apiserver and then stitch it together with the connection between the client and the aggregator, just like how it treats a non-upgrade request.

I think that this is the key, no?
for learning purposes, why and or when aggregator -> aggregated should use SPDY?

@caesarxuchao
Copy link
Member Author

@aojea I don't think aggregator->aggregated needs SPDY at any time. Even if the client request contains upgrade headers, aggregator's job is i) establishing a transport layer connection to the aggregated API server, ii) using http hijacking to get control of the transport layer connection from the client, and iii) then stitching these two connections together to form a tunnel.

Aggregator itself does not need to understand SPDY. Only the client and the aggregated API server need to speak SPDY.

@deads2k
Copy link
Contributor

deads2k commented Sep 27, 2021

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 27, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 27, 2021
@aojea
Copy link
Member

aojea commented Sep 27, 2021

it's missing the release note @caesarxuchao :)

@caesarxuchao
Copy link
Member Author

Thanks, @aojea
/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 27, 2021
@k8s-ci-robot k8s-ci-robot merged commit c647c56 into kubernetes:master Sep 27, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 27, 2021
@liggitt
Copy link
Member

liggitt commented Sep 29, 2021

shouldn't there be a test exercising the issue this is fixing?

@caesarxuchao
Copy link
Member Author

Yeah, there should be a test :)

I had thought of two ways to test this but both seemed to be a lot of work.

Approach 1, we expand the sampleapiserver to handle websocket/spdy, then we exercise the upgrades in the existing aggregator e2e tests.

Approach 2, in the e2e test, we deploy the kubevirt aggregated server and verifies if the virtctl console works. This is basically reproducing the issue in #97763.

I think approach 1 makes more sense. Definitely please let me know if you see a cheaper 3rd option.

@liggitt
Copy link
Member

liggitt commented Sep 30, 2021

we've done upgrade proxy testing before in unit tests (e.g. in staging/src/k8s.io/apimachinery/pkg/util/proxy/upgradeaware_test.go)... my first thought would be to try to capture the heart of the bug in a unit test that exercises both the custom dialer (should fail on master) and direct dialer (should pass on master) cases, then make sure this PR fixes the custom dialer case and doesn't break the direct dialer case

@aojea
Copy link
Member

aojea commented Sep 30, 2021

we've done upgrade proxy testing before in unit tests (e.g. in staging/src/k8s.io/apimachinery/pkg/util/proxy/upgradeaware_test.go)... my first thought would be to try to capture the heart of the bug in a unit test that exercises both the custom dialer (should fail on master) and direct dialer (should pass on master) cases, then make sure this PR fixes the custom dialer case and doesn't break the direct dialer case

@liggitt @caesarxuchao let me see if I can do it, that sounds like a good opportunity to me to get familiar with this code

@caesarxuchao
Copy link
Member Author

@aojea thanks for offering help. Free free to ping me on Slack (handler is also "caesarxuchao") when you have a PR.

}
proxyRoundTripper := handlingInfo.proxyRoundTripper
upgrade := httpstream.IsUpgradeRequest(req)

proxyRoundTripper = transport.NewAuthProxyRoundTripper(user.GetName(), user.GetGroups(), user.GetExtra(), proxyRoundTripper)

// if we are upgrading, then the upgrade path tries to use this request with the TLS config we provide, but it does
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment / block still accurate with this change? commenting out the transport.SetAuthProxyHeaders line doesn't fail any unit tests before or after this PR, so it's unclear whether it is needed / effective

@liggitt
Copy link
Member

liggitt commented Oct 11, 2021

If this PR is picked to 1.22, the tests added in #105475 and #105582 should be included as well

k8s-ci-robot added a commit that referenced this pull request Oct 14, 2021
…-#104985-#105475-#105582-upstream-release-1.21

Automated cherry pick of #104985: Aggregator uses the regular transport even if the request
#105475: apiserver aggregator upgrade unit test
#105582: Verifying the auth headers are set for upgraded aggregated
k8s-ci-robot added a commit that referenced this pull request Oct 14, 2021
…-#104985-#105475-#105582-upstream-release-1.22

Automated cherry pick of #104985: Aggregator uses the regular transport even if the request
#105475: apiserver aggregator upgrade unit test
#105582: Verifying the auth headers are set for upgraded aggregated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/dependency Issues or PRs related to dependency changes cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

upgraded requests to aggregated API servers do not work when using ssh tunnels
9 participants