Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spdy: add optional periodic Pings on the connection #94170

Merged
merged 1 commit into from
Sep 22, 2020

Conversation

awly
Copy link
Contributor

@awly awly commented Aug 21, 2020

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

When an SPDY connection goes over an intermediate box (proxy or load
balancer, e.g. AWS ELB), it can get interrupted due to idleness (default
in ELB is 60s). For example, this happens with kubectl exec sessions
are left open without any activity for a while.

TCP-level keep-alives are not sufficient for all intermediate boxes,
they may pay attention to application-layer traffic only. SPDY pings
make the connection appear active, letting it survive a period of
idleness.

Note: this commit adds support for pings in
k8s.io/apimachinery/pkg/util/httpstream/spdy, but doesn't enable it
anywhere in the calling code. There is no behavior change for existing
callers.

Which issue(s) this PR fixes:

N/A

Special notes for your reviewer:

This will be used in https://github.com/gravitational/teleport/, which acts as a man-in-the-middle auth proxy in for k8s API. When teleport is fronted by AWS ELB (same probably applies to other load balancers), kubectl exec sessions get disconnected after 60s of idleness. This PR addresses that.

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

N/A

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet labels Aug 21, 2020
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 21, 2020
@awly
Copy link
Contributor Author

awly commented Aug 21, 2020

/assign @smarterclayton

@fedebongio
Copy link
Contributor

/assign @caesarxuchao
(@Jefftree was safe because on vacay)

Copy link
Member

@caesarxuchao caesarxuchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transition from SPDY to HTTP/2 is blocked on #7452 (comment), so in the mean time, I think we can accept PRs maintaining SPDY. @deads2k @lavalamp what do you think?

@awly How's the user going to enable the ping?

This is similar to #81179. cc @liggitt who reviewed that PR.

// actually closed.
} else {
// Count successful pings, for testing.
atomic.AddInt64(&c.pingsSent, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this testing-only code? Dose the SPDY library provide any testing support that exposes if a ping frame is received?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the library handles incoming pings internally without any external hooks I can find.
This unexported field is the least intrusive way I could verify the behavior. Using sync/atomic, performance overhead should be negligible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if you want to test this, add a member variable of type function for calling Ping and let the test swap that out. It's not performance, it's just a code smell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough, added a swap-able function field in connection

Copy link
Contributor Author

@awly awly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's the user going to enable the ping?

@caesarxuchao this PR doesn't expose it to k8s users.
My motivation is for https://github.com/gravitational/teleport/ which uses this library to intercept kubectl exec connections for audit purposes (recording session traffic). For kube-apiserver, pings are usually unnecessary since it's generally not fronted by a strict load balancer killing idle connections.

If you'd like, I can plumb a flag up to kube-apiserver. But it adds unnecessary complexity, if nobody is asking to use it there.

// actually closed.
} else {
// Count successful pings, for testing.
atomic.AddInt64(&c.pingsSent, 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the library handles incoming pings internally without any external hooks I can find.
This unexported field is the least intrusive way I could verify the behavior. Using sync/atomic, performance overhead should be negligible.

@lavalamp
Copy link
Member

I think I'm OK with adding this and I might even be OK with turning it on, as long as we do that early in the release cycle. But please don't change the function signatures. :)

Copy link
Contributor Author

@awly awly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

// actually closed.
} else {
// Count successful pings, for testing.
atomic.AddInt64(&c.pingsSent, 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough, added a swap-able function field in connection

@lavalamp
Copy link
Member

I think I'm OK with this now, @caesarxuchao can you do a final review?

@awly
Copy link
Contributor Author

awly commented Aug 26, 2020

/retest

@awly
Copy link
Contributor Author

awly commented Aug 31, 2020

@caesarxuchao could you take another look please?

Copy link
Member

@caesarxuchao caesarxuchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two nits. Otherwise lgtm.

Proxier func(*http.Request) (*url.URL, error)
// ConnPingPeriod is a period for sending SPDY Pings on the connection.
// Optional.
ConnPingPeriod time.Duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just naming it "PingPeriod"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed, here an in upgrade.go

go conn.Serve(c.newSpdyStream)
if pingPeriod > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check if c.ping is nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

When an SPDY connection goes over an intermediate box (proxy or load
balancer, e.g. AWS ELB), it can get interrupted due to idleness (default
in ELB is 60s).  For example, this happens with `kubectl exec` sessions
are left open without any activity for a while.

TCP-level keep-alives are not sufficient for all intermediate boxes,
they may pay attention to application-layer traffic only. SPDY pings
make the connection appear active, letting it survive a period of
idleness.

Note: this commit adds support for pings in
`k8s.io/apimachinery/pkg/util/httpstream/spdy`, but doesn't enable it
anywhere in the calling code. There is no behavior change for existing
callers.
Copy link
Contributor Author

@awly awly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @caesarxuchao, nits fixed

go conn.Serve(c.newSpdyStream)
if pingPeriod > 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Proxier func(*http.Request) (*url.URL, error)
// ConnPingPeriod is a period for sending SPDY Pings on the connection.
// Optional.
ConnPingPeriod time.Duration
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed, here an in upgrade.go

}

// newConnection returns a new connection wrapping conn. newStreamHandler
// will be invoked when the server receives a newly created stream from the
// client.
func newConnection(conn *spdystream.Connection, newStreamHandler httpstream.NewStreamHandler) httpstream.Connection {
c := &connection{conn: conn, newStreamHandler: newStreamHandler}
func newConnection(conn *spdystream.Connection, newStreamHandler httpstream.NewStreamHandler, pingPeriod time.Duration, pingFn func() (time.Duration, error)) httpstream.Connection {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: given that this is a private function, and it looks like pingFn is always spdyConn.Ping. Would it be better to omit the pingFn argument and invoke c.conn.Ping directly in c.sendPings?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I overlook the test..

@awly
Copy link
Contributor Author

awly commented Sep 14, 2020

Ping @caesarxuchao

@caesarxuchao
Copy link
Member

/retest
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 14, 2020
@awly
Copy link
Contributor Author

awly commented Sep 15, 2020

/milestone v1.20

@k8s-ci-robot
Copy link
Contributor

@awly: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone v1.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@awly
Copy link
Contributor Author

awly commented Sep 17, 2020

/priority backlog

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 17, 2020
@awly
Copy link
Contributor Author

awly commented Sep 17, 2020

@caesarxuchao this PR is still missing your approval, i think

@awly
Copy link
Contributor Author

awly commented Sep 22, 2020

ping @caesarxuchao

@lavalamp
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awly, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 22, 2020
@k8s-ci-robot k8s-ci-robot merged commit 4332d34 into kubernetes:master Sep 22, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Sep 22, 2020
@awly awly deleted the spdy-pings branch September 23, 2020 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants