Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce barrier for read deadline propagation of websocket alpn upgraded connection #41946

Closed
wants to merge 2 commits into from

Conversation

AntonAM
Copy link
Contributor

@AntonAM AntonAM commented May 23, 2024

This PR adds a barrier to connection read deadline propagation, when it's set in the past.
This fixes kubectl exec not working properly over L7 load balancer, because in that case connection was upgraded and hijacked twice. And golang http server during hijacking aborts its own pending read by setting read deadline to a past value. But since during second hijack we already have established underlying upgraded connection in cases of L7 load balancer, when http server calls SetReadDeadline() it goes to the very bottom and disrupts the chain.
This barrier now only cancels topmost read, it is done by decoupling Read() in our websocket wrapper from underlying connection read.

Changelog: Fix "kubectl exec" functionality when Teleport is running behind L7 load balancer.

Fixes #41014

…n upgraded connection.

This fixes kubectl exec not working properly over L7 load balancer, because in that case connection was upgraded and hijacked twice.
And golang http server during hicjaking aborts pending reads by setting read deadline to past value.
Second hijack went to the bottom and aborted our upgraded base connection.
@AntonAM AntonAM force-pushed the anton/fix-kube-exec-l7-lb branch from 213d1b5 to 1eabfcd Compare May 23, 2024 14:16
@AntonAM AntonAM requested review from greedy52 and tigrato May 23, 2024 16:00
@AntonAM AntonAM marked this pull request as ready for review May 23, 2024 16:00
Copy link
Contributor

@jakule jakule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an important change that should have some test coverage.

@AntonAM
Copy link
Contributor Author

AntonAM commented May 23, 2024

@jakule yep, I'm working on adding tests.

@AntonAM AntonAM removed the request for review from EdwardDowling May 23, 2024 17:00
@AntonAM AntonAM force-pushed the anton/fix-kube-exec-l7-lb branch from 095bf36 to 18a82f6 Compare May 23, 2024 18:43
@AntonAM AntonAM force-pushed the anton/fix-kube-exec-l7-lb branch from 18a82f6 to 29e4130 Compare May 23, 2024 18:46
@AntonAM
Copy link
Contributor Author

AntonAM commented May 23, 2024

Added test 29e4130

Copy link
Contributor

@greedy52 greedy52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, I think the approach is reasonable but the implementation is complex and a little bit hacky. Let me know what you think about this alternative approach to use the raw ws library instead of the gorilla websocket conn:
master...STeve/try_fix_kube_exec

This more aligns with our client implementation which uses ws:

for {
frame, err := ws.ReadFrame(c.Conn)
if err != nil {
return 0, trace.Wrap(err)
}
switch frame.Header.OpCode {
case ws.OpClose:
return 0, io.EOF
case ws.OpPing:
pong := ws.NewPongFrame(frame.Payload)
if err := c.writeFrame(pong); err != nil {
return 0, trace.Wrap(err)
}
case ws.OpBinary:
c.readBuffer = frame.Payload
return c.readLocked(b)
}

And less code than original impl....

I can certainly review the current approach (this PR) if we decide to go with it.

@AntonAM
Copy link
Contributor Author

AntonAM commented May 27, 2024

@greedy52 yeah, we could use your version of the fix 👍 Will you open a PR?

@greedy52
Copy link
Contributor

@greedy52 yeah, we could use your version of the fix 👍 Will you open a PR?

yes i will do that. may steal your UT though. Thanks

@greedy52
Copy link
Contributor

@greedy52 yeah, we could use your version of the fix 👍 Will you open a PR?

#42091 is up now. sorry for the delay. Spent a whole day on writing a new unit test 😭

@AntonAM
Copy link
Contributor Author

AntonAM commented May 30, 2024

Closed in favor of #42091

@AntonAM AntonAM closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubectl exec through Teleport Proxy results in "broken pipe" error
3 participants