Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALPN SNI Proxy #7524

Merged
merged 39 commits into from
Sep 13, 2021
Merged

ALPN SNI Proxy #7524

merged 39 commits into from
Sep 13, 2021

Conversation

smallinsky
Copy link
Contributor

@smallinsky smallinsky commented Jul 12, 2021

ALPN SNI proxy service listener

ALPN SNI proxy service allows to handle incoming TLS connections and route received connections base on ServerName (SNI) NextProtos (ALPN) values to proper Teleport Proxy service.

CLI UX changes

  • tsh proxy ssh command that can be used as openSSH ProxyCommand in order to wrap ssh protocol into TLS connection and connect to a remote proxy.
    ssh -o "ForwardAgent yes" \
    -o "ProxyCommand tsh proxy ssh --user=%r %h:%p" \
    user@node.example.com
    
  • tsh proxy db [-p port] example starts local db proxy and wraps db protocol into TLS connection and establishes the connection to remote proxy.

Other Changes:

  • tsh db connect example (introduce in #7213) will detect if the remote proxy supports ALPN SNI listener and under the hood will spawn on demand local db proxy that will wrap db protocol into TLS connection and connect to remote proxy.
  • Extended KUBECONFIG file generated by tsh kube login command by cluster.server field use to set SNI value by kubectrl tool.
  • Extended Proxy Ping response proxy message by alpn_sni_listener_enabled filed.
  • Introduced a new alpn_sni_listener_enabled field in the tsh user profile to distinguish if proxy support ALPN SNI listener.

Backward compatibility

Client:

tsh proxy client uses proxy.alpn_sni_listener_enabled proxy response ping field to distinguish if remote Teleport Proxy supports TLS SNI ALPN listener. If ALPN SNI Proxy support is detected by the proxy client the traffic is wrapped into TLS and send to Web port where ALPN SNI Teleport Proxy service listen on.
If former version of a client is used to establish a connection to the remote Teleport Proxy service

Server

In order to support former proxy clients Teleport Proxy by default exposed Teleport Proxy Services in an unchanged way. Additional TLS SNI Teleport Proxy service is exposed on Web port.

Load Test IoT (connection through teleport proxy reverse tunnel)

Load Test 1K nodes teleport 7.0.0-beta.5 (For reference)

Screenshot 2021-08-09 at 00 02 51

Soak 1K nodes teleport 7.0.0-beta.5 (For reference)

tsh bench --interactive --duration=30m root@loadtest-5ddf88b955-w7sbn ps uax

* Requests originated: 17918
* Requests failed: 110
* Last error: ssh: unexpected packet in response to channel open: <nil>

Histogram

Percentile Response Duration
---------- -----------------
25         3745 ms
50         8223 ms
75         8367 ms
90         8679 ms
95         9071 ms
99         13039 ms
100        59455 ms
tsh bench --duration=30m root@loadtest-5ddf88b955-tq5d4 ls


* Requests originated: 17918
* Requests failed: 104
* Last error: connection error: desc = "transport: authentication handshake failed: EOF"

Histogram

Percentile Response Duration
---------- -----------------
25         3743 ms
50         8223 ms
75         8367 ms
90         8679 ms
95         9039 ms
99         12663 ms
100        60255 ms

Load Test 1K nodes SNI proxy:

Screenshot 2021-08-09 at 13 13 14

Soak 1K nodes teleport SNI Proxy

tsh bench --duration=30m root@loadtest-fbdb47d9f-zgs64 ls


* Requests originated: 17925
* Requests failed: 81
* Last error: ssh: unexpected packet in response to channel open: <nil>

Histogram

Percentile Response Duration
---------- -----------------
25         3807 ms
50         7455 ms
75         7507 ms
90         7735 ms
95         8839 ms
99         14551 ms
100        60607 ms
tsh bench --interactive --duration=30m root@loadtest-fbdb47d9f-xdm6v ps uax


* Requests originated: 17923
* Requests failed: 89
* Last error: ssh: unexpected packet in response to channel open: <nil>

Histogram

Percentile Response Duration
---------- -----------------
25         3815 ms
50         7755 ms
75         7811 ms
90         8035 ms
95         9079 ms
99         14655 ms
100        60479 ms

@smallinsky smallinsky force-pushed the smallinsky/alpnsniproxy branch 9 times, most recently from cf67e96 to 28131a2 Compare July 13, 2021 09:20
@smallinsky smallinsky marked this pull request as ready for review July 15, 2021 19:58
Copy link
Collaborator

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smallinsky Don't have a lot of other feedback, just a few remaining cleanups/questions.

api/client/webclient/webclient.go Outdated Show resolved Hide resolved
Comment on lines 2421 to 2424
// add proxy as a backend second time to prevent situations where there are only 2 backend and internal
// reverse tunnel dialer makes 2 calls to connect to the tunnel (ProxyPing, factual Dial call). In that
// situations the second lb backend reverse tunnel address is actually never dialed.
lb.AddBackend(*utils.MustParseAddr(net.JoinHostPort(Loopback, strconv.Itoa(proxyReverseTunnelPort))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I somehow missed this during previous passes. @smallinsky I'm not sure I fully understand why we're adding a second B backend here. How does it help with the test and the proxy?

lib/reversetunnel/transport.go Outdated Show resolved Hide resolved
lib/reversetunnel/agent.go Outdated Show resolved Hide resolved
lib/srv/alpnproxy/conn.go Outdated Show resolved Hide resolved
return nil, nil, trace.Wrap(err)
}

// Following TLS handshake fails on the server side with error: "no certificates configured" after server
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see now, it's cause you're using that fake tlsConn above. A couple of thoughts:

  • Can this failed fake handshake somehow affect the client and/or the future "real" handshake?
  • Can your fake tlsConn return a proper tls.Config so this handshake succeeds also? We have p.cfg.TLSConfig here?

tool/tsh/db.go Show resolved Hide resolved
tool/tsh/tsh.go Outdated Show resolved Hide resolved
@@ -2587,6 +2588,14 @@ func (process *TeleportProcess) setupProxyListeners() (*proxyListeners, error) {
listeners.web = listener
}
}

// Even if web service API was disabled create a web listener used for ALPN/SNI service as the master port
if cfg.Proxy.DisableWebService && !cfg.Proxy.DisableTLS && listeners.web == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Isn't cfg.Proxy.DisableWebService basically a redundant condition here?

@@ -3030,6 +2948,24 @@ func (process *TeleportProcess) initProxyEndpoint(conn *Connector) error {
if err != nil {
return trace.Wrap(err)
}

if alpnRouter != nil && !cfg.Proxy.DisableDatabaseProxy {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll ever be able drop other listeners since we'll still need to have support for separate listeners, right? But ok, that's fair.

Copy link
Collaborator

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm once we address the issue with the hacky integration test

@awly / @nklaassen: since you guys looked at this before, could you give this another look so we can get a second approval? Thanks.

Copy link
Contributor

@nklaassen nklaassen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright to me overall, my remaining question is related to the issue with multiple proxies behind a load balancer and clients choosing the port based on the wrong ping response. Would this be a serious problem for people turning on this feature or is there a workaround? Update all proxies at once?

@smallinsky
Copy link
Contributor Author

@nklaassen

Looks alright to me overall, my remaining question is related to the issue with multiple proxies behind a load balancer and clients choosing the port based on the wrong ping response. Would this be a serious problem for people turning on this feature or is there a workaround? Update all proxies at once?

The proxy update should work the proxy behind LB should have same config the issue is rated to proxy stagnation when even number of proxies are configured behind reverse tunnel LB. I have create a separate PR with fix for this case:
#8106

@smallinsky smallinsky merged commit c142b65 into master Sep 13, 2021
@smallinsky smallinsky deleted the smallinsky/alpnsniproxy branch September 13, 2021 09:54
zmb3 pushed a commit that referenced this pull request Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants