Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v10] Improve web ui ssh performance #19119

Merged
merged 3 commits into from Dec 7, 2022
Merged

Conversation

rosstimothy
Copy link
Contributor

Backports #18656 and #18910 to branch/v10

lib/web/apiserver.go Fixed Show fixed Hide fixed
lib/web/apiserver.go Fixed Show fixed Hide fixed
lib/web/apiserver.go Fixed Show fixed Hide fixed
lib/web/apiserver.go Fixed Show fixed Hide fixed
@rosstimothy rosstimothy marked this pull request as ready for review December 6, 2022 22:32
Reduces latency creating ssh sessions via the web ui by:

1) No longer uses `TeleportClient.SSH` to establish a session
2) Reuses the user auth client for the web session to perform MFA ceremony
3) Ensures that connection attempts follow the flow outlined in RFD 93

The web api server now leverages the `proxy.Router` and `srv.SessionController`
directly, instead of doing so indirectly via `TeleportClient.SSH`. Using
the `TeleportClient` required an ssh connection to be established from the web
api server to the proxy ssh server, which are in the same process. This added
overhead can be avoided now that the routing logic and session control logic
exists in a reusable component. To create an interactive session on the node
once the connection is established, `client.NodeClient` is used. A new constructor
was added to facilitate creating one and remove duplicated creation code and a
`RunInteractiveShell` receiver method was added to allow callers outside of
`lib/client` to spawn a session.

`TerminalHandler.issueSessionMFACerts` used to check if per-session mfa was enabled
and perform the mfa ceremony via the `client.ProxyClient` which was constructed
with the `TeleportClient` established from connecting to the proxy ssh server.
This would dial the Auth server under the hood directly and call `IsMFARequired`
and do the ceremony if required. Each web session established via the web ui
already established an auth client with the credentials of the logged in user.
Again overhead is removed by leveraging the existing auth client and performing
the mfa ceremony manually.

Finally `TerminalHandler.makeClient` always attempted to perform the mfa ceremony
prior to returning the `TeleportClient`. As outlined in [RFD 93](https://github.com/gravitational/teleport/blob/master/rfd/0093-offline-access.md),
this causes additional latency and requires Auth connectivity to connect to nodes.
The connection flow is now modified to attempt connection to the nodes first, and
fall back to the mfa ceremony and reconnecting only if the node denies access.

Partially addresses #15167
The agent was not being propagated when establishing an ssh session
via the web which resulted in the error described in #18850. Providing
the agent was straightforward, however, due to the changes from #18656
when using the forward server, as is required with proxy recording mode,
the ssh connection is now being performed directly over a `net.Pipe`.
Due to the synchronous nature of `net.Pipe` this causes a deadlock when
performing the ssh handshake. To mitigate the deadlocks, `DualPipeNetConn`
was changed to leverage `syscall.Socketpair` instead of `net.Pipe`.

`TestTerminal` now has two cases, one for node recording mode and another
for proxy recording mode. In order for the proxy recording mode test to pass
the fake clock used in the test needed to be properly propagated to the
`ssh.CertChecker` and `forward.ServerConfig`.

Fixes #18850
@@ -2177,15 +2224,25 @@
}

h.log.Debugf("New terminal request for ns=%s, server=%s, login=%s, sid=%s, websid=%s.",
req.Namespace, req.Server, req.Login, req.SessionID, ctx.GetSessionID())
req.Namespace, req.Server, req.Login, req.SessionID, sctx.GetSessionID())

Check failure

Code scanning / CodeQL

Log entries created from user input

This log entry depends on a [user-provided value](1).
@@ -2177,15 +2224,25 @@
}

h.log.Debugf("New terminal request for ns=%s, server=%s, login=%s, sid=%s, websid=%s.",
req.Namespace, req.Server, req.Login, req.SessionID, ctx.GetSessionID())
req.Namespace, req.Server, req.Login, req.SessionID, sctx.GetSessionID())

Check failure

Code scanning / CodeQL

Log entries created from user input

This log entry depends on a [user-provided value](1).
@@ -2177,15 +2224,25 @@
}

h.log.Debugf("New terminal request for ns=%s, server=%s, login=%s, sid=%s, websid=%s.",
req.Namespace, req.Server, req.Login, req.SessionID, ctx.GetSessionID())
req.Namespace, req.Server, req.Login, req.SessionID, sctx.GetSessionID())

Check failure

Code scanning / CodeQL

Log entries created from user input

This log entry depends on a [user-provided value](1).
@@ -2177,15 +2224,25 @@
}

h.log.Debugf("New terminal request for ns=%s, server=%s, login=%s, sid=%s, websid=%s.",
req.Namespace, req.Server, req.Login, req.SessionID, ctx.GetSessionID())
req.Namespace, req.Server, req.Login, req.SessionID, sctx.GetSessionID())

Check failure

Code scanning / CodeQL

Log entries created from user input

This log entry depends on a [user-provided value](1).
@rosstimothy
Copy link
Contributor Author

PTAL @xacrimon @probakowski

@rosstimothy rosstimothy enabled auto-merge (squash) December 7, 2022 18:01
@rosstimothy rosstimothy merged commit c499e86 into branch/v10 Dec 7, 2022
@github-actions github-actions bot removed the request for review from probakowski December 7, 2022 18:40
@rosstimothy rosstimothy deleted the tross/backport-18656/v10 branch December 7, 2022 18:48
fheinecke pushed a commit that referenced this pull request Dec 16, 2022
* Improve web ui ssh performance (#18656)

Reduces latency creating ssh sessions via the web ui by:

1) No longer uses `TeleportClient.SSH` to establish a session
2) Reuses the user auth client for the web session to perform MFA ceremony
3) Ensures that connection attempts follow the flow outlined in RFD 93

The web api server now leverages the `proxy.Router` and `srv.SessionController`
directly, instead of doing so indirectly via `TeleportClient.SSH`. Using
the `TeleportClient` required an ssh connection to be established from the web
api server to the proxy ssh server, which are in the same process. This added
overhead can be avoided now that the routing logic and session control logic
exists in a reusable component. To create an interactive session on the node
once the connection is established, `client.NodeClient` is used. A new constructor
was added to facilitate creating one and remove duplicated creation code and a
`RunInteractiveShell` receiver method was added to allow callers outside of
`lib/client` to spawn a session.

`TerminalHandler.issueSessionMFACerts` used to check if per-session mfa was enabled
and perform the mfa ceremony via the `client.ProxyClient` which was constructed
with the `TeleportClient` established from connecting to the proxy ssh server.
This would dial the Auth server under the hood directly and call `IsMFARequired`
and do the ceremony if required. Each web session established via the web ui
already established an auth client with the credentials of the logged in user.
Again overhead is removed by leveraging the existing auth client and performing
the mfa ceremony manually.

Finally `TerminalHandler.makeClient` always attempted to perform the mfa ceremony
prior to returning the `TeleportClient`. As outlined in [RFD 93](https://github.com/gravitational/teleport/blob/master/rfd/0093-offline-access.md),
this causes additional latency and requires Auth connectivity to connect to nodes.
The connection flow is now modified to attempt connection to the nodes first, and
fall back to the mfa ceremony and reconnecting only if the node denies access.

Partially addresses #15167

* Fix web ssh session with proxy recording mode (#18910)

The agent was not being propagated when establishing an ssh session
via the web which resulted in the error described in #18850. Providing
the agent was straightforward, however, due to the changes from #18656
when using the forward server, as is required with proxy recording mode,
the ssh connection is now being performed directly over a `net.Pipe`.
Due to the synchronous nature of `net.Pipe` this causes a deadlock when
performing the ssh handshake. To mitigate the deadlocks, `DualPipeNetConn`
was changed to leverage `syscall.Socketpair` instead of `net.Pipe`.

`TestTerminal` now has two cases, one for node recording mode and another
for proxy recording mode. In order for the proxy recording mode test to pass
the fake clock used in the test needed to be properly propagated to the
`ssh.CertChecker` and `forward.ServerConfig`.

Fixes #18850
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants