Description
What follows is a direct paste from a private gist used to workshop the issue a bit late last week; have completed tests/impl (from 1.17+) and will be pushing those shortly. (wanted the issue number set in stone first...:D)
We have a CVE for this issue: CVE-2018-7750.
Intro
Email from one Matthijs Kooijman (@matthijskooijman) dated 2018.03.02 notes that Paramiko's server implementation may be connected to by clients that do not implement the auth step, and happily serves up commands/etc to such un-authed clients. He found that AsyncSSH (another Python lib that does not use Paramiko) has the same issue. Finally, he states the RFC is unclear as to whether this is purposeful.
Let's double check both the RFCs and then our favorite reference implementation, OpenSSH.
Should neither provide a useful clue, my gut says the server implementation should track whether we've sent SSH_MSG_USERAUTH_SUCCESS
(99% sure we already do track this) and default to rejecting any connection-level messages (like SSH_MSG_CHANNEL_OPEN
or SSH_MSG_GLOBAL_REQUEST
) unless that flag is True.
RFC scan
tl;dr it is indeed kinda vague, there are two kinda-disagreeing undercurrents, neither of which are ironclad:
- It is assumed that the connection protocol (which is where command exec occurs) runs on top of / after setting up, the transport (initial kex/handshake/etc) and auth (user auth) protocols.
- The specific server implementation, and/or operator of an instantiation of such an implementation, has significant leeway in how they implement and/or configure the server, re: when and how user-auth occurs.
Specifics:
-
RFC 4251 (protocol arch)
-
1: pretty clear that the intent is that a user auth step always occurs, followed by a service request:
The client sends a service request once a secure transport layer
connection has been established. A second service request is sent after
user authentication is complete. -
4.3: third bullet point re: policy issues that 'SHOULD' be addressed, highlights that auth specifics are up to the site/operator:
The authentication methods that are to be required by the server for
each user. The server's policy MAY require multiple authentication for
some or all users. The required algorithms MAY depend on the location
from where the user is trying to gain access. -
9.4.3: this whole section arguably applies, but it's very vague. However it seems to back up my hunch that at core, this is up to the server implementer and/or operator, e.g:
At the discretion of the implementers, this default policy may be along
the lines of anything-goes where there are no restrictions placed upon
users [...]
-
-
RFC 4252 (auth protocol)
-
4: more vague implications that the server can do whatever it wants, e.g. the below quote about
none
auth implies the authors at least partly considered servers that intentionally don't care about auth at all (though the specific discussion is about the actual, explicit use of thenone
auth type message, which is distinct from "did not submit auth at all"):The "none" method is reserved, and MUST NOT be listed as supported.
However, it MAY be sent by the client. The server MUST always reject
this request, unless the client is to be granted access without any
authentication, in which case, the server MUST accept this request. -
5.3: this (like 4251.1) states that the server should start up the requested service after sending auth-success. One could read this to imply that services SHOULD NOT start UNLESS auth has occurred, but it's not explicit...
-
7: notes that implementations MUST implement public-key auth, though I note this is distinct from requiring that it is enabled (clearly, many real servers only offer password auth, for example.)
-
-
RFC 4253 (transport protocol)
- 10: implies client may request "a service" after initial (high level) kex, where that service is one of userauth or connection. The transport level of the protocol thus doesn't appear to actually care or enforce that one performs auth before connection.
-
RFC 4254 (connection protocol)
-
1: once again implies that connection is "designed to" occur after/on top of auth.
-
11: again, it's 'assumed':
This protocol is assumed to run on top of a secure, authenticated
transport. User authentication and protection against network-level
attacks are assumed to be provided by the underlying protocols.
-
OpenSSH's implementation
My old friend and the only C codebase I have any familiarity with whatsoever, openssh-portable...
Synopsis
After all the below, the tl;dr seems to be:
- OpenSSH sets up a dispatch table to determine how it responds to protocol messages/packets
- This table gets reinitialized depending on 'phase' of execution: while awaiting auth, it is only set to respond to auth-related messages, then after successful auth, it retools the table to only respond to post-auth-related messages like channel opens.
- Thus, the case under test ends up being a simple "What even are you talking about? What's a channel open?"
NotImplementedError
style situation - no auth step, no idea how to handle anything beyond auth.
Deep dive
- Main SSH2 server loop is
serverloop.c
->server_loop2()
- Which uses a dispatch table to handle inbound messages: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L393
- Which dispatches to other functions, so when it sees eg
MSG_CHANNEL_OPEN
it callsserver_input_channel_open()
: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L897 - Which is defined here: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L613
- If the user is asking for command exec, that's channel type
session
: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L630 - Which calls
server_request_session
: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L582 - Which calls
session_open
with a handle onthe_authctxt
: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L603- This is our first apparent reference to anything auth-related so far...
- The only other apparent external context is the
ssh
object used to get the actual channel in play on the call prior: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L600 - That auth context object is
extern
'd at top of file: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/serverloop.c#L84 - We'll dig into that later if necessary but for now, let's assume it has handy ways of telling whether the user is authed or not, and the question is whether/how those are consulted.
session_open
is, bizarrely, defined insession.c
: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/session.c#L1757- It checks for
authctxt->valid
(or a null password entry) and gets mad if not true: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/session.c#L1767- So yea, we gotta doublecheck what that
->valid
member actually maps to.
- So yea, we gotta doublecheck what that
Authctxt
struct is defined inauth.h
here: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/auth.h#L55-L98- It has a bunch of flags, of which
success
,authenticated
, andvalid
all seem relevant. success
is not documented;authenticated
sounds like it maps to, well, authentication (user is who they claim to be) withvalid
mapping to (a generic level of) authorization (the user is actually allowed to login.)
- It has a bunch of flags, of which
- Those flags (esp
valid
) aren't set in too many places; the most useful and in retrospect most obvious place is in handling of userauth requests: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/auth2.c#L215- The core of this is actually use of
Authmethod
structs (format defined here) created by the variousauth2-*
modules (one for each implemented auth backend - kerberos, password, publickey, hostbased, etc)- These are simple
name
,userauth
,enabled
structures, withuserauth
being a pointer to an implementation function (so e.g. the one for password auth is referencinguserauth_passwd
inauth2-passwd.c
(here.)
- These are simple
- The per-method
userauth
func is called and the result stored asauthenticated
var: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/auth2.c#L287 - Which bubbles down (after much state machine checking) to finalizing userauth - sending the success network message, updating
authctxt->success = 1
, etc: https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/auth2.c#L352-L360
- The core of this is actually use of
- The
->valid
flag seems to actually just be "is the requested username a valid local system user": https://github.com/openssh/openssh-portable/blob/71e48bc7945f867029e50e06c665c66aed6d3c64/auth2.c#L236-L239- Note use of
getpwnamallow
, which (basically) wraps the syscallgetpwnam
aka "get password entry for user" (soauthctxt->pw
is specifically that structure and not just a password) - Which explains why it's distinct from
success
andauthenticated
.
- Note use of
Initial distillation
- Seems at first that "huh, the user can get a session as long as they exist locally, without necessarily passing auth" which would be bad but would also act like Paramiko.
- However, realized: that
->valid
flag is set byinput_userauth_request
, meaning the client has to actually submit auth in order to set it. (Ditto->pw
.) - So if a user attempts to send a channel open request without authing, both
authctxt->pw
andauthctxt->valid
will be null, and thussession_open
should call line 1767 and thus result infatal("no user for session")
.
Testing with live OpenSSH server
Proving this with a live install is interesting:
-
Ran local docker container executing Ubuntu + OpenSSH 7.2 on port 2222 with nothing but root password auth by default
-
Executed Matthijs's
client-test.py
with nothing but the port number changed -
Did not get expected
no user for session
but instead seem to have just confused the poor thing:debug1: SSH2_MSG_NEWKEYS received [preauth] debug1: KEX done [preauth] dispatch_protocol_error: type 90 seq 3 [preauth] debug1: Received SSH2_MSG_UNIMPLEMENTED for 3 [preauth]
Second dive
- The above protocol error log message comes out of here: https://github.com/openssh/openssh-portable/blob/151c6e433a5f5af761c78de87d7b5d30a453cf5e/dispatch.c#L45-L46 in
dispatch_protocol_error()
- Which is stuffed into newly initialized dispatch tables by ssh_dispatch_init
- Which is done via
do_authentication2
here: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/auth2.c#L172
- Which is done via
- So the tl;dr is that the dispatch table gets all 255 slots filled initially by
SSH2_MSG_UNIMPLEMENTED
, then the table is filled in with what is intended to be responded to (e.g. indo_authentication2
, the very next line is to say "ok and now respond to service requests")- E.g. slot 90 corresponds to
SSH_MSG_CHANNEL_OPEN
, aka what our test client was requesting: https://tools.ietf.org/html/rfc4250#section-4.1.2
- E.g. slot 90 corresponds to
- Looks like in the normal flow of things, the post-auth process ends up reinitializing the dispatch table with acceptable messages (including channel opens):
- Server loop emits "Entering interactive session" log message one can see in a successful regular auth+shell: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/serverloop.c#L375
- Then calls
server_init_dispatch
: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/serverloop.c#L393 - Which does aforementioned dispatch reset & fill-in: https://github.com/openssh/openssh-portable/blob/de1920d743d295f50e6905e5957c4172c038e8eb/serverloop.c#L889-L910
End result
- The RFC isn't terrifically clear beyond "well, we kind of assume you're not gonna open channels and such unless you've already authed", but it's not a
SHOULD
or aMUST
. - OpenSSH has chosen to implement this as a strict "only respond to the messages you can handle at the current stage" setup, where auth comes before connection (as in the RFC.) Trying to do things out-of-order results in a simple
SSH_MSG_UNIMPLEMENTED
. - Paramiko does not do things quite that way: instead, as one might guess from the bug description, it simply sets up all possible dispatch targets (anything implemented by the Transport across its two handler tables, and the AuthHandler handler table) and then dispatches depending on message type:
paramiko/paramiko/transport.py
Lines 1891 to 1917 in 27a8ed1
Doing nothing certainly seems like a bad idea: this is clearly a massive security flaw, and the only reason I did all the above investigation is because software has an irritating history of "but I was relying on that bug / looseness in the spec / whatever!". Given the main reference implementation disallows it, I'm inclined to assume nobody could possibly rely on this.
So there's two obvious fixes for Paramiko:
- The "OpenSSH is the Bible" approach: update Transport's dispatching to be more like OpenSSH's and only enable certain message types depending on the state of
self.is_authenticated
(which, impressively, appears to only ever be used in__repr__
...!!!)- This could be problematic given that Transport is frustratingly bimodal and is used both for server and client operations - we'd have to make sure that we're not preventing a not-yet-authed client from dispatching on necessary responses because it's not authed yet, for example.
- Of note, AsyncSSH is taking this approach in their fix, and are simply leveraging the fact that RFC 4250 compatible protocol numbers mean one can just go "is the message identifier greater than the highest possible auth related number? Are you authed yet? No? Screw off!" - seems possible on our end, though doesn't really change the previous point about ensuring client-side cases are protected.
- The "that's too much work right now" approach: simply rub some more references to
.auth_handler
and/or.is_authenticated
in specific spots such asServer.check_channel_request
(here)- Except this, too, is problematic because of how Transport, Server and AuthHandler split up responsibilities & exposures to one another. By default a Server has no direct access to the Transport running it, or that Transport's AuthHandler (which, in server mode, is set up during rekeying [including initial kex].)
- Changing this would be backwards incompatible (e.g. enforcing an actual
__init__
onServer
subclasses for the passing-in of a reference to one of the other objects, since right nowServer
doesn't even define one!) though I would like to examine it sometime. Now probably not the best time though.
- Changing this would be backwards incompatible (e.g. enforcing an actual
- Could still put specific, small-scoped changes in
Transport
though, such as around thecheck_channel_request
calls here:paramiko/paramiko/transport.py
Lines 2531 to 2537 in 27a8ed1
- Except this, too, is problematic because of how Transport, Server and AuthHandler split up responsibilities & exposures to one another. By default a Server has no direct access to the Transport running it, or that Transport's AuthHandler (which, in server mode, is set up during rekeying [including initial kex].)
My gut says to take a quick stab at the 1st approach but to fall back to the
2nd if the 1st cannot be done relatively painlessly.
Either way, re: the actual action to take seems poorly defined, but esp given OpenSSH simply spits out a bunch of question marks and not a "useful" error; the RFCs (4250, 4254) list only 4 default 'error' types, of which OPEN_FAILED_ADMINISTRATIVELY_PROHIBITED
seems the closest fit. And indeed Paramiko uses it for eg bogus channel types, in some legacy tests.