Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: SSL backend #1145

Open
njsmith opened this issue Jul 10, 2019 · 15 comments
Open

Discussion: SSL backend #1145

njsmith opened this issue Jul 10, 2019 · 15 comments
Labels
design discussion potential API breaker TLS Relevant to our TLS/SSL implementation

Comments

@njsmith
Copy link
Member

njsmith commented Jul 10, 2019

There are two main TLS libraries people use in Python: the stdlib ssl module, and the third-party pyopenssl. So far Trio has stuck with the stdlib module. @dholth recently raised the question of support pyopenssl, and at time of writing has made significant progress on the implementation: #1140. The trade-offs here are super complicated, and keep coming up in various ways, so it's probably worth having a proper thread to collect discussion.

Considerations

SSLStream is some of the most complicated code in Trio, and has some of the most complicated tests, and any bugs are potentially security-sensitive. We really don't want to have to maintain multiple copies of it.

There's no reasonable way to abstract away the difference between these libraries from users. Users have to be able to configure the openssl Context object directly, and these two libraries have different and incompatible ways of doing that. Some projects (e.g. Twisted) have tried to define their own TLS configuration API so they can hide the underlying library, but it's a huge amount of work, means users lose access to what meager documentation exists for the other systems, and will inevitably end up hiding away features that people need.

ssl is in the stdlib, so always available. pyopenssl does have solid wheel coverage, so it's not a major burden for most users to install. So I guess the only case where this would matter is if pip ever wants to depend on Trio, then we'll need to be able to run Trio without any extension modules. Currently that's how it works on Linux/macOS; only Windows needs C extensions (specifically cffi). This is probably not worth worrying about too much -- if/when it makes sense for pip to seriously consider using Trio, we can figure out our options then.

ssl is in many ways simpler and easier to use. For example, a critical feature in any TLS library is to validate hostnames, or else all this certificate infrastructure is useless. With ssl, this is enabled by default. With pyopenssl, you need to be an expert and use a third-party library to enable it. Super tricky.

pyopenssl is generally more featureful. ssl has gotten a lot more complete in recent years, so we can mostly get away with it, but historically the pattern has been that serious frameworks always have something they need that they can't get from ssl, so . That's @dholth's situation... he wants OCSP stapling, which pyopenssl supports and ssl doesn't.

pyopenssl isn't tied to the CPython release cycle. Example of why this matters: openssl has some annoying bugs in TLS 1.3 that cause our tests to deadlock (openssl/openssl#7967). For now we're disabling TLS 1.3 in the tests. But since TLS 1.3 didn't exist when CPython 3.5 and 3.6 were released, they don't expose the flag you need to disable TLS 1.3! So if you have a CPython 3.5 or 3.6 that's built against openssl 1.1.1 w/ TLS 1.3 support, then our tests just break. In CPython 3.8 will expose some more TLS 1.3 configuration knobs and might let us work around this issue (bpo-37120)... but earlier versions never will. And in general this will also apply to any future limitations we might run into -- if we stick with ssl then we have to cross our fingers that we won't run into anything too dire in the future, because if we do then we're stuck.

  • Bonus: RHEL 8 has CPython 3.6 and openssl 1.1.1, so unless we can convince the openssl devs to fix 1.1.1, we're probably going to be stuck supporting this combination for a long time.

In the long run, it would be Very Nice to support platform-native TLS libraries (SChannel on Windows, SecureTransport on macOS), since these are the only libraries that know how to hook into the system trust store to properly evaluate certificates. (rustls is also super cool.) Of course this is in direct contradiction to several of the above points. I was hoping PEP 543 would come along and save us, but it seems to be stalled for now. And at least as currently conceived, it's only trying to expose a minimal common subset, so it probably won't help with situations like @dholth's OCSP stapling.

Options

The Trio community as a whole could only support ssl (= status quo), or only support pyopenssl. I suspect that this won't be viable in the long run -- even if the core Trio package only supports one, there are enough tradeoffs between these that someone will need to support the other somehow.

We could copy/paste trio's SSLStream code and tests, then hack them to make PyOpenSSLStream and tests. (This is what #1140 does for now.) And then:

  • ...ship both sets of code inside trio itself.
  • ...ship ssl support in Trio itself, while relegating pyopenssl to some kind of third-party library
  • ...ship pyopenssl support in Trio itself, while relegating ssl to some kind of third-party library

It's nice that it's easy to ship code to use Trio+your-favorite-TLS-library as a third-party package, but all of these options do lead to a lot of tricky duplicate code. In the latter two cases we outsource that to third-parties, but someone would still need to do the work.

We could take on making PEP 543 happen, but that's a ton of work and as mentioned above, it still doesn't help with @dholth's use case.

I guess I can imagine a different/more limited version of PEP 543, where we don't try to abstract over configuration, but just the minimal set of features that Trio needs to wire up a TLS library to its I/O model. Like, let users use whatever context object they want, as long as they tell us how to do the core encrypt/decrypt/WantRead/WantWrite/feed-the-underlying-buffer operations? I guess that could even be used as a driver to grow something like PEP 543 over time, and then eventually extract it...

A more limited version of this would be to invent a little abstraction over just ssl and pyopenssl, and use that to write SSLStream. (So the user experience would be: when you create an SSLStream, you can pass in either a ssl.Context or OpenSSL.SSL.Context, your choice.) The two libraries are very similar from our point of view, because they both expose openssl's model more-or-less directly. unwrap might need a bit of care (ssl doesn't expose SSL_shutdown directly but instead wraps some automatic retry semantics around it, while I think pyopenssl just gives you SSL_shutdown), but AFAICT everything else maps pretty-much one-to-one. The biggest API issue is all the accessors that we currently re-export:

trio/trio/_ssl.py

Lines 348 to 395 in b994108

_forwarded = {
"context",
"server_side",
"server_hostname",
"session",
"session_reused",
"getpeercert",
"selected_npn_protocol",
"cipher",
"shared_ciphers",
"compression",
"pending",
"get_channel_binding",
"selected_alpn_protocol",
"version",
}
_after_handshake = {
"session_reused",
"getpeercert",
"selected_npn_protocol",
"cipher",
"shared_ciphers",
"compression",
"get_channel_binding",
"selected_alpn_protocol",
"version",
}
def __getattr__(self, name):
if name in self._forwarded:
if name in self._after_handshake and not self._handshook.done:
raise NeedHandshakeError(
"call do_handshake() before calling {!r}".format(name)
)
return getattr(self._ssl_object, name)
else:
raise AttributeError(name)
def __setattr__(self, name, value):
if name in self._forwarded:
setattr(self._ssl_object, name, value)
else:
super().__setattr__(name, value)
def __dir__(self):
return super().__dir__() + list(self._forwarded)

These are all specific to ssl. One option would be to give up and just give users access to the underlying ssl.SSLObject/OpenSSL.SSL.Connection, so e.g. instead of doing ssl_stream.getpeercert() you'd do ssl_stream.ssl_object.getpeercert(). This would also remove all that dynamic stuff that makes mypy grumpy. OTOH, it would mean that the type of SSLStream.ssl_object was statically Union[ssl.SSLObject, OpenSSL.SSL.Connection], which will also be annoying to mypy users, in a different way! Mayyybe we could do something with generics and overloads, so if you pass a ssl.Context you get a SSLStream[ssl.SSLObject], but if you pass a OpenSSL.SSL.Context you get a SSLStream[OpenSSL.SSL.Connection]?

@tiran: I don't have any specific questions but I guess you might be interested in this so I'll CC you and you can unsubscribe if you want :-)

@dholth
Copy link

dholth commented Jul 18, 2019

Remember how great Python's SimpleHTTPServer was? It shows up in non-Python tutorials sometimes just because people want a convenient web server. Unfortunately, it doesn't support http/2 and a lot of other new things. My vision is to capture that same magic in an obnoxiously modern server: in one minute, you should be able to have your humorous platypus-based website or online pyweek game running, with https that doesn't give you that nagging feeling that you really should reconfigure it later with Mozilla's recommend cipher suite or something, with staggered automatic renewal and OCSP to be as polite as possible to the letsencrypt service. See also the caddy web server.

So that brings us to hacking on the ssl module. I wanted to implement tls-acme-01 which for current versions of openssl requires another use case not anticipated by the standard library: sniffing the ssl handshake to do SNI based on both the servername and the ALPN protocol. Yes, this is the recommended way to do it according to the openssl (not pyopenssl) maintainers. I later wanted to do OSCP stapling too. I've tried the same find/replace ssl -> PyOpenSSL strategy for asyncio and got about as far, then gave trio a try: trio's ssl protocol is astonishingly shorter and has more comments and tests.

The basic find/replace ssl -> pyopenssl idea is simple enough and it's not difficult to translate send/receive bytes. PyOpenSSL has better exceptions - you can check the type in more places instead of catching SSLError and checking an error code. The ssl module has getters/setters in more places, perhaps letting you read back the option you just set in more places as a property, where PyOpenSSL might have a set() method only. PyOpenSSL is not super maintained lately.

The refined implementation is harder because you have to figure out when and which exception is thrown based on ssl error conditions you've never heard of before and debug both sides of this new connection object while it hangs talking to itself in the tests.

While hacking away at this, I've discovered even more implementation options while looking for rare PyOpenSSL examples. urllib3 implements a standardlib ssl API compatible wrapper for PyOpenSSL in its contrib/ module for the bits that urllib3 itself needs (sadly not including bio). I used some of that code to implement some of those re-exported accessors. I am also very interested in pypy. It contains a complete implementation of the standard library ssl api built on top of a vendored version of cryptography, cryptography is also the basis for PyOpenSSL. Pypy's ssl could be made to run on cpython to be loaded before / instead of the standard library module; it would be easier to extend if you're not great at C. Or we could just add the missing method to the standard library module and recompile.

It might be possible to convert a ssl context into a PyOpenSSL context; I thought I saw that somewhere; but not the other way around due to the setters-without-getters nature of PyOpenSSL.

If you have more pyopenssl examples I'd like to see them.

https://github.com/urllib3/urllib3/blob/master/src/urllib3/contrib/pyopenssl.py

https://bitbucket.org/pypy/pypy/src/default/lib_pypy/_cffi_ssl/

https://github.com/twisted/twisted/blob/6ac66416c0238f403a8dc1d42924fb3ba2a2a686/src/twisted/protocols/tls.py

https://github.com/twisted/twisted/blob/6ac66416c0238f403a8dc1d42924fb3ba2a2a686/src/twisted/internet/_sslverify.py

# brand "X"
$ radon raw cpython/Lib/asyncio/sslproto.py
    LOC: 727
    LLOC: 422
    SLOC: 430
    Comments: 46
    Single comments: 47
    Multi: 135
    Blank: 115
    - Comment Stats
        (C % L): 6%
        (C % S): 11%
        (C + M % L): 25%

# trio
$ radon raw _ssl.py 
_ssl.py
    LOC: 880
    LLOC: 207
    SLOC: 267
    Comments: 359
    Single comments: 358
    Multi: 166
    Blank: 89
    - Comment Stats
        (C % L): 41%
        (C % S): 134%
        (C + M % L): 60%

@tiran
Copy link

tiran commented Jul 18, 2019

You may want to ask @alex and @reaperhulk before you go for PyOpenSSL. As far as I know they want to phase out PyOpenSSL.

Also PyOpenSSL does not work in restricted environment with tightened security. All callback requires executable and writeable to create dynamic FFI trampolines. The same approach allows attackers to execute native machine code. Kernel security policies such as SELinux usually disallow execmem to protect against exploits.

@dholth
Copy link

dholth commented Jul 18, 2019

Good to know. Did you know cffi has a new non-trampoline version of callbacks? It works exactly like it would in C, where you have to pass C a void* to keep track of what you really want to happen. https://bitbucket.org/dholth/kivyjoy/src/aaeab79b2891782209a1219cd65a4d9716cea669/_kivyjoy/cdefs.py#lines-20 https://bitbucket.org/dholth/kivyjoy/src/aaeab79b2891782209a1219cd65a4d9716cea669/kivyjoy/__init__.py#lines-7

@tiran
Copy link

tiran commented Jul 18, 2019

It so happens that I filed a bug report against cffi several years ago, worked with Armin, and finally Armin implemented the static call on my request.

I ported cryptography to the new static callback and also started to fix PyOpenSSL. However porting required massive changes and I quickly stopped working on the fix.

@alex
Copy link

alex commented Jul 19, 2019

Yeah, we'd love to kill pyOpenSSL, because the code is grody and sad-making and the APIs aren't very good. But it's still maintained and supported, so I don't have an opinion on this.

@njsmith
Copy link
Member Author

njsmith commented Jul 19, 2019

Remember how great Python's SimpleHTTPServer was? It shows up in non-Python tutorials sometimes just because people want a convenient web server. Unfortunately, it doesn't support http/2 and a lot of other new things. My vision is to capture that same magic in an obnoxiously modern server

Yeah, I totally see why you would want the feature. The question is how to do it :-)

It might be possible to convert a ssl context into a PyOpenSSL context; I thought I saw that somewhere; but not the other way around due to the setters-without-getters nature of PyOpenSSL.

Unfortunately, ssl.SSLContext is also missing some important getters, e.g. it has set_alpn_protocols but there's no get_alpn_protocols. We could imagine fixing that in 3.8, but that doesn't help anytime soon. So I think we need to assume that there's no way to convert a ssl.SSLContext into an OpenSSL.SSL.Context or vice-versa. And users create these objects, not us – so by the time we can do anything, the user has already committed to which library they're using, and it's too late for us to try to do anything clever with urllib3 or pypy's code.

So my best idea so far is to define an abstraction over the two libraries. Like, imagine an AbstractSSLConn class with attributes:

  • want_read_error, unexpected_eof_error: exception types to be used in except clauses
  • write_incoming_data, write_incoming_eof, read_outgoing: methods to interact with the underlying read/write buffers.
  • do_handshake, read, write, unwrap: the core TLS operations

And then we have a get_ssl_conn(context, server_hostname, server_side) function, which sniffs the type of the context object and returns an implementation of this abstract type using either ssl or pyopenssl. Either way it's a very thin wrapper.

I'm also pretty sure that at least most of this interface could be implemented for other TLS libraries too. There might be a few little details to work out, like from some skimming it looks like Apple's SecureTransport can send close_notify but can't wait for it, so you can't implement unwrap, but unwrap is not that important. And at least to start with we can make this a private internal API, so we can always tweak it later if we need to.

We could incrementally add generic getters too – I guess getpeercert(binary=True) and selected_alpn_protocol are pretty generic and cover about 95% of the cases where folks need to introspect a TLS connection.

@dholth
Copy link

dholth commented Jul 19, 2019

I think I will try pulling the stdlib ssl module into a separate package. I've already tried two things. (Actually I also want to try repackaging the entire stdlib). I don't want pyOpenSSL per se, I only want an extra callback and setter. Although it would be a tremendous nuisance if that involved redistributing openssl.

Your idea of a minimum abstraction is better. It would be great even to have the exception cases explained for a single backend. Should I read the API pep again?

On board with skipping unwrap.

@tiran
Copy link

tiran commented Jul 19, 2019

Unfortunately, ssl.SSLContext is also missing some important getters, e.g. it has set_alpn_protocols but there's no get_alpn_protocols.

OpenSSL may be lacking some of these getters, too. When OpenSSL 1.1.0 made all structs opaque, they didn't add setters/getters for all fields. Let's collect all getters and I'll file PR's with upstream.

We could incrementally add generic getters too – I guess getpeercert(binary=True) and selected_alpn_protocol are pretty generic and cover about 95% of the cases where folks need to introspect a TLS connection.

3.8 has a new internal API SSLContext._msg_callback that lets gives you full access to TLS low level frames and raw data.

@njsmith are you at the core sprints? I'd like to discuss TLS stuff with you.

@dholth
Copy link

dholth commented Jul 19, 2019

Recall txsni wraps sslcontext and possibly connection to compensate for the lacking getters, to be able to recreate a matching context with the client's requested server name certificate. For hypercorn this was less important because the library controls the context factory.

@njsmith
Copy link
Member Author

njsmith commented Jul 20, 2019

It looks like oscrypto has an issue open for exposing a buffer-based wrapper around SChannel and SecureTransport, which is relevant here: wbond/oscrypto#1. It seems to be mostly blocked on uncertainty about what the API should look like, which we might be able to help with :-).

@dholth

I think I will try pulling the stdlib ssl module into a separate package. I've already tried two things. (Actually I also want to try repackaging the entire stdlib). I don't want pyOpenSSL per se, I only want an extra callback and setter. Although it would be a tremendous nuisance if that involved redistributing openssl.

The ssl module was definitely one of the ones I had in the back of my mind when I started this discussion: https://discuss.python.org/t/if-python-started-moving-more-code-out-of-the-stdlib-and-into-pypi-packages-what-technical-mechanisms-could-packaging-use-to-ease-that-transition/1738

That's largely orthogonal to the question of Trio supporting multiple SSL backends, though. Probably it would be better to keep this thread more focused on that.

Your idea of a minimum abstraction is better. It would be great even to have the exception cases explained for a single backend. Should I read the API pep again?

PEP 543 is certainly relevant, but it's trying to be a complete-ish designed-from-scratch abstract interface to all of TLS, which means it has a ton of stuff that we don't care about here. Our needs are much more narrow. Of course down the road this might develop into something more general like PEP 543, but the best way to do that is to start with the stuff we actually need and expand incrementally.

The semantics I had in mind are:

When you call the high-level operations (do_handshake, read, write, unwrap, or whatever we end up with), then they might do any or all of the following things:

  • consume data from the incoming data buffer
  • write data to the outgoing data buffer
  • raise an exception

If they raise want_read_error, that means that we need to add some data to the incoming data buffer by calling write_incoming_data or write_incoming_eof, and then re-try the original call.

If they raise unexpected_eof_error, that means that they encountered an EOF on the underlying connection, and it happened at a moment where the TLS spec says that EOF should not happen. (E.g., in the middle of the handshake, or after the handshake but before receiving a close_notify message.) This can only happen after you've called write_incoming_eof.

If they raise any other error, that means something nasty went wrong, like a bad certificate or something. In this case Trio will wrap the exception in a trio.BrokenResourceError, so it's OK for it to be some weird PyOpenSSL exception type or whatever – users won't be encountering it directly.

@tiran

OpenSSL may be lacking some of these getters, too. When OpenSSL 1.1.0 made all structs opaque, they didn't add setters/getters for all fields. Let's collect all getters and I'll file PR's with upstream.

It's probably a good thing to do, but given that it will take a ~decade before we can use them, I think for purposes of this issue we should look for solutions that don't require openssl changes :-).

@njsmith are you at the core sprints? I'd like to discuss TLS stuff with you.

Unfortunately I won't be at the core sprints this year :-(. Though I guess you have plenty of ways to talk to me online...

@dholth
Copy link

dholth commented Jul 23, 2019

Do you think some of the difficulty in porting to OpenSSL is due to hidden retries in the standard library ssl? Will that kind of thing get in the way of a standard api?

@njsmith
Copy link
Member Author

njsmith commented Jul 23, 2019

The only time I know about where the ssl module does hidden retries is in unwrap, which generally calls SSL_shutdown twice. This is a little inconvenient, because we often want to call SSL_shutdown just once. But it turns out not to be a problem, because of a lucky quirk: in the cases where we only want to call SSL_shutdown once, then the second call is effectively a no-op aside from raising SSLWantReadError, so we can just ignore this exception.

So it's annoying and subtle, but I think our abstraction layer will be able to hide it away just fine.

@njsmith
Copy link
Member Author

njsmith commented Aug 1, 2019

Here's a complication we'll eventually have to deal with if we want to support the native certificate store on Windows and macOS. (Which we probably do, because this is the only way to reliably handle stuff like internal corporate servers using a private CA.) On both of these platforms, the native cert checking code may go off and hit the network to download root certs and/or check revocation servers. So, if you want to do certificate checking, you have to somehow arrange to do it in a thread.

I think the basic approach is:

  1. disable certificate checking
  2. do the handshake as normal
  3. extract the certificate
  4. manually invoke the cert-checking code in a thread. For macOS this is SecTrustEvaluate, or maybe SecTrustEvaluateWithError. For Windows this is uh well stare at this for a while and hopefully you can figure it out.

This is totally doable, but if we're coming up with an abstraction layer over backends then our abstract interface will need to be able to support this flow. The little sketch I wrote a few comments up won't be enough.

This isn't required for a version 1, of course, but I wanted to write it down somewhere so we can find it again later.

There will also be some fun where servers will only want to do this if they're actually requesting a client cert (you don't want to spawn a new thread for every incoming connection if you can avoid it), and renegotiation adds more complications (because if you support renegotiation then you have to be prepared to validate new certificates after the handshake; maybe we don't want to support this but we need to at least think about it).

There's also an orthogonal question of whether you want to use OpenSSL everywhere, or use the native platform TLS library. You might think that using the native platform TLS library would make it easier to use the native platform trust store, but in practice this blocking issue means that this isn't really true – no matter which TLS library you use underneath, if you want to use the system trust store in an async library, you have to do the cert checking by hand.

The main advantage of using the native platform TLS library is that it gets security updates automatically with the OS, while if we use OpenSSL then it's going to be the one that was shipped inside the Python interpreter or statically linked inside a pyca/cryptography wheel, and many users are bad about upgrading those when new OpenSSL releases come out.

Unfortunately, on macOS they're deprecating the current TLS library (SecureTransport) and moving to a new one that has some limitations that make it mostly unusable for us: #1165. This process will take a while, so I guess we could support SecureTransport in the mean time, but it doesn't seem very attractive to add new code to support something we know is going away and will never support TLS 1.3.

On Windows, maybe it does make sense to use SChannel? I literally can't find any docs for it on MSDN, so I don't know how usable it is. But I know Steve Dower is very eager to get people to stop using OpenSSL on Windows :-). Grovelling through the oscrypto source will probably help us find some search terms to locate the right docs? I dunno.

@Tronic
Copy link
Contributor

Tronic commented Sep 24, 2019

Remember how great Python's SimpleHTTPServer was? It shows up in non-Python tutorials sometimes just because people want a convenient web server. Unfortunately, it doesn't support http/2 and a lot of other new things. My vision is to capture that same magic in an obnoxiously modern server: in one minute, you should be able to have your humorous platypus-based website or online pyweek game running, with https that doesn't give you that nagging feeling that you really should reconfigure it later with Mozilla's recommend cipher suite or something, with staggered automatic renewal and OCSP to be as polite as possible to the letsencrypt service. See also the caddy web server.

I encourage you to join the Sanic project. I'm currently adding Trio and HTTP/2 support, and help in polishing other details of the young project would be more than welcome. Writing HTTP applications couldn't really get much simpler, plus it also runs extremely fast (unlike most of the gazillion other Python HTTP frameworks).

# hello.py
import sanic
from sanic.response import text, json, html

app = sanic.Sanic()

@app.get("/")
async def frontpage(request):
    # maybe await something here...
    return text(f"Hello {request.ip}!")

@app.get("/api/person/<name>/<age:int>")
def person(request, name, age):
    return json({name: age})

Then python3 -m sanic --cert cert.pem --key key.pem hello.app

https://sanic.readthedocs.io/en/latest/

If you only need quick serving of static files, it is hard to beat Node's npx http-server. Unless the new module becomes part of the standard library, it'll never be as convenient as python -m SimpleHTTPServer on any random box with Python installed.

@njsmith
Copy link
Member Author

njsmith commented Oct 22, 2020

Just happened across this really interesting issue that Dart ran into, where they were using the platform-native cert validator on their main thread and it caused massive multi-second freezes: dart-lang/sdk#41519

One thing I learned from the thread: Apparently BoringSSL actually has built-in support for SSL_do_handshake returning not just WANT_READ and WANT_WRITE, but also WANT_CERTIFICATE_VERIFY. Unfortunately OpenSSL doesn't seem to have anything similar yet...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design discussion potential API breaker TLS Relevant to our TLS/SSL implementation
Projects
None yet
Development

No branches or pull requests

5 participants