New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try loading trusted certs from a list of fallbacks #633

Merged
merged 17 commits into from Jun 29, 2017

Conversation

Projects
None yet
6 participants
@reaperhulk
Member

reaperhulk commented Jun 8, 2017

pyca/cryptography will shortly begin shipping a wheel. Since SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the library, this will start failing to load the proper certificates for users on many linux distributions. To avoid this we can use the Go solution of iterating over a list of potential candidates and loading it when found.

fixes #632

Update: The approach has been modified to use the default cert file and default cert dir to detect whether the installed cryptography is sourced from a manylinux1 wheel. If it is (and the OpenSSL env vars that override default dirs are not set) then we load the fallbacks. This should address most if not all of the previous concerns that have been raised.

"""
for cafile in _CERTIFICATE_FILE_LOCATIONS:
if os.path.isfile(cafile):
self.load_verify_locations(cafile)

This comment has been minimized.

@reaperhulk

reaperhulk Jun 8, 2017

Member

If the file exists but is invalid this will currently raise an exception. Do we want to catch this and silently swallow it?

This comment has been minimized.

@Lukasa

Lukasa Jun 8, 2017

Member

Welcome to the catch-22 of this approach.

If you throw an exception then users on, say, Gentoo, are going to be real mad that they can no longer use the system trust store just because they put some random data at "/etc/pki/tls/cacert.pem", a path that has no special meaning on their system. If you don't throw an exception then you're silently swallowing potentially important errors. Maybe log?

Fun follow-on: should you load all certs at all these places? Does this open the user up to risks about having unexpected or malicious root certs elsewhere in the system suddenly be trusted by Python but nothing else? Doesn't this lead to PyOpenSSL behaving fundamentally differently to their native applications? Does this fundamentally mean that we should all abandon PyOpenSSL and just use PEP 543 instead? ARE WE NOT ALL DOOMED TO DIE ALONE?

This comment has been minimized.

@tiran

tiran Jun 8, 2017

I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release and using ID and ID_LIKE, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile and capath.

This comment has been minimized.

@reaperhulk

reaperhulk Jun 8, 2017

Member

@Lukasa these are all system directories where I wouldn't expect users to be dropping random files very often. And, if they do, the same problem applies to Go, which honestly seems like a reasonable defense.

Maybe raising a UserWarning if a file is found but is invalid? Or we could continue the loop and look for something else.

@tiran It's impractical to enumerate every possible distribution. What makes you think Go's approach doesn't work? It's not like Go binaries are unusual things and every TLS client connection made by one is doing trust roots in this fashion.

This comment has been minimized.

@tiran

tiran Jun 8, 2017

Go does more work to find the actual location. It takes the env vars into account and stops looking for certs when it finds a directory that contains at least on cert. https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go#L63

This comment has been minimized.

@squeaky-pl

squeaky-pl Jun 8, 2017

Maybe related, I do something similar to in Portable PyPy which ships OpenSSL as well: https://github.com/squeaky-pl/portable-pypy/blob/master/ssl3.py.patch

30.000 downloads after this change and nobody complained, which of course doesn't mean it's a right thing to do.

@Lukasa

This comment has been minimized.

Member

Lukasa commented Jun 8, 2017

/cc @dstufft

@Lukasa Lukasa requested a review from dstufft Jun 8, 2017

@@ -701,6 +711,35 @@ def set_default_verify_paths(self):
"""
set_result = _lib.SSL_CTX_set_default_verify_paths(self._context)
_openssl_assert(set_result == 1)
num = self._check_num_store_objects()

This comment has been minimized.

@tiran

tiran Jun 8, 2017

SSL_CTX uses lazy loading for capath directory. The check will always return 0 when a system has only capath and not a cafile.

This comment has been minimized.

@reaperhulk

reaperhulk Jun 8, 2017

Member

Ah, that's definitely a problem!

This comment has been minimized.

@tiran

tiran Jun 8, 2017

You have to detect capath yourself. https://github.com/python/cpython/blob/master/Lib/ssl.py#L335 take the env vars into account. The four function return the hard-coded paths and the names of the env vars:

X509_get_default_cert_file_env()
X509_get_default_cert_file()
X509_get_default_cert_dir_env()
X509_get_default_cert_dir()

The names of the files in capath match the pattern [0-9a-f]{8}\.[0-9]

This comment has been minimized.

@reaperhulk

reaperhulk Jun 8, 2017

Member

The env var behavior is still present and functional because we call SSL_CTX_set_default_verify_paths before we try the fallback, but we need to fix the lazy loading issue.

Do you know of a way to force the X509_STORE to load the certs? That seems like an easier path (if possible) than trying to replicate the logic.

"""
for cafile in _CERTIFICATE_FILE_LOCATIONS:
if os.path.isfile(cafile):
self.load_verify_locations(cafile)

This comment has been minimized.

@tiran

tiran Jun 8, 2017

I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release and using ID and ID_LIKE, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile and capath.

@dstufft

This comment has been minimized.

Member

dstufft commented Jun 8, 2017

So I tried this with pip. Some of the problems there won't apply here, because you're able to pass both a cafile and a capath (which we couldn't, because requests won't allow it) but some are still going to matter:

  • Detecting by distribution doesn't work, because there are like a billion distributions.
  • People are going to have random files at locations that other platforms use, if you're lucky these will be symlinks, if you're unlucky they'll be entirely unrelated files or completely unmaintained copies of a ca bundle.
  • Some systems don't use the CAFile at all and they only use a CAPath, it looks like this only looks at CAFile?
  • Some systems use both CAFile and CAPath and expect you to load both of them.

Honestly, I suggest just depending on certifi.

@alex

This comment has been minimized.

Member

alex commented Jun 8, 2017

My personal suggestion is that we hew as closely to what go does as possible: https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go (with the addition of trying whatever's compiled into OpenSSL first) and if that doesn't work it's ths users fault for having a weirdass computer.

@dstufft

This comment has been minimized.

Member

dstufft commented Jun 8, 2017

(Notably what Go is doing does not match what we're doing here)

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 8, 2017

@dstufft do you remember what systems only had CAPath? And wouldn't the CAPath and CAFile requirement be an OpenSSL specific bug? We're shipping our own OpenSSL so bugs like that wouldn't affect us.

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 8, 2017

We don't need to care about non-linux because we won't be shipping a wheel that causes this problem on those platforms.

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 8, 2017

It looks like

	"/etc/ssl/certs",               // SLES10/SLES11, https://golang.org/issue/12139
	"/etc/pki/tls/certs",           // Fedora/RHEL

are potentially relevant from Go's list.

@tiran

This comment has been minimized.

tiran commented Jun 8, 2017

On Fedora and RHEL, the /etc/pki/tls/certs directory does not contain any files that match the pattern for capath certs.

@squeaky-pl

This comment has been minimized.

squeaky-pl commented Jun 8, 2017

When I did a similar thing for Portable PyPy I found https://www.happyassassin.net/2015/01/12/a-note-about-ssltls-trusted-certificate-stores-and-platforms/ very helpful.

@codecov

This comment has been minimized.

codecov bot commented Jun 8, 2017

Codecov Report

Merging #633 into master will increase coverage by 0.24%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #633      +/-   ##
==========================================
+ Coverage   96.78%   97.02%   +0.24%     
==========================================
  Files          18       18              
  Lines        5625     6084     +459     
  Branches      390      497     +107     
==========================================
+ Hits         5444     5903     +459     
- Misses        121      122       +1     
+ Partials       60       59       -1
Impacted Files Coverage Δ
src/OpenSSL/SSL.py 94.9% <100%> (+0.13%) ⬆️
tests/test_ssl.py 99.11% <100%> (+0.01%) ⬆️
tests/test_crypto.py 98.57% <0%> (ø) ⬆️
src/OpenSSL/crypto.py 97.34% <0%> (+0.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9280c5...1959382. Read the comment docs.

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 8, 2017

Thanks for the info @squeaky-pl !

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 8, 2017

I have significantly modified the approach (and added a lot of comments to explain the logic). Coverage will be abysmal at the moment but I'd like to see if @tiran and @Lukasa are more comfortable with this.

@reaperhulk reaperhulk force-pushed the reaperhulk:default-path-fun branch 2 times, most recently from d014394 to 72f2ab9 Jun 9, 2017

reaperhulk added some commits Jun 8, 2017

try loading trusted certs from a list of fallbacks
pyca/cryptography will shortly begin shipping a wheel. Since
SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the
library, this will start failing to load the proper certificates for
users on many linux distributions. To avoid this we can use the Go
solution of iterating over a list of potential candidates and loading
it when found.
capath is lazy loaded so we need to do a lot more checks
This now checks to see if env vars are set as well as seeing if the
dir exists and has valid certs in it. If either of those are true (or
the number of certs is > 0) it won't load the fallback. If it does do
the fallback it will also attempt to load certs from a dir as a final
fallback

@reaperhulk reaperhulk force-pushed the reaperhulk:default-path-fun branch from 72f2ab9 to 5d9d7fe Jun 9, 2017

@reaperhulk reaperhulk force-pushed the reaperhulk:default-path-fun branch from 5d9d7fe to 4e1e7cc Jun 9, 2017

reaperhulk added some commits Jun 20, 2017

@@ -130,6 +132,19 @@ class _buffer(object):
SSL_CB_HANDSHAKE_START = _lib.SSL_CB_HANDSHAKE_START
SSL_CB_HANDSHAKE_DONE = _lib.SSL_CB_HANDSHAKE_DONE
# Taken from https://golang.org/src/crypto/x509/root_linux.go

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

Do we care about the BSD, Plan9 (lol), or Solaris values?

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

We won't ship a precompiled binary for those platforms so we shouldn't need to care.

file_env_var = _ffi.string(
_lib.X509_get_default_cert_file_env()
).decode("ascii")
if not self._verify_env_vars_set(dir_env_var, file_env_var):

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

Please name this check_env_vars_set.

@@ -699,8 +714,98 @@ def set_default_verify_paths(self):
:return: None
"""
# This function will attempt to load certs from both a cafile and
# capath that are set at compile time. However, it will first check
# environment variables and, if present, load those paths instead

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

This comment feels like it's in the wrong place. It checks the default paths before manualyl doing the env vars.

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

SSL_CTX_set_default_verify_paths itself checks env vars. It's part of OpenSSL itself. The logic here should look like this:

  1. Attempt to load via SSL_CTX_set_default_verify_paths. This will use the env vars specified by OpenSSL preferentially, but if they're not set (as they are not in most cases) then it will load the CA file and CA dir that were set at compile time.
  2. If env vars are set we do not attempt to load any fallbacks, but if they are not we go to the fallback path.

This comment has been minimized.

@alex

alex Jun 21, 2017

Member

Ah, ok, I see the ambiguity, "This function" refers to SSL_CTX_set_default_verify_paths but I thought it meant "the function we are in". Can you change the comment?

# objects are present. However, the cert directory (capath) is
# lazily loaded and num will always be zero so we need to check if
# the dir exists and has valid file names in it to cover that case.
num = self._check_num_store_objects()

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

What happens if you load your own roots and then call set_default? What should happen?

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

Prior to this patch it would load the system trust store properly even if you already had certs added. With this patch it would fail to do so. That is not what we want. Ideas? We could potentially track calls to load_verify_locations and run the fallback if it hasn't been called previously, but that is pretty ugly.

What if, instead of checking the number of certs in the store, we checked to see if the default cert file path and default cert dir path don't exist. If both are not present then we'd use fallbacks.

This comment has been minimized.

@alex

alex Jun 21, 2017

Member

Count the number of certs at the start, and instead of checking if the number after calling SSL_CTX_set... is zero, check if it's the same as at the start. (And add a test for it!)

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

That also works, although I'm wondering if it'd make more sense to just say if the file is there then we shouldn't need fallbacks.

Another idea would be that we could compile our OpenSSL such that the default dir and file is very unique. Like /pyca/cryptography/openssl/cert.pem. Then our entire fallback check could just be "is the default dir/file this value", because that would be a guard value to tell us we're a manylinux1 build.

This comment has been minimized.

@alex

alex Jun 21, 2017

Member

So you're saying dump most of this logic and instead check "is this is a pyca/cryptography OpenSSL, then try these known distro paths"?

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

Yeah, I didn't think of that until just now so I'm not sure I've thought through the implications, but it seems like rather than trying to detect if a system has successfully loaded roots maybe we're better off only modifying our behavior when we know we're running under a manylinux1 cryptography build.

:return: bool
"""
return (
os.environ.get(file_env_var, None) is not None or

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

You can drop the None arg to get

return any(
[re.match(b'^[0-9a-f]{8}\.[0-9]', x) is not None for x in l]
)
except OSError:

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

Only the os.listdir should be wrapped in the except, and only ENOENT should be caught, anyhting else can be reraised. Maybe EPERM as well, not sure.

try:
dir_var = "CUSTOM_DIR_VAR"
file_var = "CUSTOM_FILE_VAR"
os.environ[dir_var] = "value"

This comment has been minimized.

@alex

alex Jun 20, 2017

Member

Use monkeypatch for fucking with teh env.

@alex

This comment has been minimized.

Member

alex commented Jun 21, 2017

@alex

This comment has been minimized.

Member

alex commented Jun 21, 2017

reaperhulk added some commits Jun 21, 2017

"/etc/ssl/certs", # SLES10/SLES11
]
_CRYPTOGRAPHY_MANYLINUX1_CA_DIR = "/pyca/cryptography/openssl/certs"

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

These values will need to be set by using --openssldir=/pyca/cryptography/openssl --prefix=/pyca/cryptography/openssl as args to the configure script in pyca/infra#98

This comment has been minimized.

@dstufft

dstufft Jun 21, 2017

Member

Is the point of these values just to act as a sigil to detect if we're inside a manylinux1 wheel or not? Do we expect people to ever put anything inside of these directories?

This comment has been minimized.

@alex

alex Jun 21, 2017

Member

I think the answer is it's a sigil, we don't expect people to put stuff there, but as an intermediate hack it's maybe handy that someone could put something there in a pinch.

This comment has been minimized.

@dstufft

dstufft Jun 21, 2017

Member

If we expect people to maybe put something there in a pinch, maybe it should conform to a more standard location like /opt/pyca/cryptograpy/openssl/ or something? Not a big deal either way though, but it feels weird to have a top level /pyca/ path.

This comment has been minimized.

@reaperhulk

reaperhulk Jun 21, 2017

Member

I'll go ahead and make the change. Hopefully users will never need to put something in that path, but I guess it's not crazy to want the path to be sane so that it could be done.

@alex

Looks good to me. Would also like a review from @dstufft or someone else, and I think we should hold off on merging until we finalize the manylinux1 side of this.

@reaperhulk

This comment has been minimized.

Member

reaperhulk commented Jun 29, 2017

pyca/infra#98 has landed and pyca/cryptography#3736 is now in review. This should no longer be blocked.

@alex

alex approved these changes Jun 29, 2017

This looks good to me, but I'd like additional reviews from other folks!

@alex alex added this to the 17.0.0 milestone Jun 29, 2017

@reaperhulk reaperhulk modified the milestones: 17.0.0, 17.1.0 Jun 29, 2017

@alex alex merged commit 55fb341 into pyca:master Jun 29, 2017

3 checks passed

codecov/patch 100% of diff hit (target 96.78%)
Details
codecov/project 97.02% (+0.24%) compared to c9280c5
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment