Added PEMSerializationBackend interface #1276

Merged
merged 5 commits into from Sep 8, 2014

Projects

None yet

7 participants

@alex
Python Cryptographic Authority member

No description provided.

@public
Python Cryptographic Authority member

If we are doing a backend that handles format differences for you, is there actually a good reason for it to be specific to PEM? I'm still not sure why it actually needs to be a backend interface either.

@coveralls

Coverage Status

Coverage remained the same when pulling e0e9541 on alex:pem-loading-backend into ad116e2 on pyca:master.

@alex
Python Cryptographic Authority member

Could I get some opinions on this? I'd love to run with this and get the PEM public key loading done, since many people have asked for that

@dreid dreid commented on an outdated diff Sep 8, 2014
docs/hazmat/backends/interfaces.rst
@@ -578,6 +578,23 @@ A specific ``backend`` may provide one or more of these interfaces.
:class:`~cryptography.hazmat.primitives.interfaces.EllipticCurvePublicKey`
provider.
+.. class:: PEMSerializationBackend
+
+ .. versionadded:: 0.6
+
+ A backend with methods for working with any PEM encoded keys.
+
+ .. method:: load_pem_private_key(data, password)
+
+ :param bytes data: PEM data to load.
+ :param bytes password: The password to use if the data is encrypted.
+ Should be ``None`` is the data is not encrypted.
@dreid
dreid Sep 8, 2014

if the data is not is the data is

@reaperhulk
Python Cryptographic Authority member

I think it makes sense for this to be a backend interface since we do backend capability discovery via the interfaces and it's possible for a backend to not support PEMSerialization at all. Is the plan to add a load_pem_public_key method to this interface in the future?

@alex
Python Cryptographic Authority member
@coveralls

Coverage Status

Coverage remained the same when pulling 99e61ea on alex:pem-loading-backend into e9d027a on pyca:master.

@reaperhulk reaperhulk merged commit 86dd834 into pyca:master Sep 8, 2014

1 check was pending

Details continuous-integration/travis-ci The Travis CI build is in progress
@alex alex deleted the alex:pem-loading-backend branch Sep 8, 2014
@ronf

Here are some thoughts on the API:

  • I like what you're proposing of having a generic "load" function which doesn't require the caller to know or care what the key type is. I wonder if it should even be specific to PEM, though. Are you planning to have other APIs to import DER format keys, for instance, or other forms such as RFC-4716 or OpenSSH? In most case, those formats could be auto-recognized just as you already appear to be planning to auto-recognize the difference between PKCS#1 and PKCS#8 PEM formats.

  • The "password" argument should probably be called "passphrase", to be more consistent with terminology used in OpenSSL and OpenSSH.

  • Something should probably be said about the expected type of the passphrase. Right now you show it as bytes, but Unicode data needs special handling which varies depending on the key type. For instance, PBKDF2 expects UTF-8 encoded bytes, but PKCS#12 key derivation uses UTF-16 bigendian encoding of the passphrase, and adds a null word (two null bytes) at the end. This type of encoding is used for some of the older PKCS#8 ciphers like DES, RC2, and RC4 with SHA1 when using PBE version 1. In my AsyncSSH code, I allow the passphrase to be passed as either bytes or a Unicode string, and I convert the string to bytes using the proper encoding automatically once I see the cipher being used.

  • You may want to think about having an API that can load a list of keys, when multiple PEM blocks are concatenated together into a single file/buffer. This is pretty common, and could be difficult for the caller to do if you're doing all of the PEM parsing for them.

If you'd like to see what I did in AsyncSSH in terms of an API for this, check out:

http://asyncssh.timeheart.net/api.html#public-key-support

This also includes export APIs on key objects to serialize them back out in various formats including multiple varieties of DER & PEM, OpenSSH, and RFC4716.

@public
Python Cryptographic Authority member

I think our current API is designed a lot around the limitations of the public OpenSSL key handling APIs. Combined with our reluctance to add PyASN1 as a dependency this makes doing some of the stuff you suggest rather tricky :( I would very much like pretty much all the features you suggested though. (Although handling DER OpenSSL keys is hard since they aren't self describing.)

I would actually quite like to do all the ASN1 parsing ourselves for key serialization because it would make the sort of API you describe very easy but the OpenSSL APIs for doing that are poorly documented and not terribly user friendly.

e.g. Password encoding is a thing we simply don't handle at all right now as you point out. This is because OpenSSL doesn't either. (Unicode password handling is commented out in the OpenSSL source.)

@ronf

Yeah, for the minimal amount of ASN.1 support I needed for this, I also didn't want to bring in all of PyASN1, as really all that was needed was basic DER support. So, instead of 3000+ lines, I coded a very simple DER implementation as a single standalone module which is around 500 lines of code that provides both encode & decode functions that convert serialized data to/from native Python types where they exist, with a few special classes to represent other types like bit strings and OIDs.

While I understand the reluctance to duplicate code already in OpenSSL, I looked at this code as being more helper functions than crypto code itself, so I thought it was a reasonable tradeoff. I'm happy to respect whatever you decide here, though, and I've managed to work around it in AsyncSSH by having my own DER/PEM code interface with PyCA at the public/private numbers API when importing keys.

If you did want to use my ASN.1 code, I'd be happy to provide a version under a compatible license.

Regarding the specific point about the DER keys not being self-describing, that's only a problem in the PKCS#1 case really, as PKCS#8 is pretty self-describing even in DER form once you extract the OIDs. For PKCS#1 keys the best I've been able to do is try each of the different key types until I find one where the structure of the DER data matches what is expected. That's a bit ugly, but so far I haven't come up with a better alternative and it does seem to work well in practice. Since these encodings are legacy and we probably won't see any new ones, I think it's probably pretty safe.

Regarding OpenSSL Unicode support, it might be possible to use the OpenSSL APIs by always passing it a byte array as the passphrase, but "peek" into the DER data in Python to see what the key type was to decide which Unicode encoding to use before making the call. So, the caller would provide a Unicode string to the Python wrapper, but OpenSSL would get the properly encoded binary version. It would only be needed for PBE-encrypted keys, and determining whether they are one of the PCKS#12 encodings wouldn't be all that difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment