New version of VaultAES256: no padding, base64 output (plus some cleanups) #12130

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
8 participants
@amenonsen
Contributor

amenonsen commented Aug 27, 2015

These don't alter any functionality, but they set the stage for comprehensive improvements in the AES256 cipher while maintaining backwards compatibility more easily than was possible before. These simplifications will eventually also make it simpler to complete the VaultFile prototype and enable transparent vault usage from playbooks.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Aug 28, 2015

Contributor

The ultimate point of this cleanup is explained in the discussion on the PR and issue linked above; this PR makes it possible to solve both problems in an elegant and backwards-compatible way.

Contributor

amenonsen commented Aug 28, 2015

The ultimate point of this cleanup is explained in the discussion on the PR and issue linked above; this PR makes it possible to solve both problems in an elegant and backwards-compatible way.

@graingert

This comment has been minimized.

Show comment
Hide comment

I think this should use https://cryptography.io/en/latest/hazmat/primitives/symmetric-encryption/#cryptography.hazmat.primitives.ciphers.Cipher because cryptography is a well known and trusted interface

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Aug 28, 2015

Contributor

@graingert and what we use from PyCrypto isn't well-known and trusted? (It's a real question, I honestly don't know.) I've heard people speak about using cryptography whenever available, but I'm not familiar with the tradeoffs/maintenance status/etc.

Unless there's some special reason to avoid PyCrypto, I don't think switching from one AES-CTR+HMAC-SHA-256 implementation to another is particularly compelling. I see that cryptography implements Fernet (https://github.com/fernet/spec/blob/master/Spec.md), though. I haven't reviewed the implementation (in particular, I don't know how it generates its keys), and I really prefer AES-CTR to AES-CBC, but using an existing high-level authenticated encryption implementation does have certain attractions.

Contributor

amenonsen commented Aug 28, 2015

@graingert and what we use from PyCrypto isn't well-known and trusted? (It's a real question, I honestly don't know.) I've heard people speak about using cryptography whenever available, but I'm not familiar with the tradeoffs/maintenance status/etc.

Unless there's some special reason to avoid PyCrypto, I don't think switching from one AES-CTR+HMAC-SHA-256 implementation to another is particularly compelling. I see that cryptography implements Fernet (https://github.com/fernet/spec/blob/master/Spec.md), though. I haven't reviewed the implementation (in particular, I don't know how it generates its keys), and I really prefer AES-CTR to AES-CBC, but using an existing high-level authenticated encryption implementation does have certain attractions.

@graingert

This comment has been minimized.

Show comment
Hide comment
@graingert

graingert Aug 28, 2015

I think what's compelling is how simple the code becomes:

>>> import os
>>> from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
>>> from cryptography.hazmat.backends import default_backend
>>> backend = default_backend()
>>> key = os.urandom(32)
>>> iv = os.urandom(16)
>>> cipher = Cipher(algorithms.AES(key), modes.CTR(iv), backend=backend)
>>> encryptor = cipher.encryptor()
>>> ct = encryptor.update(b"a secret message") + encryptor.finalize()
>>> decryptor = cipher.decryptor()
>>> decryptor.update(ct) + decryptor.finalize()

Edit: Ah this doesn't do HMAC.

I think what's compelling is how simple the code becomes:

>>> import os
>>> from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
>>> from cryptography.hazmat.backends import default_backend
>>> backend = default_backend()
>>> key = os.urandom(32)
>>> iv = os.urandom(16)
>>> cipher = Cipher(algorithms.AES(key), modes.CTR(iv), backend=backend)
>>> encryptor = cipher.encryptor()
>>> ct = encryptor.update(b"a secret message") + encryptor.finalize()
>>> decryptor = cipher.decryptor()
>>> decryptor.update(ct) + decryptor.finalize()

Edit: Ah this doesn't do HMAC.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Aug 28, 2015

Contributor

Yeah, it doesn't do HMAC, and it doesn't use PBKDF2 for key generation. Besides, the whole encryptor()+encryptor.update()+encryptor.finalize() thing is actually more code than we have now.

Contributor

amenonsen commented Aug 28, 2015

Yeah, it doesn't do HMAC, and it doesn't use PBKDF2 for key generation. Besides, the whole encryptor()+encryptor.update()+encryptor.finalize() thing is actually more code than we have now.

@amenonsen amenonsen changed the title from Vault cleanups, phase #2 to New version of VaultAES256: no padding, base64 output (plus some cleanups) Sep 2, 2015

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 2, 2015

Contributor

I've changed the title of this PR (from "Vault cleanups, phase 2") and added the commit that introduces a new version of the VaultAES256 cipher that doesn't use padding and Base64-encodes output. This should make it easier to understand the point of the cleanups.

Contributor

amenonsen commented Sep 2, 2015

I've changed the title of this PR (from "Vault cleanups, phase 2") and added the commit that introduces a new version of the VaultAES256 cipher that doesn't use padding and Base64-encodes output. This should make it easier to understand the point of the cleanups.

@mgedmin

View changes

test/units/parsing/vault/test_vault_editor.py
@@ -207,7 +208,7 @@ def test_rekey_migration(self):
os.unlink(v10_file.name)
- assert vl.cipher_name == "AES256", "wrong cipher name set after rekey: %s" % vl.cipher_name
+ assert re.search(';AES256',fdata), "wrong cipher name set after rekey"

This comment has been minimized.

@mgedmin

mgedmin Sep 3, 2015

Contributor

You could do the same ';AES256' in fdata check here as in test_vault_editor.py, and avoid the re module.

@mgedmin

mgedmin Sep 3, 2015

Contributor

You could do the same ';AES256' in fdata check here as in test_vault_editor.py, and avoid the re module.

This comment has been minimized.

@amenonsen

amenonsen Sep 3, 2015

Contributor

Sorry, I meant to fix both, but forgot to commit one hunk. Now fixed.

@amenonsen

amenonsen Sep 3, 2015

Contributor

Sorry, I meant to fix both, but forgot to commit one hunk. Now fixed.

@graingert

View changes

lib/ansible/parsing/vault/__init__.py
+
+ CTR mode requires that the same key and the same counter are only ever
+ used once for encryption. Since we generate the key using a random salt
+ each time, it would be fine to always start the counter at 0.

This comment has been minimized.

@graingert

graingert Sep 3, 2015

Or just a random 16bit IV.

@graingert

graingert Sep 3, 2015

Or just a random 16bit IV.

This comment has been minimized.

@amenonsen

amenonsen Sep 3, 2015

Contributor

The "problem" with a random 16-byte (not bit) IV is that it'll have to be stored alongside the random 32-byte salt in the output. It wouldn't hurt anything, but since there's also no point to doing it, I don't prefer this solution.

@amenonsen

amenonsen Sep 3, 2015

Contributor

The "problem" with a random 16-byte (not bit) IV is that it'll have to be stored alongside the random 32-byte salt in the output. It wouldn't hurt anything, but since there's also no point to doing it, I don't prefer this solution.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 8, 2015

Contributor

@abadger IIRC you had concerns about whether the version number should belong to the cipher or the container. Did you get a chance to think about this further? I can recast the PR accordingly.

FYI: I've squashed this down to three separate commits now for easier review (it could have been two, but I left the .version change separate in case it needs to be reworked).

@ldx since you recently contributed changes to make the vault faster, maybe you have some thoughts about this code too?

Contributor

amenonsen commented Sep 8, 2015

@abadger IIRC you had concerns about whether the version number should belong to the cipher or the container. Did you get a chance to think about this further? I can recast the PR accordingly.

FYI: I've squashed this down to three separate commits now for easier review (it could have been two, but I left the .version change separate in case it needs to be reworked).

@ldx since you recently contributed changes to make the vault faster, maybe you have some thoughts about this code too?

@abadger

This comment has been minimized.

Show comment
Hide comment
@abadger

abadger Sep 10, 2015

Member

VaultLib:

  • Perhaps we want to keep a vault file version in addition to the individual cipher versions. It would make it easier to change the algorithm that is used to transform the binary data into ascii in the future, for instance.
  • Need a method that parses the header information of a vault file and returns the cipher name and cipher version. This is needed by VaultEditor.edit_file()

_cipher:

  • Have this function return the cipher class (ie: Cipher). Make it the responsibility of the calling code to instantiate the class. that way the calling code can pass additional arguments to the constructor.

_format_output:

  • Since the cipher_name is more important than the cipher_version, put that first. Also make cipher_name a byte string. Slightly less efficient but makes the interface cleaner. New signature is: _format_output(self, b_cipher_name, b_cipher_version, b_data)
  • We can remove the check for cipher_name as that is now passed in as a required parameter instead of being set on the class.
  • _format_output has lost the ability to break the data into 80 column lines. Need to restore that.

encrypt:

  • Need to take a parameter for the cipher_version as well as cipher_name. Default None is fine and then we can let the cipher choose the version it wants to use.
  • Need to add cipher version to the CIPHER_WHITELIST so that we can remove individual versions of the cipher should they prove insecure.
  • With the changes to _cipher(), setting the cipher variable becomes:
    cipher = self._cipher(cipher_name)(version=cipher_version)

_parse_vault_file:

  • Return all byte strings. Make it the responsibility of the calling code to transform any of the strings to unicode type if necessary. Slightly less efficient but it makes the interface much cleaner.
  • Prepend the variables in this function that are holding byte strings with b_.

decrypt:

  • Rename version to b_version so we know it holds a byte string.
  • With the change to _parse_vault_file() to make callers responsible for transforming to unicode, we should be setting b_cipher_name from the return value of _parse_vault_file(). Then we'll set cipher_name = to_unicode(b_cipher_name)
  • Need to add cipher version information to CIPHER_WHITELIST and check for that here.
  • Might not want to use "encryption cipher" in the error message as we're decrypting here and that might confuse users. How about u"Cipher used to encrypt this vault file is unrecognized: {0}".format(cipher_name)
  • With the changes to _cipher(), setting the cipher variable becomes:
    cipher = self._cipher(cipher_name)(version=b_version)

VaultEditor:

_edit_file_helper:

  • Needs to send the cipher_name and cipher_version when it calls self.vault.encrypt()
  • cipher_name and cipher_version need to be passed to _edit_file_helper() so that it knows what to pass along to self.vault.encrypt()

encrypt_file:

  • Needs to take an cipher_name and cipher_version. Needs to pass those along to self.vault.encrypt()

create_file:

  • Needs to take a cipher_name and cipher_version. Needs to pass those along to _edit_file_helper()

edit_file:

  • No longer a self.vault.cipher_name attribute that it can use. Need to call a new VaultLb function to retrieve the cipher_name and cipher_version
  • Needs to compare both cipher_name and cipher_version to the CIPHER_WRITE_WHITELIST.

rekey_file:

  • Need to take a cipher_name and cipher_version as parameters and pass those on to new_vault.encrypt()

VaultAES:

init:

  • Needs to take version as a parameter. Needs to compare whether version is equal to b'1.1' and raise an error if it does not match.

decrypt:

  • Should not take a version parameter since we're setting that in init()

VaultAES256:

init:

  • Need to take version as a parameter. Needs to compare whether version is allowed (b'1.1' or b'1.2') raise an error if no match

encrypt:

  • Need to be able to write 1.1 compatible encrypted data in case the user needs to maintain compatibility with older /usr/bin/ansible on other machines.
  • Use base64.b64encode/b64decode instead of codecs.encode/decode('base64'). Upstream python has talked about not having encode/decode do any bytes<=>bytes or text<=>text transformations in the future so this is future-proofing.
    import base64
    base64.b64encode/decode

Other files:

  • Probably need to add a command line arg to ansible-vault to allow selecting different ciphers+cipher_version

Other thoughts:

  • Since I'll be out, if you think that we don't need to be able to write VaultAES256 1.1 compatible files, see if you can talk jimi-c into that viewpoint. I'll go along with him if he thinks it's okay to only support a single format for writing.
  • In the ciphers, version can either go into the init() and become an attribute (self.version) or it can be a parameter of both decrypt and encrypt. I think it makes more sense in init() but you could try to justify the other direction instead. If we go the other way, then the version parameter needs to be checked to be sure that it is valid wherever it is given as a parameter.
Member

abadger commented Sep 10, 2015

VaultLib:

  • Perhaps we want to keep a vault file version in addition to the individual cipher versions. It would make it easier to change the algorithm that is used to transform the binary data into ascii in the future, for instance.
  • Need a method that parses the header information of a vault file and returns the cipher name and cipher version. This is needed by VaultEditor.edit_file()

_cipher:

  • Have this function return the cipher class (ie: Cipher). Make it the responsibility of the calling code to instantiate the class. that way the calling code can pass additional arguments to the constructor.

_format_output:

  • Since the cipher_name is more important than the cipher_version, put that first. Also make cipher_name a byte string. Slightly less efficient but makes the interface cleaner. New signature is: _format_output(self, b_cipher_name, b_cipher_version, b_data)
  • We can remove the check for cipher_name as that is now passed in as a required parameter instead of being set on the class.
  • _format_output has lost the ability to break the data into 80 column lines. Need to restore that.

encrypt:

  • Need to take a parameter for the cipher_version as well as cipher_name. Default None is fine and then we can let the cipher choose the version it wants to use.
  • Need to add cipher version to the CIPHER_WHITELIST so that we can remove individual versions of the cipher should they prove insecure.
  • With the changes to _cipher(), setting the cipher variable becomes:
    cipher = self._cipher(cipher_name)(version=cipher_version)

_parse_vault_file:

  • Return all byte strings. Make it the responsibility of the calling code to transform any of the strings to unicode type if necessary. Slightly less efficient but it makes the interface much cleaner.
  • Prepend the variables in this function that are holding byte strings with b_.

decrypt:

  • Rename version to b_version so we know it holds a byte string.
  • With the change to _parse_vault_file() to make callers responsible for transforming to unicode, we should be setting b_cipher_name from the return value of _parse_vault_file(). Then we'll set cipher_name = to_unicode(b_cipher_name)
  • Need to add cipher version information to CIPHER_WHITELIST and check for that here.
  • Might not want to use "encryption cipher" in the error message as we're decrypting here and that might confuse users. How about u"Cipher used to encrypt this vault file is unrecognized: {0}".format(cipher_name)
  • With the changes to _cipher(), setting the cipher variable becomes:
    cipher = self._cipher(cipher_name)(version=b_version)

VaultEditor:

_edit_file_helper:

  • Needs to send the cipher_name and cipher_version when it calls self.vault.encrypt()
  • cipher_name and cipher_version need to be passed to _edit_file_helper() so that it knows what to pass along to self.vault.encrypt()

encrypt_file:

  • Needs to take an cipher_name and cipher_version. Needs to pass those along to self.vault.encrypt()

create_file:

  • Needs to take a cipher_name and cipher_version. Needs to pass those along to _edit_file_helper()

edit_file:

  • No longer a self.vault.cipher_name attribute that it can use. Need to call a new VaultLb function to retrieve the cipher_name and cipher_version
  • Needs to compare both cipher_name and cipher_version to the CIPHER_WRITE_WHITELIST.

rekey_file:

  • Need to take a cipher_name and cipher_version as parameters and pass those on to new_vault.encrypt()

VaultAES:

init:

  • Needs to take version as a parameter. Needs to compare whether version is equal to b'1.1' and raise an error if it does not match.

decrypt:

  • Should not take a version parameter since we're setting that in init()

VaultAES256:

init:

  • Need to take version as a parameter. Needs to compare whether version is allowed (b'1.1' or b'1.2') raise an error if no match

encrypt:

  • Need to be able to write 1.1 compatible encrypted data in case the user needs to maintain compatibility with older /usr/bin/ansible on other machines.
  • Use base64.b64encode/b64decode instead of codecs.encode/decode('base64'). Upstream python has talked about not having encode/decode do any bytes<=>bytes or text<=>text transformations in the future so this is future-proofing.
    import base64
    base64.b64encode/decode

Other files:

  • Probably need to add a command line arg to ansible-vault to allow selecting different ciphers+cipher_version

Other thoughts:

  • Since I'll be out, if you think that we don't need to be able to write VaultAES256 1.1 compatible files, see if you can talk jimi-c into that viewpoint. I'll go along with him if he thinks it's okay to only support a single format for writing.
  • In the ciphers, version can either go into the init() and become an attribute (self.version) or it can be a parameter of both decrypt and encrypt. I think it makes more sense in init() but you could try to justify the other direction instead. If we go the other way, then the version parameter needs to be checked to be sure that it is valid wherever it is given as a parameter.
@srgvg

This comment has been minimized.

Show comment
Hide comment
@srgvg

srgvg Sep 10, 2015

Member

@abadger is that a roadmap in your comment or are you just happy to see this pull request?

Member

srgvg commented Sep 10, 2015

@abadger is that a roadmap in your comment or are you just happy to see this pull request?

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 10, 2015

Contributor

@abadger @jimi-c didn't need much convincing:

 18:00  <jimi|ansible> crab: yes that's how we did it last time
 18:00  <jimi|ansible> vault reads all but writes latest

I'll work on the changes from that perspective.

Contributor

amenonsen commented Sep 10, 2015

@abadger @jimi-c didn't need much convincing:

 18:00  <jimi|ansible> crab: yes that's how we did it last time
 18:00  <jimi|ansible> vault reads all but writes latest

I'll work on the changes from that perspective.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 12, 2015

Contributor

Hi @abadger.

Thanks for taking the time to comment on this PR. I went through your
list, and here are some thoughts.

The gist is that I don't disagree with anything you're proposing, but I
think they're generalisations that should be implemented when there's an
actual use-case for them (e.g. when GPG support is added). I don't think
it makes sense to include many of them in this PR.

Also note that the comments below are based on the perspective that we
stay with the "read any, write latest" strategy.

VaultLib:

  • Perhaps we want to keep a vault file version in addition to the
    individual cipher versions.

Perhaps, but it can be added later (in a backwards-compatible way).

  • Need a method that parses the header information of a vault file and
    returns the cipher name and cipher version. This is needed by
    VaultEditor.edit_file()

I don't understand why this is needed (or perhaps how it would be used),
but _parse_vault_file should be easily adaptable.

_cipher:

  • Have this function return the cipher class (ie: Cipher).

OK.

_format_output:

  • … New signature is: _format_output(self, b_cipher_name, b_cipher_version, b_data)
  • We can remove the check for cipher_name …

OK.

  • _format_output has lost the ability to break the data into 80
    column lines. Need to restore that.

I don't think this is needed, at least until there's an actual cipher
that needs this. Base64 encoding takes care of this already, and GPG
won't need it either.

On the other hand, I observe that base64.b64encode() doesn't wrap the
output, and the encodestring() method (which does) is described as being
a "legacy" interface. So I'll see what makes the most sense (though to
be honest, I don't see the point of wrapping at 80 chars).

encrypt:

  • Need to take a parameter for the cipher_version as well as
    cipher_name.

Not needed per "write latest".

  • Need to add cipher version to the CIPHER_WHITELIST so that we can
    remove individual versions of the cipher should they prove insecure.

Can be done later.

_parse_vault_file:

  • Return all byte strings. …
  • Prepend the variables in this function that are holding byte
    strings with b_.

OK.

decrypt:

  • Rename version to b_version so we know it holds a byte string.

OK.

  • With the change to _parse_vault_file() to make callers responsible
    for transforming to unicode, we should be setting b_cipher_name from
    the return value of _parse_vault_file(). Then we'll set cipher_name
    = to_unicode(b_cipher_name)

OK.

  • Need to add cipher version information to CIPHER_WHITELIST and
    check for that here.

Again, can be done later when there's an actual need for it.

  • Might not want to use "encryption cipher" in the error message as
    we're decrypting here and that might confuse users. How about
    u"Cipher used to encrypt this vault file is unrecognized:
    {0}".format(cipher_name)

OK.

VaultEditor:

As far as I can tell, all of these changes can be done later.

VaultAES: […move version from decrypt to init…]
VaultAES256: […ditto…]

OK.

encrypt:

  • Need to be able to write 1.1 compatible encrypted data in case the
    user needs to maintain compatibility with older /usr/bin/ansible on
    other machines.

Not needed per "write latest".

  • Use base64.b64encode/b64decode instead of codecs.encode/decode('base64').

OK.

Other files:

  • Probably need to add a command line arg to ansible-vault to allow
    selecting different ciphers+cipher_version

Can be done later, e.g. when GPG support is introduced.

I would be happy to resubmit with the changes marked OK above; let me
know if you'd be willing to merge that. As for the remaining changes, I
don't think they belong with this PR (and I think it's premature to do
them before there's a concrete use-case), and I don't think it's fair
to make them dependencies for the change that I proposed.

Contributor

amenonsen commented Sep 12, 2015

Hi @abadger.

Thanks for taking the time to comment on this PR. I went through your
list, and here are some thoughts.

The gist is that I don't disagree with anything you're proposing, but I
think they're generalisations that should be implemented when there's an
actual use-case for them (e.g. when GPG support is added). I don't think
it makes sense to include many of them in this PR.

Also note that the comments below are based on the perspective that we
stay with the "read any, write latest" strategy.

VaultLib:

  • Perhaps we want to keep a vault file version in addition to the
    individual cipher versions.

Perhaps, but it can be added later (in a backwards-compatible way).

  • Need a method that parses the header information of a vault file and
    returns the cipher name and cipher version. This is needed by
    VaultEditor.edit_file()

I don't understand why this is needed (or perhaps how it would be used),
but _parse_vault_file should be easily adaptable.

_cipher:

  • Have this function return the cipher class (ie: Cipher).

OK.

_format_output:

  • … New signature is: _format_output(self, b_cipher_name, b_cipher_version, b_data)
  • We can remove the check for cipher_name …

OK.

  • _format_output has lost the ability to break the data into 80
    column lines. Need to restore that.

I don't think this is needed, at least until there's an actual cipher
that needs this. Base64 encoding takes care of this already, and GPG
won't need it either.

On the other hand, I observe that base64.b64encode() doesn't wrap the
output, and the encodestring() method (which does) is described as being
a "legacy" interface. So I'll see what makes the most sense (though to
be honest, I don't see the point of wrapping at 80 chars).

encrypt:

  • Need to take a parameter for the cipher_version as well as
    cipher_name.

Not needed per "write latest".

  • Need to add cipher version to the CIPHER_WHITELIST so that we can
    remove individual versions of the cipher should they prove insecure.

Can be done later.

_parse_vault_file:

  • Return all byte strings. …
  • Prepend the variables in this function that are holding byte
    strings with b_.

OK.

decrypt:

  • Rename version to b_version so we know it holds a byte string.

OK.

  • With the change to _parse_vault_file() to make callers responsible
    for transforming to unicode, we should be setting b_cipher_name from
    the return value of _parse_vault_file(). Then we'll set cipher_name
    = to_unicode(b_cipher_name)

OK.

  • Need to add cipher version information to CIPHER_WHITELIST and
    check for that here.

Again, can be done later when there's an actual need for it.

  • Might not want to use "encryption cipher" in the error message as
    we're decrypting here and that might confuse users. How about
    u"Cipher used to encrypt this vault file is unrecognized:
    {0}".format(cipher_name)

OK.

VaultEditor:

As far as I can tell, all of these changes can be done later.

VaultAES: […move version from decrypt to init…]
VaultAES256: […ditto…]

OK.

encrypt:

  • Need to be able to write 1.1 compatible encrypted data in case the
    user needs to maintain compatibility with older /usr/bin/ansible on
    other machines.

Not needed per "write latest".

  • Use base64.b64encode/b64decode instead of codecs.encode/decode('base64').

OK.

Other files:

  • Probably need to add a command line arg to ansible-vault to allow
    selecting different ciphers+cipher_version

Can be done later, e.g. when GPG support is introduced.

I would be happy to resubmit with the changes marked OK above; let me
know if you'd be willing to merge that. As for the remaining changes, I
don't think they belong with this PR (and I think it's premature to do
them before there's a concrete use-case), and I don't think it's fair
to make them dependencies for the change that I proposed.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 13, 2015

Contributor

I've pushed a new version with most of the changes discussed above, except for pushing unicode conversion to the callers of _format_output and _parse_vault_file (specifically, I've left the cipher name and version as unicode strings; pushing it to the callers made for more ugly boilerplate code). IMHO that's about as far as it's reasonable to clean things up in the context of this PR.

Contributor

amenonsen commented Sep 13, 2015

I've pushed a new version with most of the changes discussed above, except for pushing unicode conversion to the callers of _format_output and _parse_vault_file (specifically, I've left the cipher name and version as unicode strings; pushing it to the callers made for more ugly boilerplate code). IMHO that's about as far as it's reasonable to clean things up in the context of this PR.

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 19, 2015

Contributor

@abadger did you get a chance to look through these changes?

Contributor

amenonsen commented Sep 19, 2015

@abadger did you get a chance to look through these changes?

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Sep 30, 2015

Contributor

@abadger I've pushed the changes you requested, to add a separate header version field (set to 1.2 here) in addition to the cipher version (which works as before). The test failures are due to unrelated breakage in devel caused by commit 79e364d, but I did update the tests for the changes in this PR (and those pass):

kea:~/extern/ansible $ PYTHONPATH=./lib nosetests -d -w test/units/parsing/vault -q
----------------------------------------------------------------------
Ran 14 tests in 0.640s

OK
Contributor

amenonsen commented Sep 30, 2015

@abadger I've pushed the changes you requested, to add a separate header version field (set to 1.2 here) in addition to the cipher version (which works as before). The test failures are due to unrelated breakage in devel caused by commit 79e364d, but I did update the tests for the changes in this PR (and those pass):

kea:~/extern/ansible $ PYTHONPATH=./lib nosetests -d -w test/units/parsing/vault -q
----------------------------------------------------------------------
Ran 14 tests in 0.640s

OK

@abadger abadger added this to the v2 milestone Oct 1, 2015

@abadger

This comment has been minimized.

Show comment
Hide comment
@abadger

abadger Oct 1, 2015

Member

Marking v2 as there's a few things in here that are fixes we'd really want for v2 as well.

Member

abadger commented Oct 1, 2015

Marking v2 as there's a few things in here that are fixes we'd really want for v2 as well.

@jimi-c jimi-c modified the milestones: next, v2 Oct 8, 2015

amenonsen and others added some commits Aug 27, 2015

VaultLib interface cleanups: remove .cipher_name, move .version
As noted in the comments in PR #12112, the cipher_name stored in the
VaultLib conflates the desired cipher for encryption with the cipher
read from an encrypted file header. This change makes the encryption
cipher a parameter to encrypt() and changes _split_header() to return
the extracted values so that decrypt() can use the appropriate cipher
name (which a new helper transforms into a VaultXXX class).

As a consequence, _split_header is renamed to _parse_vault_file, which
also becomes more strict about parsing vault headers. (This also means
that decrypt() doesn't have to check is_encrypted() separately, since
the validation is being done anyway.)

We also move b_version from VaultLib to the individual cipher classes so
that they can manage their own versioning; we also pass the version from
the file header to the cipher's decrypt() method.

This doesn't change any functionality, but it paves the way for easier
format changes without having to introduce ever more cipher classes.
Introduce an improved AES256 cipher version
Now we don't add spurious padding for AES-CTR (which is used as a stream
cipher, and doesn't need padding) and we use base64 to encode the output
message instead of blowing up its size by hexlify()ing everything twice.
We also don't unnecessarily use PBKDF2 to generate the initial value for
the encryption counter.

We always use the new format for output, but maintain read-compatibility
with a few lines of 1.1-specific decoding in decrypt(). (Note that we do
not force v1.1 files to be rewritten by ansible-vault edit unless their
contents have actually changed, in which case the new cipher version is
used automatically).

Closes #10634 from graingert (which removed the padding, but added yet
another cipher class for backwards compatibility)

Fixes #12121 from mgedmin (which reported the double-hexlify() problem)
Add a vault header version in addition to the cipher version
The header now looks like this: $ANSIBLE_VAULT;1.2;AES256;1.2

The vault header version is currently not used to convey any extra
information. It's included in the hope that it will help to escape
unforeseen problems.

The earlier _format_output method is renamed to _add_vault_header, and
is no longer responsible for wrapping the output at 80 columns (cipher
classes are now responsible for doing their own formatting; GPG would
need this anyway).

The _parse_vault_file method is renamed to _parse_vault_header, and is
no longer responsible for unwrapping the ciphertext (cipher classes are
responsible for undoing whatever formatting they originally did).

A new decrypt_with_metadata method is introduced primarily so that
VaultEditor can find the cipher_name to see if it wants to upgrade
to a new cipher or not.

These changes were made at the request of @abadger.
Four changes related to parsing the Vault File header
* Have internal VaultLib functions dealing with the header return byte strings
* Switch finding the Ciphers to use a mapping for safety
* If we recognize an invalid vault file while parsing the header, throw
  an exception there instead of making a higher level decide that.
* Make version comparison robust against double digit versions

@amenonsen amenonsen closed this Oct 9, 2015

@graingert

This comment has been minimized.

Show comment
Hide comment
@graingert

graingert Oct 9, 2015

Closed? WFT?

Closed? WFT?

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen Oct 10, 2015

Contributor

Since a number of people have now written to ask why I closed the PR, the main reasons are that (a) it won't be in v2: "[we] Decided to take it off the v2 milestone because we don't have anyone in-house to review it for security problems and it's late to be adding it for v2" (per @abadger on IRC), (b) this code has significant security and compatibility considerations, and if it's not going to be in v2, I don't want people to use it (e.g. because of possible divergence with core cipher names/versions, if something else gets merged in future); I know of people who want to, at least, and we know that people have used other unmerged features from PRs in the past (e.g. with the vault+gpg changes, to name just one), (c) it's blocking @mgedmin's python 3 compatibility work because he's been holding off on touching the vault code until this PR was merged.

On a personal note, I feel it reflects poorly on me as a contributor to have long-pending PRs (and bitrotting security code in particular is the worst). I know that not everyone will think this is a sensible feeling, and it's certainly not something I mean to make a big deal about (the reasons above, to me, are sufficient to motivate the decision to close), but it's how I feel anyway.

Contributor

amenonsen commented Oct 10, 2015

Since a number of people have now written to ask why I closed the PR, the main reasons are that (a) it won't be in v2: "[we] Decided to take it off the v2 milestone because we don't have anyone in-house to review it for security problems and it's late to be adding it for v2" (per @abadger on IRC), (b) this code has significant security and compatibility considerations, and if it's not going to be in v2, I don't want people to use it (e.g. because of possible divergence with core cipher names/versions, if something else gets merged in future); I know of people who want to, at least, and we know that people have used other unmerged features from PRs in the past (e.g. with the vault+gpg changes, to name just one), (c) it's blocking @mgedmin's python 3 compatibility work because he's been holding off on touching the vault code until this PR was merged.

On a personal note, I feel it reflects poorly on me as a contributor to have long-pending PRs (and bitrotting security code in particular is the worst). I know that not everyone will think this is a sensible feeling, and it's certainly not something I mean to make a big deal about (the reasons above, to me, are sufficient to motivate the decision to close), but it's how I feel anyway.

@graingert

This comment has been minimized.

Show comment
Hide comment
@graingert

graingert Oct 10, 2015

That's fair enough. Do you think there well be a 2.1 milestone to target?

That's fair enough. Do you think there well be a 2.1 milestone to target?

@jimi-c

This comment has been minimized.

Show comment
Hide comment
@jimi-c

jimi-c Oct 10, 2015

Member

@graingert that's pretty much what the next milestone is. Once 2.0 is out, we'll create a 2.1 milestone and move everything from next to 2.1.

Member

jimi-c commented Oct 10, 2015

@graingert that's pretty much what the next milestone is. Once 2.0 is out, we'll create a 2.1 milestone and move everything from next to 2.1.

@mgedmin

This comment has been minimized.

Show comment
Hide comment
@mgedmin

mgedmin Oct 12, 2015

Contributor

(c) it's blocking @mgedmin's python 3 compatibility work because he's been holding off on touching the vault code until this PR was merged.

FWIW that was mostly an excuse to give people an incentive to merge this quicker.

Contributor

mgedmin commented Oct 12, 2015

(c) it's blocking @mgedmin's python 3 compatibility work because he's been holding off on touching the vault code until this PR was merged.

FWIW that was mostly an excuse to give people an incentive to merge this quicker.

@amenonsen amenonsen deleted the amenonsen:vault-cleanups branch Dec 11, 2015

@h0nIg

This comment has been minimized.

Show comment
Hide comment
@h0nIg

h0nIg May 19, 2016

Contributor

could you please review this (and reopen, since 2.1 is RC and its still not included), because http://toroid.org/ansible-vault-strange-crypto points out some improvements regarding vault files and speed?

+1

Contributor

h0nIg commented May 19, 2016

could you please review this (and reopen, since 2.1 is RC and its still not included), because http://toroid.org/ansible-vault-strange-crypto points out some improvements regarding vault files and speed?

+1

@amenonsen

This comment has been minimized.

Show comment
Hide comment
@amenonsen

amenonsen May 19, 2016

Contributor

This is superseded by #15060, which I'll merge when I get some time to address review comments.

Contributor

amenonsen commented May 19, 2016

This is superseded by #15060, which I'll merge when I get some time to address review comments.

@ansibot ansibot added feature and removed feature_pull_request labels Mar 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment