New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework crypto layer #1345

Merged
merged 5 commits into from Dec 21, 2018

Conversation

Projects
None yet
3 participants
@plettich
Copy link
Contributor

plettich commented Dec 18, 2018

Working on the crypto layer to improve the interfaces.
Changed the signature of some function to accept/return (hexlified) unicode strings where appropriate. Some functions still accept/return bytes since they are called from several places with unclear code paths.
This is more a basis for discussion...

Merging #1345 into master will decrease coverage by <.01%.
The diff coverage is 93.75%.
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1345      +/-   ##
==========================================
- Coverage   96.37%   96.36%   -0.01%
==========================================
Files         144      144
Lines       17315    17312       -3
==========================================
- Hits        16687    16683       -4
- Misses        628      629       +1
Impacted Files Coverage Δ
privacyidea/lib/token.py 94.76% <ø> (ø) ⬆️
privacyidea/lib/security/aeshsm.py 38.09% <ø> (ø) ⬆️
privacyidea/lib/security/pkcs11.py 0% <0%> (ø) ⬆️
privacyidea/api/auth.py 99.11% <100%> (-0.01%) ⬇️
privacyidea/lib/utils.py 96.85% <100%> (+0.01%) ⬆️
privacyidea/lib/security/default.py 97.84% <100%> (-0.54%) ⬇️
privacyidea/lib/tokens/tantoken.py 100% <100%> (ø) ⬆️
privacyidea/lib/crypto.py 95.16% <96.66%> (-0.39%) ⬇️
privacyidea/models.py 98.82% <97.05%> (-0.08%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f14e9b...d709698. Read the comment docs.

  • I am fine with my comments!
    If Friedrich is fine with his points, I think this can be merged.
  • I still have some open comments -- but we could also keep them in mind for a subsequent PR to get this merged?
  •  File: privacyidea/lib/crypto.py:L74-80
  • I know, this is a new one: But it might make sense to define the CONFIG_KEY; TOKEN_KEY here and import them in e.g. lib/security/aeshsm.py ... so that we do not need to redefine the ints in all files.
  • I was thinking about that, too.
    Unfortunately they are mixed up:
    in lib/crypto.py:
    CONFIG_KEY = 1 TOKEN_KEY = 2 VALUE_KEY = 3
    in lib/security/aeshsm.py:
    TOKEN_KEY = 0 CONFIG_KEY = 1 VALUE_KEY = 2
    and in lib/security/default.py:
    TOKEN_KEY = 0 CONFIG_KEY = 1 VALUE_KEY = 2 DEFAULT_KEY = 3
    I'll check the code and fix it.
  • fixed in d709698
    There are also the functions encrypt() and decrypt() in crypto.py which pass the slot id, but it is actually unused.
  •  File: privacyidea/lib/crypto.py:L136-227
  • would it make sense and be safe to do
return _to_unicode(binascii.hexlify(_to_bytes(s)))

?

  • sure, I'll fix it.
  • fixed in d709698
  •  File: privacyidea/lib/crypto.py:L136-227
  • Where is the salt and the rounds gone? (OK, obviously we might drop passlib anyways later in favour for cryptography)
  • The documentation recommends keeping the default values. I am not sure what we would gain (in our case even reducing the number of rounds and the salt_size compared to the default).
    I would prefer sticking with the default since 'they know best'...
  • You are probably right. I also think the parameters where not used externally. We added them to be able to allow someone who "thinks he knows better", to more easily adapt this.
  •  File: privacyidea/lib/crypto.py:L326-378
  • I used this step before the return since it was easier to debug. What do you tink?
  • I think it is more 'pythonic' but You are right, debugging is easier :-)
  • fixed in d709698
  •  File: privacyidea/lib/security/default.py:L43-54
  • if we use the functions _to_bytes and _to_unicode outside of the original module I think we should calls them to_bytes and to_unicode.
  • I tried to keep these functions to the crypto.py module... I'll check if the import can be avoided.
  • Maybe they'll be useful for porting other modules. At least hexlify_and_unicode might be :-)
  • One Problem is, if we define something in lib/utils.py we can't use it in lib/crypto.py since the crypto module is imported into utils...
    I would prefer if the utils module would not import any privacyIDEA specific modules so that we can import it everywhere.
    We could move the functions that make use of the crypto module (specifically urandom() and geturandom()) into the crypto module, but i would like to do that in a different PR.
  • Yes, a privacyIDEA-independent utils module would be nice! But doing this in another PR is probably a good idea.
  • Right. The utils was ment to run independent of anything privacyidea specific.
  •  File: privacyidea/lib/crypto.py:L46-58
  • Please add some information to the doc string of this module, that parameter of functions are expected as a certain type and functions in this module usually return a certain type (is it hexlified byte string?)
  • on it.
  •  File: privacyidea/lib/security/default.py:L325-352
  • I wasn't really aware of this method :-) But I see it's used to encrypt encryption keys with a password in encrypt_enckey in pi-manage. I think encryption keys are just bytes. What do you think about making password_encrypt take bytes and return bytes, and password_decrypt take bytes and return bytes, in order to make it consistent with the other methods of DefaultSecurityModule?
    Also, I think we'll need to change encrypt_enckey in pi-manage: it reads the enckey file using open(encfile, "r") which I think will break in Python 3?
  • I stumbled upon this as well, we need to distinguish between functions that expect/return strings (hashing, encrypting/decrypting passwords/pins) and functions which handle arbitrary (binary) data. I am on it.
    And there are several open() calls which need to be changed to read the data in binary format. I'll fix them along the way.
  • This is all a little tricky: encrypt_enckey in pi-manage reads in an encKey file (which should be binary) and prints out a string (hexlified iv and hexlified encrypted data).
    This string will be read in by _get_secret() in the DefaultSecurityModule which then calls password_decrypt with the given string.
    I am not really sure how we should handle things here.
    We are going to rewrite these parts anyway when we switch to a different crypto-library (AES, padding, etc...), so maybe we should keep it that way for now and keep this in mind for the next rewrite.
  • I think adding a TODO at those functions should be fine.
  •  File: privacyidea/models.py:L393-402
  • Correct me if I'm wrong, but can't we remove this because encryptPin will call _to_bytes() anyway?
  • No need for correction :-)
  • fixed in d709698
  •  File: privacyidea/lib/crypto.py:L326-378
  • Hm, I see why you changed encrypt to return a hexstring, and I think that makes sense. But I think it's unfortunate that encrypt and decrypt are not inverses anymore (because decrypt expects bytes).
    Maybe we can keep the old encrypt/decrypt operating on bytes, and add a new encrypt_and_hexlify? But maybe there's a better way to fix it.
  • Yes, i don't like this either. I'll look into the usage of these functions again, maybe there is a better approach.
  • @fredreichbier : @plettich added a docstring, so that we can handle this later with issue #337
  • We can fix this in a later PR.

  

plettich added some commits Dec 14, 2018

DB tests with Python 2/3 compatibility
Changing the interface of the crypto library to accept unicode as well as
byte-string and return only unicode.
Rework of crypto layer
- most functions in lib/crypto.py now return an appropriate type
  (hexlified unicode strings or binary data). All calling functions are
  adjusted accordingly.
- add/fix a lot of comments regarding the parameter/return types

@plettich plettich requested a review from privacyidea/core Dec 18, 2018

@cornelinux cornelinux changed the title Rework crypt layer Rework crypto layer Dec 18, 2018

@codecov

This comment has been minimized.

Copy link

codecov bot commented Dec 18, 2018

Codecov Report

Merging #1345 into master will increase coverage by 0.32%.
The diff coverage is 97.45%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1345      +/-   ##
=========================================
+ Coverage   96.38%   96.7%   +0.32%     
=========================================
  Files         144     144              
  Lines       17315   17685     +370     
=========================================
+ Hits        16689   17103     +414     
+ Misses        626     582      -44
Impacted Files Coverage Δ
privacyidea/lib/security/pkcs11.py 0% <ø> (ø) ⬆️
privacyidea/lib/security/aeshsm.py 84.61% <ø> (+46.52%) ⬆️
privacyidea/lib/token.py 96.39% <ø> (+1.62%) ⬆️
privacyidea/models.py 99.04% <100%> (+0.14%) ⬆️
privacyidea/api/auth.py 99.11% <100%> (-0.01%) ⬇️
privacyidea/lib/utils.py 96.85% <100%> (+0.01%) ⬆️
privacyidea/lib/crypto.py 95.49% <100%> (-0.05%) ⬇️
privacyidea/lib/tokens/tantoken.py 100% <100%> (ø) ⬆️
privacyidea/lib/security/default.py 96.44% <92.85%> (-1.95%) ⬇️
privacyidea/lib/policydecorators.py 99.22% <0%> (-0.78%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a5a451...8b38de8. Read the comment docs.



# constant - later taken from the env?
CONFIG_KEY = 1

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

I know, this is a new one: But it might make sense to define the CONFIG_KEY; TOKEN_KEY here and import them in e.g. lib/security/aeshsm.py ... so that we do not need to redefine the ints in all files.

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

I was thinking about that, too.
Unfortunately they are mixed up:
in lib/crypto.py:
CONFIG_KEY = 1 TOKEN_KEY = 2 VALUE_KEY = 3

in lib/security/aeshsm.py:
TOKEN_KEY = 0 CONFIG_KEY = 1 VALUE_KEY = 2

and in lib/security/default.py:
TOKEN_KEY = 0 CONFIG_KEY = 1 VALUE_KEY = 2 DEFAULT_KEY = 3

I'll check the code and fix it.

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

fixed in d709698
There are also the functions encrypt() and decrypt() in crypto.py which pass the slot id, but it is actually unused.

This comment has been minimized.

@cornelinux
:return: hexlified string converted to unicode
:rtype: unicode
"""
return _to_unicode(binascii.hexlify(s))

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

would it make sense and be safe to do

return _to_unicode(binascii.hexlify(_to_bytes(s)))

?

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

sure, I'll fix it.

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

fixed in d709698

"""
key = get_app_config_value("PI_PEPPER", "missing")
pw_dig = passlib.hash.pbkdf2_sha512.encrypt(key + password, rounds=rounds,
salt_size=salt_size)
pw_dig = passlib.hash.pbkdf2_sha512.hash(key + password)

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

Where is the salt and the rounds gone? (OK, obviously we might drop passlib anyways later in favour for cryptography)

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

The documentation recommends keeping the default values. I am not sure what we would gain (in our case even reducing the number of rounds and the salt_size compared to the default).
I would prefer sticking with the default since 'they know best'...

This comment has been minimized.

@cornelinux

cornelinux Dec 19, 2018

Member

You are probably right. I also think the parameters where not used externally. We added them to be able to allow someone who "thinks he knows better", to more easily adapt this.
I am fine with that. For the last four years noone thought to know better ;-)

This comment has been minimized.

@cornelinux
'''
hsm = get_hsm()
ret = hsm.encrypt(data, iv, id)
return ret

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

I used this step before the return since it was easier to debug. What do you tink?

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

I think it is more 'pythonic' but You are right, debugging is easier :-)

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

fixed in d709698

@@ -43,12 +43,11 @@
import os

from Crypto.Cipher import AES
from privacyidea.lib.crypto import zerome
from privacyidea.lib.crypto import zerome, _to_bytes, _to_unicode, hexlify_and_unicode

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

if we use the functions _to_bytes and _to_unicode outside of the original module I think we should calls them to_bytes and to_unicode.

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

I tried to keep these functions to the crypto.py module... I'll check if the import can be avoided.

This comment has been minimized.

@fredreichbier

fredreichbier Dec 19, 2018

Member

Maybe they'll be useful for porting other modules. At least hexlify_and_unicode might be :-)

This comment has been minimized.

@plettich

plettich Dec 20, 2018

Contributor

One Problem is, if we define something in lib/utils.py we can't use it in lib/crypto.py since the crypto module is imported into utils...
I would prefer if the utils module would not import any privacyIDEA specific modules so that we can import it everywhere.
We could move the functions that make use of the crypto module (specifically urandom() and geturandom()) into the crypto module, but i would like to do that in a different PR.

This comment has been minimized.

@fredreichbier

fredreichbier Dec 20, 2018

Member

Yes, a privacyIDEA-independent utils module would be nice! But doing this in another PR is probably a good idea.

This comment has been minimized.

@cornelinux

cornelinux Dec 20, 2018

Member

Right. The utils was ment to run independent of anything privacyidea specific.

@@ -46,12 +46,12 @@
from .log import log_with
from .error import HSMException
import binascii
import six

This comment has been minimized.

@cornelinux

cornelinux Dec 18, 2018

Member

Please add some information to the doc string of this module, that parameter of functions are expected as a certain type and functions in this module usually return a certain type (is it hexlified byte string?)

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

on it.

@fredreichbier
Copy link
Member

fredreichbier left a comment

Some general additions to Cornelius' comments

:param text: The text to encrypt
:type text: str

This comment has been minimized.

@fredreichbier

fredreichbier Dec 19, 2018

Member

I wasn't really aware of this method :-) But I see it's used to encrypt encryption keys with a password in encrypt_enckey in pi-manage. I think encryption keys are just bytes. What do you think about making password_encrypt take bytes and return bytes, and password_decrypt take bytes and return bytes, in order to make it consistent with the other methods of DefaultSecurityModule?

Also, I think we'll need to change encrypt_enckey in pi-manage: it reads the enckey file using open(encfile, "r") which I think will break in Python 3?

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

I stumbled upon this as well, we need to distinguish between functions that expect/return strings (hashing, encrypting/decrypting passwords/pins) and functions which handle arbitrary (binary) data. I am on it.

And there are several open() calls which need to be changed to read the data in binary format. I'll fix them along the way.

This comment has been minimized.

@plettich

plettich Dec 20, 2018

Contributor

This is all a little tricky: encrypt_enckey in pi-manage reads in an encKey file (which should be binary) and prints out a string (hexlified iv and hexlified encrypted data).
This string will be read in by _get_secret() in the DefaultSecurityModule which then calls password_decrypt with the given string.
I am not really sure how we should handle things here.
We are going to rewrite these parts anyway when we switch to a different crypto-library (AES, padding, etc...), so maybe we should keep it that way for now and keep this in mind for the next rewrite.

This comment has been minimized.

@cornelinux

cornelinux Dec 20, 2018

Member

I think adding a TODO at those functions should be fine.

This comment has been minimized.

@fredreichbier
elif hashed is False:
else:
if isinstance(pin, six.text_type):
upin = upin.encode('utf8')

This comment has been minimized.

@fredreichbier

fredreichbier Dec 19, 2018

Member

Correct me if I'm wrong, but can't we remove this because encryptPin will call _to_bytes() anyway?

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

No need for correction :-)

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

fixed in d709698

This comment has been minimized.

@fredreichbier
:type iv: bytes or str
:param id: contains the key id of the keyset which should be used
:type id: int
:return: encrypted and hexlified data

This comment has been minimized.

@fredreichbier

fredreichbier Dec 19, 2018

Member

Hm, I see why you changed encrypt to return a hexstring, and I think that makes sense. But I think it's unfortunate that encrypt and decrypt are not inverses anymore (because decrypt expects bytes).
Maybe we can keep the old encrypt/decrypt operating on bytes, and add a new encrypt_and_hexlify? But maybe there's a better way to fix it.

This comment has been minimized.

@plettich

plettich Dec 19, 2018

Contributor

Yes, i don't like this either. I'll look into the usage of these functions again, maybe there is a better approach.

This comment has been minimized.

@cornelinux

cornelinux Dec 20, 2018

Member

@fredreichbier : @plettich added a docstring, so that we can handle this later with issue #337 For me this is fine.

This comment has been minimized.

@fredreichbier

fredreichbier Dec 21, 2018

Member

We can fix this in a later PR.

This comment has been minimized.

@fredreichbier
Fixing some review remarks
- moving the slot id constants into the SecurityModule base class. The
  constants in crypto.py where wrong anyway.
- removed unnecessary encoding check
- added some necessary encoding conversions
- improved test coverage


# constant - later taken from the env?
CONFIG_KEY = 1

This comment has been minimized.

@cornelinux
:return: hexlified string converted to unicode
:rtype: unicode
"""
return _to_unicode(binascii.hexlify(s))

This comment has been minimized.

@cornelinux
'''
hsm = get_hsm()
ret = hsm.encrypt(data, iv, id)
return ret

This comment has been minimized.

@cornelinux
@cornelinux

This comment has been minimized.

Copy link
Member

cornelinux commented Dec 20, 2018

I am fine with my comments!
If Friedrich is fine with his points, I think this can be merged.

elif hashed is False:
else:
if isinstance(pin, six.text_type):
upin = upin.encode('utf8')

This comment has been minimized.

@fredreichbier
@fredreichbier

This comment has been minimized.

Copy link
Member

fredreichbier commented Dec 20, 2018

I still have some open comments -- but we could also keep them in mind for a subsequent PR to get this merged?

Fix comments and remarks from review
In some places the code/documentation is inconsistent. Some of this will be
addressed in different issues.
Also removed the import of 'private' functions for now.
@@ -46,12 +46,12 @@
from .log import log_with
from .error import HSMException
import binascii
import six

This comment has been minimized.

@cornelinux
@@ -43,12 +43,11 @@
import os

from Crypto.Cipher import AES
from privacyidea.lib.crypto import zerome
from privacyidea.lib.crypto import zerome, _to_bytes, _to_unicode, hexlify_and_unicode

This comment has been minimized.

@cornelinux

@cornelinux cornelinux merged commit 2afd3ba into master Dec 21, 2018

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 97.45% of diff hit (target 96.38%)
Details
codecov/project 96.7% (+0.32%) compared to 9a5a451
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@cornelinux cornelinux deleted the rework_crypt_layer branch Dec 21, 2018

@cornelinux cornelinux referenced this pull request Dec 21, 2018

Closed

Rework crypto layer #1340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment