New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several issues with Python 3 #1318

Merged
merged 2 commits into from Nov 26, 2018

Conversation

Projects
None yet
2 participants
@plettich
Contributor

plettich commented Nov 23, 2018

There are multiple issues which need attention In order to run the code with Python 2 and Python 3.

  • The type basestring does not exist in Python 3 so we use
    six.string_types
  • Iterator don't contain a .next() function anymore, instead the builtin
    function next() is used
  • raw_input() does not exist in Python 3 instead we use input() from
    six.moves which does not evaluate the input
  • Changes how a MetaClass is defined
  • long type does not exist in Python 3. Instead a little hack is used to
    redefine long as int in case of Python 3
  • Encoding strings with non-text encodings does not work anymore in
    Python 3. Instead the functionality from the binascii module is used.
  • The behaviour of the division operator changed in Python 3. To achieve
    the same behaviour in Python 2 we import division from __future__
  • The range() builtin function returns an iterator type in Python 3 thus
    we have to convert it to a list in some cases.
  • Improving conversion of a list of integers to bytes in aeshsm.py
  • The string class does not provide the method replace() anymore in
    Python 3, instead use the instance method.
  • If a class defines the __eq__() function it also has to define the
    __hash__() function in Python 3. Also the __nonzero__ function is
    called __bool__ in Python 3
  • Some more string encodings, checks and exception handling

  •  File: privacyidea/lib/auditmodules/sqlaudit.py:L325-352
  • Well, maybe we should treat all the encoding stuff in the following PRs. I'll revert it for now.
  •  General Comment
  • # Codecov Report

Merging #1318 into master will increase coverage by 0.03%.
The diff coverage is 84.5%.
Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1318      +/-   ##
=========================================
+ Coverage   95.86%   95.9%   +0.03%
=========================================
Files         142     142
Lines       17285   17451     +166
=========================================
+ Hits        16571   16736     +165
- Misses        714     715       +1
Impacted Files Coverage Δ
privacyidea/lib/machines/base.py 94.11% <100%> (+0.11%) ⬆️
privacyidea/lib/tokens/mOTP.py 92.98% <100%> (ø) ⬆️
privacyidea/lib/tokens/totptoken.py 98.85% <100%> (ø) ⬆️
privacyidea/lib/tokens/vasco.py 96.72% <100%> (ø) ⬆️
privacyidea/lib/smsprovider/SmtpSMSProvider.py 100% <100%> (ø) ⬆️
privacyidea/lib/security/aeshsm.py 38.09% <100%> (+1.5%) ⬆️
privacyidea/lib/config.py 94.83% <100%> (ø) ⬆️
privacyidea/lib/auditmodules/sqlaudit.py 96.32% <100%> (+2.23%) ⬆️
privacyidea/lib/policy.py 99.81% <100%> (+0.4%) ⬆️
privacyidea/lib/resolvers/LDAPIdResolver.py 93.41% <100%> (+0.01%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30d8338...5f12bca. Read the comment docs.

  •  File: privacyidea/lib/resolvers/LDAPIdResolver.py:L1067-1074
  • Hm, I'm not sure this does the right thing on Python 3: I think sha_hash.digest() and salt are both bytes, so having "{0}{1}".format(...) will create something like 'b"=\\xa5AU\\x99\\x18\\xa8\\x08\\xc2@+\\xbaP\\x12\\xf6\\xc6\\x0b\'f\\x1c"b\'\\x01\\x02\\x03\\x04\''. I guess we could just replace it with digest_b64 = binascii.b2a_base64(sha_hash.digest() + salt)?
  • You are right, missed it during tests.
  •  File: privacyidea/lib/security/aeshsm.py:L238-245
  • same as above
  •  File: privacyidea/lib/utils.py:L250-269
  • I guess this won't work on Python 3 because binascii.hexlify(s) is bytes, so all c will be int, but hex2ModDict has str keys :-/
    ... I guess we would need to change hexHexChars to a bytestring? But then we also need to change modhex_decode
  • Nice fix! There may be a remaining issue with the decoding function: I think modhex_decode will fail on Python 3 if its arguments is bytes. I think it makes sense to expect unicode strings only, but from reading the code, I'm not sure what type the arguments will actually have :-/
    ... maybe we'll just keep it like this for now, and test the yubikey token later on Python 3?
  • True, it fails with bytes.
    But by looking through the usages of modhex_decode (and modhex_encode) it looks like modhex_decode is only called with a string argument (not bytes explicitly).
    We could always add a check but i would prefer to migrate the unicode stuff and then fix the code where Python 3 is actually failing.
  •  File: privacyidea/lib/crypto.py:L709-722
  • Yes, You are right, text_type should be use here (and below). I'll fix it.
  •  File: privacyidea/lib/security/aeshsm.py:L186-199
  • The problem was that chr() returns unicode in Python 3 and thus the int_list_to_bytestring() function broke.
    Maybe we can use builtins from six, I'll check.
  •  File: privacyidea/lib/token.py:L2042-2050
  • Ok, actually i didn't check that :-) just the message from pylint...
  •  File: privacyidea/lib/security/aeshsm.py:L22-29
  • Unfortuantely six does not provide the builtins module from Python 3 but if we change the int_list_to_bytestring() stuff above we don't need it anyway.

  

Fix several issues with Python 3
- The type `basestring` does not exist in Python 3 so we use
  `six.string_types`
- Iterator don't contain a `.next()` function anymore, instead the builtin
  function `next()` is used
- `raw_input()` does not exist in Python 3 instead we use `input()` from
  `six.moves` which does not evaluate the input
- Changes how a MetaClass is defined
- `long` type does not exist in Python 3. Instead a little hack is used to
  redefine `long` as `int` in case of Python 3
- Encoding strings with non-text encodings does not work anymore in
  Python 3. Instead the functionality from the `binascii` module is used.
- The behaviour of the division operator changed in Python 3. To achieve
  the same behaviour in Python 2 we import `division` from `__future__`
- The `range()` builtin function returns an iterator type in Python 3 thus
  we have to convert it to a list in some cases.
- Improving conversion of a list of integers to bytes in `aeshsm.py`
- The `string` class does not provide the method `replace()` anymore in
  Python 3, instead use the instance method.
- If a class defines the `__eq__()` function it also has to define the
  `__hash__()` function in Python 3. Also the `__nonzero__` function is
  called `__bool__` in Python 3
- Some more string encodings, checks and exception handling

@plettich plettich requested a review from privacyidea/core Nov 23, 2018

@fredreichbier

some comments about Python 2/3 compatibility

le.privacyidea_server,
le.client,
le.loglevel,
le.clearance_level)

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

So after this change _log_to_string returns unicode in Python 2, whereas it always returned bytes before. I think this is sensible, but it may break things?

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

Well, maybe we should treat all the encoding stuff in the following PRs. I'll revert it for now.

:return: The signature of the string
:rtype: long
"""
if isinstance(s, string_types):

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

I think this is fine on Python 3, but I'm not sure this does the right thing on Python 2. Assume s is of type bytes. Then, the if branch is entered, which calls s.encode('utf8'), which fails if s contains bytes outside the ASCII range.

Maybe string_types should be text_type?

Anyway, I'm not sure about this and I don't know if we ever call sign with non-ASCII bytestrings, so this may be totally theoretical.

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

Yes, You are right, text_type should be use here (and below). I'll fix it.

This comment has been minimized.

@fredreichbier
"""
if isinstance(s, string_types):

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

same as above

This comment has been minimized.

@fredreichbier
@@ -1066,8 +1067,7 @@ def _create_ssha(password):
sha_hash.update(salt)
# Create a base64 encoded string
digest_b64 = '{0}{1}'.format(sha_hash.digest(),
salt).encode('base64').strip()
digest_b64 = binascii.b2a_base64('{0}{1}'.format(sha_hash.digest(), salt)).strip()

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

Hm, I'm not sure this does the right thing on Python 3: I think sha_hash.digest() and salt are both bytes, so having "{0}{1}".format(...) will create something like 'b"=\\xa5AU\\x99\\x18\\xa8\\x08\\xc2@+\\xbaP\\x12\\xf6\\xc6\\x0b\'f\\x1c"b\'\\x01\\x02\\x03\\x04\''. I guess we could just replace it with digest_b64 = binascii.b2a_base64(sha_hash.digest() + salt)?

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

You are right, missed it during tests.

This comment has been minimized.

@fredreichbier
@@ -189,9 +186,13 @@ def random(self, length):
raise HSMException("Failed to generate random number after multiple retries.")
# convert the array of the random integers to a string
return int_list_to_bytestring(r_integers)
return bytes(r_integers)

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

I think we might need a case distinction to keep it working on Python 2 here?
https://github.com/LudovicRousseau/PyKCS11/blob/951a9e941cab4f49537e29caf7c9250674e3f4ff/PyKCS11/__init__.py#L1527

I just noticed this isn't the Python bytes, but the builtins bytes, so this may be fine?

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

The problem was that chr() returns unicode in Python 3 and thus the int_list_to_bytestring() function broke.
Maybe we can use builtins from six, I'll check.

This comment has been minimized.

@fredreichbier
@@ -211,9 +212,13 @@ def encrypt(self, data, iv, key_id=TOKEN_KEY):
if retries > self.max_retries:
raise HSMException("Failed to encrypt after multiple retries.")
return int_list_to_bytestring(r)
return bytes(r)

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

same as above

This comment has been minimized.

@fredreichbier
@@ -233,7 +238,7 @@ def decrypt(self, data, iv, key_id=TOKEN_KEY):
if retries > self.max_retries:
raise HSMException("Failed to decrypt after multiple retries.")
return int_list_to_bytestring(r)
return bytes(r)

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

same as above

This comment has been minimized.

@fredreichbier
@@ -2041,7 +2042,8 @@ def check_token_list(tokenobject_list, passw, user=None, options=None):
# Token is locked
pin_match = False
otp_count = -1
repl = {'message': tae.message}
_num, message = tae.args
repl = {'message': message}

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

Hmm, Google told me e.message is deprecated for Exception, but wouldn't we be fine in case of TokenAdminError because we add the message attribute ourselves?

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

Ok, actually i didn't check that :-) just the message from pylint...

This comment has been minimized.

@fredreichbier
@@ -250,21 +250,19 @@ def generate_password(size=6, characters=string.ascii_lowercase +
def modhex_encode(s):
return ''.join(
[hex2ModDict[c] for c in s.encode('hex')]
[hex2ModDict[c] for c in binascii.hexlify(s)]

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

I guess this won't work on Python 3 because binascii.hexlify(s) is bytes, so all c will be int, but hex2ModDict has str keys :-/
... I guess we would need to change hexHexChars to a bytestring? But then we also need to change modhex_decode

This comment has been minimized.

@fredreichbier

fredreichbier Nov 26, 2018

Member

Nice fix! There may be a remaining issue with the decoding function: I think modhex_decode will fail on Python 3 if its arguments is bytes. I think it makes sense to expect unicode strings only, but from reading the code, I'm not sure what type the arguments will actually have :-/
... maybe we'll just keep it like this for now, and test the yubikey token later on Python 3?

This comment has been minimized.

@plettich

plettich Nov 26, 2018

Contributor

True, it fails with bytes.
But by looking through the usages of modhex_decode (and modhex_encode) it looks like modhex_decode is only called with a string argument (not bytes explicitly).
We could always add a check but i would prefer to migrate the unicode stuff and then fix the code where Python 3 is actually failing.

This comment has been minimized.

@fredreichbier
@@ -22,6 +22,7 @@
import binascii
import logging
from builtins import bytes

This comment has been minimized.

@fredreichbier

fredreichbier Nov 23, 2018

Member

Hm, I think not having another compatibility module in addition to six would be nice -- what do you think?

This comment has been minimized.

@plettich

plettich Nov 23, 2018

Contributor

Unfortuantely six does not provide the builtins module from Python 3 but if we change the int_list_to_bytestring() stuff above we don't need it anyway.

This comment has been minimized.

@fredreichbier
Fixes remarks from review
- In some cases, the `six.string_types` type is not suitable since it
  also contains bytestrings in Python 2. In this case, the `six.text_type`
  type is better (which contains only unicode strings)
- Formatting a bytestring does not work as expected in Python 3 since it
  includes the `b` prefix in the string
- The `bytes` type from Python 3 does not work as expected. The bytestring
  generation is reverted.
- Missed the added message attribute in `privacyIDEAError`
@codecov

This comment has been minimized.

codecov bot commented Nov 26, 2018

Codecov Report

Merging #1318 into master will increase coverage by 0.03%.
The diff coverage is 84.5%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1318      +/-   ##
=========================================
+ Coverage   95.86%   95.9%   +0.03%     
=========================================
  Files         142     142              
  Lines       17285   17451     +166     
=========================================
+ Hits        16571   16736     +165     
- Misses        714     715       +1
Impacted Files Coverage Δ
privacyidea/lib/machines/base.py 94.11% <100%> (+0.11%) ⬆️
privacyidea/lib/tokens/mOTP.py 92.98% <100%> (ø) ⬆️
privacyidea/lib/tokens/totptoken.py 98.85% <100%> (ø) ⬆️
privacyidea/lib/tokens/vasco.py 96.72% <100%> (ø) ⬆️
privacyidea/lib/smsprovider/SmtpSMSProvider.py 100% <100%> (ø) ⬆️
privacyidea/lib/security/aeshsm.py 38.09% <100%> (+1.5%) ⬆️
privacyidea/lib/config.py 94.83% <100%> (ø) ⬆️
privacyidea/lib/auditmodules/sqlaudit.py 96.32% <100%> (+2.23%) ⬆️
privacyidea/lib/policy.py 99.81% <100%> (+0.4%) ⬆️
privacyidea/lib/resolvers/LDAPIdResolver.py 93.41% <100%> (+0.01%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30d8338...5f12bca. Read the comment docs.

@fredreichbier fredreichbier merged commit 4da61fa into master Nov 26, 2018

4 of 5 checks passed

codecov/patch 84.5% of diff hit (target 95.86%)
Details
ci/circleci Your tests passed on CircleCI!
Details
codecov/project 95.9% (+0.03%) compared to 30d8338
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@fredreichbier fredreichbier deleted the python3_migrate_stuff branch Nov 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment