New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix dictionary usage for python3 compatibility #1312

Merged
merged 6 commits into from Nov 22, 2018

Conversation

Projects
None yet
3 participants
@plettich
Contributor

plettich commented Nov 16, 2018

The usage of dictionaries changed between python2 and python3.
See https://portingguide.readthedocs.io/en/latest/dicts.html#dictionaries

  •  General Comment
  • Regarding the inefficiencies of python3 idioms in python2 i have done same tests:
    Python2:

In [1]: D = dict(zip(range(100), range(100)))
In [4]: %timeit for k,v in D.iteritems(): pass
100000 loops, best of 3: 2.06 µs per loop
In [5]: %timeit for k,v in D.items(): pass
100000 loops, best of 3: 3.17 µs per loop
In [6]: %timeit for k,v in six.iteritems(D): pass
100000 loops, best of 3: 2.17 µs per loop
Python3:
In [255]: %timeit for k,v in D.items(): pass
1.93 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [256]: %timeit for k,v in six.iteritems(D): pass
2.2 µs ± 43.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Merging #1312 into master will increase coverage by 0.09%.
The diff coverage is 96.92%.
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1312      +/-   ##
==========================================
+ Coverage   95.74%   95.83%   +0.09%
==========================================
Files         142      142
Lines       17254    17256       +2
==========================================
+ Hits        16519    16538      +19
+ Misses        735      718      -17
Impacted Files Coverage Δ
privacyidea/lib/eventhandler/base.py 95.04% <ø> (ø) ⬆️
privacyidea/api/token.py 98.6% <ø> (ø) ⬆️
privacyidea/lib/resolvers/SQLIdResolver.py 97.9% <ø> (ø) ⬆️
privacyidea/webui/login.py 89.47% <0%> (+10.52%) ⬆️
privacyidea/lib/resolver.py 100% <100%> (+3.37%) ⬆️
privacyidea/lib/machineresolver.py 100% <100%> (ø) ⬆️
privacyidea/api/event.py 100% <100%> (ø) ⬆️
privacyidea/lib/smsprovider/HttpSMSProvider.py 98.26% <100%> (ø) ⬆️
privacyidea/lib/apps.py 100% <100%> (ø) ⬆️
privacyidea/lib/subscriptions.py 87.7% <100%> (ø) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f21955e...deacecf. Read the comment docs.

keys = list(static_pol) + list(dynamic_pol)
  • Yes, You are right.
  • Done
  •  File: privacyidea/lib/policy.py:L761-770
  • We can drop this line, since the dict tmp_rights contains unique keys. Using list(set()) was used to make the entris in the list unique.
  • correct
  • What about the block before? Could we get duplicate entries from the
    current policies?
  • Which block do you mean?
    The method ui_get_rights simply returns a list of "rights" which is only used by the single page application to decide whether to display a certain UI component or not. It is not responsible for displaying data.
  • So it doesn't really matter if some 'rights' occur multiple times in the list?
    I meant the block in line 746:757 where we create the list of 'rights' from the current policies.
  • It shouldTM not.
    The rights are read in the auth API and returned by the /auth endpoint.
    Then the UI uses the checkRight-function to determine, if a user e.g. is allowed to set the Token PIN, so that we can display the "set PIN" button or not.
    However, it would also probably not hurt to avoid duplicate entries.
  • changed the code so that we have only unique elements in the rights list at the end of the function.
  •  File: privacyidea/lib/user.py:L597-609
  • You must not drop the for-loop. There could be other search parameters like email which you are now missing and which are not contained in the searchDict.
  • You are right, i am on it.
  • Fixed the loop, checks for username and user (in this order) are done after the loop.
  •  File: tests/test_api_lib_policy.py:L1802-1809
  • I understand that the keys of the response are now sorted for signature, so the signature changes but will stay the same?
  • Yes, as long as there are the same keys/values in the dictionary. Before we
    dumped the dictionary as a JSON-string without any order (which was fine
    since the order did not change, but it wasn't guaranteed).
    Now the the keys of the dictionary get sorted before the dump and thus the
    dumped JSON-String should be the same regardless of the sorting of the
    dictionary.
  •  File: tests/test_lib_policy.py:L497-504
  • is there a reason why this can not be compares as a list anymore?
    WOuld this change the program behaviour?
  • Since ordering is part of the list, we can not (safely) compare lists
    generated from dictionaries (since their order can not be guaranteed).
    We should check If we compare lists like that somewhere in the code.
    Unfortunately lists ans sets arent't the same since lists can contain
    multiple occurrences of the same value.
  • OK, I understand. The set does not have an order, so two sets are the same, if they have the same elements. No matter in which order, since there is no order. This would still be true:
self.assertEqual(set(rights), {"disable", "enable"})
  • 👍
  • Correct.
    BTW: {"disable", "enable"} denotes a set-literal, not a dictionary.
  •  File: privacyidea/lib/token.py:L2027-2034
  • Might be a bit nitpicking :-) But if the caller passes options, this change will modify the object in-place, which was not the case before. I think it would be safer to have something like
if options:
options = options.copy()
else:
options = {}
options.update(...)
  • oups I was not aware of the copy. I think I changed this also in my PR. But you are right.
  • No nitpicking, will fix.
  • Done
  •  File: privacyidea/lib/user.py:L597-619
  • Sorry for nitpicking and probably my original code was even worse.
    But should be do:
for key in param:
lval = param[key]
if key in ["realm", "resolver", "user", "username"]:
continue
searchDict[key] = lval
log.debug(...)

I was not aware of this earlier.

   

plettich added some commits Nov 13, 2018

Fixed dictionary handling
There are several differences with accessing dictionaries between Python 2
and 3.
- `dict.iteritems()` changed to `dict.items()`
- no need to get the keys for a check: `if key in dict` works.
- keys and values are not returned as lists anymore, they have to be
  converted: `list(dict.keys())`

Some of these changes might have some impact on the performance.

More information can be found here:
<https://python-future.org/compatible_idioms.html#dictionaries>
<https://portingguide.readthedocs.io/en/latest/dicts.html>
Fix usage of dictionaries
In Python the order of keys in dictionaries is not guaranteed.
This will be enforced in Python 3.3 - 3.7 so we need to fix the code where
we rely on the ordering of the keys.
This is the case where we calculate the signature of a response or the id
of the LDAP Resolver.
In `user.py` this would have randomly changed the output since it can not
be guaranteed that `username` is read before `user` in the `param`
dictionary.
More Information can be found here:
https://portingguide.readthedocs.io/en/latest/dicts.html#changed-key-order

These issues were found by running the tests with Python 2 and setting the
environment variable `PYTHONHASHSEED=random`

@plettich plettich requested a review from privacyidea/core Nov 16, 2018

@plettich

This comment has been minimized.

Contributor

plettich commented Nov 16, 2018

Regarding the inefficiencies of python3 idioms in python2 i have done same tests:
Python2:

In [1]: D = dict(zip(range(100), range(100)))
In [4]: %timeit for k,v in D.iteritems(): pass
100000 loops, best of 3: 2.06 µs per loop
In [5]: %timeit for k,v in D.items(): pass
100000 loops, best of 3: 3.17 µs per loop
In [6]: %timeit for k,v in six.iteritems(D): pass
100000 loops, best of 3: 2.17 µs per loop

Python3:

In [255]: %timeit for k,v in D.items(): pass
1.93 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [256]: %timeit for k,v in six.iteritems(D): pass
2.2 µs ± 43.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@codecov

This comment has been minimized.

codecov bot commented Nov 16, 2018

Codecov Report

Merging #1312 into master will increase coverage by 0.08%.
The diff coverage is 98.43%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1312      +/-   ##
==========================================
+ Coverage   95.79%   95.88%   +0.08%     
==========================================
  Files         142      142              
  Lines       17283    17284       +1     
==========================================
+ Hits        16557    16572      +15     
+ Misses        726      712      -14
Impacted Files Coverage Δ
privacyidea/lib/eventhandler/base.py 95.04% <ø> (ø) ⬆️
privacyidea/api/token.py 98.6% <ø> (ø) ⬆️
privacyidea/lib/resolvers/SQLIdResolver.py 97.9% <ø> (ø) ⬆️
privacyidea/webui/login.py 89.47% <0%> (+10.52%) ⬆️
privacyidea/lib/resolver.py 100% <100%> (+3.37%) ⬆️
privacyidea/lib/utils.py 96.77% <100%> (-0.01%) ⬇️
privacyidea/lib/machineresolver.py 100% <100%> (ø) ⬆️
privacyidea/api/event.py 100% <100%> (ø) ⬆️
privacyidea/lib/smsprovider/HttpSMSProvider.py 98.26% <100%> (ø) ⬆️
privacyidea/lib/apps.py 100% <100%> (ø) ⬆️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dccc9f4...4ef97cc. Read the comment docs.

Show resolved Hide resolved privacyidea/api/policy.py Outdated
rights = static_rights + enroll_rights
tmp_rights = get_static_policy_definitions(scope)
tmp_rights.update(get_dynamic_policy_definitions(scope))
rights = list(tmp_rights)
# reduce the list
rights = list(set(rights))

This comment has been minimized.

@cornelinux

cornelinux Nov 17, 2018

Member

We can drop this line, since the dict tmp_rights contains unique keys. Using list(set()) was used to make the entris in the list unique.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

correct

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

What about the block before? Could we get duplicate entries from the
current policies?

This comment has been minimized.

@cornelinux

cornelinux Nov 19, 2018

Member

Which block do you mean?
The method ui_get_rights simply returns a list of "rights" which is only used by the single page application to decide whether to display a certain UI component or not. It is not responsible for displaying data.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

So it doesn't really matter if some 'rights' occur multiple times in the list?
I meant the block in line 746:757 where we create the list of 'rights' from the current policies.

This comment has been minimized.

@cornelinux

cornelinux Nov 19, 2018

Member

It shouldTM not.
The rights are read in the auth API and returned by the /auth endpoint.

Then the UI uses the checkRight-function to determine, if a user e.g. is allowed to set the Token PIN, so that we can display the "set PIN" button or not.

However, it would also probably not hurt to avoid duplicate entries.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

changed the code so that we have only unique elements in the rights list at the end of the function.

@@ -597,20 +597,12 @@ def get_user_list(param=None, user=None):
searchDict = {"username": "*"}
param = param or {}
# we have to recreate a new searchdict without the realm key
# as delete does not work
for key in param:

This comment has been minimized.

@cornelinux

cornelinux Nov 17, 2018

Member

You must not drop the for-loop. There could be other search parameters like email which you are now missing and which are not contained in the searchDict.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

You are right, i am on it.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

Fixed the loop, checks for username and user (in this order) are done after the loop.

@@ -1802,7 +1802,7 @@ def test_07_sign_response(self):
new_response = sign_response(req, resp)
jresult = json.loads(new_response.data)
self.assertEqual(jresult.get("nonce"), "12345678")
self.assertEqual(jresult.get("signature"), "12286563956186851222763777429695584644324740652565581988587414544297432783523785532070055857619627542347973153465187816396752261730111152465977744264826200787508894389937562519240189748382275312824518790194963440380306202638365223197802490803597478514617706281242938294159192617108210623265084867605251009544178231020920999223970535034040822922068576537239452191722786808074622369000594984335557065543495017539992338002123756259365978375186660886179484370560915820484600337343618193501989005733515794341174572697366915377218917268995598532737466873248560512717129014931617597408250418461112752400750722207570913588830")
self.assertEqual(jresult.get("signature"), "11355158914966210201410734667484298031497086510917116878993822963793177737963323849914979806826759273431791474575057946263651613906587629736481370983420295626001055840803201448376203681672140726404056349423937599480275853513810616624349811159346536182220806878464577429106150903913526744093300868582898892977164229848617413618851794501457802670374543399415905458325601994002527427083792164898293507308423780001137468154518279116138010266341425663850327379848131113626641510715557748879427991785684858504631545256553961505159377600982900016536629720752767147086708626971940835730555782551222922985302674756190839458609")

This comment has been minimized.

@cornelinux

cornelinux Nov 17, 2018

Member

I understand that the keys of the response are now sorted for signature, so the signature changes but will stay the same?

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

Yes, as long as there are the same keys/values in the dictionary. Before we
dumped the dictionary as a JSON-string without any order (which was fine
since the order did not change, but it wasn't guaranteed).
Now the the keys of the dictionary get sorted before the dump and thus the
dumped JSON-String should be the same regardless of the sorting of the
dictionary.

@@ -497,7 +497,7 @@ def test_17_ui_get_rights(self):
P = PolicyClass()
rights = P.ui_get_rights(SCOPE.USER, "realm2", "user")
# there was still another policy...
self.assertEqual(rights, ["enable", "disable"])
self.assertEqual(set(rights), {"enable", "disable"})

This comment has been minimized.

@cornelinux

cornelinux Nov 17, 2018

Member

is there a reason why this can not be compares as a list anymore?
WOuld this change the program behaviour?

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

Since ordering is part of the list, we can not (safely) compare lists
generated from dictionaries (since their order can not be guaranteed).
We should check If we compare lists like that somewhere in the code.
Unfortunately lists ans sets arent't the same since lists can contain
multiple occurrences of the same value.

This comment has been minimized.

@cornelinux

cornelinux Nov 19, 2018

Member

OK, I understand. The set does not have an order, so two sets are the same, if they have the same elements. No matter in which order, since there is no order. This would still be true:

self.assertEqual(set(rights), {"disable", "enable"})

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

Correct.
BTW: {"disable", "enable"} denotes a set-literal, not a dictionary.

@fredreichbier

Only have a minor comment to add to Cornelius' review.

@@ -2027,7 +2027,7 @@ def check_token_list(tokenobject_list, passw, user=None, options=None):
# add the user to the options, so that every token, that get passed the
# options can see the user
options = options or {}
options = dict(options.items() + {'user': user}.items())
options.update({'user': user})

This comment has been minimized.

@fredreichbier

fredreichbier Nov 19, 2018

Member

Might be a bit nitpicking :-) But if the caller passes options, this change will modify the object in-place, which was not the case before. I think it would be safer to have something like

if options:
    options = options.copy()
else:
    options = {}
options.update(...)

This comment has been minimized.

@cornelinux

cornelinux Nov 19, 2018

Member

oups I was not aware of the copy. I think I changed this also in my PR. But you are right.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

No nitpicking, will fix.

This comment has been minimized.

@plettich

plettich Nov 19, 2018

Contributor

Done

This comment has been minimized.

@fredreichbier
@@ -1802,7 +1802,7 @@ def test_07_sign_response(self):
new_response = sign_response(req, resp)
jresult = json.loads(new_response.data)
self.assertEqual(jresult.get("nonce"), "12345678")
self.assertEqual(jresult.get("signature"), "12286563956186851222763777429695584644324740652565581988587414544297432783523785532070055857619627542347973153465187816396752261730111152465977744264826200787508894389937562519240189748382275312824518790194963440380306202638365223197802490803597478514617706281242938294159192617108210623265084867605251009544178231020920999223970535034040822922068576537239452191722786808074622369000594984335557065543495017539992338002123756259365978375186660886179484370560915820484600337343618193501989005733515794341174572697366915377218917268995598532737466873248560512717129014931617597408250418461112752400750722207570913588830")
self.assertEqual(jresult.get("signature"), "11355158914966210201410734667484298031497086510917116878993822963793177737963323849914979806826759273431791474575057946263651613906587629736481370983420295626001055840803201448376203681672140726404056349423937599480275853513810616624349811159346536182220806878464577429106150903913526744093300868582898892977164229848617413618851794501457802670374543399415905458325601994002527427083792164898293507308423780001137468154518279116138010266341425663850327379848131113626641510715557748879427991785684858504631545256553961505159377600982900016536629720752767147086708626971940835730555782551222922985302674756190839458609")

This comment has been minimized.

@cornelinux
@@ -497,7 +497,7 @@ def test_17_ui_get_rights(self):
P = PolicyClass()
rights = P.ui_get_rights(SCOPE.USER, "realm2", "user")
# there was still another policy...
self.assertEqual(rights, ["enable", "disable"])
self.assertEqual(set(rights), {"enable", "disable"})

This comment has been minimized.

@cornelinux
Fixed issues from review
- removed unnecessary call to `.keys()` `api/policy.py`
- fixed issue of returning only unique actions in `lib/policy.py`
- fixed possible change of `options` dictionary in `lib/token.py`
- fixed creation of search dictionary in `lib/user.py` (Maybe we should add
  a test for that)
rights = static_rights + enroll_rights
tmp_rights = get_static_policy_definitions(scope)
tmp_rights.update(get_dynamic_policy_definitions(scope))
rights = list(tmp_rights)
# reduce the list
rights = list(set(rights))

This comment has been minimized.

@cornelinux
keys = static_pol.keys() + dynamic_pol.keys()
pol = {k: dict(static_pol.get(k, {}).items()
+ dynamic_pol.get(k, {}).items()) for k in keys}
keys = list(static_pol.keys()) + list(dynamic_pol.keys())

This comment has been minimized.

@cornelinux
@@ -497,7 +497,7 @@ def test_17_ui_get_rights(self):
P = PolicyClass()
rights = P.ui_get_rights(SCOPE.USER, "realm2", "user")
# there was still another policy...
self.assertEqual(rights, ["enable", "disable"])
self.assertEqual(set(rights), {"enable", "disable"})

This comment has been minimized.

@cornelinux
continue
if key == "user" or key == "username":
# If "user" or "username" in param we skip to handle the user afterwards
continue

This comment has been minimized.

@cornelinux

cornelinux Nov 19, 2018

Member

Sorry for nitpicking and probably my original code was even worse.
But should be do:

for key in param:
    lval = param[key]
    if key in ["realm", "resolver", "user", "username"]:
        continue
    searchDict[key] = lval
    log.debug(...)

I was not aware of this earlier.

Fixed creation of search dictionary
Remark from review.
Added the `PYTHONHASHSEED=random` variable to the tests run by travis to
check for implicit usage of dictionary ordering (see
<https://portingguide.readthedocs.io/en/latest/dicts.html#changed-key-order>).
Also improved the test coverage.
@@ -597,20 +597,12 @@ def get_user_list(param=None, user=None):
searchDict = {"username": "*"}
param = param or {}
# we have to recreate a new searchdict without the realm key
# as delete does not work
for key in param:

This comment has been minimized.

@cornelinux
searchDict[key] = lval
log.debug("Parameter key:{0!r}={1!r}".format(key, lval))
# update searchdict depending on existence of 'user' or 'username' in param

This comment has been minimized.

@cornelinux
@cornelinux

This comment has been minimized.

Member

cornelinux commented Nov 21, 2018

:shipit:

plettich added some commits Nov 21, 2018

Fix failing test `test_lib_utils.py`
This is a nice example how the assumed order of a dictionary leads to
(weird) errors:
In this case when looping through the dictionary of character classes, the
order was different and thus the returned comment string is different.
@cornelinux

This comment has been minimized.

Member

cornelinux commented Nov 21, 2018

@fredreichbier This looks good to me. My comments have been covered. If you have nother else we could merge this!

@fredreichbier fredreichbier merged commit 033aaaf into master Nov 22, 2018

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 98.43% of diff hit (target 95.79%)
Details
codecov/project 95.88% (+0.08%) compared to dccc9f4
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@fredreichbier fredreichbier deleted the python3_migrate_dicts branch Nov 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment