New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token janitor now forbids --chunksize for modification actions #1323

Open
wants to merge 1 commit into
base: branch-2.23
from

Conversation

Projects
None yet
2 participants
@fredreichbier
Member

fredreichbier commented Nov 27, 2018

For several combinations of filter criteria and actions, the --chunksize parameter resulted in matching tokens for which no action was performed.

This PR does change behavior to fix this! In 2.23, privacyidea-token-janitor default behavior was to query the database in chunks of 1000, the new default behavior is to not use chunked search at all, but it can be enabled by passing --chunksize X. The token janitor now forbids --chunksize in conjunction with actions that modify the database. This fixes the issues given in #1322, but also forbids some combinations of filter criteria and actions that would, in fact, work correctly. This may be cumbersome for users with a large number of tokens.

So ... I'm not sure if that's the best fix :-) Happy about any comments!

Merging #1323 into branch-2.23 will decrease coverage by 0.25%.
The diff coverage is n/a.
Impacted file tree graph

@@              Coverage Diff               @@
##           branch-2.23   #1323      +/-   ##
==============================================
- Coverage        96.06%   95.8%   -0.26%
==============================================
Files              142     142
Lines            18165   17203     -962
==============================================
- Hits             17450   16482     -968
- Misses             715     721       +6
Impacted Files Coverage Δ
privacyidea/lib/tokens/u2f.py 94.16% <0%> (-1.67%) ⬇️
privacyidea/lib/token.py 95.18% <0%> (-1.15%) ⬇️
privacyidea/api/lib/postpolicy.py 96.61% <0%> (-1.01%) ⬇️
privacyidea/api/token.py 98.59% <0%> (-0.72%) ⬇️
privacyidea/api/lib/prepolicy.py 96.59% <0%> (-0.59%) ⬇️
privacyidea/lib/policy.py 99.39% <0%> (-0.46%) ⬇️
privacyidea/lib/tokens/hotptoken.py 100% <0%> (ø) ⬆️
privacyidea/lib/tokens/vascotoken.py 96.55% <0%> (+1.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44c02e0...1382e49. Read the comment docs.

  • Maybe pagination in this variant is not useful here since we don't want to display any results with prev|next|last links.
    Here is an interesting approach which deals with this kind of problems.
    But we have to build the requests by hand in this case, i guess.
  • Ah, clever: Our current Pagination-based approach issues exactly the inefficient LIMIT ... queries mentioned in the post above:
SELECT ... FROM token WHERE ... LIMIT 0, 1000
SELECT ... FROM token WHERE ... LIMIT 1000, 1000
...

which may also cause problems if the set of tokens matching the WHERE clause changes between the first and the second query (e.g. because some are deleted).
We could instead do

SELECT ... FROM token WHERE ... ORDER BY id ASC LIMIT 0, 1001
SELECT ... FROM token WHERE id >= 1219 ... ORDER BY id ASC LIMIT 1001
...

which should work even if we delete tokens between query 1 and 2.
This is a good idea! However, as this would require quite some additional code, I'm not sure if we should implement this in branch-2.23. Maybe we should instead plan that for the 3.0 code cleanup. What do you think?

  • Pagination is nice for displaying but not not for chunk-wise processing.
    Maybe we could introduce a get_tokens_chunk() function which also returns the id of the next starting point.
    Or we cannibalize the get_tokens() function which already has some kind of pagination feature (is it used anywhere else besides the token janitor?).
    If this is no pressing issue, we should plan this for 3.0/3.1

  

Token janitor now forbids --chunksize for modification actions
For several combinations of filter criteria and actions, the --chunksize
parameter resulted in matching tokens for which no action was performed.

Fixes #1322

@fredreichbier fredreichbier requested a review from privacyidea/core Nov 27, 2018

@fredreichbier fredreichbier changed the base branch from master to branch-2.23 Nov 27, 2018

@fredreichbier fredreichbier reopened this Nov 27, 2018

@codecov

This comment has been minimized.

codecov bot commented Nov 27, 2018

Codecov Report

Merging #1323 into branch-2.23 will decrease coverage by 0.25%.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff               @@
##           branch-2.23   #1323      +/-   ##
==============================================
- Coverage        96.06%   95.8%   -0.26%     
==============================================
  Files              142     142              
  Lines            18165   17203     -962     
==============================================
- Hits             17450   16482     -968     
- Misses             715     721       +6
Impacted Files Coverage Δ
privacyidea/lib/tokens/u2f.py 94.16% <0%> (-1.67%) ⬇️
privacyidea/lib/token.py 95.18% <0%> (-1.15%) ⬇️
privacyidea/api/lib/postpolicy.py 96.61% <0%> (-1.01%) ⬇️
privacyidea/api/token.py 98.59% <0%> (-0.72%) ⬇️
privacyidea/api/lib/prepolicy.py 96.59% <0%> (-0.59%) ⬇️
privacyidea/lib/policy.py 99.39% <0%> (-0.46%) ⬇️
privacyidea/lib/tokens/hotptoken.py 100% <0%> (ø) ⬆️
privacyidea/lib/tokens/vascotoken.py 96.55% <0%> (+1.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44c02e0...1382e49. Read the comment docs.

@plettich

This comment has been minimized.

Contributor

plettich commented Nov 29, 2018

Maybe pagination in this variant is not useful here since we don't want to display any results with prev|next|last links.
Here is an interesting approach which deals with this kind of problems.
But we have to build the requests by hand in this case, i guess.

@fredreichbier

This comment has been minimized.

Member

fredreichbier commented Dec 11, 2018

Ah, clever: Our current Pagination-based approach issues exactly the inefficient LIMIT ... queries mentioned in the post above:

SELECT ... FROM token WHERE ... LIMIT 0, 1000
SELECT ... FROM token WHERE ... LIMIT 1000, 1000
...

which may also cause problems if the set of tokens matching the WHERE clause changes between the first and the second query (e.g. because some are deleted).
We could instead do

SELECT ... FROM token WHERE ... ORDER BY id ASC LIMIT 0, 1001
SELECT ... FROM token WHERE id >= 1219 ... ORDER BY id ASC LIMIT 1001
...

which should work even if we delete tokens between query 1 and 2.

This is a good idea! However, as this would require quite some additional code, I'm not sure if we should implement this in branch-2.23. Maybe we should instead plan that for the 3.0 code cleanup. What do you think?

@plettich

This comment has been minimized.

Contributor

plettich commented Dec 11, 2018

Pagination is nice for displaying but not not for chunk-wise processing.
Maybe we could introduce a get_tokens_chunk() function which also returns the id of the next starting point.
Or we cannibalize the get_tokens() function which already has some kind of pagination feature (is it used anywhere else besides the token janitor?).
If this is no pressing issue, we should plan this for 3.0/3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment