Improve rule stacking #4014

magnumripper · 2019-05-30T22:32:04Z

Current rule stacking actually runs the preprocessor as well as the rule engine twice, which can't help performance. I'm thinking we could instead concatenate the two rules before PP and only run things once although that will involve less trivial things for throwing things around (eg. if the second rule has reject-flags they must be moved to just after the first rule's reject flags (if any) and/or remove dupes of such.

Example, rule1 ^[A-Z] and rule2 Az"[0-9][0-9]"

Current flow (first entry from PP from respective rule):

"word" -> apply ^A -> "Aword" -> apply Az"00" -> "Aword00"

New flow, single rule, PP ^[A-Z] Az"[0-9][0-9]"

"word" -> apply ^A Az"00" -> "Aword00"

That was easy. Now something less trivial:

Rule1 -p -c (?a 2 (?a c 1 [cl] and rule2 -c /?v V Q

Current flow (single mode, 2 words)

"john"+"smith" -> -p -c (?a 2 (?a c 1 c -> "JohnSmith" -> -c /?v V Q -> "JoHNSMiTH"

New flow, concat -p -c (?a 2 (?a c 1 [cl].+ M.-c /?v V Q (M always added between them, and in this case also + because we have 1 and/or 2 in the first rule) and then dropping the second -c results in -p -c (?a 2 (?a c 1 [cl] + M /?v V Q

"john"+"smith" -> -p -c (?a 2 (?a c 1 c + M /?v V Q -> "JoHNSMiTH"

A drawback with doing/fixing this is that current --rules-stack run first word through all rules before going on to next word (normal rules do the opposite). This is sometimes wanted.

The text was updated successfully, but these errors were encountered:

solardiz · 2019-05-31T18:17:00Z

Current rule stacking actually runs the preprocessor as well as the rule engine twice, which can't help performance. I'm thinking we could instead concatenate the two rules before PP and only run things once

Wouldn't that one run be on twice longer rules, and thus of roughly the same total processing cost?

Overall, what you suggest here sounds to me like introducing complexity for no gain. But I could be wrong, especially given that I'm not very familiar with the current rule stacking.

magnumripper · 2019-05-31T18:23:25Z

Wouldn't that one run be on twice longer rules, and thus of roughly the same total processing cost?

Maybe, I'm not sure yet.

Overall, what you suggest here sounds to me like introducing complexity for no gain. But I could be wrong, especially given that I'm not very familiar with the current rule stacking.

Hopefully you are right. I just want to look at it when I get the time.

solardiz · 2019-05-31T18:32:37Z

BTW, don't we still run the preprocessor + rule engine across all rules an extra time just to pre-check the rules' syntax? I certainly do that in core. If we care about speeding things like this up, we should add a way to (partially) skip this checking - e.g., a john.conf setting limiting this checking to only the first N ruleset lines (pre-pp) or/and rules (post-pp) - e.g., 1 million by default. This would speed up startup, but would allow allow for postponed failure. Speaking of which, we might then also have a setting to make such failure non-fatal (optionally only when the failure is a postponed one? make the setting a tri-state? this gets tricky).

magnumripper added the enhancement label May 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve rule stacking #4014

Improve rule stacking #4014

magnumripper commented May 30, 2019

solardiz commented May 31, 2019

magnumripper commented May 31, 2019

solardiz commented May 31, 2019

Improve rule stacking #4014

Improve rule stacking #4014

Comments

magnumripper commented May 30, 2019

solardiz commented May 31, 2019

magnumripper commented May 31, 2019

solardiz commented May 31, 2019