-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add separator matcher #115
Conversation
The ideas thrown around #12 (comment) may be the better alternatives to strictly matching a new type of pattern and having that impact the entropy too highly. |
Example output separator-matcher.mov |
Well shoot, I definitely forgot to work on the unit tests after I made more changes. I keep running into |
Hey, thanks for this pull request 👍 |
35ba365
to
2ec0768
Compare
@domosapien could you please resolve the conflicts, so that i can review it once more and maybe merge it this time? |
* Fix using regex vs string as a base for the pattern * Fix another instance of matching that was not using the correct regex
2ec0768
to
a3963d7
Compare
The merge wasn't bad, but there were some failing unit tests for missing translations, which I had just put in as placeholders anyway. I've removed those translations / placeholders to allow for successful unit tests. If you would like to add them, please feel free. I'm not sure what the suggestion would be for separator usage anyway 🙃 |
I didn't like the scoring that was coming out of the demo. Still seems way too high. I don't know if this should be merged yet. |
I actually don't know if the separators are the issue, or the general issue of determining the lowest entropy path and score. Looking at the #12 issue, I think the other library does this better somehow. Also, the #143 issue still has the same output (or only every so slightly lower) with the separators matching, which still doesn't seem good. |
Could you fix the linting, too? As you can see in your video above this separator matcher that you implement works just fine. It is splitting The scoring from the java port is a different matter, as the author described. He implemented another matching algorithm as the original doesn't seem quite right. This would be another separated issue for this library but your separater matcher is working as indented :) |
* Replace char in password test to not trigger separator match
Fixes #12
Adds
separator
matching. Limits separators to common ones ([ ,._-]
). Could be expanded to any non-ascii / digit.To get this to work, I had to override
bruteforce
andrepeat
matchers. They would both eat separators and cause the separator to be integrated into other matching. The example used in the issue (buy by beer
) still uses therepeat
pattern even when there is a separator matcher unless this is changed. Other examples likecorrect horse battery staple
would usebruteforce
instead of separators. The final result is to removebruteforce
matches if they have a separator in them.repeats
now only match non-separator chars, so they are excluded by default from generating matches with them.The scoring still feels off to me. I removed additional guesses from being added to repeat separators, as I figured they were enough of a pattern to be easily guessable. This is probably not the right response, and the score still feels high to me. In the issue, the same
buy by beer
has anbvcxz
result posted showing a much lower entropy. I seem to agree with that one more than the one I generate.Regardless, I figured this may be a starting point for someone else to potentially point out problems or provide solutions.