Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add separator matcher #115

Merged
merged 10 commits into from
May 2, 2023
Merged

Conversation

domosapien
Copy link
Contributor

Fixes #12

Adds separator matching. Limits separators to common ones ([ ,._-]). Could be expanded to any non-ascii / digit.

To get this to work, I had to override bruteforce and repeat matchers. They would both eat separators and cause the separator to be integrated into other matching. The example used in the issue (buy by beer) still uses the repeat pattern even when there is a separator matcher unless this is changed. Other examples like correct horse battery staple would use bruteforce instead of separators. The final result is to remove bruteforce matches if they have a separator in them. repeats now only match non-separator chars, so they are excluded by default from generating matches with them.

The scoring still feels off to me. I removed additional guesses from being added to repeat separators, as I figured they were enough of a pattern to be easily guessable. This is probably not the right response, and the score still feels high to me. In the issue, the same buy by beer has a nbvcxz result posted showing a much lower entropy. I seem to agree with that one more than the one I generate.

Regardless, I figured this may be a starting point for someone else to potentially point out problems or provide solutions.

@domosapien
Copy link
Contributor Author

The ideas thrown around #12 (comment) may be the better alternatives to strictly matching a new type of pattern and having that impact the entropy too highly.

@domosapien
Copy link
Contributor Author

Example output

separator-matcher.mov

@MrWook MrWook added the enhancement New feature or request label Jun 10, 2022
@domosapien
Copy link
Contributor Author

Well shoot, I definitely forgot to work on the unit tests after I made more changes. I keep running into No element indexed by 31 errors, and don't know how to diagnose them. I definitely messed up the repeat tests. I will try to fix that if I have time today.

@MrWook
Copy link
Collaborator

MrWook commented Jun 15, 2022

Hey, thanks for this pull request 👍
I think the unit test errors are because you changed the repeat matcher a little bit which results in a different behaviour than before with a different scoring even for cases without a separator.

@MrWook
Copy link
Collaborator

MrWook commented Apr 27, 2023

@domosapien could you please resolve the conflicts, so that i can review it once more and maybe merge it this time?

@domosapien
Copy link
Contributor Author

The merge wasn't bad, but there were some failing unit tests for missing translations, which I had just put in as placeholders anyway. I've removed those translations / placeholders to allow for successful unit tests. If you would like to add them, please feel free. I'm not sure what the suggestion would be for separator usage anyway 🙃

@domosapien
Copy link
Contributor Author

I didn't like the scoring that was coming out of the demo. Still seems way too high. I don't know if this should be merged yet.

@domosapien
Copy link
Contributor Author

domosapien commented Apr 28, 2023

I actually don't know if the separators are the issue, or the general issue of determining the lowest entropy path and score.

Looking at the #12 issue, I think the other library does this better somehow. Also, the #143 issue still has the same output (or only every so slightly lower) with the separators matching, which still doesn't seem good.

@MrWook
Copy link
Collaborator

MrWook commented Apr 28, 2023

Could you fix the linting, too?

As you can see in your video above this separator matcher that you implement works just fine. It is splitting buy by beer correctly and the example from #143 too.

The scoring from the java port is a different matter, as the author described. He implemented another matching algorithm as the original doesn't seem quite right. This would be another separated issue for this library but your separater matcher is working as indented :)

@MrWook MrWook merged commit cc9e0b4 into zxcvbn-ts:master May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separators screw with scoring
2 participants