-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Levenshtein Distance for dictionaries #43
Comments
I'd take a look at what I did for the Java "port" I maintain: https://github.com/GoSimpleLLC/nbvcxz LD calculations were extremely useful, but my god they slow things down. In nbvcxz I have it as a configuration option if we use an LD pass for our dictionary matching algorithm or not. Here is the meat of it: https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/java/me/gosimple/nbvcxz/matching/DictionaryMatcher.java |
That's what i wanted to do anyway if i implement it :D Actually i thought exactly the same thing that this will slow down everything extremely. So i thought about making it optional and it would be a good idea to add a debounce zxcvbn function. Normally this library is used directly on an input field and the distance would make the input really laggy, i guess. So a debounce function which will trigger the real zxcvbn function only after a certain time after the user typed would improve this. |
So I optimized the LD pretty well for the common case in my implementation. For the vast majority of passwords, we don't even bother getting to that code path. Check all the cases in the DictionaryMatcher where I short circuit the LD code path because the password isn't a good candidate for it to matter. I only bother doing the LD calculation on the whole password rather than each individual part of the password, cutting down the amount of work required immensely. But your mention of how this is usually tied directly to a field makes that less of a difference. For my Java library since it's server side, I generally just get updates from the input box it's tied to every so often, and then send my estimate back to the client to update the "password strength meter" the calculation is tied to. As a user is typing, I don't necessarily get an update every keystroke to recalculate things with my web framework, so it doesn't cause issues with how I use it. I like the idea of having different code paths, one with very tight timing that will give some reasonably accurate result as the user is typing, and then a slower code path that is able to get things "right" once they have stopped mashing the keyboard. |
I was checking some SO topics on this matter and I found this. It might be worth exploring. |
I finally got the time to do a little bit and added the levenshtein distance. I don't like to create a dependency but this seems to be the best option for me. The library is super fast and i already saw some changes in the scoring, some comparison cases on the documentation page dropped by one point for the scoring. |
It would be nice if the dictionary matching has the ability to use Levenshtein Distance (LD) calculations to match passwords which are non-exact matches to a dictionary entry.
For example that something like this would match:
{someDictionary: ['Herbert', 'Dorothea']}
with the "password"Hebert
would match the first entry of the dictionary.The downside is that this would decrease the performance of the library.
The text was updated successfully, but these errors were encountered: