-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match tokens not accurate with spaces #216
Comments
Seconded. The password "one two three four five six" tokenizes to ["one", " two ", "three", " four ", "five", " six"], with " two ", " four ", and " six" evaluating to brute force. We could probably fix this by having an additional dictionary of word separators. |
I believe sentences and phrases are a common tactic, and touch typists will find it easier to add spaces than to type the same phrase without. I believe adding spaces between tokens should be a one-bit entropy addition; right now it adds a lot of unnecessary entropy. Perhaps call it 'language punctuation'; then you can abstract it into a language agnostic module that also includes other language appropriate punctuation (like commas and sentence-ending punctuation for English). |
I'm not sure if this is the same issue but "passwordpassword!1" is clearly a bad password and expectedly gets a score of 1. Yet simply separating the words by a space gives "password password!1" and is no better, but gets a score of 4. |
In my Java port, I handled this problem a bit differently. I attempt to extract a separator by looking at occurances of non-alphanumeric characters, and that becomes it's own match type, and the algorithm I implemented to find the best combination of matches takes that into account. |
Not sure whether zxcvbn is still in active development but I'd say this is quite an important bug to fix if so. To give another example: |
@BenKennish agreed that it really needs to be handled. I just tweeted Dropbox to see if I can get a response if they plan on continuing support for the library, since @lowe hasn't been active in quite a while =(. There are a bunch of issues which really deserve a response from the maintainer, as well as a number of pull requests just sitting out there. |
Hi folks -- I think the best workarounds for this issue currently is to either: a) remove spaces before feeding into Long-term I'd like to fix a bit more generally, so that a variety of word separators (space, hyphen, underscore, etc) are recognized. I have a plan for this but have yet to implement. I agree that the library is overdue for a number of changes, perhaps the biggest of which is removing coffeescript as a dependency. |
@lowe would you be open to a PR essentially transpiling the library to JS? Obviously it's not quite as simple as that (idk if the Coffeescript compiler preserves comments, for example), but if it's something you'd be open to, I'm sure someone from the community (maybe even me) could do that. |
using the password "buy by beer" for example, evaluates to three tokens, "bu" "y by b", and "eer". IMO, any whitespace separator should be used as a match.token separator. Otherwise, the password evaluation is confusing. It works sometimes. "mary had a little lamb" evaluates to "mary" " had a " "little" " " "lamb where the fourth token is just a space. But " had a " evaluates to a bruteforce evaluation.
The text was updated successfully, but these errors were encountered: