Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match tokens not accurate with spaces #216

Open
pete4abw opened this issue Dec 26, 2017 · 8 comments
Open

Match tokens not accurate with spaces #216

pete4abw opened this issue Dec 26, 2017 · 8 comments

Comments

@pete4abw
Copy link

using the password "buy by beer" for example, evaluates to three tokens, "bu" "y by b", and "eer". IMO, any whitespace separator should be used as a match.token separator. Otherwise, the password evaluation is confusing. It works sometimes. "mary had a little lamb" evaluates to "mary" " had a " "little" " " "lamb where the fourth token is just a space. But " had a " evaluates to a bruteforce evaluation.
image

@TheHans255
Copy link

Seconded. The password "one two three four five six" tokenizes to ["one", " two ", "three", " four ", "five", " six"], with " two ", " four ", and " six" evaluating to brute force. We could probably fix this by having an additional dictionary of word separators.

@MyrddinE
Copy link

I believe sentences and phrases are a common tactic, and touch typists will find it easier to add spaces than to type the same phrase without. I believe adding spaces between tokens should be a one-bit entropy addition; right now it adds a lot of unnecessary entropy.

Perhaps call it 'language punctuation'; then you can abstract it into a language agnostic module that also includes other language appropriate punctuation (like commas and sentence-ending punctuation for English).

@chunty
Copy link

chunty commented Jun 12, 2018

I'm not sure if this is the same issue but "passwordpassword!1" is clearly a bad password and expectedly gets a score of 1.

Yet simply separating the words by a space gives "password password!1" and is no better, but gets a score of 4.

@Tostino
Copy link
Contributor

Tostino commented Jun 15, 2018

In my Java port, I handled this problem a bit differently.

https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/java/me/gosimple/nbvcxz/matching/SeparatorMatcher.java

I attempt to extract a separator by looking at occurances of non-alphanumeric characters, and that becomes it's own match type, and the algorithm I implemented to find the best combination of matches takes that into account.

@BenKennish
Copy link

Not sure whether zxcvbn is still in active development but I'd say this is quite an important bug to fix if so.

To give another example:
"applebananacherry" - correctly regarded as 3 dictionary words.
"apple banana cherry" - "apple " is regarded as bruteforce, "banana" as a dictionary word, " " as bruteforce, and "cherry" as a dictionary word.

@Tostino
Copy link
Contributor

Tostino commented Feb 11, 2019

@BenKennish agreed that it really needs to be handled. I just tweeted Dropbox to see if I can get a response if they plan on continuing support for the library, since @lowe hasn't been active in quite a while =(. There are a bunch of issues which really deserve a response from the maintainer, as well as a number of pull requests just sitting out there.

@lowe
Copy link
Collaborator

lowe commented Feb 11, 2019

Hi folks -- I think the best workarounds for this issue currently is to either:

a) remove spaces before feeding into zxcvbn().
b) call zxcvbn() twice and take a minimum.

Long-term I'd like to fix a bit more generally, so that a variety of word separators (space, hyphen, underscore, etc) are recognized. I have a plan for this but have yet to implement.

I agree that the library is overdue for a number of changes, perhaps the biggest of which is removing coffeescript as a dependency.

@mkopinsky
Copy link

@lowe would you be open to a PR essentially transpiling the library to JS? Obviously it's not quite as simple as that (idk if the Coffeescript compiler preserves comments, for example), but if it's something you'd be open to, I'm sure someone from the community (maybe even me) could do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants