Match tokens not accurate with spaces #216

pete4abw · 2017-12-26T16:45:31Z

using the password "buy by beer" for example, evaluates to three tokens, "bu" "y by b", and "eer". IMO, any whitespace separator should be used as a match.token separator. Otherwise, the password evaluation is confusing. It works sometimes. "mary had a little lamb" evaluates to "mary" " had a " "little" " " "lamb where the fourth token is just a space. But " had a " evaluates to a bruteforce evaluation.

TheHans255 · 2018-03-22T22:30:06Z

Seconded. The password "one two three four five six" tokenizes to ["one", " two ", "three", " four ", "five", " six"], with " two ", " four ", and " six" evaluating to brute force. We could probably fix this by having an additional dictionary of word separators.

MyrddinE · 2018-05-29T17:09:26Z

I believe sentences and phrases are a common tactic, and touch typists will find it easier to add spaces than to type the same phrase without. I believe adding spaces between tokens should be a one-bit entropy addition; right now it adds a lot of unnecessary entropy.

Perhaps call it 'language punctuation'; then you can abstract it into a language agnostic module that also includes other language appropriate punctuation (like commas and sentence-ending punctuation for English).

chunty · 2018-06-12T12:14:36Z

I'm not sure if this is the same issue but "passwordpassword!1" is clearly a bad password and expectedly gets a score of 1.

Yet simply separating the words by a space gives "password password!1" and is no better, but gets a score of 4.

Tostino · 2018-06-15T16:16:45Z

In my Java port, I handled this problem a bit differently.

https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/java/me/gosimple/nbvcxz/matching/SeparatorMatcher.java

I attempt to extract a separator by looking at occurances of non-alphanumeric characters, and that becomes it's own match type, and the algorithm I implemented to find the best combination of matches takes that into account.

BenKennish · 2019-02-05T14:32:36Z

Not sure whether zxcvbn is still in active development but I'd say this is quite an important bug to fix if so.

To give another example:
"applebananacherry" - correctly regarded as 3 dictionary words.
"apple banana cherry" - "apple " is regarded as bruteforce, "banana" as a dictionary word, " " as bruteforce, and "cherry" as a dictionary word.

Tostino · 2019-02-11T17:49:05Z

@BenKennish agreed that it really needs to be handled. I just tweeted Dropbox to see if I can get a response if they plan on continuing support for the library, since @lowe hasn't been active in quite a while =(. There are a bunch of issues which really deserve a response from the maintainer, as well as a number of pull requests just sitting out there.

lowe · 2019-02-11T18:19:44Z

Hi folks -- I think the best workarounds for this issue currently is to either:

a) remove spaces before feeding into zxcvbn().
b) call zxcvbn() twice and take a minimum.

Long-term I'd like to fix a bit more generally, so that a variety of word separators (space, hyphen, underscore, etc) are recognized. I have a plan for this but have yet to implement.

I agree that the library is overdue for a number of changes, perhaps the biggest of which is removing coffeescript as a dependency.

mkopinsky · 2019-02-11T21:34:03Z

@lowe would you be open to a PR essentially transpiling the library to JS? Obviously it's not quite as simple as that (idk if the Coffeescript compiler preserves comments, for example), but if it's something you'd be open to, I'm sure someone from the community (maybe even me) could do that.

chunty mentioned this issue Jun 12, 2018

user_inputs argument #227

Open

This was referenced Apr 18, 2019

Incorrect strength detection for password with spaces dwolfhub/zxcvbn-python#45

Closed

Demo thinks a well-known password is strong #237

Open

ulope mentioned this issue May 5, 2020

Score incongruous for repeated words #276

Open

MrWook mentioned this issue Jan 5, 2021

Separators screw with scoring zxcvbn-ts/zxcvbn#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Match tokens not accurate with spaces #216

Match tokens not accurate with spaces #216

pete4abw commented Dec 26, 2017

TheHans255 commented Mar 22, 2018

MyrddinE commented May 29, 2018

chunty commented Jun 12, 2018 •

edited

Loading

Tostino commented Jun 15, 2018

BenKennish commented Feb 5, 2019

Tostino commented Feb 11, 2019

lowe commented Feb 11, 2019

mkopinsky commented Feb 11, 2019

Match tokens not accurate with spaces #216

Match tokens not accurate with spaces #216

Comments

pete4abw commented Dec 26, 2017

TheHans255 commented Mar 22, 2018

MyrddinE commented May 29, 2018

chunty commented Jun 12, 2018 • edited Loading

Tostino commented Jun 15, 2018

BenKennish commented Feb 5, 2019

Tostino commented Feb 11, 2019

lowe commented Feb 11, 2019

mkopinsky commented Feb 11, 2019

chunty commented Jun 12, 2018 •

edited

Loading