Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Each entry in the match sequence needs to add some inherent entropy #48

Closed
pde opened this issue Aug 2, 2014 · 4 comments
Closed

Each entry in the match sequence needs to add some inherent entropy #48

pde opened this issue Aug 2, 2014 · 4 comments

Comments

@pde
Copy link

pde commented Aug 2, 2014

zxcvbn decomposes each password into a match sequence, and then for each match says, "aha, I can find this part in an English dictionary (7 bits)", "this next piece is a name (4 bits), "this is brute force (9 bits)".

There is an inherent entropy to changing models each time. It's probably not much (2-6 bits per entry in the match sequence, I'm guessing) but at the moment zxcvbn is underestimating passwords that jump between a number of these.

@pyramids
Copy link

The original author referred to this entropy as structural entropy, and made a documented decision to ignore it ("It’s difficult to formulate a sound model for structural entropy; statistically, I don’t happen to know what structures people choose most, so I’d rather do the safe thing and underestimate", https://tech.dropbox.com/2012/04/zxcvbn-realistic-password-strength-estimation/).

@pde
Copy link
Author

pde commented Aug 14, 2014

A very simple but decent model would be to observe the frequency of all
structures across a large password dataset. If dropbox doesn't want to
do this, there are a few blog posts by people with very large password
dbs who might help.

A slightly better model would be to make the structure probabilities
conditional on the preceding structure in a given password.

Or you could go overboard and use
PPM.

Sounds like a fun project for an undergraduate thesis or an intern
somewhere :)

On Tue, Aug 12, 2014 at 12:25:22PM -0700, Björn Stein wrote:

The original author referred to this entropy as structural entropy, and made a documented decision to ignore it ("It’s difficult to formulate a sound model for structural entropy; statistically, I don’t happen to know what structures people choose most, so I’d rather do the safe thing and underestimate", https://tech.dropbox.com/2012/04/zxcvbn-realistic-password-strength-estimation/).


Reply to this email directly or view it on GitHub:
#48 (comment)

Peter Eckersley pde@eff.org
Technology Projects Director Tel +1 415 436 9333 x131
Electronic Frontier Foundation Fax +1 415 436 9993

@lowe
Copy link
Collaborator

lowe commented Sep 24, 2015

Agreed. @pde, thanks for reporting, I know it's been a while :) Extra entropy for each entry in the match sequence is coming soon. I have a simpler scheme in mind than what you propose, and will update this thread with more soon.

@lowe
Copy link
Collaborator

lowe commented Oct 24, 2015

After experimenting with different models over the last two weeks, a reasonable length penalty is now implemented in 4.0.1. Try it out, and check the docs in scoring.coffee to see how it works. Feedback appreciated!

@lowe lowe closed this as completed Oct 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants