Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All 4 character classes not always used #4

Open
zbeekman opened this issue Oct 19, 2017 · 2 comments
Open

All 4 character classes not always used #4

zbeekman opened this issue Oct 19, 2017 · 2 comments

Comments

@zbeekman
Copy link

Generated passwords don’t always include all 4 character classes. It would be nice to have an option that would change the algorithm to guarantee all classes are drawn from. Obviously this must be done deterministically and will decrease the apparent entropy of the generated passwords because e.g. there are fewer characters to draw from in certain classes, in particular the set of symbols, so the probability of seeing characters that are elements of smaller classes increases. But if your passwords are sufficiently long this should have any practical consequence.

@nicjansma
Copy link
Owner

That's a smart idea, and I've hit this problem in the past too.

One challenge is that the algorithm relies on Base64 (base64url variation) which only has two non-alphanumeric characters: + and /.

While I think most sites that require some sort of symbol would accept either of those characters, the generated password possibly won't contain them.

I think we'd want to make two changes to satisfy sites that require specific character classes:

  • Allow to use Ascii85 or Z85 instead of Base64 which have more string-safe symbols in them
  • If the generated password does not satisfy all requirements (e.g. doesn't contain symbols), the iteration number would be added to the end of the string that is hashed until it satisfies all requirements. e.g.:
Salted Password = Trim(Base64(Hash(Master Password + Domain Name + Domain Phrase + Iteration#)))

Thoughts?

@zbeekman
Copy link
Author

Adding an iteration number or nonce at the end until the drawing comes up satisfactorily would work and seems like a good approach. I don't know if you are supporting/want to support non-ascii characters, that would complicate things enormously.

I think your proposal is the simplest, most effective answer to this problem, provided you can draw from a larger class of symbols.

I can think of more complicated approaches, but, realistically I doubt they're worth it. (For example, you could create a pool of characters, where duplicates are allowed. To start you could seed the pool with one of each allowed character in a predetermined order. You could then use the hash to index into the pool and draw characters. Depending on which character you're drawing (n of M total) you could increase the probability of drawing a character from an unsatisfied class by appending the pool with additional copies of that set of characters. The hash could then be used to shuffle the final output so that there isn't an increased probability of a character from one of the smaller classes (0-9 and symbols) at the end of the password. This is completely deterministic, but a PITA to program. The only advantage over your approach is that you could code it in such a way that it is guaranteed to succeed on the first try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants