You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A very common password style is take the first letter of each word in a sentence/phrase, possibly with some substitutions. This leads to a fairly random looking password that is easy to remember, but hard to brute force. The letters are not randomly distributed however, as they're related to the frequencies of letters as the first letter of words. There are far more words starting with s,c,p than with x,z,y,q or numbers. Thus instead of treating it as cardinality 26 for any lower case letter, treat each letter individually based on its rank in the list of 95 printable characters.
You could get this rank by using something like this:
$ cat /usr/share/dict/american-english | cut -c 1 | uniq -c | sort -n
or $ cat /usr/share/dict/american-english | cut -c 1 | tr A-Z a-z | sort | uniq -c | sort -n
Alternatively you could get the rank based on the character frequencies in the password lists, which would help with the frequencies for numbers and special characters.
The text was updated successfully, but these errors were encountered:
Hi @tewalds, I'm going to close this as a wontfix, but thanks anyway for suggesting this approach. My issue is that this adds too big an assumption about how people choose passwords; I haven't seen data to support that this is a common strategy, and even if 10% of all passwords used this scheme, it would still be incorrect to apply to 90% of other passwords.
Instead of simply closing this as wontfix, why not check the character frequencies in passwords that don't follow simple patterns. You've got a large password database so this should be pretty easy to check. If this is a valid way of reducing the entropy as shown in real passwords, then real password crackers probably use it effectively and it should be used here. If it doesn't reduce entropy much, then wontfix seems perfectly reasonable. I've read articles about password crackers downloading giant corpuses (lyrics, movie quotes, wikipedia, etc) and using sentences from there fairly effectively, though I'm having trouble finding a reference for that. What I did find is another similar idea of using markov chains, which should have similar effects: https://www.trustwave.com/Resources/SpiderLabs-Blog/Hashcat-Per-Position-Markov-Chains/ .
A very common password style is take the first letter of each word in a sentence/phrase, possibly with some substitutions. This leads to a fairly random looking password that is easy to remember, but hard to brute force. The letters are not randomly distributed however, as they're related to the frequencies of letters as the first letter of words. There are far more words starting with s,c,p than with x,z,y,q or numbers. Thus instead of treating it as cardinality 26 for any lower case letter, treat each letter individually based on its rank in the list of 95 printable characters.
You could get this rank by using something like this:
$ cat /usr/share/dict/american-english | cut -c 1 | uniq -c | sort -n
or $ cat /usr/share/dict/american-english | cut -c 1 | tr A-Z a-z | sort | uniq -c | sort -n
Alternatively you could get the rank based on the character frequencies in the password lists, which would help with the frequencies for numbers and special characters.
The text was updated successfully, but these errors were encountered: