New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(training): add some new features #19
Conversation
- l33t, now we can use DFS to find all possible unleeted cases, we can also use l33t.ignore / l33t.found to early accept or reject to speedup the process of finding l33ts - multiwords. Now we can split digits and others to multiwords. And we can find all possible Multi words compositions and pick one with largest probability by DFS - context. Add some fixed collocations to context detector. - monte carlo. Add monte carlo method. Note that a password may be given several probabilities, therefore I try to use as many as possible structures to find the largest probability of a given password. The result may not accurate. However, I test that under 10^10 guesses, the error is less than 0.1% between using Monte Carlo and actually generate 10^10 candidate passwords (used 5 datasets, nearly 100Millon).
- the logic of codes is not changed
feat(training): l33t, multiwords, context sensitiveI'm still on the way to master English. Sorry for my poor presentation and documentations. Note that I didn't change any files of your repo except for .gitignore. What I did is to add new files into the repo. You can reject the Pull Request I committed last time. The usage of newly added features are similar to original ones. The difference is that sections_list will not be changed in place. You will get a new instance of sections_list. Usages
Changes
fixes: #18 |
- the cause if that I changed the return value of extract_l33t, however, I forget to change corresponding codes in parse(password: str), Therefore, sorted(l33t_list, key=lambda x: x[1]) will not give us correct result.
- following operation of v4.1 and v4.1-with-l33t. - intuitive.
- fix a bug in cs detection - add corresponding test case - add cli to segmntr
Some Known Bugs Fixedl33t related codes will resort the password in an incorrect way because I changed the return value but didn't change corresponding codes using this returned value. Now this bug is fixed. context_sensitive_detection has a bug. I fixed it and added corresponding test case to your codes. This is the only change I made to your codes. The other changes are all in new files and they won't affect your original codes. SegmntrParsing passwords in test set and we can see what the structure of a password is. fixes: #18 |
I'm really impressed by these changes and I apologize as I had some things pop up in my personal life that have limited my ability to focus on this. That being said, I really would like to get these changes integrated by next Thursday in time for Defcon, as well as the Crack Me If You Can competition. Also your English is great so no need to apologize for that. I appreciate the work you have put into this, as there are some features here I am very excited about. As a heads up, I will likely accept your pull request, but then push another commit that temporarily disable some of your additions. Then I can slowly add those features back in as I have more time to test and understand them. I'll make sure I give you credit for these features, but since other people download and use this toolset I want to ensure I have a good handle it won't break any of the other tools such as the guesser and prince-ling. |
1) Modified the unittests to support changes to the context_sensitive wordlist 2) Found a legacy bug in website_detection that had nothing to do with this pull request. lakiw#3) Moved entries around in the .gitignore to group the different types of ignore rules lakiw#4) Removed the version requirements from chardet in requirements.txt
I apologize once again, as when I started really going through your code I realized you had already done most of what I was thinking about with moving many of the changes into new programs for people to run. I really like what I see, and by doing things like running the unit-tests again I found some errors I wasn't aware of in my own code (nothing to do with your additions). |
a password and pick the one with largest probability
probability, because a password can be generated several time.
for example, 1q2w3e4r may be treated as K8 or K7A1.
fixes: #18