-
Notifications
You must be signed in to change notification settings - Fork 0
A collection of tools to make it easier to generate wordlists and collect tripcodes
License
MIT, MIT licenses found
Licenses found
MIT
LICENSE
MIT
LICENSE-IWI
crypt3lx2k/Tripcode-Dictionary-Tools
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Tripcode Dictionary Tools is a collection of software that makes it easy to collect tripcodes, gather candidates for wordlists and operate on 4chan boards/pages/threads. First of all, if the only reason you're reading this is because you want to crack peoples tripcodes to be disruptive and annoying, fuck off. If you violate the terms of service of the 4chan API typically with excessive requests you might get banned. If you violate the rules of 4chan on 4chan itself you will get banned, this includes dumping peoples tripcodes. If you're a douche, fuck off. Second of all, this isn't a plug and play type of software. You most likely have to do some work to do what you want to do, that being said this package provides several programs that are designed to make the job easier for you. Remember that you can invoke all of these programs with either -h or --help to see a full list of what they offer individually. Primarily 6 programs are provided, all of these programs accept input in the form of 4chan links. A 'link' can either be a full thread URL like https://boards.4chan.org/g/thread/36491091 boards.4chan.org/g/thread/36491091 or something simpler like /g/thread/36491091 For a board you can either write https://boards.4chan.org/g/ boards.4chan.org/g/ /g/ or g If you specify a board as a link the entire board is scraped. You can also specify specific pages on boards, https://boards.4chan.org/g/1 boards.4chan.org/g/1 /g/1 or g/1 If you specify a page as a link every thread on that page is scraped. crack.py: This program uses the tripcodes/public.db3 and tripcodes/secure.db3 (optionally others, invoke with --help) tripcode databases and checks them versus posts, and prints cracked tripcodes to the screen. The databases does not come with the program, you have to generate them yourself, there are several programs that are designed to aid you with this task. dump_hashes.py: This program writes every unique tripcode found to a file. You can then use this file with an external cracking program like John the Ripper (1.8.0+ supports tripcodes out of the box, but without UTF-8/SJIS support or HTML replacement). dump_words.py: This program looks all over for potential words that can be used as tripcodes, this includes in subjects, filenames and e-mail fields and writes the unique words found to a file. The file is not sorted and contains one word per line, the words are potentially a lot longer than the maximum 8 characters used to create a tripcode. dump_ngrams.py: This program looks in comments for ngrams and writes them to a file, the filter used to generate 'tokens' for this program is a lot stricter than for the dump_words.py program. The file is sorted by occurrence and has the format <space separated list of n words> <number of occurrences> These 4 programs all potentially use a lot of bandwidth, in accordance with the 4chan API all of them buffer pages and use if-modified where applicable. So if you have cached a thread by operating on it earlier, you might avoid downloading the thread again, but a request is still made to 4chans servers. This is backed by a cache file bin/cache.bin (optionally something else), if you have downloaded a specific board of 4chan already and want to operate on that you can invoke all of the above programs with the --offline flag, this makes them only use the cache and no internet. To aid this practice, build_cache.py: This program just builds the cache and does nothing else, you can afterwards invoke all of the above programs with --offline. Note that when running a program in offline mode it will set the number of threads to 1, as multiple threads in Python only tend to slow things down when major blocking I/O (like downloading) isn't involved. So for instance if you want to dump all the hashes on /g/, also dump the words and a couple of ngrams you can do this. $ ./build_cache /g/ $ ./dump_hashes --offline tripcodes.txt /g/ $ ./dump_words --offline words.txt /g/ $ ./dump_ngrams --offline bigrams.txt 2 /g/ ... Note that all of the above programs also build the cache, so if you just write $ ./dump_hashes tripcodes.txt /g/ you will also have a cached version on /g/ on your machine. prune_cache.py: This program prunes 404'ed entries from the cache, if you run this sporadically you'll avoid the cache file growing too big, if you want to build an archive or something like that you obviously don't want to run this program as old threads are simply deleted. You can also run this program in offline mode, take care when using this as it uses the offline board thread lists to see what threads are unreachable, you could potentially prune a new thread if you have been working on it without touching the board thread list. The end user is also presented with 3 secondary programs, util/makesql: This program reads a file with tripcode/solution pairs and makes databases for use with the ./crack.py program, it optionally accepts a format string in the form of a regex, invoke with --help for details. util/johntosql: This program reads a john.pot file generated by John the Ripper and creates a database based on that. util/triptestertosql: This programs read a file generated by Tripcode-Tester (https://github.com/crypt3lx2k/Tripcode-Tester) and generates a database based on that. A small tripcode list to start off with is located at http://www.pageoftext.com/PH_plain&nm_page=secure_tripcode_dictionary it's small for a regular tripcode list but it's the most significant public collection of secure tripcodes I have found on the internet. As far as I can tell the list was created by user 'jeb3' on userscripts.org http://userscripts.org/users/77660 specifically for the script 'Tripcode Breaker' http://userscripts.org/scripts/review/68857 so all credit goes to 'jeb3' for compiling that list. I do not host that list so it might go offline whenever and without notice. If you want to start by using that list you can use util/makesql in the following manner, we start off by downloading the list $ wget 'http://www.pageoftext.com/PH_plain&nm_page=secure_tripcode_dictionary' \ -O tripcodes.txt then we generate a regular and a secure tripcode database based on it $ util/makesql --regex='\solution!\tripcode!!.{11}' tripcodes.txt \ tripcodes/public.db3 $ util/makesql --regex='\solution!.{10}!!\secure' tripcodes.txt \ tripcodes/secure.db3 If you then do $ ./crack.py /sp/ you might get some results, but it's not very likely due to the small size of the database. You're probably better off creating wordlists and using them with John the Ripper, you might also want to try it with large public leaks like the rockyou list. Last but not least I have to mention that you can of course use tdt from the Python shell itself as a module. The programs themselves serve as examples how to do this. FAQ: Q: Do you hate tripcode users? A: No, software like this actually helps tripcode users choose better tripcodes as it makes weak tripcodes very apparent and therefore easy to avoid. Q: Some douche leaked my tripcode on 4chan, what do I do now? A: First of all you report the post where the leak happened for a rule violation. Then you recover, you can keep using your tripcode knowing that it is public, or you can choose a better one, very rarely do people persist in imitating others for long periods of time. If you're a high quality contributor e.g. you often contribute in draw threads or write threads people will know when you're being imitated and when you're actually posting as your imitators lack the artistic skills to properly imitate you. Q: How do I choose a good tripcode? A: The short answer to this is use a secure tripcode. After using it once you then google the hash and if you get no results on google you're probably in the clear. If you want to use a regular tripcode the same rules apply as for passwords with a few special modifications. This should be apparent, don't use dictionary words, this includes phrases commonly posted on 4chan, don't use combinations of dictionary words. Use all 8 characters, avoid the characters "<>& as they expand to their corresponding HTML entities and make your tripcode effectively very short. If you use 8 characters and you choose from alphanumeric characters (A-Za-z0-9) there are (26+26+10)^8 = 62^8 = 218340105584896 possible combinations, against a GPU with 130 million tripcodes per second it will take about 19.4 days running non-stop to exhaust the keyspace. If you use certain Japanese characters in your tripcode you effectively expand the number of combinations to a number greater than 2^56, against the same GPU it will then take 17.5 years to exhaust the keyspace. So while it is possible to brute force tripcodes it still takes an unreasonable amount of time unless someone decides to dedicate a lot of resources to it. The most secure tripcodes are those generated by a program like MTY, as these are completely random and therefore model very strong passwords. So if you want to use a regular tripcode you might as well spend a little time generating one with a specific phrase in it.
About
A collection of tools to make it easier to generate wordlists and collect tripcodes
Resources
License
MIT, MIT licenses found
Licenses found
MIT
LICENSE
MIT
LICENSE-IWI
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published