Join GitHub today
Option to disable hash sorting and counting #1545
Every time hashcat loads a hash file it tries to count and sort the hashes. However, this process seems to be taking a very long time.
To speed up the whole process (especially when performing batch jobs) it would be really useful if additional flags are provided:
I think that the only way something like this could work is a mechanism similar to the dictstat file (for wordlists): hashcat could remember that it already parsed and sorted these set of hashes and have some file timestamps to make sure that the files didn't change in the meantime. It could even dump the sorted (prepared) hash list to a separate file on disk and pick it up if the user uses the same hash list again and again.
I don't think that hashcat could trust the user that it sorted the hash list correctly and that it wasn't changed afterwards :( Therefore it needs to be done by hashcat and verified.
The disadvantage of this is that in the worse case hashcat would need to store at least a file as large as the original hash file (+ some metadata, like timestamps etc).
I like the idea of the cached hashlist blob. However, this could create a mess in the hashcat installation folder. The question is in the details. Should we store such a blob in the same folder as the original hashlist or should we use a special blob folder to do that? I'd think the first one sounds better. We also need to have a different filename ideally with a special filename suffix (eg: .hcdump or so).
We'll also need some sort of header. Any ideas of the required attributes?
Oh and what about if the user changes the data in the original hashfile. We need to make sure that there's no change. We could, for example, store the original hashfile filestats (as @philsmd already said as we do in dictstat) and compare them before we allow hashcat to load the blob.
Finally the question is: Is it worth having such a complex system and what's the benefit? The only affected users would be users with very large hashlists. But for hashlists < 1M hashes this creates almost no advantage. I'd think most of the hashlists load by hashcat are small hashlists and single hashes but I don't have any real statistics about that.