Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always store a raw hash if possible #1469

Open
magnumripper opened this issue Jun 25, 2015 · 10 comments
Open

Always store a raw hash if possible #1469

magnumripper opened this issue Jun 25, 2015 · 10 comments

Comments

@magnumripper
Copy link
Member

Here's an idea from #1463

Any format that isn't using UTF-16 or something like that, and that is raw or "salted raw", could store its hash as raw. On loading pot file, it would then need to look for raw hashes of the correct type, and apply some logic to it.

Here's an example:

$ cat john.pot
$dynamic_0$488911ea30ce711dca20bf7413fc49cc:johnripper
$dynamic_1$488911ea30ce711dca20bf7413fc49cc$ripper:john
$dynamic_1$488911ea30ce711dca20bf7413fc49cc$ipper:johnr
$dynamic_4$488911ea30ce711dca20bf7413fc49cc$john:ripper
$dynamic_4$488911ea30ce711dca20bf7413fc49cc$jo:hnripper

With clever handling (and perhaps a revision of john.pot format) any of the formats would be able to use any one of the lines to produce a correct --show figure from an input hash.

For example, you load the hash $dynamic_1$488911ea30ce711dca20bf7413fc49cc$ripper. The format finds the line $dynamic_4$488911ea30ce711dca20bf7413fc49cc$john:ripper in john.pot and draws the conclusion the proper plaintext for dynamic_1 is "john".

But taking it a step further, all those formats (as well as the thick raw-md5's) should store their hash like this:

$md5$488911ea30ce711dca20bf7413fc49cc:johnripper

And then they'd have to apply logic to it when reading the pot file back. This would be simpler (and more efficient) than trying to parse all other formats.

@magnumripper
Copy link
Member Author

I'm dynamic_1. I get this hash

$dynamic_1$488911ea30ce711dca20bf7413fc49cc$ripper

and then I read the pot file and see this hash

$md5$488911ea30ce711dca20bf7413fc49cc:johnripper

I see it's the same hash, so I'll bite. I know my salt is appended and is "ripper" so I strip that from the pot entry's string, and use the remainder of it: "john".

@frank-dittrich
Copy link
Collaborator

Given that some formats "hexify" their salts when they contain certain problematic characters, this can get messy.
Also, I'd like the pot file to contain the real passwords, i.e. johnripper, john and ripper, for loopback mode, `--make-charset, generatiing markov stats files, statistics on password frequency, etc.
May be this should be discussed on john-dev?

@magnumripper
Copy link
Member Author

May be this should be discussed on john-dev?

Sure (and john-users is probably better) but that doesn't usually add anything. We're still just toying with the ideas.

@magnumripper
Copy link
Member Author

Also, I'd like the pot file to contain the real passwords, i.e. johnripper, john and ripper, for loopback mode, `--make-charset, generatiing markov stats files, statistics on password frequency, etc.

This is a valid concern. I agree but I still think this is interesting. I'll ponder it for a while.

@davidbolvansky
Copy link

@magnumripper How to parse just hashes and passwords out of john.pot file?

Can I reliably drop everything between first $ and second $?
“$dynamic_0$488911ea30ce711dca20bf7413fc49cc:johnripper”

@AlekseyCherepanov
Copy link
Member

How to parse just hashes and passwords out of john.pot file?

Can I reliably drop everything between first $ and second $?
“$dynamic_0$488911ea30ce711dca20bf7413fc49cc:johnripper”

@davidbolvansky At the moment john stores unambiguous hash plus password in .pot file. Hash and password are separated by :. (Passwords are not encoded as $HEX[] as in hashcat, so \n in password causes troubles. Also there are some options regarding encodings. Default is to store as utf-8.)

Unambiguous hash means that it is possible to infer hash format from the string correctly. We call it canonical form. I.e. john loading hashes in such form understands format right always. There might be multiple formats to load this hash, but all of them should be able to crack this hash with the given password (the only exception is confusion between hmac-* formats and other formats with # in hash; also some "compatible" formats can have different limitations on length disallowing some of them from cracking particular password, e.g. nt vs nt-long). E.g. $dynamic_0$ is for raw-md5, raw-md5-opencl and dynamic_0 formats that are different implementations of the same hash type.

We use word "tag" regarding prefixes like $dynamic_0$. Simple formats just get a tag to express its format. Some formats do not modify hash at all because common form of hashes is unambiguous (e.g. sha256crypt format has only one form). But some formats modify hashes much heavier (e.g. scrypt format with its $9$ and $ScryptKDF.pm$ forms). Also formats with very long hashes store a reduced form of hashes in .pot file; the reduced form contains $SOURCE_HASH$ substring (e.g. 7z format uses it).

$ john --format=scrypt --list=format-tests
[...]
scrypt	6	$9$X9fA8mypebLFVj$Klp6X9hxNhkns0kwUIinvLRSIgWOvCwDhVTZqjsycyU	JtR
scrypt	7	$ScryptKDF.pm$16384*8*1*bjZkemVmZ3lWVi42*cmBflTPsqGIbg9ZIJRTQdbic8OCUH+904TFmNPBkuEA=	test123
[...]

We call bare 32 hex digits a bare hash. Particularly 32 hex digits could be raw-md5, raw-md4 and a few more predefined formats. Moreover it might be md5(md5($p)). In same cases, bare hash can be cracked as different formats, e.g. ec0405c5aef93e771cd80e0db180b88b is raw-md5 of 900150983cd24fb0d6963f7d28e17f72 and also it is md5(md5($p)) of abc. That's why tags are pretty important.

The problem is that conversion from some form into the canonical form is a one way road. There is no code in john to do reverse conversion. So you have a few options:

  • If you are interested in raw-md5 only then you may pick lines starting with $dynamic_0$ and remove the tag. That's safe at the moment (i.e. before this issue is implemented).

  • For given password files you may use --show=formats option to get a list of hashes in canonical form to build a table for reverse lookup.

  • You may add unique usernames to hashes in password file and use --show option to make john to print username + password utilizing john's conversion and matching.

@davidbolvansky
Copy link

Let me explain my situation..
I have the hashlist "md5.hash" with:

5d41402abc4b2a76b9719d911017c59a
aa4140bbbc4b2a76b9719d911017c592
7d793037a0760186574b0282f2f435e7
5d4140bbbc4b2a76b9719d911017c592

Now john cracked one hash. so in john.pot file:

cat john.pot
$dynamic_0$7d793037a0760186574b0282f2f435e7:world
./john --show --format=RAW-MD5 md5.hash 
?:world

1 password hash cracked, 3 left

So what I need is direct mapping hash <--> found password. Yes, I was thinking about username trick, could work.

Your presented --show=formats looks also useful:

./john --format=RAW-MD5 --show=formats  md5.hash 
[{"lineNo":1,"ciphertext":"5d41402abc4b2a76b9719d911017c59a","rowFormats":[{"label":"Raw-MD5","prepareEqCiphertext":true,"canonHash":["$dynamic_0$5d41402abc4b2a76b9719d911017c59a"]}]},
{"lineNo":2,"ciphertext":"aa4140bbbc4b2a76b9719d911017c592","rowFormats":[{"label":"Raw-MD5","prepareEqCiphertext":true,"canonHash":["$dynamic_0$aa4140bbbc4b2a76b9719d911017c592"]}]},
{"lineNo":3,"ciphertext":"7d793037a0760186574b0282f2f435e7","rowFormats":[{"label":"Raw-MD5","prepareEqCiphertext":true,"canonHash":["$dynamic_0$7d793037a0760186574b0282f2f435e7"]}]},
{"lineNo":4,"ciphertext":"5d4140bbbc4b2a76b9719d911017c592","rowFormats":[{"label":"Raw-MD5","prepareEqCiphertext":true,"canonHash":["$dynamic_0$5d4140bbbc4b2a76b9719d911017c592"]}]}]

If you are interested in raw-md5 only then you may pick lines starting with $dynamic_0$ and remove the tag. That's safe at the moment (i.e. before this issue is implemented).

I am interested in any john hash type, so I need to find some general solution to properly match hash (input from me) to newly found password.

@AlekseyCherepanov
Copy link
Member

To print hash + password, it might be possible to copy hash as username. It would be elegant:

$ cat md5.hash
5d41402abc4b2a76b9719d911017c59a:5d41402abc4b2a76b9719d911017c59a
aa4140bbbc4b2a76b9719d911017c592:aa4140bbbc4b2a76b9719d911017c592
7d793037a0760186574b0282f2f435e7:7d793037a0760186574b0282f2f435e7
5d4140bbbc4b2a76b9719d911017c592:5d4140bbbc4b2a76b9719d911017c592

$ cat md5.pot
$dynamic_0$7d793037a0760186574b0282f2f435e7:world

$ ./john/run/john --pot=md5.pot md5.hash --format=raw-md5 --show
7d793037a0760186574b0282f2f435e7:world

1 password hash cracked, 3 left

But there is a problem: some formats use original username. john encodes username into canonical form, so it is possible to replace username after that. But the trick above will not work.

$ cat username.pw
abc:e99a18c428cb38d5f260853678922e03
def:e88ebfe1ae982a6da01436e48af6eb74

$ ./john/run/john --pot=username.pot username.pw --format='dynamic=md5($u.$p)' --mask=123
[...]
123              (abc)     
123              (def)     
[...]

$ cat username.pot
@dynamic=md5($u.$p)@e99a18c428cb38d5f260853678922e03$HEX$2455616263:123
@dynamic=md5($u.$p)@e88ebfe1ae982a6da01436e48af6eb74$HEX$2455646566:123

So it looks like the only generic and reliable way is to use --show=formats.

(BTW inline dynamic= formats do not match named formats, e.g. --format='dynamic=md5($p)' will not match cracks with $dynamic_0$ tag, and --format=raw-md5 will not match cracks with @dynamic=md5($p)@ prefix.)

Back to the topic of this issue, formats with $u might benefit from the same trick of storing cracked raw hash as salted formats.

@AlekseyCherepanov
Copy link
Member

john encodes username into canonical form, so it is possible to replace username after that.

That's incorrect: john picks new username again, at least for ad-hoc dynamic formats.

There is one more nuance parsing .pot files: some formats can split original hash into multiple. Most notable such format is LM: original hash is 32 hex digits, but it is split into independent halves. That's why "canonHash" in --show=formats is an array. Such hashes can be cracked and shown partially. Another format with multiple pieces is descrypt (when hash has length 24). I did not find other such formats. (These formats has valid() method returning values bigger than 1. Their split() method describes how to extract parts.)

Incorrect synthetic LM hash can have left half with password shorter than 7 chars. john will not pick it as cracked.

$ cat lm.pot
$LM$cbc501a4d2227783:AAAAAAA
$LM$1fb363feb834c12d:ZZZZZZ

$ cat lm.pw
cbc501a4d22277831fb363feb834c12d
AAAAAAAAAAAAAAAA1fb363feb834c12d
1fb363feb834c12d1fb363feb834c12d

$ ./john/run/john --format=lm --pot=lm.pot lm.pw --show
?:AAAAAAAZZZZZZ
?:???????ZZZZZZ
?:???????ZZZZZZ

4 password hashes cracked, 2 left

@magnumripper
Copy link
Member Author

I think we should implement this logic for hashes lacking a login

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants