Skip to content
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
collisions/hashquines/
collisions/hashquines/

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

Hashquines

Hashquines are files showing their own MD5.

It seems like an impossible magic trick because modifying the file's contents changes the hash, therefore the hash can't be known in advance. So it's the opposite strategy: make it possible to display any value, included the value of the actual hash of the file, without changing the overall hash of the file.

While the security risk they represent is debatable, they certainly show that MD5 is now just a fun toy.

Strategies

Self-check

'Cheating' by just computing the hash value and displaying it. They don't rely on hash collisions.

In Python:

import hashlib
import sys
print(hashlib.md5(open(sys.argv[0],"rb").read()).hexdigest())

In Batch:

md5sum %0

Read an encoded value

Chain enough fast collisions to be able to encode a hash value without modifying the final hash value, then some code is executed to check and display that value.

Some fixed offsets of FastColl collision blocks will be xored with 0x80, so that bit can be checked reliably - in this case, one collision equals 1 bit of information.

An MD5 is 128 bits, so 128 Fastcoll blocks are needed.

Abuse format parsing

Abuse the format structure to make the parser display a digit or another while keeping the same hash value, repeat for each digit and for every value to be displayed (32*16=512).

Like colliding 2 images or documents, but instead colliding the contents of many different elements at various positions, to display the right value of the actual hash file without changing the overall file hash.

Depending on the format, a possible way is to use a collision as a switch to enable one character or another. In this case, N chained collisions display N+1 different objects in the same position.

To display the hex value of an MD5, 16 characters are required for 32 digits, so 480 (=32*15) collisions.

A way to reduce this amount of collision is to display the characters via 7-segments display, in which case each segments needs to be toggled on or off (like bits). So in this case, 224 (=32*7) collisions are required.

 -
| |
 -
| |
 -

Examples

$ pdftotext -q md5text.pdf -
66DA5E07C0FD4C921679A65931FF8393

$ md5sum md5text.pdf
66da5e07c0fd4c921679a65931ff8393 *md5text.pdf

  • Poc||GTFO 0x14 is a polyglot file: simultaneously a JPG in PDF hashquine and a NES hashquine, with also a hidden cover while keeping the same MD5 - a classic collision, albeit the JPG picture has a lot of custom scans.

Cf self-descriptive image (here as PNG):

  • a PNG hashquine by David 'Retr0id' Buchanan

Notes

Custom FastColl

For these hashquines, new forms of hash collisions were introduced by Mako:

  • Rather than relying on UniColl, FastColl was modified to force the creation of a JPEG comment FF FE right before the collision difference. Also useable for standard JPEG collisions.

  • Similarly, in the PDF hashquine with text, 32b of the FastColl are forced to be Do(...) which is a valid PDF operator. It's very nice to have such a short text operator that can be abused in the middle of a collision block!

Hiding collision blocks

As usual, hiding collision blocks can be tricky. Here are some introduced techniques.

  • Mako abused PDF name conventions, forcing the start of the collision block to define a PDF name - the last character before the collision block is a /, defining a PDF name. The name has to be reused later in the file.

For example, the first collision block defines the atrocious but working name /öÃÝüúá.3�A�¢å¦�e�»ñæÀ¿Wæ��‚b—�»ò´óûàÊÄÃ’�\q±*��,ÆýH™æ�S�ùÞsp.

  • Retr0id put the collision blocks in plain sight: a custom palette is used to hide all colors but 0, and collision blocks containing null values are rejected.

Here's the a crop of the picture around the 8 digit with a more revealing palette (black and red color unmodified).

These are unicoll blocks to turn on/off the red pixels (color 0 or 1)

Misc

  • detectcoll in unsafe mode can enumerate all Fastcoll and Unicoll occurences are present in the file - cf logs.

    • Knowing the offsets and the types of the collision just from one file, it's then possible to modify or reset the collisions and alter the display yourself of the files without changing their MD5 or recomputing any collision, whether it's an 'encoding' hashquine (ex: for NES) or a 'format' hashquine (ex: Jpg in PDF).
  • Since Unicoll only changes one byte, Retr0id's PNG font is made of one pixel per line:

  *
     *
*
       *
*
       *
  *
     *
*
       *
*
       *
  *
     *
  • Since correct values of Adler32 are required for Zlib chunks and CRC32 are required for PNG chunks, Retr0id had to forge these values while keeping the same MD5 hash: to do that, he added many fastcoll blocks and swapped them in a clever meet-in-a-middle attack to reach a suitable value - cf this gitst.

The Fastcoll blocks to forge the CRCs are 'visible' at the bottom of the picture (here, with a revealing palette).

More than hash values

Many collisions can be used to encode or display a hash value, but they can be used to encode anything else, even bigger.

Retr0id combined parallelized Fastcoll with a linux 4kb shellcode loader, generating a hashquine (benign), a rickroll (fun) or a meterpreter (evil) or anything you want with the same final hash.

$ python3 monomorph.py bin/monomorph.linux.x86-64.benign benign
[...]
$ python3 monomorph.py bin/monomorph.linux.x86-64.benign hashquine sample_payloads/bin/hashquine.bin
[...]
$ benign
$ hashquine
My MD5 is: 3cebbe60d91ce760409bbe513593e401
$ md5sum benign hashquine
3cebbe60d91ce760409bbe513593e401  benign
3cebbe60d91ce760409bbe513593e401  hashquine