Wolves Among the Sheep
Some security tools still stick to MD5 when identifying malware samples years after practical collisions were shown against the algorithm. This can be exploited by first showing these tools a harmless sample (Sheep) and then a malicious one (Wolf) that have the same MD5 hash. Please use this code to test if the security products in your reach use MD5 internally to fingerprint binaries and share your results by issuing a pull request updating the contents of
Works-on-a-different-machine-than-mine version, feedback is welcome!
- 32-bit Windows (virtual) machine (64-bit breaks stuff)
- Visual Studio 2012 to compile the projects (Express will do)
- Fastcoll for collisions
- Optional: Cygwin+MinGW to compile Evilize
Extract Fastcoll to the
fastcoll directory. Name the executable
shepherd.bat to generate
sheep.exe (in the VS Development Command Prompt):
> shepherd.bat YOURPASSWORD your_shellcode.raw
After this step you should have your two colliding binaries (
wolf.exe in the
For more information see the tutorial of Peter Selinger, older revisions of this document or the source code...
How does it work?
shepherd.exewith the user supplied command line arguments
shepher.exegenerates a header file (
sc.h) that contains the encrypted shellcode, the password and the CRC of the plain shellcode
shepherd.batexecutes the build process of
sheep.exeis built with
sc.hincluded by Visual Studio
evilize.execalculates a special IV for the chunk of
sheep.exeright before the block where the collision will happen
fastcoll.exewith the IV as a parameter
fastcoll.exegenerates two 128 byte colliding blocks:
evilize.exereplaces the original string buffers of
sheep.exeso that they contain combinations
- The resulting files (
evilize/sheep.exe) have the same MD5 hashes but behave differently. The real code to be executed only appears in the memory of
To test the security products in your reach you should generate two pairs of samples (SHEEP1-WOLF1 and SHEEP2-WOLF2), preferably with the same payload. Since samples (or their fingerprints) are usually uploaded to central repositories (or "the cloud") precompiled samples are not included to avoid conflicts between independent testers.
After the samples are ready follow the methodology shown on the diagram below:
(*) If the product is not able to detect the first malicious sample, there are more serious problems to worry about than crypto-fu. In fact, the simple cryptography included in the provided boilerplate code poses as a hard challenge for various products... Try to use more obvious samples!
(**) The product most probably uses some trivial method to detect the boilerplate insted of the actual payload. You can try to introduce simple changes to the code like removing debug strings.
Please don't forget to share your positive results by issuing a pull request to the RESULTS.md file!
- Poisonous MD5 - Wolves Among the Sheep
- Peter Selinger: MD5 Collision Demo
- How to make two binaries with same MD5
- Stop using MD5 now!
Licenced under GNU/GPL if not otherwise stated.