A image hashing library written in Python. ImageHash supports:
- average hashing (aHash)
- perception hashing (pHash)
- difference hashing (dHash)
- wavelet hashing (wHash)
Why can we not use md5, sha-1, etc.?
Unfortunately, we cannot use cryptographic hashing algorithms in our implementation. Due to the nature of cryptographic hashing algorithms, very tiny changes in the input file will result in a substantially different hash. In the case of image fingerprinting, we actually want our similar inputs to have similar output hashes as well.
Based on PIL/Pillow Image, numpy and scipy.fftpack (for pHash) Easy installation through pypi.
>>> from PIL import Image >>> import imagehash >>> hash = imagehash.average_hash(Image.open('test.png')) >>> print(hash) d879f8f89b1bbf >>> otherhash = imagehash.average_hash(Image.open('other.bmp')) >>> print(otherhash) ffff3720200ffff >>> print(hash == otherhash) False >>> print(hash - otherhash) 36
The demo script find_similar_images illustrates how to find similar images in a directory.
Source hosted at github: https://github.com/JohannesBuchner/imagehash
- 4.0: Changed binary to hex implementation, because the previous one was broken for various hash sizes. This change breaks compatibility to previously stored hashes; to convert them from the old encoding, use the "old_hex_to_hash" function.
- 3.5: image data handling speed-up
- 3.2: whash now also handles smaller-than-hash images
- 3.0: dhash had a bug: It computed pixel differences vertically, not horizontally.
- I modified it to follow dHash. The old function is available as dhash_vertical.
- 2.0: added whash
- 1.0: initial ahash, dhash, phash implementations.