Skip to content
πŸŽ‡ Quickly search over billions of images
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs Add local copy of images in docs (#55) Dec 18, 2016
image_match handle unicode image paths for Python 2.7 Jun 17, 2018
tests test that unicode image paths are processed correctly Jun 17, 2018
.dockerignore Dockerize for dev purposes Mar 23, 2016
.editorconfig Add .editorconfig file Mar 3, 2016
.gitignore ignore pytest cache Jun 17, 2018
.travis.yml Add integration with codecov Jul 6, 2017
Dockerfile add integration tests for lookups using metadata filtering Mar 26, 2017 add Mar 2, 2016 Moved codecov to edjolabs Oct 24, 2018
pytest.ini add integration tests for lookups using metadata filtering Mar 26, 2017 Update scikit-image to >=0.14 Aug 8, 2018

PyPI PyPI Documentation Status codecov


image-match is a simple (now Python 3!) package for finding approximate image matches from a corpus. It is similar, for instance, to pHash, but includes a database backend that easily scales to billions of images and supports sustained high rates of image insertion: up to 10,000 images/s on our cluster!

PLEASE NOTE: This algorithm is intended to find nearly duplicate images -- think copyright violation detection. It is NOT intended to find images that are conceptually similar. For more explanation, see this issue or this video.

Based on the paper An image signature for any kind of image, Wong et al. There is an existing reference implementation which may be more suited to your needs.

The folks over at Pavlov have released an excellent containerized version of image-match for easy scaling and deployment.

Quick start

Install and setup image-match

Once you're up and running, read these two (short) sections of the documentation to get a feel for what image-match is capable of:

Image signatures

Storing and searching images

You can’t perform that action at this time.