Skip to content

A .NET core based API exposing the PDQ perceptual hashing algorithm by @facebook ("PDQ as a service")

License

Notifications You must be signed in to change notification settings

AiLECS/netPDQContainer

Repository files navigation

.NET PDQ Container

Description

This project hosts an implementation of Facebook's PDQ, a perceptual hashing algorithm used for measuring image similarity. A summary of the algorithm is available here.

The algorithm is accessible via either a RESTful API, or websockets.


Setup

Clone/download this project, and build the image:

foo@bar:/foo/netPDQContainer/$ docker build -t ailecs/netpdq:latest .

This can take a while (bandwidth/CPU depending - YMMV), because it installs build-essential and other large(ish) packages, plus cloning and making the PDQ binaries.

-t ailecs/netpdq:latest is optional, but does make the next step easier.

Once built, the image can be containerised and run using

foo@bar:~/$ docker run  -p 8080:8080 ailecs/netpdq:latest

If you didn't use -t, you'll need to work out the image ID. This can be done using

foo@bar:~/$ docker image ls

Note: This project comes with an ignorable hashset (based on a reduced subset of Google Open Images) as an example and also for your own use. You may find it easier to mount a local directory of hash files and map them to the container thus:

foo@bar:~/$ docker run  -p 8080:8080 -v /path/to/your/hashsets:/app/netPDQContainer/hashes ailecs/pdqhasher:latest

Usage

The container exposes a RESTful API, with swagger 2.0 compliant documentation currently being developed (watch this space).

To hash a file: Post a file (multipart - named file to http://localhost:8080/pdq . The hash (encoded as a hex string) plus the image quality is returned.


Licensing

This is released under an MIT licence. External dependencies for the software in this project may be subject to more restrictive arrangements.


Notes

The service utilises netMIH for accelerating lookups within the configured hamming distance. Otherwise, it uses linear search. You will encounter performance drops with linear search, particularly as your dataset grows!

About

A .NET core based API exposing the PDQ perceptual hashing algorithm by @facebook ("PDQ as a service")

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published