Image comparison by hash codes
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
project Added scalafmt Oct 17, 2018
.gitignore ENSIME support in gitignore Oct 27, 2018

Scala pHash

Build Status Maven

Scala fork of pHash library. This library identifies whether images are similar. You can try it at demo page.

Original pHash uses CImg library for image processing but I could not find CImg for jvm. Therefore I use java.awt and self-made functions for image processing. Consequently, results of my library is different from original phash.

How to use

My library implements three Perceptual Hashing algorithms: Radial Hash, DCT hash and Marr hash. More info about it.

sbt dependencies

resolvers += "Sonatype OSS Snapshots" at ""
libraryDependencies += "com.github.poslegm" %% "scala-phash" % "1.2.0"


There is three functions for each hashing algorithm. Let's consider them by example of DCT hash:

  • def dctHash(image: BufferedImage): Either[Throwable, DCTHash] ― compute image's hash;
  • def unsafeDctHash(image: BufferedImage): DCTHash ― compute image's hash unsafely (danger of exception);
  • def dctHashDistance(hash1: DCTHash, hash2: DCTHash): Long ― compare hashes of two images.

Similar functions written for Marr and Radial Hash algorithms.

All public api with scaladocs decsribed in object PHash.


import scalaphash.PHash._
import javax.imageio.ImageIO

val img1 = File("img1.jpg"))
val img2 = File("img2.jpg"))

val radialDistance: Either[Throwable, Double] = for {
  img1rad <- radialHash(img1)
  img2rad <- radialHash(img2)
} yield radialHashDistance(img1rad, img2rad)

radialDistance.foreach {
  case distance if distance > 0.95 => println("similar")
  case _ => println("not similar")

radialDistance.left.foreach(e => println(e.getMessage))

Radial distance is more when images are similar. DCT and Marr distances are less when images are similar. Recommended to make a decision on image similarity when at least two hashes pass thresholds.

radial: 0.9508017124330319
dct: 13
marr: 0.5052083333333334

radial: 0.3996241672331173
dct: 41
marr: 0.4704861111111111


My results is not compatible with original pHash. Use original library if you have an opportunity.
Also, it works much slower than c++ version (about 5-7 times).