Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
SimHashPhp is a PHP 5.3 library to use the SimHash algorithm.
branch: master


What is SimHashPhp ?

SimHashPhp is a PHP library that port the SimHash algorithm created by Moses Charikar. This algorithm provide an efficient way to compare two texts or two data sets.

See "SimHash or the way to compare quickly two datasets" for more informations.

Build Status

How to use it ?

The library use PSR-0 naming convention so you can easily use it with an autoloader. However, basically, you can use a simple require:


require 'Leg/SimHash/SimHash.php';
require 'Leg/SimHash/SimHashFactory.php';

use Leg\SimHash\SimHash;
use Leg\SimHash\SimHashFactory;

$factory = new SimHashFactory();

$fingerprint1 = $factory->run('Mary-jane is very tall. She was in the 9th grade.');
$fingerprint2 = $factory->run('John is in high school. He is not so tall.');

// This method return a float between 0 and 1
// where 0 is completely different strings and
// 1 is completely equals strings.


This library is under the MIT license (see


The SimHashPhp is mainly developed by Titouan Galopin.

Reporting an issue or a feature request

Issues and feature requests are tracked in the Github issue tracker.

Something went wrong with that request. Please try again.