Skip to content

SimHashPhp is a PHP 5.3 library to use the SimHash algorithm.

License

Notifications You must be signed in to change notification settings

guoyu07/SimHashPhp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimHashPhp

What is SimHashPhp ?

SimHashPhp is a PHP library that port the SimHash algorithm created by Moses Charikar. This algorithm provide an efficient way to compare two texts or two data sets.

See "SimHash or the way to compare quickly two datasets" for more informations.

Build Status

How to use it ?

The library use PSR-0 naming convention so you can easily use it with an autoloader. However, basically, you can use a simple require:

<?php

require 'Leg/SimHash/SimHash.php';
require 'Leg/SimHash/SimHashFactory.php';

use Leg\SimHash\SimHash;
use Leg\SimHash\SimHashFactory;


$factory = new SimHashFactory();

$fingerprint1 = $factory->run('Mary-jane is very tall. She was in the 9th grade.');
$fingerprint2 = $factory->run('John is in high school. He is not so tall.');

// This method return a float between 0 and 1
// where 0 is completely different strings and
// 1 is completely equals strings.
$fingerprint1->compareWith($fingerprint2);

License

This library is under the MIT license (see LICENSE.md)

About

The SimHashPhp is mainly developed by Titouan Galopin.

Reporting an issue or a feature request

Issues and feature requests are tracked in the Github issue tracker.

About

SimHashPhp is a PHP 5.3 library to use the SimHash algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PHP 100.0%