A little PHP library implementing the Aho–Corasick algorithm
The original paper cound be found here
一个纯PHP实现的 Aho-Corasick算法
算法的原论文可以看这里
百度搜出来的AC算法的中文讲解就那么几篇,转载来转载去的,但我表示看不懂。
索性一怒之下看原始的论文,然后根据论文中的算法写了这个PHP实现。
改天我也写篇中文讲解,争取比那几篇写得更容易懂一些。
PHP 7.0 or higher
composer require tianhe1986/fatahocorasick
and then in your code
require_once __DIR__ . '/vendor/autoload.php';
use FatAhoCorasick\FatAhoCorasick;
$ac = new FatAhoCorasick();
//add keyword, string or array
$ac->addKeyword(['art', 'cart']);
$ac->addKeyword('ted');
//compute info
$ac->compute(false); //not using next array
//$ac->compute(true); using next array
//search
$result = $ac->search('a carted mart lot one blue ted');
$result
would be like follows:
(
[0] => Array
(
[0] => cart
[1] => 2
)
[1] => Array
(
[0] => art
[1] => 3
)
[2] => Array
(
[0] => ted
[1] => 5
)
[3] => Array
(
[0] => art
[1] => 10
)
[4] => Array
(
[0] => ted
[1] => 27
)
)
For each item in $result
, item[0] means the keyword found, item[1] means its start location.
Without next
array:
$ac = new FatAhoCorasick();
//add keyword, string or array
$ac->addKeyword(['art', 'cart']);
$ac->addKeyword('ted');
//compute info
$ac->compute(false); //not using next array
//get output, goto and failure, then you can store them
$output = $ac->getOutput();
$goto = $ac->getGoto();
$failure = $ac->getFailure();
$nac = new FatAhoCorasick();
$result = $nac->searchByFailure('a carted mart lot one blue ted', $output, $goto, $failure);
With next
array:
$ac = new FatAhoCorasick();
//add keyword, string or array
$ac->addKeyword(['art', 'cart']);
$ac->addKeyword('ted');
//compute info
$ac->compute(true); //not using next array
//get output, and next, then you can store them
$output = $ac->getOutput();
$next = $ac->getNext();
$nac = new FatAhoCorasick();
$result = $nac->searchByNext('a carted mart lot one blue ted', $output, $next);