A PHP implementation of the Aho-Corasick string search algorithm. Mirror from https://gerrit.wikimedia.org/g/AhoCorasick.
Branch: master
Clone or download
libraryupgrader
libraryupgrader build: Updating mediawiki/mediawiki-codesniffer to 24.0.0
Change-Id: I3617e41606c079569ca6e7b0c23a0ac0f342438c
Latest commit 4291503 Feb 7, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bench
build
src
tests build: Updating phpunit/phpunit to 4.8.36 || ^6.5 Feb 27, 2018
.editorconfig Sync with library bootstrap May 6, 2018
.gitattributes Sync with library bootstrap May 6, 2018
.gitignore
.gitreview
.phpcs.xml Sync with library bootstrap May 6, 2018
.travis.yml build: Include Travis CI testing on php 7.3 Jan 29, 2019
CODE_OF_CONDUCT.md build: Updating mediawiki/mediawiki-codesniffer to 22.0.0 Sep 4, 2018
Doxyfile Sync with library bootstrap May 6, 2018
LICENSE Initial commit. Jun 9, 2015
NOTICE Initial commit. Jun 9, 2015
README.md
composer.json build: Updating mediawiki/mediawiki-codesniffer to 24.0.0 Feb 7, 2019
phpunit.xml.dist Sync with library bootstrap May 6, 2018

README.md

Packagist.org

AhoCorasick

AhoCorasick is a PHP implementation of the Aho-Corasick string search algorithm, which is an efficient way of searching a body of text for multiple search keywords.

Here is how you use it:

use AhoCorasick\MultiStringMatcher;

$keywords = new MultiStringMatcher( array( 'ore', 'hell' ) );

$keywords->searchIn( 'She sells sea shells by the sea shore.' );
// Result: array( array( 15, 'hell' ), array( 34, 'ore' ) )

$keywords->searchIn( 'Say hello to more text. MultiStringMatcher objects are reusable!' );
// Result: array( array( 4, 'hell' ), array( 14, 'ore' ) )

Features

The algorithm works by constructing a finite-state machine out of the set of search keywords. The time it takes to construct the finite state machine is proportional to the sum of the lengths of the search keywords. Once constructed, the machine can locate all occurences of all search keywords in any body of text in a single pass, making exactly one state transition per input character.

Contribute

Support

If you are having issues, please let us know.

License

The project is licensed under the Apache license.