A PHP implementation of the English (Porter 2) Stemmer.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
demo
src
test
.gitignore
LICENSE
README.md
circle.yml
composer.json
index.php
phpunit.xml.dist

README.md

Porter 2 Stemmer for PHP

Circle CI GitHub license

A PHP library for stemming words using the English Porter 2 algorithm.

Screenshot of Conversion

Background

A stemmer takes a given word and follows a set of rules to reduce this word to search-index-usable stem (as opposed to the actual word root). For example, aggravate, aggravated, and aggravates all reduce to "aggrav," thus creating a commonality between those words.

Martin Porter's English (Porter 2) Algorithm improves on the original Porter stemmer as described here.

Basic Usage

The included /demo/index.php file contains a conversion form demonstration.

Make your code aware of the Porter2 class via your favorite method (e.g., use or require)

Then pass a string of text into the class:

$text = Porter2::stem('consistently');
echo $text; // consist

$text = Porter2::stem('consisting');
echo $text; // consist

$text = Porter2::stem('consistency');
echo $text; // consist

Stemmer Resources

Tests

A verification list of 29,000 words and their expected stems can be run (after composer install via phpunit).