Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

The Tokenizer class allows us to split an string into tokens. Unlike other classes, it is based on regular patterns.

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 classes
Octocat-spinner-32 LICENSE
Octocat-spinner-32 README.md
Octocat-spinner-32 test.php
README.md

tokenizer

The Tokenizer class allows us to split an string into tokens. Unlike other classes, it is based on regular expressions. The 'match' function is the most important function of the class. It allows to split an string into tokens and accepts a regular expression as parameter. For example:

// splits an string into 'words'
$t = new Tokenizer("Lorem ipsum dolor sit amet");
while (list($token) = $t->match("\w+")) {
    echo "$token-";
}

Note that you DO NOT NEED to write an explicit regular expression. In the above example, instead of typing "/^\s*\w+/" we can write "\w+". In this case, the function ignores the left spaces and start searching from the current offset position. In any case, you can use an explicit regular expresion:

// uses an explicit regular expression
$t = new Tokenizer("I'm 35 years old");
if (list($years) = $t->match("/\d+/")) {
    echo "You are $years old";
}
Something went wrong with that request. Please try again.