SYNOPSIS

my $counter = Text::WordCounter->new();

my $word_count = $counter->word_count( $text )

DESCRIPTION

It is quite heuristic, for example '-' and digits inside word characters are treated as a word character, see the tests to find out how all the special cases are resolved,

The features parameter should be a hashref and is an accumulator for found features.

ATTRIBUTES

stemming

If set stemming via Lingua::Stem is performed on the words. We never managed to make it sanely in multilingual texts.

stopwords

A hashref with words to discard.

INSTANCE METHODS

`is_stop_word`

`normalize`

Lowercases words and stemms them if the stemming attribute is true.

`split_scripts`

`word_count`

Returns a hashref with word counts.

LIMITATIONS

From languages that don't use spaces only Chinese is currently supported (using Lingua::ZH::MMSEG).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
lib/Text		lib/Text
t		t
Changes		Changes
README.pod		README.pod
TODO		TODO
dist.ini		dist.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib/Text

lib/Text

t

t

Changes

Changes

README.pod

README.pod

TODO

TODO

dist.ini

dist.ini

Repository files navigation

SYNOPSIS

DESCRIPTION

ATTRIBUTES

stemming

stopwords

INSTANCE METHODS

`is_stop_word`

`normalize`

`split_scripts`

`word_count`

LIMITATIONS

SEE ALSO

About

Releases

Packages

Languages

operasoftware/Text-WordCounter

Folders and files

Latest commit

History

Repository files navigation

SYNOPSIS

DESCRIPTION

ATTRIBUTES

stemming

stopwords

INSTANCE METHODS

is_stop_word

normalize

split_scripts

word_count

LIMITATIONS

SEE ALSO

About

Resources

Stars

Watchers

Forks

Languages

`is_stop_word`

`normalize`

`split_scripts`

`word_count`