Skip to content

add simple japanese tokenizer, based on tinysegmenter [LUCENE-2522] #3596

@asfimport

Description

@asfimport

TinySegmenter (http://www.chasen.org/\~taku/software/TinySegmenter/) is a tiny japanese segmenter.

It was ported to java/lucene by Kohei TAKETA <k-tak@void.in>,
and is under friendly license terms (BSD, some files explicitly disclaim copyright to the source code, giving a blessing instead)

Koji knows the author, and already contacted about incorporating into lucene:

I've contacted Takeda-san who is the creater of Java version of
TinySegmenter. He said he is happy if his program is part of Lucene.
He is a co-author of my book about Solr published in Japan, BTW. ;-)

Migrated from LUCENE-2522 by Robert Muir (@rmuir), updated May 09 2016
Attachments: LUCENE-2522.patch (versions: 3)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions