Skip to content

This is an open-source effort for making Hebrew properly searchable by various IR software libraries, while maintaining decent recall, precision and relevancy in retrievals. Includes Hebrew Analyzer for Lucene, and already produces results for Hebrew texts which are much better than the default Lucene implementation. Available for Java and .NET …

License

synhershko/HebMorph

Repository files navigation

HebMorph is an open-source effort for making Hebrew properly searchable by various IR software libraries, while maintaining decent recall, precision and relevancy in retrievals. All code and files are released under the GNU Affero General Public License version 3.

More details at http://code972.com/HebMorph

Maven Central Build Status

Lucene / Elasticsearch compatibility

Since March 2017 hebmorph-lucene is being released for every Lucene/Solr version, with a matching major version number (most of the time minor as well). Matching Elasticsearch plugin versions are also available, see https://github.com/synhershko/elasticsearch-analysis-hebrew.

hebmorph-lucene version Lucene version Elasticsearch version Release date
6.2.x 6.1.x -> 6.2.x 5.1.x -> 5.2.x 3/2017
6.0.0 6.0.x 5.0.x 1/2017
2.4.0 5.5.x 2.4.x
2.3.x 5.4.x 2.2.x -> 2.3.x 4/2/2016
2.2.x 5.3.x 2.0.x -> 2.1.x 4/2/2016
2.1.x 4.10.4 1.6 -> 1.7.x 4/2/2016
2.0.x 4.10.x 1.4.x, 1.5.x 24/3/2015
1.5.0 4.9.0 1.3.x 9/9/2014
1.4.x 4.8.x 1.x -> 1.2.x August 2014
1.3.x 4.6.x 0.90.8 -> 0.90.13 June 2014
1.2.0 4.5.x 0.90.6, 0.90.7 10/11/2013
1.1.0 4.4.0 0.90.3 -> 0.90.5
1.0.0 <= 4.3.0 <= 0.90.2

Tutorial for integrating HebMorph with Elasticsearch can be found here http://code972.com/blog/2013/08/129-hebrew-search-with-elasticsearch-and-hebmorph

Get it from Maven Central

For the analyzer support, get hebmorph-lucene:

        <dependency>
            <groupId>com.code972.hebmorph</groupId>
            <artifactId>hebmorph-lucene</artifactId>
            <version>6.6.0</version>
            <scope>compile</scope>
        </dependency>

Lucene.NET compatibility

The .NET version of the library is compatible with Lucene.NET version 3.0.3, but has some known bugs that were fixed in the Java version and haven't been ported back yet.

License

HebMorph is copyright (C) 2010-2015, Itamar Syn-Hershko. HebMorph currently relies on Hspell, copyright (C) 2000-2013, Nadav Har'El and Dan Kenigsberg (http://hspell.ivrix.org.il/).

It is released to the public licensed under the GNU Affero General Public License v3. See the LICENSE file included in this distribution. Note that not only the programs in the distribution, but also the dictionary files and the generated word lists, are licensed under the AGPL. There is no warranty of any kind for the contents of this distribution.

About

This is an open-source effort for making Hebrew properly searchable by various IR software libraries, while maintaining decent recall, precision and relevancy in retrievals. Includes Hebrew Analyzer for Lucene, and already produces results for Hebrew texts which are much better than the default Lucene implementation. Available for Java and .NET …

Resources

License

Stars

Watchers

Forks

Packages

No packages published