Skip to content
mystem wrapper for JVM languages
Scala
Branch: master
Clone or download
Pull request Compare This branch is 15 commits ahead, 2 commits behind alexeyev:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
project
src
.gitignore
LICENSE
README.md
build.sbt

README.md

A Scala wrapper for morphological analyzer Yandex.MyStem

Download

Introduction

Details about the algorithm can be found in I. Segalovich «A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine», MLMTA-2003, Las Vegas, Nevada, USA.

The wrapper's code in under MIT license, but please remember that Yandex.MyStem is not open source and licensed under conditions of the Yandex License.

System Requirements

The wrapper should at least work on Ubuntu Linux 12.04+, and Windows 7+.

Install

Sbt

resolvers += Resolver.bintrayRepo("cnsa", "oss-maven")

libraryDependencies += "org.nsa.nlp" %% "mystem-scala" % "0.1.8" 

Maven

Maven central

<dependency>
  <groupId>ru.stachek66.nlp</groupId>
  <artifactId>mystem-scala</artifactId>
  <version>0.1.4</version>
</dependency>

Issues

Mystem 3.0, 3.1 is supported currently. Please create issues for compatibility troubles and other requests.

Examples

Probably the most important thing to remember when working with mystem-scala is that you should have just one MyStem instance per mystem/mystem.exe file in your application.

###Scala

import java.io.File

import org.nsa.nlp.mystem.holding.{Factory, MyStem, Request}

object MystemSingletonScala {

  val mystemAnalyzer: MyStem =
    new Factory("-igd --eng-gr --format json --weight")
      .newMyStem(
        "3.0",
        Option(new File("/home/coolguy/coolproject/3dparty/mystem"))).get
}

object AppExampleScala extends App {

  MystemSingletonScala
    .mystemAnalyzer
    .analyze(Request("Есть большие пассажиры мандариновой травы"))
    .info
    .foreach(info => println(info.initial + " -> " + info.lex))
}

###Java

import org.nsa.nlp.mystem.holding.Factory;
import org.nsa.nlp.mystem.holding.MyStem;
import org.nsa.nlp.mystem.holding.MyStemApplicationException;
import org.nsa.nlp.mystem.holding.Request;
import org.nsa.nlp.mystem.model.Info;
import scala.Option;
import scala.collection.JavaConversions;

import java.io.File;

public class MyStemJavaExample {

    private final static MyStem mystemAnalyzer =
            new Factory("-igd --eng-gr --format json --weight")
                    .newMyStem("3.0", Option.<File>empty()).get();

    public static void main(final String[] args) throws MyStemApplicationException {

        final Iterable<Info> result =
                JavaConverters.asJavaIterable(
                        mystemAnalyzer
                                .analyze(Request.apply("И вырвал грешный мой язык"))
                                .info()
                                .toIterable());

        for (final Info info : result) {
            System.out.println(info.initial() + " -> " + info.lex() + " | " + info.rawResponse());
        }
    }
}

Contacts

Anton Alekseev anton.m.alexeyev@gmail.com

Thanks for reviews and contributions

  • Vladislav Dolbilov, @darl
  • Mikhail Malchevsky
  • @anton-shirikov

Also please see

You can’t perform that action at this time.