Skip to content

apertium/apertium-sah

Repository files navigation

Sakha: apertium-sah

This is an Apertium monolingual language package for Sakha. What you can use this language package for:

  • Morphological analysis of Sakha
  • Morphological generation of Sakha
  • Part-of-speech tagging of Sakha

Requirements

You will need the following software installed:

  • lttoolbox (>= 3.3.0)
  • apertium (>= 3.3.0)
  • vislcg3 (>= 0.9.9.10297)
  • hfst (>= 3.8.2)

If this does not make any sense, we recommend you look at: apertium.org

Compiling

Given the requirements being installed, you should be able to just run:

$ ./configure
$ make

You can use ./autogen.sh instead of ./configure if you're compiling from source.

If you're doing development, you don't have to install the data, you can use it directly from this directory.

If you are installing this language package as a prerequisite for an Apertium translation pair, then do (typically as root / with sudo):

$ make install

You can give a --prefix to ./configure to install as a non-root user, but make sure to use the same prefix when installing the translation pair and any other language packages.

Testing

If you are in the source directory after running make, the following commands should work:

  • Morphological analysis:

     $  echo "Бу сахалыы морфологическай ырытыы." | apertium -d . sah-morph
     ^Бу/бу<prn><dem><nom>/бу<mod>$ ^сахалыы/саха<n>+лыы<post>/сахалыы<adv><cop><aor><p3><sg>/сахалыы<adv>$ ^морфологическай/морфологическай<adj><cop><aor><p3><sg>/морфологическай<adj>$ ^ырытыы/ырытыы<n><nom><cop><aor><p3><sg>/ырытыы<n><nom>/ырытыы<n><attr>/ырыт<v><tv><ger><nom><cop><aor><p3><sg>/ырыт<v><tv><ger><nom>$^./.<sent>$
  • Tagging (analysis + disambiguation):

     $  echo "Бу сахалыы морфологическай ырытыы." | apertium -d . sah-tagger
     ^Бу/Бу<prn><dem><nom>$ ^сахалыы/саха<n>+лыы<post>$ ^морфологическай/морфологическай<adj>$ ^ырытыы/ырытыы<n><nom><cop><aor><p3><sg>$^./.<sent>$
  • Morphological generation:

     $  echo "^бу<prn><dem><nom>$ ^саха<n>+лыы<post>$ ^морфологическай<adj>$ ^ырытыы<n><nom>+э<cop><aor><p3><sg>$" | apertium -d . -f none sah-gener
     бу сахалыы морфологическай ырытыы

Files and data

For more information

Citing

When referencing this work in an academic publication, we ask that you cite the following source:

  • Ivanova, Sardana, Jonathan N. Washington, and Francis M. Tyers (2022). A Free/Open-Source Morphological Analyser and Generator for Sakha. Presented at LREC 2022. Poster, Paper.

The transducer also appeared as the following:

Help and support

If you need help using this language pair or data, you can contact:

See also the file AUTHORS, included in this distribution.