# Install voikko library libvoikko on mac

Well, I wanted to use python libvoikko library to do text propressing on some Finnish documents. In order to use the python libvoikko, it is required to install the Voikko library and test whether the installation is successful with the python example code from [here](https://www.puimula.org/htp/doc/python/libvoikko.html). Unfortunately, the installing process was quite torturing in that there is not so many documentations about it. Thus, I record my installing process here in case someone encounter the same issues as I did.

TL;DR

In order to make libvoikko to work on mac. We need to do the following:
- Install python libvoikko with command ```pip install voikko```
- Install dependencies: pkg-config, gettext, libxml++ 2, libarchive, libz, libtool, hfst-ospell
- Build libvoikko
- Install voikko-fi if you want to use libvoikko to preprocess Finnish text.

Below is the gory details of my journey of pain to get libvoikko to work on mac.

### Install with homebrew: failed

```$ brew install libvoikko```

With this installation, the python code has the issue of **dlopen: symbol not found: __zn7hfst_ol10transducer6lookupepc**. 

The dlopen problem is guessed to be the case where **the install process did not complete**.

### Install from source: failed

```$ PREFIX=/usr/local```

```$ CPPFLAGS="-I$PREFIX/include" CXXFLAGS="-L$PREFIX/lib" PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" ./configure --prefix=$PREFIX --with-dictionary-path=$PREFIX/lib/voikko```

```$ make```

    clang: error: argument unused during compilation: '-L/usr/local/lib' [-Werror,-Wunused-command-line-argument]

Or

```$ PREFIX=/usr/local```

```$ CPPFLAGS="-I$PREFIX/include" PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" ./configure --prefix=$PREFIX --with-dictionary-path=$PREFIX/lib/voikko```

```$ make```

    ./morphology/HfstAnalyzer.hpp:52:3: error: use of undeclared identifier 'hfst_ol'
                    hfst_ol::Transducer *t;
                    ^
    morphology/AnalyzerFactory.cpp:77:10: error: cannot initialize return object of type 'libvoikko::morphology::Analyzer *'
          with an rvalue of type 'libvoikko::morphology::HfstAnalyzer *'
                    return new HfstAnalyzer(morPath);
                           ^~~~~~~~~~~~~~~~~~~~~~~~~
                       
Yet, I counldn't find a way to solve it even struggling with it for a while.

[Ref](http://voikko.puimula.org/source-mac.html)

### Compiling voikko with hfst: succeed

#### Install dependencies: pkg-config, gettext, libxml++ 2, libarchive, libz

Since I only lacked libz, so I installed it only.

```$ brew install zlib```

#### Install hfst-ospell

```$ brew install hfstospell```

#### Build LibVoikko

```$ git clone https://github.com/voikko/corevoikko.git```

```$ cd corevoikko/libvoikko```

```$ ./autogen.sh```

    Cleaning autotools files...
    cp: /usr/share/gettext/config.rpath: No such file or directory
    Creating autotools files...
    configure.ac:47: error: possibly undefined macro: AC_LIBTOOL_WIN32_DLL
          If this token and others are legitimate, please use m4_pattern_allow.
          See the Autoconf documentation.
    autoreconf: /usr/local/Cellar/autoconf/2.69/bin/autoconf failed with exit status: 1oreconf: /usr/local/Cellar/autoconf/2.69/bin/autoconf failed with exit status: 1

To solve above problem,

1. need to find the path to file **gettext/config.rpath**, and replace ```cp: /usr/share/gettext/config.rpath``` in the autogen.sh file with the found path e.g. ```cp /usr/local/Cellar/gettext/0.19.8.1//share/gettext/config.rpath```

2. need to install libtool: ```$ brew install libtool```

```$ ./autogen.sh```

```$ ./configure --with-dictionary-path=/Users/lifa08/Library/Spelling/voikko:/usr/local/share/voikko:/usr/local/lib/voikko```

    Libvoikko was configured with the following options
      * VFST support:                   yes
      *   Experimental VFST features:   no
      * HFST support:                   yes
      * Experimental VISLCG3 support:   no
      * Experimental Lttoolbox support: no
      * Morphology compilers:           yes
      * Simple client programs:         yes
      * Fallback dictionary path:       /Users/lifa08/Library/Spelling/voikko:/usr/local/share/voikko:/usr/local/lib/voikko

```$ make```

    Making all in java
    make[2]: Nothing to be done for `all'.
    Making all in test
    make[2]: Nothing to be done for `all'.
    Making all in cs
    make[2]: Nothing to be done for `all'.
    Making all in cl
    make[2]: Nothing to be done for `all'.
    make[2]: Nothing to be done for `all-am'.
    
```$ sudo make install```

    Making install in java
    make[2]: Nothing to be done for `install-exec-am'.
    make[2]: Nothing to be done for `install-data-am'.
    Making install in test
    make[2]: Nothing to be done for `install-exec-am'.
    make[2]: Nothing to be done for `install-data-am'.
    Making install in cs
    make[2]: Nothing to be done for `install-exec-am'.
    make[2]: Nothing to be done for `install-data-am'.
    Making install in cl
    make[2]: Nothing to be done for `install-exec-am'.
    make[2]: Nothing to be done for `install-data-am'.
    make[2]: Nothing to be done for `install-exec-am'.
    make[2]: Nothing to be done for `install-data-am'.
    
[Ref](http://divvun.no/doc/infra/CompilingVoikkoWithHfst.html)

Now run the python code:

In [1]:
from voikko.libvoikko import Voikko

In [2]:
v = Voikko("fi")

VoikkoException: Initialization of Voikko failed: No valid dictionaries were found

Well, No valid dictionaries were found, So lets install dictionary **voikko-fi**

## Install voikko-fi

Building voikko-fi from this source code requires **foma**, libvoikko, python (version 3 or later) and GNU make.

So install **foma** first: 

Download foma from [here](https://bitbucket.org/mhulden/foma/downloads/foma-0.9.18_OSX.tar.gz) and unpack it.

```$ cd OSX```

```$ sudo cp ./foma /usr/local/bin/.```

```$ sudo cp ./flookup /usr/local/bin/.```

[Ref](https://blogs.cornell.edu/finitestatecompling/2016/08/24/installing-foma/)

Then build **voikko-fi**:

```$ make vvfst```

    foma -f vvfst/main.foma 2>&1 | grep -v "defined but not used"
    Root...15, Sanasto...13, Sanasto_ee...2, Sanasto_em...2, Sanasto_ep...2, Sanasto_es...2, Sanasto_h...1, Sanasto_l...2, Sanasto_n...2, Sanasto_nl...2, Sanasto_t...2, Sanasto_p...1, Sanasto_a...2, Sanasto_s...2, Sanasto_c...1, Sanasto_laatusanat...2, Sanasto_nimisanat...3, Sanasto_nimisanat_ja_nl_vertailumuodot...
    .
    .
    .
    uhdesana...Building lexicon...
    25
    Determinizing...
    Minimizing...
    Done!
    7.5 MB. 430498 states, 491152 arcs, Cyclic.
    defined Lexicon: 7.5 MB. 430498 states, 491152 arcs, Cyclic.
    defined ItoE: 517 bytes. 3 states, 10 arcs, Cyclic.
    defined Lengthening: 1.9 kB. 9 states, 90 arcs, Cyclic.
    defined HV: 2.5 kB. 17 states, 125 arcs, Cyclic.
    variable flag-is-epsilon = ON
    7.5 MB. 432053 states, 493608 arcs, Cyclic.
    7.5 MB. 432053 states, 493608 arcs, Cyclic.
    defined Lexicon2: 7.5 MB. 432053 states, 493608 arcs, Cyclic.
    7.5 MB. 432046 states, 493521 arcs, Cyclic.
    7.5 MB. 432046 states, 493521 arcs, Cyclic.
    Writing AT&T file: vvfst/all.att
    ! grep ']]' vvfst/all.att
    cat vvfst/all.att | sort -n | voikkovfstc -o vvfst/mor.vfst
    Symbols: 288
    Transitions: 493521
    Final states: 1
    Overflow cells: 1
    foma -f vvfst/autocorrect.foma 2>&1 | grep -v "defined but not used"
    Root...Building lexicon...
    228
    Determinizing...
    Minimizing...
    Done!
    22.8 kB. 1188 states, 1411 arcs, 227 paths.
    defined Lexicon: 22.8 kB. 1188 states, 1411 arcs, 227 paths.
    variable flag-is-epsilon = ON
    22.8 kB. 1188 states, 1411 arcs, 227 paths.
    22.8 kB. 1188 states, 1411 arcs, 227 paths.
    Writing AT&T file: vvfst/autocorrect.att
    cat vvfst/autocorrect.att | sort -n | voikkovfstc -o vvfst/autocorr.vfst
    Symbols: 22
    Transitions: 1411
    Final states: 1
    Overflow cells: 0

```$ sudo make vvfst-install DESTDIR=/usr/local/lib/voikko```

    install -m 755 -d /usr/local/lib/voikko/5/mor-standard
    install -m 644 vvfst/index.txt vvfst/mor.vfst vvfst/autocorr.vfst /usr/local/lib/voikko/5/mor-standard
    
[Ref](https://github.com/voikko/corevoikko/tree/master/voikko-fi)

Then, run the python code again, this time no errors!

In [5]:
v = Voikko("fi")

In [7]:
v.analyze(u"kissa")

[{'BASEFORM': 'kissa',
  'CLASS': 'nimisana',
  'FSTOUTPUT': '[Ln][Xp]kissa[X]kiss[Sn][Ny]a',
  'NUMBER': 'singular',
  'SIJAMUOTO': 'nimento',
  'STRUCTURE': '=ppppp',
  'WORDBASES': '+kissa(kissa)'}]

In [8]:
v.spell(u"kissa")

True

In [9]:
v.suggest(u"kisssa")

['kissa', 'kissaa', 'kisassa', 'kisussa']

In [10]:
v.hyphenate(u"kissa")

'kis-sa'

In [11]:
v.terminate()