simple cg output from hfst-ospell, expects one word per line as input
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
test
.gitignore
.travis.yml
AUTHORS
COPYING
ChangeLog
Makefile.am
NEWS
README
README.org
autogen.sh
configure.ac

README.org

hfst-ospell-cg

https://travis-ci.org/unhammer/hfst-ospell-cg.svg

Description

This program spells using hfst-ospell, expecting one word per line, outputting Constraint Grammar format readings.

Prerequisites

  • gcc >=5.0.0 with libstdc++-5-dev (or similarly recent version of clang, with full C++11 support)
  • hfst-ospell-dev >=0.4.3

Tested with gcc-5.4.0. On Mac OS X, the newest XCode includes a modern C++ compiler.

Building

./autogen.sh
./configure  # optionally with argument --prefix=$HOME/my/prefix
make
make install # with sudo if you didn't specify a prefix

On OS X, you may have to do this:

export CC=clang CXX=clang++ "CXXFLAGS=-std=gnu++11 -stdlib=libc++"
./autogen.sh
./configure  LDFLAGS=-L/opt/local/lib # optionally with argument --prefix=$HOME/my/prefix
make
make install # with sudo if you didn't specify a prefix

Usage

Takes one arguments: a hfst-ospell zhfst archive

src/hfst-ospell-cg se.zhfst < input > output

You give a max weight (inclusive, based on error model) with -w, or max suggestions with -n, e.g.

src/hfst-ospell-cg -w 30000 se.zhfst < input > output
src/hfst-ospell-cg -n 3000 se.zhfst < input > output

You can also give a max analysis weight with -W, since the analyser may have its own weights (e.g. compound tags have higher weights):

src/hfst-ospell-cg -w 10000 se.zhfst < input > output

Note that FST weights are multiplied by 1000 and cast to integers, since CG expects numerical tags to be integral. Comparisons are inclusive (greater-than-or-equals).

Troubleshooting

If you get

terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error

or

util.hpp:36:19: fatal error: codecvt: No such file or directory
 #include <codecvt>
                   ^
compilation terminated.

then your C++ compiler is too old. See Prerequisites.

Progress [2/4]

This should:

  • [X] load a zhfst bin
  • [X] output CG analyses
  • [ ] do NUL-flushing, outputting STREAMCMD:FLUSH
  • [ ] deal with subreadings