Release 3.14.0

@eaxelson eaxelson released this Mar 23, 2018 · 36 commits to master since this release

Noteworthy changes in 3.14.0

  • Numerous improvements to pmatching and tokenization:

    • pmatch now supports the default symbol

    • pmatch now supports reading word embeddings in the binary format

    • improvements to pmatch runtime context handling: a bug affecting expression-initial contexts is fixed, and pmatch now supports Ins() arcs inside RC() and NRC() contexts

    • in pmatch, bugs affecting multiple Ins() arcs, in particular nested ones, are fixed

  • Implement variable 'retokenize' in hfst-xfst

The file hfst-3.14.0.tar.gz contains the source code.
For installation instructions and alternative installation methods, see our download page.

Release 3.13.0

@eaxelson eaxelson released this Sep 23, 2017 · 202 commits to master since this release

Noteworthy changes in 3.13.0

  • Numerous improvements to pmatching and tokenization:

    • "[].with(X = Y)" feature in pmatch. This provides support in the pmatch2fst compiler to define "global flags".

    • Add a variable "xerox-composition", default to "on".

    • Consider list symbols ("@l..." and "@x...") to be special.

    • Fix runtime handling of contexts and compilation of negative contexts.

    • Make Like() and Unlike() much faster by not sorting the whole vocabulary and only calculating each comparison key once.

    • Keep track of weights along context checking paths and unify weight handling.

    • In blankline-separated mode, keep blanklines in output too.

    • Round weight to zero decimals, non-sci.

    • Make hfst-tokenise usable as a lib; include simple string-to-string function.

    • Use libreadline in hfst-pmatch when available.

    • Cleanup remainder of the pmatch test suite, all the tests now pass.

  • Python interface:

    • Support reading several twolc files.

    • Add functions 'compose' and 'cross_product' that take a list of transducers.

    • Allow empty string as input for hfst.fst and hfst.fsa and interpret it as epsilon.

    • Perform fsmbook tests also via python API.

    • Add option --local-hfst to setup.py.

    • Include pre-swig-generated wrappers to pypi source distribution.

  • Compilation:

    • Use by default c++11 unordered_map and unordered_set, unless otherwise specified.

    • Add an option --without-c++11 (defaults to 'no') to compile hfst without c++11 support.

    • Require libc++ and osx version >= 10.7 with clang.

  • New functions and options:

    • Add function HfstBasicTransducer::remove_final_weight.

    • Add function HfstTransducer::negate() for automata.

    • Add option --restricted-mode (-R) to hfst-xfst.

    • Flag diacritics: support getting a list of operations involved with a particular feature.

    • Allow creating HfstInputStream's from std::istream's.

  • Fix issues #341 and #353, make workarounds for issue #358.

The file hfst-3.13.0.tar.gz below contains the source code.
For installation instructions and alternative installation methods, see our download page.

Release 3.12.2

@eaxelson eaxelson released this Apr 11, 2017 · 367 commits to master since this release

Noteworthy changes in 3.12.2

  • Changes to configure:

    • Disable lexc and foma wrappers as well as hfst-train-tagger tool unless explicitly requested

    • Enable hfst-calculate and hfst-xfst by default

    • Add experimental options --with-openfst-log=lean and --with-sfst=lean which support a limited nuber of operations for these types (reading, writing, converting between t
      ypes and deleting)

    • Require at least automake version 1.12 unless compiling from pre-yacc-generated sources

  • Implement hfst-twolc as a single program instead of a script. Get rid of hfst-twolc-system and hfst-twolc-loc tools.

  • Improve pmatch compilation and error handling mechanisms

  • Improve hfst-tokenize tool

  • Add transliterate output mode (--transliterate) to hfst-proc

  • Changes to python interface:

    • Support twolc and sfst compilation

    • Improve HfstBasicTransducer iteration mechanism

    • Add experimental support for python version 2

    • Support apply up and apply down commands in function start_xfst

The file hfst-3.12.2.tar.gz below contains the source code.
For installation instructions and alternative installation methods, see our download page.

Release 3.12.1

@eaxelson eaxelson released this Dec 1, 2016 · 517 commits to master since this release

Noteworthy changes in 3.12.1

  • Fix flag elimination bug (reported in issue #342)
  • Do not allow unescaped dots in regular expressions
  • Improvements to pmatch and tokenization tools:
    • Search for included files under scriptdir, not working dir
    • Add experimental two-vector model for word sense
    • Handle Apertium-style superblanks in --giella-cg format
  • Rename the PyPI package to 'hfst' (available at https://pypi.python.org/pypi/hfst)
  • Update foma back-end

The file hfst-3.12.1.tar.gz below contains the entire source code.
The file libhfst-3.12.1.tar.gz is a minimal source code package which contains only hfst library, openfst back-end, and python bindings. It is licensed as LGPL.
The files hfst-3.12.1_py3.3-3.4_win64 contain 64-bit python bindings for python versions 3.3 and 3.4 for windows.
For installation instructions and alternative installation methods, see our download page.

Release 3.12.0

@eaxelson eaxelson released this Nov 14, 2016 · 543 commits to master since this release

Noteworthy changes in 3.12.0

  • fixes to memory leaks and efficiency
  • fixes to numerous warnings
  • changes in Python bindings:
    • rename hfst.rules into hfst.sfst_rules
    • get rid of hfst.types and offer implementation types in class ImplementationType
    • add Xerox-type rules in module hfst.xerox_rules
    • improve documentation
    • tentatively add partial support for pypi installation
  • improvements to pmatch tools and hfst-proc

The file hfst-3.12.0.tar.gz below contains the entire source code.
The file libhfst-3.12.0.tar.gz is a minimal source code package which contains only hfst library, openfst back-end, and python bindings. It is licensed as LGPL.
For binaries, see our download page.

Release 3.11.0

@eaxelson eaxelson released this Jun 17, 2016 · 672 commits to master since this release

Noteworthy changes in 3.11.0

  • Add docstrings to Python API
  • Changes and improvements to pmatch tools, hfst-tokenize, hfst-optimized-lookup, hfst-lookup and hfst-xfst
  • Fix bugs in flag elimination

The file hfst-3.11.0.tar.gz below contains the entire source code.
The file libhfst-3.11.0.tar.gz is a minimal source code package which contains only hfst library, openfst back-end, and python bindings. It is licensed as LGPL.
For binaries, see our download page.

Release 3.10.0

@eaxelson eaxelson released this Apr 13, 2016 · 850 commits to master since this release

Noteworthy changes in 3.10.0

  • Swap directions of 'apply up' and 'apply down' in hfst-xfst, so that these commands work in the same way as in foma and xfst.
  • Add a new tool called hfst-flookup. It does lookup from right to left, in the same way as foma's flookup and xerox's lookup. The tools hfst-lookup and hfst-optimized-lookup stay as they are.
  • Improvements to pmatch and optimized lookup
  • Make hfst-fst2fst print a more informative error message if a gzipped native foma transducer is given as input.
  • Changes and improvements in Python interface:
    • Transducer functions returning a reference to the transducer now return void.
    • Tentatively add some pmatch functions.
    • Add some two-level rule functions.
    • Support lookup for transducers not in optimized-lookup format.
    • Support Python 2 unicode strings in lookup.

The file hfst-3.10.0.tar.gz below contains the entire source code.
The file libhfst-3.10.0.tar.gz is a minimal source code package which contains only hfst library, openfst back-end, and python bindings. It is licensed as LGPL.
For binaries, see our download page.

Release 3.9.2

@eaxelson eaxelson released this Mar 22, 2016 · 893 commits to master since this release

Noteworthy changes in 3.9.2

  • Improvements to tokenization and pmatch tools

The file hfst-3.9.2.tar.gz below contains the entire source code.
The file libhfst-3.9.2.tar.gz is a minimal source code package which contains only hfst library, openfst back-end, and python bindings. It is licensed as LGPL.
For binaries, see our download page.