Skip to content

v1.8.6

Compare
Choose a tag to compare
@blackwinter blackwinter released this 09 Feb 10:29
· 122 commits to master since this release
  • Lingo::Attendee::VectorFilter learned pos option to print position and
    byte offset with each word.
  • Lingo::Attendee::VectorFilter learned tfidf option to sort results based
    on their tf–idf score; the document
    frequencies are calculated over the "corpus" of all files processed during
    a single program invocation.
  • Lingo::Attendee::VectorFilter learned tokens option to filter on
    Lingo::Language::Token in addition to Lingo::Language::Word.
  • Lingo::Attendee::VectorFilter no longer supports debug (as well as
    prompt and preamble); use Lingo::Attendee::DebugFilter instead.
  • Lingo::Attendee::TextReader no longer removes line endings; option chomp
    is obsolete.
  • Lingo::Attendee::TextReader passes byte offset to the following attendee.
  • Lingo::Attendee::Tokenizer records token's byte offset.
  • Lingo::Attendee::Tokenizer records token's sequence position.
  • Lingo::Attendee::Tokenizer learned skip-tags option to skip over
    specified tags' contents.
  • Lingo::Attendee subclasses warn when invalid or obsolete options or names
    are used.
  • Changed German infix substitution /en to ch/chen in order to prevent
    overly aggressive identifications.
  • Internal refactoring and API changes.