Skip to content

Releases: finalfusion/finalfrontier

Skip SSE tests on non-SSE platforms

27 Jul 09:03
Compare
Choose a tag to compare

Tests failed on some non-x86_64 platforms because they do not have SSE. This minor release feature-gates those tests.

Directly save to other formats

02 Jul 10:32
Compare
Choose a tag to compare

Do you use fastText, but you would also like to get your hands on structured skipgram, directional skipgram, or dependency embeddings models? This is now possible, since finalfrontier 0.9.1 adds support for saving trained embeddings in the fastText format 🎉.

With the new --output flag, you can save embeddings to other formats in addition to finalfusion. Options are: fasttext, word2vec binary, text, or textdims

Target vocabulary sizes and run-time selection of SIMD code paths

24 Jun 07:00
Compare
Choose a tag to compare
  • Add support for training with a target vocabulary size. This is an alternative for setting a minimum token count and will attempt to create a vocabulary with the given size. Target vocabulary sizes are enabled through the --context-target-size, --target-size, and --ngram-target-size options. (@sebpuetz)
  • SIMD code paths are now dynamically selected at run-time. It is thus not necessary anymore to compile finalfrontier with specific target features to use code paths for newer SIMD instruction sets. (@danieldk)
  • Add dot product implementation using FMA (fused multiply-add). (@danieldk)
  • Enable training with the fastText indexer. With future changes in finalfusion-rust and finalfusion-convert, this will allow you to crate fastText embeddings with finalfrontier! (@sebpuetz)

CoNLL-U dependencies and improved error messages

23 Jun 11:02
Compare
Choose a tag to compare
  • Update the dependency format from CoNLL-X to CoNLL-U.
  • Improve error handling and error messages.
  • Remove the use of end-of-sentence markers.
  • Upgrade to finalfusion 0.12.

Explicit n-grams, single finalfrontier command

08 Nov 08:48
Compare
Choose a tag to compare
  • The most user-visible change is that ff-train-deps and ff-train-skipgram have been merged into one command, finalfrontier. Dependency and skipgram embeddings can be trained with respectively finalfrontier deps and finalfrontier skipgram.

  • Support for training explicit subwords has been added.

    Thus far, finalfrontier has followed the same subword approach as fastText: each subword (n-gram) mapped to an embedding using the FNV-1 hash function. This approach reduces the number of embeddings when the corpus contains a large number of possible embeddings, at the cost of collisions. With the --subwords ngrams option, finalfrontier uses an (explicit) n-gram vocabulary instead.

  • The hogwild and finalfrontier-utils crates have been merged into the finalfrontier crate. Consequently, finalfrontier now consists of a single crate.

  • When the number of threads is not specified, finalfrontier has traditionally used half the logical CPUs. This has been refined to use half the number of logical GPUs, capped at 20 threads. Using more than 20 threads can slow convergence drastically on typical corpora.

0.6.1

21 Jun 19:24
Compare
Choose a tag to compare

Update to finalfusion 0.7.1 to correctly enable storage of norms.

Directional skipgram

14 Jun 07:24
Compare
Choose a tag to compare

This release has the following changes:

  • Add support for the directional skip-gram model (Song et al., 2018).
  • Store norms in a finalfusion chunk, making it possible to retrieve the unnormalized embeddings.
  • Better defaults in for skip-gram models: context size 5 -> 10, dimensions 100 -> 300, epochs: 5 -> 15
  • Improved command-line option handling.

Dependency embeddings

25 Apr 15:51
Compare
Choose a tag to compare

The addition in this release is support for dependencies as context. This makes it possible to train dependency embeddings as described by Levy & Goldberg, 2014. The dependency embedding model can be tuned in fine-grained detail (such as the depth of the relations).

  • Add dependency relations.
  • Refactoring training to make it easier to add different context types.
  • Precompiled releases, including a MUSL target.
  • Migration to Rust 2018.
  • ff-train has been renamed to ff-train-skipgram.

Switch from rust2vec to finalfusion

03 Apr 11:28
Compare
Choose a tag to compare
  • The rust2vec crate is renamed to finalfusion. This minor release changes finalfrontier to use finalfusion as a dependency.
  • This is the first release that provides builds for Linux (both glibc and a static MUSL binary) and macOS.

Directly to finalfusion

11 Mar 07:25
Compare
Choose a tag to compare

The most important change in this release is that finalfrontier stores trained embeddings in the finalfusion format, which is implemented by rust2vec and finalfusion-python. This format is more generic than the old finalfrontier format and easier to implement readers for.

As a result of these changes, finalfrontier is now only for training embeddings. To actually use the embeddings in your own program, use rust2vec.

Summary of changes since v0.2.0:

  • Store trained embeddings in finalfusion format.
  • Remove ff-similarity, ff-convert, and ff-compute-accuracy. This functionality is provided by rust2vec.