Skip to content

v0.5.0

Compare
Choose a tag to compare
@LoicGrobol LoicGrobol released this 13 May 11:27
· 154 commits to main since this release
39dc80e

The performances of the contemporary models in this release are improved, most notably for models
not using BERT.

Added

  • The scripts/zenodo_upload.py script, a helper for uploading files to a Zenodo deposit.

Changed

  • The CharRNN lexer now represent words with last hidden (instead of cell) state of the LSTM and do
    not run on padding anymore.
  • Minimal Pytorch version is now 1.9.0
  • Minimal Transformers version is now 4.19.0
  • Use torch.inference_mode instead of toch.no_grad over all the parser methods.
  • BERT lexer batches no longer have an obsolete, always zero word_indices attribute
  • DependencyDataset does not have lexicon attributes (ito(lab|tag and their inverse) since we
    don't need these anymore.
  • The train_model script now skips incomplete runs with a warning.
  • The train_model script has nicer logging, including progress bars to help keep track of the
    experiments.

Fixed

  • The first word in the word embeddings lexer vocabulary is not used as padding anymore and has a
    real embedding.
  • BERT embeddings are now correctly computed with an attention mask to ignore padding.
  • The root token embedding coming from BERT lexers is now an average of non-padding words'
    embeddings
  • FastText embeddings are now computed by averaging over non-padding subwords' embeddings.
  • In server mode, models are now correctly in eval mode and processing is done
    in torch.inference_mode.

Full Changelog: v0.4.2...v0.5.0