v0.5.0
The performances of the contemporary models in this release are improved, most notably for models
not using BERT.
Added
- The
scripts/zenodo_upload.py
script, a helper for uploading files to a Zenodo deposit.
Changed
- The CharRNN lexer now represent words with last hidden (instead of cell) state of the LSTM and do
not run on padding anymore. - Minimal Pytorch version is now
1.9.0
- Minimal Transformers version is now
4.19.0
- Use
torch.inference_mode
instead oftoch.no_grad
over all the parser methods. - BERT lexer batches no longer have an obsolete, always zero
word_indices
attribute DependencyDataset
does not have lexicon attributes (ito(lab|tag
and their inverse) since we
don't need these anymore.- The
train_model
script now skips incomplete runs with a warning. - The
train_model
script has nicer logging, including progress bars to help keep track of the
experiments.
Fixed
- The first word in the word embeddings lexer vocabulary is not used as padding anymore and has a
real embedding. - BERT embeddings are now correctly computed with an attention mask to ignore padding.
- The root token embedding coming from BERT lexers is now an average of non-padding words'
embeddings - FastText embeddings are now computed by averaging over non-padding subwords' embeddings.
- In server mode, models are now correctly in eval mode and processing is done
intorch.inference_mode
.
Full Changelog: v0.4.2...v0.5.0