Skip to content

0.3.0

Compare
Choose a tag to compare
@tanaysoni tanaysoni released this 28 Oct 13:50
· 538 commits to master since this release

Major Changes

Adding Roberta & XLNet

Welcome RoBERTa and XLNet on the FARM 馃帀!
We did some intense refactoring in FARM to make it easier to add more language models. However, we will only add models where we see some decent advantages. One of the next models to follow will very likely be ALBERT ...

For now, we support Roberta/XLNet on (Multilabel) Textclassification, Text Regression and NER. QA will follow soon.

鈿狅笍 Breaking Change - Loading of Language models has changed:
Bert.load("bert-base-cased") -> LanguageModel.load("bert-base-cased")

Migrating to tokenizers from the transformers repo.

Pros:

  • It's quite easy to add a tokenizer for any of the models implemented in transformers.
  • We rather support the development there than building something in parallel
  • The additional metadata during tokenization (offsets, start_of_word) is still created via tokenize_with_metadata
  • We can use encode_plus to add model specific special tokens (CLS, SEP ...)

Cons:

  • We had to deprecate our attribute "never_split_chars" that allowed to adjust the BasicTokenizer of BERT.
  • Custom vocab is now realized by increasing vocab_size instead of replacing unused tokens

鈿狅笍 Breaking Change - Loading of tokenizers has changed:
BertTokenizer.from_pretrained("bert-base-cased") -> Tokenizer.load("bert-base-cased")

鈿狅笍 Breaking Change - never_split_chars:
is no longer supported as an argument for the Tokenizer


Modelling:

  • [enhancement] Add Roberta, XLNet and redesign Tokenizer #125
  • [bug] fix loading of old tokenizer style #129

Data Handling:

  • [bug] Fix name of squad labels in experiment config #121
  • [bug] change arg in squadprocessor from labels to label_list #123

Inference:

  • [enhancement] Add option to disable multiprocessing in Inferencer(#117) #128
  • [bug] Fix logging verbosity in Inferencer (#117) #122

Other

  • [enhancement] Tutorial update #116
  • [enhancement] Update docs for api/ui docker #118