GH-873: PyTorch-Transformers update #941

stefan-it · 2019-08-01T00:13:47Z

Hi,

this PR updates the old pytorch-pretrained-BERT library to the latest version of pytorch-transformers to support various new Transformer-based architectures for embeddings.

A total of 7 (new/updated) embeddings can be used in Flair now:

from flair.embeddings import (
    BertEmbeddings,
    OpenAIGPTEmbeddings,
    OpenAIGPT2Embeddings,
    TransformerXLEmbeddings,
    XLNetEmbeddings,
    XLMEmbeddings,
    RoBERTaEmbeddings,
)

bert_embeddings = BertEmbeddings()
gpt1_embeddings = OpenAIGPTEmbeddings()
gpt2_embeddings = OpenAIGPT2Embeddings()
txl_embeddings = TransformerXLEmbeddings()
xlnet_embeddings = XLNetEmbeddings()
xlm_embeddings = XLMEmbeddings()
roberta_embeddings = RoBERTaEmbeddings()

Detailed benchmarks on the downsampled CoNLL-2003 NER dataset for English can be found in #873 . This PR is the first working attempt to include various new Transformer-based embeddings.

Unit tests can be executed with pytest --runslow tests. These unit tests for Transformer embeddings will take ~ 4 minutes using GPU.

flair/embeddings.py

The following Transformer-based architectures are now supported via pytorch-transformers: - BertEmbeddings (Updated API) - OpenAIGPTEmbeddings (Updated API, various fixes) - OpenAIGPT2Embeddings (New) - TransformerXLEmbeddings (Updated API, tokenization fixes) - XLNetEmbeddings (New) - XLMEmbeddings (New) - RoBERTaEmbeddings (New, via torch.hub module) It also possible to use a scalar mix of specified layers from the Transformer-based models. Scalar mix is proposed by Liu et al. (2019). The scalar mix implementation is copied and slightly modified from the allennlp repo (Apache 2.0 license).

flair/embeddings.py

…st --runslow tests

alanakbik · 2019-08-06T15:29:59Z

flair/embeddings.py

+        try:
+            self.model = torch.hub.load("pytorch/fairseq", model)
+        except:
+            log_line(log)


log_line needs to be imported, otherwise this fails.

from flair.training_utils import log_line

Fixed :) I think I have to change my PyCharm default theme...

alanakbik · 2019-08-06T15:57:35Z

flair/embeddings.py

+        # method to avoid modifying the original state.
+        state = self.__dict__.copy()
+        # Remove the unpicklable entries.
+        state["model"] = None


I think this line does nothing, since "model" is part of "_modules". However, the saved model is still huge, which is strange because in __setstate__ the RoBERTa model is re-loaded from torch.hub.

@alanakbik I compared the model sizes for BERT and RoBERTa:

BERT (base): 429MB
RoBERTa (base): 487M

I will check the state["model"] now.

However, there's an upcoming PR in the PyTorch-Transformers repo that adds RoBERTa 🔥 So in near future it won't be necessary to use the torch.hub wrapper here :)

Ah ok, in this case don't spend too much time on this. It works already, so no need to fix something that will be fixed upstream :)

Great, I'll just leave it as it is now and will update the RoBERTaEmbeddings implementation whenever it is available in pytorch-transformers!

…eeded when RoBERTa embeddings are used)

alanakbik · 2019-08-07T09:19:25Z

👍

yosipk · 2019-08-07T09:34:45Z

👍

alanakbik · 2019-08-07T11:43:42Z

Awesome - thank you @stefan-it!!

Hellisotherpeople · 2019-08-07T15:13:26Z

Yes I love this!!!!!

berfubuyukoz · 2019-08-17T09:49:35Z

You are great! @stefan-it Thank you for your generosity!!!

pip: pytorch-pretrained-BERT -> pytorch-transformers change

a85e5ca

stefan-it requested review from alanakbik and yosipk August 1, 2019 00:14

myleott reviewed Aug 2, 2019

View reviewed changes

flair/embeddings.py Outdated Show resolved Hide resolved

stefan-it changed the title ~~GH-873: PyTorch-Transformers update~~ WIP: GH-873: PyTorch-Transformers update Aug 2, 2019

myleott reviewed Aug 4, 2019

View reviewed changes

flair/embeddings.py Outdated Show resolved Hide resolved

stefan-it force-pushed the GH-873-pytorch-transformers branch from 7202075 to 3bb7f00 Compare August 4, 2019 11:38

GH-873: add extensive unit tests for all Transformer-based embeddings

d55e7e9

stefan-it changed the title ~~WIP: GH-873: PyTorch-Transformers update~~ GH-873: PyTorch-Transformers update Aug 4, 2019

stefan-it and others added 2 commits August 4, 2019 22:23

GH-873: fix scalar mix calculation for BERT 😅

d88b6da

GH-873: extensive transformer embeddings tests should be run via pyte…

03878cb

…st --runslow tests

alanakbik reviewed Aug 6, 2019

View reviewed changes

GH-873: fix log_line import and add fastBPE as new dependency (only n…

1f3f785

…eeded when RoBERTa embeddings are used)

yosipk merged commit 4db491d into master Aug 7, 2019

yosipk deleted the GH-873-pytorch-transformers branch August 7, 2019 09:35

MarcioPorto mentioned this pull request Aug 9, 2019

Add support for XLNet #822

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-873: PyTorch-Transformers update #941

GH-873: PyTorch-Transformers update #941

stefan-it commented Aug 1, 2019 •

edited

alanakbik Aug 6, 2019

stefan-it Aug 7, 2019

alanakbik Aug 6, 2019

stefan-it Aug 7, 2019 •

edited

alanakbik Aug 7, 2019

stefan-it Aug 7, 2019

alanakbik commented Aug 7, 2019

yosipk commented Aug 7, 2019

alanakbik commented Aug 7, 2019

Hellisotherpeople commented Aug 7, 2019

berfubuyukoz commented Aug 17, 2019

GH-873: PyTorch-Transformers update #941

GH-873: PyTorch-Transformers update #941

Conversation

stefan-it commented Aug 1, 2019 • edited

alanakbik Aug 6, 2019

Choose a reason for hiding this comment

stefan-it Aug 7, 2019

Choose a reason for hiding this comment

alanakbik Aug 6, 2019

Choose a reason for hiding this comment

stefan-it Aug 7, 2019 • edited

Choose a reason for hiding this comment

alanakbik Aug 7, 2019

Choose a reason for hiding this comment

stefan-it Aug 7, 2019

Choose a reason for hiding this comment

alanakbik commented Aug 7, 2019

yosipk commented Aug 7, 2019

alanakbik commented Aug 7, 2019

Hellisotherpeople commented Aug 7, 2019

berfubuyukoz commented Aug 17, 2019

stefan-it commented Aug 1, 2019 •

edited

stefan-it Aug 7, 2019 •

edited