Config and SequenceClassification #17

manueltonneau · 2020-09-27T11:09:18Z

Hi all and thanks for the cool contribution,

Now that the PR is merged on transformers, I am trying to include your model in the simpletransformers repository, in order to use it in my project.

I have read on the README that BERTweet has a BERT-base configuration (and shares the pre-training procedure with RoBERTa). Therefore, how come is it associated with a RobertaConfig in src/transformers/tokenization_auto.py (in TOKENIZER_MAPPING, see the changed files in the PR)? Shouldn't we use a BertConfig instead?
When loading the weights to fine-tune on a text classification downstream task, should we use BertForSequenceClassification or RobertaForSequenceClassification?

Thanks a lot in advance.

The text was updated successfully, but these errors were encountered:

datquocnguyen · 2020-09-27T11:31:54Z

Hi, BERTweet has the same architecture as BERT-base (as RoBERTa-base), e.g. same number of layers, hidden sizes,.... But BERTweet's pre-training procedure is RoBERTa. Thus you should use both RobertaConfig and RobertaForSequenceClassification.

manueltonneau mentioned this issue Sep 27, 2020

Include BERTweet ThilinaRajapakse/simpletransformers#738

Merged

datquocnguyen closed this as completed Sep 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config and SequenceClassification #17

Config and SequenceClassification #17

manueltonneau commented Sep 27, 2020

datquocnguyen commented Sep 27, 2020 •

edited

Config and SequenceClassification #17

Config and SequenceClassification #17

Comments

manueltonneau commented Sep 27, 2020

datquocnguyen commented Sep 27, 2020 • edited

datquocnguyen commented Sep 27, 2020 •

edited