You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now that the PR is merged on transformers, I am trying to include your model in the simpletransformers repository, in order to use it in my project.
I have read on the README that BERTweet has a BERT-base configuration (and shares the pre-training procedure with RoBERTa). Therefore, how come is it associated with a RobertaConfig in src/transformers/tokenization_auto.py (in TOKENIZER_MAPPING, see the changed files in the PR)? Shouldn't we use a BertConfig instead?
When loading the weights to fine-tune on a text classification downstream task, should we use BertForSequenceClassification or RobertaForSequenceClassification?
Thanks a lot in advance.
The text was updated successfully, but these errors were encountered:
Hi, BERTweet has the same architecture as BERT-base (as RoBERTa-base), e.g. same number of layers, hidden sizes,.... But BERTweet's pre-training procedure is RoBERTa. Thus you should use both RobertaConfig and RobertaForSequenceClassification.
Hi all and thanks for the cool contribution,
Now that the PR is merged on transformers, I am trying to include your model in the
simpletransformers
repository, in order to use it in my project.RobertaConfig
insrc/transformers/tokenization_auto.py
(inTOKENIZER_MAPPING
, see the changed files in the PR)? Shouldn't we use aBertConfig
instead?BertForSequenceClassification
orRobertaForSequenceClassification
?Thanks a lot in advance.
The text was updated successfully, but these errors were encountered: