Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config and SequenceClassification #17

Closed
manueltonneau opened this issue Sep 27, 2020 · 1 comment
Closed

Config and SequenceClassification #17

manueltonneau opened this issue Sep 27, 2020 · 1 comment

Comments

@manueltonneau
Copy link

Hi all and thanks for the cool contribution,

Now that the PR is merged on transformers, I am trying to include your model in the simpletransformers repository, in order to use it in my project.

  • I have read on the README that BERTweet has a BERT-base configuration (and shares the pre-training procedure with RoBERTa). Therefore, how come is it associated with a RobertaConfig in src/transformers/tokenization_auto.py (in TOKENIZER_MAPPING, see the changed files in the PR)? Shouldn't we use a BertConfig instead?
  • When loading the weights to fine-tune on a text classification downstream task, should we use BertForSequenceClassification or RobertaForSequenceClassification?

Thanks a lot in advance.

@datquocnguyen
Copy link
Member

datquocnguyen commented Sep 27, 2020

Hi, BERTweet has the same architecture as BERT-base (as RoBERTa-base), e.g. same number of layers, hidden sizes,.... But BERTweet's pre-training procedure is RoBERTa. Thus you should use both RobertaConfig and RobertaForSequenceClassification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants