-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Did you try to fine-tune transformers LM with Ranger? #13
Comments
Testing for XLnet should be prioritarised as it is the current best state of the art. |
@avostryakov I tried fine-tuning a BERT based model for joint NER and relation classification. It performs about ~1.5% worse for my tasks than the AdamW implementation in Transformers: AdamW
Ranger
It is possible that with more tuning I might be able to close the gap. If anyone else has any tips for fine-tuning BERT with Ranger, please let me know! |
I'm working with DETR which is object detection with transformer internally and will test it out there soon. |
How does ranger perform for Detr? |
Recent transformers architectures are very famous in NLP: BERT, GPT-2, RoBERTa, XLNET. Did you try to fine-tune them on some NLP task? If so, what was the best Ranger hyper-parameters and learning rate scheduler?
The text was updated successfully, but these errors were encountered: