-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetune with transformer-xl pretrained models #31
Comments
Yes I believe it's a good direction. Transformer-XL is presumably good for document-level representations due to the ability of handling long context. On short text, Transformer-XL might also have an edge (see results on One Billion Word). |
Hi! Thank you so much for your time! I'm trying to learn more about transformers-xl. Could you please help answer this beginner question? Is it possible to change the loss function and fine tune the transformer-xl model to do classification, similar to BERT? So for the transformer-xl model, there isn't one general pre-trained model like in BERT, but instead there are many pre-trained models based on different tasks? Thank you so much for your time and God bless! |
@BoPengGit did you manage to create a transfoXL based classifier like Bert ? |
I don't remember, this was a long time ago. |
Is it possible to classify document of length 30k tokens/words, using transformer-XL? |
Hi, thanks for your excellent work. Transformer-xl is the most elegant model for long sequences by now. Do you plan to finetune pretrained models for document classification just like Bert?
The text was updated successfully, but these errors were encountered: