New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretraining instruction #41
Comments
@JiachengLi1995 It seems the steps to pretrain is fairly trivial. You can execute
python -m luke.cli build-dump-db enwiki-latest-pages-articles.xml.bz2 ./enwiki_latest
python -m luke.cli build-entity-vocab ./enwiki_latest ./output_ent-vocab.jsonl
do note that I am using roberta-base tokenizer as my tokenizer you can use others such as bert-base-cased etc. python -m luke.cli build-wikipedia-pretraining-dataset ./enwiki_latest roberta-base ./output_ent-vocab.jsonl ./wikipedia_pretrain_dataset
python -m luke.cli pretrain \
./wikipedia_pretrain_dataset \
luke-roberta-base \
--bert-model-name roberta-base \
--entity-emb-size 300 \
--batch-size 28 \
--gradient-accumulation-steps 1 \
--learning-rate 0.0005 \
--warmup-steps 10000 \
--log-dir logs/luke-roberta-base |
@theblackcat102 Thank you for providing the instructions of the pretraining of LUKE! By following these steps with using the hyperparameters specified in the original paper, our system should produce the model that performs similarly to our pretraining model. |
Hi @ikuyamada, Sorry to bother you again. I wanted to fine tune the model on the CoNLL2003 dataset to see if it runs. On the entire dataset it begins the training phase, but it takes 50hrs/epoch. So i only took the first few samples of the CoNLL dataset, but then i get an assertionerror |
Hi @mgong023, |
Alright, your recommendation gave me an insight so i have fixed that error. I now encounter an CUDA out of memory on the second epoch. I use 2x Nvidia Tesla K80. Are the GPUs not powerful enough or does something go wrong? I also get the following warning at the start: |
Hi @mgong023, |
Hi, I used the exact same data for train, valid and test set to test if the model actually learns. However, all scores (precision, recall, F1) stay zero after fine tuning it. It seems like the model doesn't learn anything at all. Is there a way to fix this? |
@mgong023 Hi, did you use the CoNLL-2003 dataset to test the model? If not, I am sorry but I cannot provide support to a specific use case. Also, this issue may not be related to your question. It is a common practice to create an issue if there does not exist an issue related to your problem. |
@theblackcat102 following your instruction, I can train new luke on my own dataset. But it seems to mismatch when converting it to transformers huggingface. There is no |
Hi @theblackcat102 and @ikuyamada, |
Hi @svjan5, We are working on the updated pretraining code and a documentation and plan to release them soon. Additionally, as described in our paper (Appendix A), we adopt two-stage pretraining. We freeze the BERT weights (by specifying |
Thank you for your prompt reply @ikuyamada. I will try following the two-stage pretaining and will wait for the updated code and instructions. |
You meant, Please confirm. Regards |
Hi @patelrajnath,
|
@svjan5 Thanks! I will let you know when the pretraining instruction is available. |
While you wait @svjan5, here are instructions I created for pretraining LUKE for myself: Note that this is for Icelandic and therefore has data and models suited for that language. Something you probably want to change |
Thanks a lot, @bennigeir. I will definitely give it a try. |
The pretraining instruction is available here. |
Great thanks a lot @ikuyamada ! |
Hi authors,
Awesome work! Thanks for your codes and instructions. Recently, I want to pretrain a new Luke model on my own dataset. Could you write a pretraining instruction so I can learn? Thank you!
The text was updated successfully, but these errors were encountered: