-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about QA-fine-tuning #16
Comments
Hi @czh17 , thanks for your interest. Can you give more details on how you trained/pretrained? i.e. the exact commands you ran + dataset processing done. For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset |
Thank you for your reply.
Yes, I realized the problem you suggested, so I reproduced the whole experiment again. However, the experiment did not work very well. The details of the experiment and the exact commands are as follows. For KGC pretrain on the MetaQA KG :
In this stage, I use main_accelerate.py under the main branch for training. I observed that the loss of the model did not decrease and would appear to go from small to large and then to small again. For example, (epoch loss: 100->500->2000->90->400). I set the learning rate to 1e-5 as well as 1e-6, but the problem does not seem to be alleviated. For KBQA fine-tuning on the MetaQA :
In this stage, I also use main_accelerate.py under the main branch for training. For inference, the qa pair in qa_test.text, is converted to the form ' predict answer: Topic Entity token | question token with NE |/t answer token '. Meanwhile, I rewrote the eval function based on eval_accelerate.py under the apoorv-dump branch, whose evaluation criterion is that if the token generated by the model is in the answer list, then the answer is judged to be correct. Please let me know if there are any mistakes or details that I should have noticed in the above training process. Thanks again for your reply. |
Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed? |
Also, I would suggest you take a look at #11 as well, for details on how you can train the model in 1 go (concatenating qa and kgc lines). I will try to post the pretrained checkpoints as well soon |
Yes, this loss fluctuation phenomenon is very confusing to me. Here are the commands I executed : python main_accelerate.py --save_prefix MetaQA_kgc_200_epoch --model_size base --dataset data_kgqa/MetaQA_1hop_half --split train_kgc_lines --batch_size 64 --save_steps 5000 --loss_steps 500 --learning_rate 0.0001 In this experiment, I have changed line 139 of main_accelerate.py to T5ForConditionalGeneration.from_pretrained('t5-base'). |
Let me try and get back to you, sorry for the delay |
Would you mind sharing the code for the KBQA fine-tuning? This is very important for my research work, thanks again. |
Hi Apoorv, nice work. I have some issue about the QA-fine-tuning.
I experimented with the MetaQA dataset using the code under the apoorv-dump branch with the following training detials:
However, the best accuracy of my model on the qa_test set was only 40.7%/12.9%/26.6% (1-hop/2-hop/3hop).
Am I missing some details during the experiment that make it less accurate? Please let me know. It would be great if you could give me a pointcheck with high accuracy.
The text was updated successfully, but these errors were encountered: