Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about QA-fine-tuning #16

Closed
czh17 opened this issue Aug 4, 2022 · 7 comments
Closed

Question about QA-fine-tuning #16

czh17 opened this issue Aug 4, 2022 · 7 comments

Comments

@czh17
Copy link

czh17 commented Aug 4, 2022

Hi Apoorv, nice work. I have some issue about the QA-fine-tuning.
I experimented with the MetaQA dataset using the code under the apoorv-dump branch with the following training detials:

  1. model_size: T5-small
  2. pointcheck: 3330000.pt (kgc task results on Wikidata5M : 21.6 (Hits@1))
  3. epoch: 60 batchsize: 64
  4. INPUT(‘predict answer: Topic Entity token | question token with NE |’) OUTPUT(‘ answer token ’)

However, the best accuracy of my model on the qa_test set was only 40.7%/12.9%/26.6% (1-hop/2-hop/3hop).
Am I missing some details during the experiment that make it less accurate? Please let me know. It would be great if you could give me a pointcheck with high accuracy.

@apoorvumang
Copy link
Owner

Hi @czh17 , thanks for your interest.

Can you give more details on how you trained/pretrained? i.e. the exact commands you ran + dataset processing done.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

@czh17
Copy link
Author

czh17 commented Aug 11, 2022

Thank you for your reply.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

Yes, I realized the problem you suggested, so I reproduced the whole experiment again. However, the experiment did not work very well. The details of the experiment and the exact commands are as follows.

For KGC pretrain on the MetaQA KG :

  • dataset: ‘data_kgqa//MetaQA_1hop_half//train_kgc_lines.txt' (only)
  • optimizer: adafactor
  • learning_rate: 1e-4
  • epoch: 200
  • model_size: small

In this stage, I use main_accelerate.py under the main branch for training. I observed that the loss of the model did not decrease and would appear to go from small to large and then to small again. For example, (epoch loss: 100->500->2000->90->400). I set the learning rate to 1e-5 as well as 1e-6, but the problem does not seem to be alleviated.

For KBQA fine-tuning on the MetaQA :

  • dataset: f‘data_kgqa//MetaQA_{hops}hop_half//train.txt' (hops = [1,2,3]) and 'qa_test.txt'
  • optimizer: adafactor
  • learning_rate: 1e-4
  • epoch: 60
  • pointcheck: (The kgc model with the smallest loss).pt
  • beam_size: 1

In this stage, I also use main_accelerate.py under the main branch for training. For inference, the qa pair in qa_test.text, is converted to the form ' predict answer: Topic Entity token | question token with NE |/t answer token '. Meanwhile, I rewrote the eval function based on eval_accelerate.py under the apoorv-dump branch, whose evaluation criterion is that if the token generated by the model is in the answer list, then the answer is judged to be correct.

Please let me know if there are any mistakes or details that I should have noticed in the above training process. Thanks again for your reply.

@apoorvumang
Copy link
Owner

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

@apoorvumang
Copy link
Owner

Also, I would suggest you take a look at #11 as well, for details on how you can train the model in 1 go (concatenating qa and kgc lines).

I will try to post the pretrained checkpoints as well soon

@czh17
Copy link
Author

czh17 commented Aug 15, 2022

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

Yes, this loss fluctuation phenomenon is very confusing to me. Here are the commands I executed :

python main_accelerate.py --save_prefix MetaQA_kgc_200_epoch --model_size base --dataset data_kgqa/MetaQA_1hop_half --split train_kgc_lines --batch_size 64 --save_steps 5000 --loss_steps 500 --learning_rate 0.0001

In this experiment, I have changed line 139 of main_accelerate.py to T5ForConditionalGeneration.from_pretrained('t5-base').

@apoorvumang
Copy link
Owner

Let me try and get back to you, sorry for the delay

@czh17
Copy link
Author

czh17 commented Sep 29, 2022

Would you mind sharing the code for the KBQA fine-tuning? This is very important for my research work, thanks again.

@czh17 czh17 closed this as completed Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants