-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could Anybody Reproduce the Results? The BERT and RoBERT results? (Resolved) #12
Comments
Hi! There are two probable reasons for this issue:
Since our model and default setting of hyper-parameters is friendly to the BERT-base and RoBERTa-base model fine-tuning, it's more efficient to reproduce the result on the X-base models. We suggest you try this way. |
When I was running CDR, I also encountered this problem, the performance was very low, around 62. However, I did not change any code, only changed some paths, I do not know why this happened, have you solved this problem? |
Hi buddy, there are many reasons for this situation. I suggest you re-check the following steps, if you have any problems. feel free to contact us:
I hope those tips can help you reproduce the results. Thx! |
First of all, thank you very much for your reply and for being so timely. Thank you again! Yes, for CDR, I did use Scibert, but for GPU, I only have one, so I changed the batch size to 2, but the other super parameters are not changed, run_cdr.sh is run, will the batch size affect so much, looking forward to your advice! |
I think the major reason may be the batch size, you can watch the loss to check whether the model convergence or not. Maybe training more steps will result better performance. Besides, you can use fp16 for large batchsize. |
Thanks for the advice, but what I don't understand is that the batch size will affect so much? Would it affect 10%+? Have you tried a similar experiment? |
Maybe deep learning is such a hyperparameter sensitive methodology, and we don't want this to happen either. We will try to conduct an analysis on batch size in future. |
Hello, I have the same question with you! I use 1 GPU with batch_size=4 and get f1=0.64, it is even lower than ATLOP with the same hyper-parameters. |
Hello, do you use the default experimental setting? Some other researchers have already reproduce this performance and even obtain much better results with hyperparameter tuning (such as #13 (comment)). Maybe the following situation account for the reason. Do you use the SciBERTbase as the pre-trained language model? If you have any question, feel free to contact us. |
Thanks for your work and reply! |
I run their scripts, and only got lower F1-scores (dev_result:{'dev_F1': 61.39554434636402, 'dev_F1_ign': 59.42344205967282, 'dev_re_p': 63.68710211912444, 'dev_re_r': 59.2631664367443, 'dev_average_loss': 0.3790786044299603}).
The text was updated successfully, but these errors were encountered: