Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实验效果疑惑 #19

Closed
Vincent131499 opened this issue Jul 24, 2020 · 3 comments
Closed

实验效果疑惑 #19

Vincent131499 opened this issue Jul 24, 2020 · 3 comments

Comments

@Vincent131499
Copy link

Vincent131499 commented Jul 24, 2020

Hello,感谢你杰出的工作。
我在glue的蚂蚁金服语义相似度语料上进行试验,finetune_epochs取20,distill_epochs取10,learning_rate取2e-5,dev_speed取0.5,最终蒸馏后在dev上的dev_acc始终在0.725徘徊。
若想让蒸馏后的dev_acc达到0.9,是不是要增大训练epoch,还是有别的影响因素呢?
感谢解答!

@autoliuweijie
Copy link
Owner

Hello,感谢你杰出的工作。
我在glue的蚂蚁金服语义相似度语料上进行试验,finetune_epochs取20,distill_epochs取10,learning_rate取2e-5,dev_speed取0.5,最终蒸馏后在dev上的dev_acc始终在0.725徘徊。
若想让蒸馏后的dev_acc达到0.9,是不是要增大训练epoch,还是有别的影响因素呢?
感谢解答!

请问,只进行Fine-tuning以后(未进行自蒸馏),所有样本都走完完整12层,这种状况下acc是多少? 先保证backbone模型的acc达到预期

@autoliuweijie
Copy link
Owner

据我所知,glue的蚂蚁金服语义相似度语料上,state-of-the-art的效果好像也才0.73左右。。。所以你想蒸馏到0.9,几乎不可能。

@Vincent131499
Copy link
Author

这两天忙没注意回复。。。是的,实验完我就去看了下glue上面的评测,发现上面的acc也挺低。
再次感谢这份工作的开源!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants