Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教,是否观察到 electra 较 bert/roberta 收敛更快? #2

Closed
nbcc opened this issue Mar 25, 2020 · 7 comments
Closed

请教,是否观察到 electra 较 bert/roberta 收敛更快? #2

nbcc opened this issue Mar 25, 2020 · 7 comments

Comments

@nbcc
Copy link

nbcc commented Mar 25, 2020

比较 pretraining 不同 steps 的 checkpoint。同 step 对应的 checkpoint,electra 100% label 学习的优势,在 finetuning 效果上,论文里是显著快于 bert 的。

不知道复现是否有这个结论呢?我们在做一个类似的策略,收敛速度上并没有论文显著。

@ymcui
Copy link
Owner

ymcui commented Mar 25, 2020

如果你说的是论文中的FIgure 4,与BERT在不同checkpoint下的性能比较,这个暂时没有。

@nbcc
Copy link
Author

nbcc commented Mar 25, 2020

如果你说的是论文中的FIgure 4,与BERT在不同checkpoint下的性能比较,这个暂时没有。

感觉是策略核心验证了~

@nbcc
Copy link
Author

nbcc commented Mar 25, 2020

electra 还是很有意思的

@ymcui
Copy link
Owner

ymcui commented Mar 25, 2020

与1M checkpoint下的RoBERTa-base的结果来看,ELECTRA的效果是要好一些的。比如在CMRC 2018开发集上,

ELECTRA-small: 63.4 / 80.8
RoBERTa-small: 58.5 / 80.0

@nbcc
Copy link
Author

nbcc commented Mar 25, 2020

与1M checkpoint下的RoBERTa-base的结果来看,ELECTRA的效果是要好一些的。比如在CMRC 2018开发集上,

ELECTRA-small: 63.4 / 80.8
RoBERTa-small: 58.5 / 80.0

👍。这个 roberta small 总步数是多少?两者最终效果估计差不多吧。

@ymcui
Copy link
Owner

ymcui commented Mar 25, 2020

总步数就定在1M步。
(重新写了一下内容)
分类任务测了LCQMC、BQ Corpus,LCQMC是ELECTRA-small好一些,BQ是RoBERTa-small效果好一些。可见结果上并不一定总是ELECTRA效果更好,综合这几个任务看ELECTRA还是有一定优势的。

LCQMC
ELECTRA-small: dev86.7 test85.9
RoBERTa-small: dev85.3 test84.9

BQ Corpus
ELECTRA-small: dev83.5,test82.0
RoBERTa-small:  dev84.3,test83.2

@nbcc nbcc closed this as completed Mar 26, 2020
@nbcc
Copy link
Author

nbcc commented Mar 26, 2020

总步数就定在1M步。
(重新写了一下内容)
分类任务测了LCQMC、BQ Corpus,LCQMC是ELECTRA-small好一些,BQ是RoBERTa-small效果好一些。可见结果上并不一定总是ELECTRA效果更好,综合这几个任务看ELECTRA还是有一定优势的。

LCQMC
ELECTRA-small: dev86.7 test85.9
RoBERTa-small: dev85.3 test84.9

BQ Corpus
ELECTRA-small: dev83.5,test82.0
RoBERTa-small:  dev84.3,test83.2

感谢分享!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants