Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6层的roberta模型啥时候发布呀? #22

Closed
Jethu1 opened this issue Sep 11, 2019 · 10 comments
Closed

6层的roberta模型啥时候发布呀? #22

Jethu1 opened this issue Sep 11, 2019 · 10 comments

Comments

@Jethu1
Copy link

Jethu1 commented Sep 11, 2019

No description provided.

@brightmart
Copy link
Owner

争取本周内

@Jethu1
Copy link
Author

Jethu1 commented Sep 16, 2019

这周能发布么? 挺着急想试试这小模型的效果

@brightmart
Copy link
Owner

延期了

@Shuryne
Copy link

Shuryne commented Sep 26, 2019

请问目前有确定6层模型以及训练语料的发放日期么?想试一试小模型的效果,谢谢🙏

@brightmart
Copy link
Owner

有的,6层的在训练,就这两天就会发布。
另外也有计划最近发布中文版的albert,最近出来的,小而高性能的google模型。

@Shuryne
Copy link

Shuryne commented Sep 27, 2019

太好了!albert我昨天才看到,似乎用了大量TPU训练,如果能够发布真的是太好了,非常感谢~

@brightmart
Copy link
Owner

@csy1998 @Jethu1 超小模型,参数量和模型大小为bert的十分之一,训练速度加快了1倍,可以试用了
https://github.com/brightmart/albert_zh

@KunWangR
Copy link

KunWangR commented Oct 8, 2019

hi,请问楼主,6层的roberta模型,大约能在什么时候发布呢?

@brightmart
Copy link
Owner

@Jethu1 @csy1998 @KunWangR 已经发布了6层的roberta模型(体验版),可以试一试效果怎么样。能否报告一下,在你们的任务上的效果呢?

@KunWangR
Copy link

roberta 6层模型在我这边的文本相似度匹配数据集上准确率提升2%,albert 6层模型提升有1%, albert 4层模型不仅不能提升任务效果,反而降低0.5%左右。但是从cpu的预测速度来看,4层以下的bert模型才能满足cpu响应性能需求,英文担心蒸馏后损失模型效果,希望作者和大家一起探究一下roberta 3-4层的模型的效果和性能如何。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants