Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LightNER在中文小样本实验 #127

Closed
kevinuserdd opened this issue Jul 18, 2022 · 5 comments
Closed

LightNER在中文小样本实验 #127

kevinuserdd opened this issue Jul 18, 2022 · 5 comments
Labels
question Further information is requested

Comments

@kevinuserdd
Copy link

请问作者有在中文数据集上做过few shot试验么? 目前我采用msra数据集训练,最高f值只有0.75; 我看论文里在英文数据集conll上面可以达到0.9; 另外,我在小样本数据集上,20-shot的f值只有0.3; 在500-shot的f值甚至也没有比20-shot的0.3更好; 请教下,中文数据集下的指标是会有这么大差距么

@kevinuserdd kevinuserdd added the question Further information is requested label Jul 18, 2022
@flow3rdown
Copy link
Contributor

LightNER暂未在中文数据集上做过实验,建议您调一下batch size和learning rate,few-shot场景对这两个参数会比较敏感,全量场景可以试下不冻结bart,此外还可以试下将learn_weights设置为False

@kevinuserdd
Copy link
Author

LightNER暂未在中文数据集上做过实验,建议您调一下batch size和learning rate,few-shot场景对这两个参数会比较敏感,全量场景可以试下不冻结bart,此外还可以试下将learn_weights设置为False
您说的是yaml文件中的freeze_plm 和learn_weights参数设置成False试试么?

@flow3rdown
Copy link
Contributor

LightNER暂未在中文数据集上做过实验,建议您调一下batch size和learning rate,few-shot场景对这两个参数会比较敏感,全量场景可以试下不冻结bart,此外还可以试下将learn_weights设置为False
您说的是yaml文件中的freeze_plm 和learn_weights参数设置成False试试么?

是的

@kevinuserdd
Copy link
Author

一般在全量场景下,需要跑多少个迭代? 在中文场景下,准确率有0.8+,但是召回率一直只有0.5不到,知道这是什么情况导致的么

@flow3rdown
Copy link
Contributor

CONLL数据集包括12.7k条训练实例,训练了30个epoch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants