Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

怎样不冻结bert参数? #4

Closed
VioletJKI opened this issue Jul 6, 2019 · 5 comments
Closed

怎样不冻结bert参数? #4

VioletJKI opened this issue Jul 6, 2019 · 5 comments

Comments

@VioletJKI
Copy link

我把这儿 grads = tf.gradients(self.loss, train_vars) 的train_vars改成了tvars,但是出来的结果全是0

@yumath
Copy link
Owner

yumath commented Jul 13, 2019

@VioletJKI 模型中已经冻结了参数

@yumath yumath closed this as completed Jul 18, 2019
@airship-explorer
Copy link

我把这儿 grads = tf.gradients(self.loss, train_vars) 的train_vars改成了tvars,但是出来的结果全是0

我采取相同的方式想更新BERT参数,出来结果也全是0,请问你解决了吗?

@yumath
Copy link
Owner

yumath commented Nov 12, 2019

@Guohai93 确实是这样,我的理解是bert的参数过多,无法在如此小的NER数据集上收敛。你如果需要更新bert参数的话,建议先对bert参数在自己数据集上进行训练,再拿来做NER

@airship-explorer
Copy link

@Guohai93 确实是这样,我的理解是bert的参数过多,无法在如此小的NER数据集上收敛。你如果需要更新bert参数的话,建议先对bert参数在自己数据集上进行训练,再拿来做NER

做了一点实验,发现采用训练LSTM的1e-3学习速率太大了,导致底层预训练的BERT被破坏了,如果采用小的学习速率比如1e-4或更小,结果不再是0了。

@yumath
Copy link
Owner

yumath commented Nov 12, 2019

@Guohai93 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants