Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

习题4-8 #14

Open
fecet opened this issue Mar 30, 2020 · 3 comments
Open

习题4-8 #14

fecet opened this issue Mar 30, 2020 · 3 comments

Comments

@fecet
Copy link

fecet commented Mar 30, 2020

将 w​ 初始化为 0 会使得同一层的神经元在计算时没有区别性, 具有同样的梯度, 产生同样的权重更新.

@YanHao22
Copy link

YanHao22 commented Oct 6, 2021

有没有这种原因呢?
如果损失函数没有正则化项,例如损失函数取交叉熵,那么初始化为零会导致更新步长为0,根本动不了

@NneurotransmitterR
Copy link

若直接初始化参数为全0或同一个常数,则在前向计算时,所有隐藏层神经元都具有相同的激活值,在反向传播时,所有的神经元的更新也相同,这会使得隐含神经元没有任何区分度,变成实际上相当于只有一个隐含神经元的网络。

@NneurotransmitterR
Copy link

随机初始化的分布同样需要考虑,否则可能会导致梯度消失或梯度爆炸问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants