Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

权重初始化 #43

Closed
xiatutu opened this issue Sep 12, 2018 · 4 comments
Closed

权重初始化 #43

xiatutu opened this issue Sep 12, 2018 · 4 comments

Comments

@xiatutu
Copy link

xiatutu commented Sep 12, 2018

代码里在网络训练之前没有看到权重初始化的语句,不初始化权重的话都为0了,怎么训练的鸭?
或者是代码里用哪部分对权重初始化了?
谢谢~

@MorvanZhou
Copy link
Owner

    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

这是Linear() layer 的一个功能, 每次初始化都会reset parameters, 而且使用uniform distribution初始化。

@xiatutu
Copy link
Author

xiatutu commented Sep 12, 2018

那对于后面的CNN的那个代码,卷积层也没有显式的初始化,但我看conv2d并没有像Linear这样能自动初始化鸭。

@MorvanZhou
Copy link
Owner

MorvanZhou commented Sep 12, 2018

这是conv2d 最终会调用到的源码:

class _ConvNd(Module):

    def __init__(self, in_channels, out_channels, kernel_size, stride,
                 padding, dilation, transposed, output_padding, groups, bias):
        super(_ConvNd, self).__init__()
        ......
        self.reset_parameters()

    def reset_parameters(self):
        n = self.in_channels
        for k in self.kernel_size:
            n *= k
        stdv = 1. / math.sqrt(n)
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

@xiatutu
Copy link
Author

xiatutu commented Sep 12, 2018

蟹蟹!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants