Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About your implementation #20

Closed
HuangZiliAndy opened this issue Mar 8, 2018 · 3 comments
Closed

About your implementation #20

HuangZiliAndy opened this issue Mar 8, 2018 · 3 comments

Comments

@HuangZiliAndy
Copy link

Hi, thanks for your wonderful code! I have a little problem about your implementation. When you normalize w martrix, I notice you used ww = w.renorm(2,1,1e-5).mul(1e5). I think a natural implementation is just ww = F.normalize(w, dim=0). Do you have some special reasons for that? Thank you very much!!!

@PkuRainBow
Copy link

I also have similar concern, and I try to use the below function

nn.utils.weight_norm()

I read the doc, it says that,

Weight normalization is a reparameterization that decouples the magnitude
of a weight tensor from its direction. This replaces the parameter specified
by name (e.g. "weight") with two parameters: one specifying the magnitude
(e.g. "weight_g") and one specifying the direction (e.g. "weight_v").

In fact, I want to normalize the weight vectors(L2 norm=1), so I am wondering whether the following function call can ensure that the weights' L2 norm equal 1 or it will learn a magnitude factor "weight_g"? So How can I ensure that the "weight_g" equal 1?

nn.utils.weight_norm(nn.Conv2d(512, num_classes, kernel_size=1, stride=1, padding=1, dilation=1, bias=False), name='weight')

@clcarwin
Copy link
Owner

There is no any special reason to use renorm. I just find it in pytorch's doc.

@douyh
Copy link

douyh commented Oct 24, 2018

Hi, thanks for your wonderful code! I have a little problem about your implementation. When you normalize w martrix, I notice you used ww = w.renorm(2,1,1e-5).mul(1e5). I think a natural implementation is just ww = F.normalize(w, dim=0). Do you have some special reasons for that? Thank you very much!!!

Have you try F.normalize instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants