Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StyleLoss dividing by n twice #90

Closed
rrshaban opened this issue Dec 7, 2015 · 2 comments
Closed

StyleLoss dividing by n twice #90

rrshaban opened this issue Dec 7, 2015 · 2 comments

Comments

@rrshaban
Copy link

rrshaban commented Dec 7, 2015

In StyleLoss:updateOutput neural_style.lua:398:

function StyleLoss:updateOutput(input)
  self.G = self.gram:forward(input)
  self.G:div(input:nElement())
  self.loss = self.crit:forward(self.G, self.target)
  ...
end

Our criterion function (self.crit = nn.MSECriterion()) already divides by n after calculating MSE: nn.MSECriterion. So it seems to me that we are dividing by n twice; once before passing it to MSE and then once again within MSE.

Is this intentional, or am I missing something?

@jcjohnson
Copy link
Owner

You're right - we do divide by the same thing twice. Per-layer, this is basically a no-op since it just scales the style loss for that layer by a constant.

However since we use the same style loss weight for all layers, any normalization has the effect of scaling the contributions of the style losses from different layers, since n is different for each layer. I didn't really have a good theoretical justification for this particular type of normalization, but empirically it tends to give good results.

@rrshaban
Copy link
Author

Thanks for the detailed reply, as well as the great code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants