New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds nn.Normalize #341

Merged
merged 1 commit into from Sep 14, 2015

Conversation

Projects
None yet
6 participants
@fmassa
Contributor

fmassa commented Aug 1, 2015

Generalizes #260 from @karpathy to accept arbitrary L_p norms.
A few remarks:

  • Maybe LpNormalize or Normalize is a better name ?
  • Only accepts 1D/2D inputs. @nicholas-leonard proposed to add a dim parameter to make it equivalent to torch.norm. Is it worth it given that we have SpatialBatchNormalization ?
  • updateGradInput is quite memory consuming for large d.

cc: @bamos

@szagoruyko

This comment has been minimized.

Show comment
Hide comment
@szagoruyko

szagoruyko Aug 1, 2015

Member

definitely the name has to be Normalize, not Norm
is it hard to extend to other dimensions? SpatialBatchNormalization is different

Member

szagoruyko commented Aug 1, 2015

definitely the name has to be Normalize, not Norm
is it hard to extend to other dimensions? SpatialBatchNormalization is different

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Aug 2, 2015

Contributor

@szagoruyko Maybe it's not too difficult to extend it to work for other dimensions, by viewing the input to be 2D with the last dimension the desired normalized dimension, but one has to take care about batched/non-batched inputs as well. Maybe we could add a setNumInputDims function, as in nn.View ?
I'll spend some more time trying to make it more generic.

Everybody ok to renaming it to Normalize ?

Contributor

fmassa commented Aug 2, 2015

@szagoruyko Maybe it's not too difficult to extend it to work for other dimensions, by viewing the input to be 2D with the last dimension the desired normalized dimension, but one has to take care about batched/non-batched inputs as well. Maybe we could add a setNumInputDims function, as in nn.View ?
I'll spend some more time trying to make it more generic.

Everybody ok to renaming it to Normalize ?

@karpathy

This comment has been minimized.

Show comment
Hide comment
@karpathy

karpathy Aug 2, 2015

Contributor

Why is it obvious that the name should be Normalize and not Norm? As far as I can tell almost all operations from torch are ported without name changes to nn layers. E.g. analogous to max operation there is a nn.Max, so why is it obvious that norm operation should be nn.Normalize? Shouldn't we have nn.Maximize then? My first instinct would be to stick with the current naming for naming consistency.

Contributor

karpathy commented Aug 2, 2015

Why is it obvious that the name should be Normalize and not Norm? As far as I can tell almost all operations from torch are ported without name changes to nn layers. E.g. analogous to max operation there is a nn.Max, so why is it obvious that norm operation should be nn.Normalize? Shouldn't we have nn.Maximize then? My first instinct would be to stick with the current naming for naming consistency.

@soumith

This comment has been minimized.

Show comment
Hide comment
@soumith

soumith Aug 2, 2015

Member

torch.norm returns the p-norm. isn't this module also normalizing the input according to the returned norm?

Member

soumith commented Aug 2, 2015

torch.norm returns the p-norm. isn't this module also normalizing the input according to the returned norm?

@karpathy

This comment has been minimized.

Show comment
Hide comment
@karpathy

karpathy Aug 2, 2015

Contributor

Good point. Almost suggest that there should be both nn.Norm that does exactly what norm does, and then nn.Normalize that also does a div right afterwards. But perhaps that gets a bit too hairy then :)

Contributor

karpathy commented Aug 2, 2015

Good point. Almost suggest that there should be both nn.Norm that does exactly what norm does, and then nn.Normalize that also does a div right afterwards. But perhaps that gets a bit too hairy then :)

@soumith

This comment has been minimized.

Show comment
Hide comment
@soumith

soumith Aug 2, 2015

Member

That's actually a good idea :)

Member

soumith commented Aug 2, 2015

That's actually a good idea :)

@soumith soumith changed the title from Adds Norm to Adds Normalize Sep 4, 2015

@soumith soumith changed the title from Adds Normalize to Adds nn.Normalize Sep 4, 2015

@soumith

This comment has been minimized.

Show comment
Hide comment
@soumith

soumith Sep 4, 2015

Member

@fmassa looks like it's good to go is it?

Member

soumith commented Sep 4, 2015

@fmassa looks like it's good to go is it?

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Sep 4, 2015

Contributor

It's good. I just need to squash the commits.
I didn't had the time to add a dimension parameter though, it would complicate a bit the logic because of the setNumInputDims function. But it could be added later if needed.

Contributor

fmassa commented Sep 4, 2015

It's good. I just need to squash the commits.
I didn't had the time to add a dimension parameter though, it would complicate a bit the logic because of the setNumInputDims function. But it could be added later if needed.

@soumith

This comment has been minimized.

Show comment
Hide comment
@soumith

soumith Sep 11, 2015

Member

squash, good to go.

Member

soumith commented Sep 11, 2015

squash, good to go.

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Sep 12, 2015

Contributor

@soumith I just squashed the commits.

Contributor

fmassa commented Sep 12, 2015

@soumith I just squashed the commits.

soumith added a commit that referenced this pull request Sep 14, 2015

@soumith soumith merged commit 5fa1a8b into torch:master Sep 14, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@soumith

This comment has been minimized.

Show comment
Hide comment
@soumith

soumith Sep 14, 2015

Member

thanks a lot Francisco, for the excellent PR.

Member

soumith commented Sep 14, 2015

thanks a lot Francisco, for the excellent PR.

@fmassa fmassa deleted the fmassa:norm branch Sep 14, 2015

@ffmpbgrnn

This comment has been minimized.

Show comment
Hide comment
@ffmpbgrnn

ffmpbgrnn Oct 3, 2015

Lots of memory is needed in backprop of this module. One reason might be creating the eyeExpand matrix and later doing multiplication. In my case, when the batch size is 64, input dimension is 4800, two Normalize layer would be out of memory for one 4G memory GPU. Any idea to implement a more space efficient Normalize layer?

ffmpbgrnn commented Oct 3, 2015

Lots of memory is needed in backprop of this module. One reason might be creating the eyeExpand matrix and later doing multiplication. In my case, when the batch size is 64, input dimension is 4800, two Normalize layer would be out of memory for one 4G memory GPU. Any idea to implement a more space efficient Normalize layer?

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Oct 3, 2015

Contributor

@ffmpbgrnn here is a version of Normalize which uses much less memory (it doesn't depend on the batch size anymore). It should be slower on GPU. The tests passes so it should be fine. Use it with fastMode(false).
fmassa@015ba9c

Contributor

fmassa commented Oct 3, 2015

@ffmpbgrnn here is a version of Normalize which uses much less memory (it doesn't depend on the batch size anymore). It should be slower on GPU. The tests passes so it should be fine. Use it with fastMode(false).
fmassa@015ba9c

@ffmpbgrnn

This comment has been minimized.

Show comment
Hide comment
@ffmpbgrnn

ffmpbgrnn Oct 3, 2015

@fmassa thanks for your work. I will give a try!

ffmpbgrnn commented Oct 3, 2015

@fmassa thanks for your work. I will give a try!

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Oct 5, 2015

Contributor

@ffmpbgrnn so, does this patch works fine for you ? Is it much slower than the previous version ?
Maybe we could push this simplified version to master (taking of the faster mode to make things simple) ?
cc @soumith

Contributor

fmassa commented Oct 5, 2015

@ffmpbgrnn so, does this patch works fine for you ? Is it much slower than the previous version ?
Maybe we could push this simplified version to master (taking of the faster mode to make things simple) ?
cc @soumith

@ffmpbgrnn

This comment has been minimized.

Show comment
Hide comment
@ffmpbgrnn

ffmpbgrnn Oct 8, 2015

@fmassa , sorry for my late response..Yes, it passes the test. But when I test fastMode(false) mode on GPU, it's even faster than the fastMode(true).

require 'nn'
require 'cutorch'
require 'cunn'

local module = nn.Normalize(2):cuda()
module:fastMode(false)
local input = torch.rand(64, 2400):cuda()
local t = torch.Timer()
for i = 1, 100 do
    module:forward(input)
    module:backward(input, input)
    print(i)
end
print(t:time().real/100)

fastMode(false), output: 0.09
fastMode(true), output: 0.12

ffmpbgrnn commented Oct 8, 2015

@fmassa , sorry for my late response..Yes, it passes the test. But when I test fastMode(false) mode on GPU, it's even faster than the fastMode(true).

require 'nn'
require 'cutorch'
require 'cunn'

local module = nn.Normalize(2):cuda()
module:fastMode(false)
local input = torch.rand(64, 2400):cuda()
local t = torch.Timer()
for i = 1, 100 do
    module:forward(input)
    module:backward(input, input)
    print(i)
end
print(t:time().real/100)

fastMode(false), output: 0.09
fastMode(true), output: 0.12

@dineshj1

This comment has been minimized.

Show comment
Hide comment
@dineshj1

dineshj1 Oct 24, 2015

This line could cause division by 0 when self.p<2, e.g. L1 normalization. Perhaps clamp to have minimum value eps? Could do this only for self.p<2 if necessary.

dineshj1 commented on Normalize.lua in b80bda2 Oct 24, 2015

This line could cause division by 0 when self.p<2, e.g. L1 normalization. Perhaps clamp to have minimum value eps? Could do this only for self.p<2 if necessary.

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Oct 24, 2015

Contributor

Thanks for the remark. I've added a clamping factor, check it #444

Contributor

fmassa replied Oct 24, 2015

Thanks for the remark. I've added a clamping factor, check it #444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment