Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Fixing harsh upgrade_proto for `"BatchNorm"` layer #5184
Conversation
|
@shelhamer would you please have a look at this issue/proposed fix? Thanks. |
shelhamer
self-assigned this
Jan 17, 2017
shelhamer
referenced
this pull request
Jan 19, 2017
Closed
Cannot explicitly "name" BatchNorm parameters for sharing (Siamese network) #5171
|
Switching to zeroing the |
shelhamer
merged commit bc0d680
into
BVLC:master
Jan 20, 2017
1 check passed
|
@shelhamer Thanks for merging This PR! |
|
@shaibagon Thank Shai for a fix. Not sure about internal structure. Just quick question. Does the upgraded proto of BN layer have the same interface as before having this upgrade? |
|
@antran89 There is no interface change. The actions |
shaibagon
deleted the
shaibagon:fix_batch_norm_param_upgrade branch
Apr 18, 2017
Jiangfeng-Xiong
commented
May 9, 2017
•
|
@shaibagon @shelhamer what will happen if we share parameters in batchnorm layer, since mean and variance are calculated based on input, so, during training, there are two inputs in the siamese network,there would be two means and two variance based on different inputs, So, what will be used as paramter in batchnorm, or we just average them? |
|
@Jiangfeng-Xiong you obviously cannot have two means and variances in the same layer, it make no sense. |
shaibagon commentedJan 15, 2017
This PR attempts to fix issues #5171 and #5120 cuased by PR #4704:
PR#4704 removes completely all
paramarguments of"BatchNorm"layers, and resetting them toparam {lr_mult: 0}. This "upgrade" is too harsh and it discards"name"argument that might be set by user.This PR fixes
upgrade_proto.cppfor"BatchNorm"layer to be more conservative, leave"name"in param, and only setlr_multanddecay_multto zero.Example of such upgrade:
Input prototxt
"Upgraded" prorotxt:
As you can see
lr_multanddecay_multare set to zero leavingnameintact when explicitly set by user.