Effect of bias in linear layers #16

ptrcarta · 2020-12-14T05:40:39Z

I've been experimenting with SELUs, and found they provide an improvement in terms of computation time during training with respect to batchnorm, thank you for your work.

I just have a question regarding the effect of bias in linear layers. As I understand it, every neuron should have mean zero in order to stay in the self regularizing zone, but bias precisely shifts that mean. In my experiments however I didn't see much of an effect either removing or adding biases. I see that in the tutorial notebook bias is used, and I wonder wether you've considered the issue.

gklambauer · 2020-12-14T07:52:59Z

Dear ptrcarta, thanks, good point! We have experimented a lot with SNNs with and without bias units. In wide networks they hardly play a role. My hypothesis is that it is due to the following: a) SELUs counter the bias shift well and keep activations close to zero mean which is good for learning and b) in wide layers, any unit can learn to take the role of a bias unit. However, at the output layer, bias units can help especially if you have unbalanced data. Hope this helps!

gklambauer closed this as completed Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Effect of bias in linear layers #16

Effect of bias in linear layers #16

ptrcarta commented Dec 14, 2020

gklambauer commented Dec 14, 2020

Effect of bias in linear layers #16

Effect of bias in linear layers #16

Comments

ptrcarta commented Dec 14, 2020

gklambauer commented Dec 14, 2020