Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regularization on biases #40

Open
GuangQin1995 opened this issue Apr 28, 2017 · 3 comments
Open

regularization on biases #40

GuangQin1995 opened this issue Apr 28, 2017 · 3 comments

Comments

@GuangQin1995
Copy link

if regularizer is not None:
regularizers = sum([tf.nn.l2_loss(variable) for variable in self.variables])
loss += (regularizer * regularizers)
it seems like that you have regularization on biases, as the self.variables included the biases
` variables = []
for w1,w2 in weights:
variables.append(w1)
variables.append(w2)

for b1,b2 in biases:
    variables.append(b1)
    variables.append(b2)`
@jakeret
Copy link
Owner

jakeret commented Apr 30, 2017

Yes you're right. Have you checked if it makes a big difference?

According to deeplearning.stanford.edu/ :Applying weight decay to the bias units usually makes only a small difference to the final network.

But I might be worth investigating

@meijie0401
Copy link

I think when batchsize is not 1, we should divide 'regularizers' by 2batch size just like following. What's your idea?
regularizers = sum([tf.nn.l2_loss(variable) for variable in self.variables])//(2
batch_size)

@jakeret
Copy link
Owner

jakeret commented Sep 8, 2018

Sorry for the very late reply.
The power of the regularizer is a hyperparameter like the batch size we can cover their relation in the param search instead of having a it implicitly in the code, don't we?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants