-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regularization #14
Comments
I think the regularizers themselves should just accept expressions and return expressions. This way the L1 regularization can be applied both to model parameters and to activations, for example. We might add convenience methods such as: def regularize_weights(layer, regularizer):
return regularizer(layer.get_non_bias_params())
def regularize_all_weights(layer, regularizer):
return regularizer(nntools.layers.get_all_non_bias_params(layer))
def regularize_output(layer, regularizer, inputs=None):
return regularizer(layer.get_output(inputs)) So one can write: Again, there should be a way to use networks with multiple output layers, I guess... we could swap the argument order so we can define /edit: Some regularizers will take additional parameters (such as a sparsity target), so the convenience functions probably should be: def regularize_weights(layer, regularizer, *args, **kwargs):
return regularizer(layer.get_non_bias_params(), *args, **kwargs) |
Agreed here. I also suggested in #110 that rather than layers having just |
What types of expressions? Expressions that augment (are added to) the original cost function? These would then be differentiable. I ask because in #84 I'm toying with the idea of hard weight norm constraints which can't be written as part of the cost function. They can really only be written as updates on values that violate the hard constraint. |
There are many different ways to regularize things, that hook into various parts of the code - I think what's meant is that in general we want any tools we write to help with regularization to operate on Theano data types only, so that they are maximally reusable (and so that they can be used without the rest of the library). The current regularization code takes a layer instance and then calls |
See also #86. |
Now that our test coverage and documentation are in a pretty good state, fixing up the regularization module is one of the last few things we need to do before we're ready for release. Some good ideas for a new API have been discussed in this thread. This discussion predates the We just need to turn this idea into a PR now :) Any takers? |
I made a start at it, see #285 |
We should implement some commonly used regularizers (L1, L2, sparsity penalties on the activations as in sparse autoencoders, ...).
How should we do this? The
nntools.regularization
module I included in the initial commit was an afterthought and should be treated as more of a placeholder.In #11 @f0k already mentioned that it's probably a good idea to make the regularization module operate on Theano expressions, not
Layer
instances, so that it can be used in isolation.Any ideas? We should also take into account that some regularizers operate on model parameters (e.g. L1, L2) and others operate on activations (autoencoder sparsity penalty) and are data-dependent.
The text was updated successfully, but these errors were encountered: