-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LxReg #273
Comments
I wrote this class to generalize L-1 regularization, L-2 regularization, and so on... (hence the name LxReg) and make it compatible with the whole cost framework. It's not a real 'cost', since it does not take into account the training or monitoring data, but it still integrates with other costs, so you can have an expression like cost = NLL + L-2(weights) I've used it a couple of times and it works well. You provide it with variables (say, your weight matrices) and the order 'x' of the regularization you want, and it computes the symbolic expression for you when you call it. Maybe I should change the docstring to explain what it does more clearly? |
Ah, OK. Have you just been using it to regularize parameters directly? i.e., put an L2 norm on the weights? Is there a risk of it silently doing the wrong thing if someone tried to use it to put an L1 penalty on something data-dependent, like the activations of an MLP? Should it enforce that things in the "variables" list are shared variables or something like that? |
Yes, only to regularize direct parameters. Currently there is no safeguard to prevent people from using it on something like the activations of an MLP, I guessed if it was obvious that this 'cost' was to be used as a regularizer, people would use it appropriately. As long as imposing that the things in the 'variables' list are shared variables is not too restrictive (I can't think of any use case involving something other than shared variables for now), I think it could be a good idea. It all depends on what degree of freedom we want the user to have. Maybe a warning saying "This cost was intended to be used on shared variables, such as weights in a neural net, make sure your use of data-dependent variables is intended" would be sufficient? |
What would go wrong if they used it as a penalty on activations? That may be a desirable thing from a machine learning point of view. What is the Theano-level problem? --Yoshua On 2013-05-14, at 14:04, vdumoulin wrote:
|
That's a good point. I personally don't see any theano-level problem with On Wednesday, May 15, 2013, yoshua wrote:
Vincent |
The question is how actual data gets plugged into the back-end of that Theano graph, and what code is responsible for constructing these variables, etc. |
I'm not sure I get what problems can arise from a Theano point of view. Can you explain yourself further? |
Currently, what would happen if we used LxReg on activations is probably an error in Theano saying that some inputs to the trainin monitoring function (depending on whether LxReg is a training cost or monitoring cost) were not provided. If some models need to use Lx regularization on data-driven quantities (whether they depend on inputs, targets, both, or something else), a new |
Ok, I see the issue, thanks! Would it be bad design if we made LxReg compatible with data-driven quantities but used it to compute non-data-driven quantities too? |
Has anyone had time to look at my pull request for this issue? |
I'm back now and working through the PRs. See the PR for comments. |
Pull request was merged, closing the issue. |
What is up with the LxReg class? Does it actually work? How are you meant to make the "variables" argument to the init method actually get driven by the training or monitoring data?
The text was updated successfully, but these errors were encountered: