LxReg #273

goodfeli · 2013-05-05T15:46:23Z

What is up with the LxReg class? Does it actually work? How are you meant to make the "variables" argument to the init method actually get driven by the training or monitoring data?

vdumoulin · 2013-05-14T13:34:42Z

I wrote this class to generalize L-1 regularization, L-2 regularization, and so on... (hence the name LxReg) and make it compatible with the whole cost framework. It's not a real 'cost', since it does not take into account the training or monitoring data, but it still integrates with other costs, so you can have an expression like

cost = NLL + L-2(weights)

I've used it a couple of times and it works well. You provide it with variables (say, your weight matrices) and the order 'x' of the regularization you want, and it computes the symbolic expression for you when you call it.

Maybe I should change the docstring to explain what it does more clearly?

goodfeli · 2013-05-14T17:40:17Z

Ah, OK. Have you just been using it to regularize parameters directly? i.e., put an L2 norm on the weights? Is there a risk of it silently doing the wrong thing if someone tried to use it to put an L1 penalty on something data-dependent, like the activations of an MLP? Should it enforce that things in the "variables" list are shared variables or something like that?

vdumoulin · 2013-05-14T18:04:43Z

Yes, only to regularize direct parameters. Currently there is no safeguard to prevent people from using it on something like the activations of an MLP, I guessed if it was obvious that this 'cost' was to be used as a regularizer, people would use it appropriately.

As long as imposing that the things in the 'variables' list are shared variables is not too restrictive (I can't think of any use case involving something other than shared variables for now), I think it could be a good idea. It all depends on what degree of freedom we want the user to have. Maybe a warning saying "This cost was intended to be used on shared variables, such as weights in a neural net, make sure your use of data-dependent variables is intended" would be sufficient?

yoshua · 2013-05-15T17:15:28Z

What would go wrong if they used it as a penalty on activations? That may be a desirable thing from a machine learning point of view. What is the Theano-level problem?

--Yoshua

On 2013-05-14, at 14:04, vdumoulin wrote:

Yes, only to regularize direct parameters. Currently there is no safeguard to prevent people from using it on something like the activations of an MLP, I guessed if it was obvious that this 'cost' was to be used as a regularizer, people would use it appropriately.

As long as imposing that the things in the 'variables' list are shared variables is not too restrictive (I can't think of any use case involving something other than shared variables for now), I think it could be a good idea. It all depends on what degree of freedom we want the user to have. Maybe a warning saying "This cost was intended to be used on shared variables, such as weights in a neural net, make sure your use of data-dependent variables is intended" would be sufficient?

—
Reply to this email directly or view it on GitHub.

vdumoulin · 2013-05-15T17:19:37Z

That's a good point. I personally don't see any theano-level problem with
that.

On Wednesday, May 15, 2013, yoshua wrote:

What would go wrong if they used it as a penalty on activations? That may
be a desirable thing from a machine learning point of view. What is the
Theano-level problem?

--Yoshua

On 2013-05-14, at 14:04, vdumoulin wrote:

Yes, only to regularize direct parameters. Currently there is no
safeguard to prevent people from using it on something like the activations
of an MLP, I guessed if it was obvious that this 'cost' was to be used as a
regularizer, people would use it appropriately.

As long as imposing that the things in the 'variables' list are shared
variables is not too restrictive (I can't think of any use case involving
something other than shared variables for now), I think it could be a good
idea. It all depends on what degree of freedom we want the user to have.
Maybe a warning saying "This cost was intended to be used on shared
variables, such as weights in a neural net, make sure your use of
data-dependent variables is intended" would be sufficient?

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/273#issuecomment-17952521
.

Vincent

dwf · 2013-05-15T17:24:38Z

The question is how actual data gets plugged into the back-end of that Theano graph, and what code is responsible for constructing these variables, etc.

vdumoulin · 2013-06-17T15:03:17Z

I'm not sure I get what problems can arise from a Theano point of view. Can you explain yourself further?

lamblin · 2013-06-17T21:17:36Z

Currently, what would happen if we used LxReg on activations is probably an error in Theano saying that some inputs to the trainin monitoring function (depending on whether LxReg is a training cost or monitoring cost) were not provided.
The reason is that all data-driven costs have to be derived from the data argument to their expr method, as this is the variable that will be used to hold the actual data during training or monitoring.
This data argument is conform to the (space, source) data_specs defined in get_data_specs. In this case, LxReg.get_data_specs() returns (NullSpace(), ''), which means this cost uses no data at all. Without data, activations (for instance) cannot be expressed.

If some models need to use Lx regularization on data-driven quantities (whether they depend on inputs, targets, both, or something else), a new Cost class will probably have to be defined, either for that particular case, or in a more general setting, but this could get complex quickly.

vdumoulin · 2013-06-18T13:49:06Z

Ok, I see the issue, thanks!

Would it be bad design if we made LxReg compatible with data-driven quantities but used it to compute non-data-driven quantities too?

vdumoulin · 2013-06-25T13:26:14Z

Has anyone had time to look at my pull request for this issue?

goodfeli · 2013-09-30T20:25:21Z

I'm back now and working through the PRs. See the PR for comments.

vdumoulin · 2013-11-17T15:45:02Z

Pull request was merged, closing the issue.

vdumoulin mentioned this issue Jun 20, 2013

LxReg refactoring #380

Merged

vdumoulin closed this as completed Nov 17, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LxReg #273

LxReg #273

goodfeli commented May 5, 2013

vdumoulin commented May 14, 2013

goodfeli commented May 14, 2013

vdumoulin commented May 14, 2013

yoshua commented May 15, 2013

vdumoulin commented May 15, 2013

dwf commented May 15, 2013

vdumoulin commented Jun 17, 2013

lamblin commented Jun 17, 2013

vdumoulin commented Jun 18, 2013

vdumoulin commented Jun 25, 2013

goodfeli commented Sep 30, 2013

vdumoulin commented Nov 17, 2013

LxReg #273

LxReg #273

Comments

goodfeli commented May 5, 2013

vdumoulin commented May 14, 2013

goodfeli commented May 14, 2013

vdumoulin commented May 14, 2013

yoshua commented May 15, 2013

vdumoulin commented May 15, 2013

dwf commented May 15, 2013

vdumoulin commented Jun 17, 2013

lamblin commented Jun 17, 2013

vdumoulin commented Jun 18, 2013

vdumoulin commented Jun 25, 2013

goodfeli commented Sep 30, 2013

vdumoulin commented Nov 17, 2013