Params for updates vs. params for regularization #110

craffel · 2015-01-23T00:56:26Z

I just realized that if someone were to use regularization on the recurrent nets, they might have an unintended side effect - namely, get_params (optionally) returns the initial state vectors for the recurrent layer. This is because sometimes people want to optimize the initial state vectors, so we want to allow them to be returned so they can be passed as updates to the optimization function (SGD etc). The current L2 regularizer optionally calls either get_all_params or get_all_non_bias_params, both of which would potentially return the initial state param. But, I don't think anyone would want to regularize them. This may be true in other layers too, that get_params returns parameters which don't make sense to regularize.

I think that instead of making the regularizers operate on a layer and then call get_all_params or get_all_non_bias_params, we force the user to supply the params they want to regularize. Then, each layer should have separate methods for get_weight_params, get_bias_params, and when appropriate things like get_init_params (for recurrence). get_params would just combine the output of all of these individual functions. This would provide the convenience of getting all of the params for updates, but also would make it so that the user can control what they want to regularize. Any thoughts?

The text was updated successfully, but these errors were encountered:

benanne · 2015-01-23T09:13:06Z

Agreed. This is also more in line with our goal of transparency: "Functions and methods should return Theano expressions and standard Python / numpy data types where possible." This will reduce cognitive overhead (what methods take Theano expressions? What methods take layers?) and increase interoperability with other libraries and custom Theano code.

As I've mentioned before in #86, what's currently there should probably be thrown away. This is already being discussed in #14 so maybe we should move there.

craffel · 2015-01-24T19:49:11Z

This is already being discussed in #14 so maybe we should move there.

OK, closing.

craffel closed this as completed Jan 24, 2015

craffel mentioned this issue Jan 24, 2015

Regularization #14

Closed

f0k mentioned this issue Mar 11, 2015

API: get_params() #164

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Params for updates vs. params for regularization #110

Params for updates vs. params for regularization #110

craffel commented Jan 23, 2015

benanne commented Jan 23, 2015

craffel commented Jan 24, 2015

Params for updates vs. params for regularization #110

Params for updates vs. params for regularization #110

Comments

craffel commented Jan 23, 2015

benanne commented Jan 23, 2015

craffel commented Jan 24, 2015