Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of Using L2 Regularizer #86

Closed
cancan101 opened this issue Jan 3, 2015 · 6 comments
Closed

Example of Using L2 Regularizer #86

cancan101 opened this issue Jan 3, 2015 · 6 comments

Comments

@cancan101
Copy link

Currently none of the examples show using the L2 regularizer.

An example of using it in a model would be helpful.

@benanne
Copy link
Member

benanne commented Jan 3, 2015

The current code for regularization was always intended as a placeholder. For now the easiest thing to do is just to construct the regularization terms manually. I don't think what we have right now is worth documenting, it needs to be thoroughly thought through and reimplemented first.

@cancan101
Copy link
Author

Other than dropout and early-stopping, what other forms of regularization are currently implemented in Lasagne?

@benanne
Copy link
Member

benanne commented Jan 5, 2015

I don't think we have any code for early stopping in the library right now, there's no code for training loops yet. I guess @dnouri may have implemented this as part of his Lasagne extensions in https://github.com/dnouri/nolearn though.

Personally I usually stick to dropout and not much else, which is why that's all there is for now. Are you looking for anything in particular? As long as you can implement it in Theano, you can use it with Lasagne :)

@dimatura
Copy link
Contributor

The regularization does work though. If your final output layer is l_out, then reg = lasagne.regularization.l2(l_out) returns an expression for the L2 norm of all the non-bias parameters in your network. If loss is a symbolic theano expression for your loss (e.g. cross entropy), then a regularized objective would be loss + 0.0001*reg. Then you can use gradient on this as usual.

@saskra
Copy link

saskra commented Jun 26, 2018

I don't think we have any code for early stopping in the library right now

Did this change by now/ will it change someday?

@f0k
Copy link
Member

f0k commented Jun 26, 2018

Did this change by now/ will it change someday?

No / probably not. There are some useful discussions in #756, but there was no fully satisfying implementation at the time, and now there's hardly anyone to work on or review this. Writing a training loop for a specific experiment is easy enough, and writing a general training loop that covers all use cases and strikes the right balance between user-friendliness and transparency is very difficult. I'd rather have no training code at all than an incomplete solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants