Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

power_t: does it make sense for this parameter to have negative values #22178

Open
reshamas opened this issue Jan 10, 2022 · 1 comment
Open

Comments

@reshamas
Copy link
Member

sklearn/linear_model/_stochastic_gradient.py

Ref #22115

I do not anticipate negative power_t to be mathematically meaningful but apparently our code accepts it without crashing... So ok with documenting it.

Originally posted by @ogrisel in #22115 (comment)

@thomasjpfan

I feel like this a case where documenting -inf will lead to more people trying out. If this is not mathematically meaningful, then we could be promoting a bad practice?

cc: @glemaitre

@urbanophile
Copy link

My $0.02 is that this should be restricted to (0, inf). I have only heard of people wanting to increase the learning rate when doing something like cosine annealing, a linear warmup and decay or a cyclic learning rate for much more complicated deep neural networks. In the context of a linear classifier, I think you want the whole method to be pretty well-behaved, and the algorithm will not converge if the learning rate increases.

So, I'm happy to change this and submit a pull request if other people are agreeable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants