New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] Adding Implementation of SAG - next episode #4738

Closed
wants to merge 11 commits into
base: master
from

Conversation

Projects
None yet
10 participants
@TomDLT
Member

TomDLT commented May 19, 2015

I took over the great work of @dsullivan7 in #3814.

I removed the merges with master, squashed all the commits and rebased on master.

@dsullivan7

This comment has been minimized.

Show comment
Hide comment
@dsullivan7

dsullivan7 May 19, 2015

Contributor

Awesome @TomDLT! There was talk of having this implemented as a solver for LogisticRegression and RidgeRegression, have you looked into that at all?

Contributor

dsullivan7 commented May 19, 2015

Awesome @TomDLT! There was talk of having this implemented as a solver for LogisticRegression and RidgeRegression, have you looked into that at all?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 19, 2015

Member

travis is still unhappy ;) Thanks for picking this up!

Member

amueller commented May 19, 2015

travis is still unhappy ;) Thanks for picking this up!

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 19, 2015

Member

I rerun the classifier benchmark on two large datasets:
RCV1 and Alpha (cf here)
The plot shows the convergence with log10(|loss - loss_optimal|)

Result on Alpha (500.000 x 500, Dense):
diffloss
Result on RCV1 (804.414 x 47.152, Sparse):
diffloss

Member

TomDLT commented May 19, 2015

I rerun the classifier benchmark on two large datasets:
RCV1 and Alpha (cf here)
The plot shows the convergence with log10(|loss - loss_optimal|)

Result on Alpha (500.000 x 500, Dense):
diffloss
Result on RCV1 (804.414 x 47.152, Sparse):
diffloss

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 19, 2015

Member

as discussed I vote for adding 'sag' solver to LogisticRegression and RidgeRegression that would call plain sag_logistic and sag_ridge functions.

Member

agramfort commented May 19, 2015

as discussed I vote for adding 'sag' solver to LogisticRegression and RidgeRegression that would call plain sag_logistic and sag_ridge functions.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 19, 2015

Member

newtoncg is faster than liblinear? I'm surprised! Anyhow SAG seems to kick ass. I'd be +1 on adding a solver to the classifiers as this seems like a good default.

Member

amueller commented May 19, 2015

newtoncg is faster than liblinear? I'm surprised! Anyhow SAG seems to kick ass. I'd be +1 on adding a solver to the classifiers as this seems like a good default.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 19, 2015

Member

(though i dream of the day where the default LogisticRegression is multinomial, not OvR ;)

Member

amueller commented May 19, 2015

(though i dream of the day where the default LogisticRegression is multinomial, not OvR ;)

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 19, 2015

Member

newton-cg is faster than liblinear? I'm surprised!

Actually, newton-cg is not faster.
In previous example, with fit_intercept=True, liblinear and newton-cg does not converge to the same minimum, since liblinear use a regularization on the intercept, whereas newton-cg and SAG don't.

I tried using the same regularization in SAG, and it converges to the same minimum as liblinear.
However, it makes more sense not to regularize the intercept.

Finally, with fit_intercept=False, we see that liblinear is a not slower than newton-cg.
diffloss_no_intercept

Member

TomDLT commented May 19, 2015

newton-cg is faster than liblinear? I'm surprised!

Actually, newton-cg is not faster.
In previous example, with fit_intercept=True, liblinear and newton-cg does not converge to the same minimum, since liblinear use a regularization on the intercept, whereas newton-cg and SAG don't.

I tried using the same regularization in SAG, and it converges to the same minimum as liblinear.
However, it makes more sense not to regularize the intercept.

Finally, with fit_intercept=False, we see that liblinear is a not slower than newton-cg.
diffloss_no_intercept

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 19, 2015

Member

Thanks for the explanation.

Member

amueller commented May 19, 2015

Thanks for the explanation.

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 20, 2015

Member

I implemented sag_logistic as a solver in LogisticRegression, and changed accordingly some of the tests.

Currently, in order to match LogisticRegression API, and compared to previous SAGClassifier,

  • eta is forced to 'auto'
  • we loose the warm_start
  • we loose parallel processing for multiclass
  • the behavior with multiclass with class weights is changed (it is now the same as in other LogisticRegression solvers with 'OvR' strategy)
Member

TomDLT commented May 20, 2015

I implemented sag_logistic as a solver in LogisticRegression, and changed accordingly some of the tests.

Currently, in order to match LogisticRegression API, and compared to previous SAGClassifier,

  • eta is forced to 'auto'
  • we loose the warm_start
  • we loose parallel processing for multiclass
  • the behavior with multiclass with class weights is changed (it is now the same as in other LogisticRegression solvers with 'OvR' strategy)
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 20, 2015

Member

why do we lose warm_start?

Member

amueller commented May 20, 2015

why do we lose warm_start?

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member

there is no reason not to support warm_start

I would put all sag related code in 1 file called sag.py (ie no sag_class.py)

Member

agramfort commented May 21, 2015

there is no reason not to support warm_start

I would put all sag related code in 1 file called sag.py (ie no sag_class.py)

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member

travis is not happy

Member

agramfort commented May 21, 2015

travis is not happy

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 21, 2015

Member

For warm_start, how should I pass the option without adding parameters to LogisticRegression?

Member

TomDLT commented May 21, 2015

For warm_start, how should I pass the option without adding parameters to LogisticRegression?

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member
Member

agramfort commented May 21, 2015

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member

ping us when ready to merge.

thx

Member

agramfort commented May 21, 2015

ping us when ready to merge.

thx

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 21, 2015

Member

I also thought there was a warm_start for LogisticRegressionCV... hum

Member

amueller commented May 21, 2015

I also thought there was a warm_start for LogisticRegressionCV... hum

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 21, 2015

Member

needs a rebase (probably for whatsnew)

Member

amueller commented May 21, 2015

needs a rebase (probably for whatsnew)

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 21, 2015

Member

I implemented sag_ridge as a solver in Ridge, and changed accordingly some of the tests.

Currently, in order to match Ridge API, and compared to previous SAGRegressor,

  • eta is forced to 'auto'
  • we loose random_state and warm_start
Member

TomDLT commented May 21, 2015

I implemented sag_ridge as a solver in Ridge, and changed accordingly some of the tests.

Currently, in order to match Ridge API, and compared to previous SAGRegressor,

  • eta is forced to 'auto'
  • we loose random_state and warm_start
@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member
Member

agramfort commented May 21, 2015

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 21, 2015

Member

yeah

Member

amueller commented May 21, 2015

yeah

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 21, 2015

Member

shouldn't LogisticRegression have a random_state for liblinear? Or is that only for hinge-loss?

Member

amueller commented May 21, 2015

shouldn't LogisticRegression have a random_state for liblinear? Or is that only for hinge-loss?

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member
Member

agramfort commented May 21, 2015

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 21, 2015

Member

No, LogisticRegression class has a random_state, and so has sag_logistic.
It is missing in Ridge class, so it is currently missing also in sag_ridge.

Member

TomDLT commented May 21, 2015

No, LogisticRegression class has a random_state, and so has sag_logistic.
It is missing in Ridge class, so it is currently missing also in sag_ridge.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 21, 2015

Member
Member

agramfort commented May 21, 2015

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 22, 2015

Member

So I added random_state in sag_ridge.

Member

TomDLT commented May 22, 2015

So I added random_state in sag_ridge.

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 22, 2015

Member

By the way, _fit_liblinear has a random_state parameter, but it is not used in LogisticRegression with solver='liblinear'.
Actually, LogisticRegression has a random_state parameter that was never used (before SAG)

Member

TomDLT commented May 22, 2015

By the way, _fit_liblinear has a random_state parameter, but it is not used in LogisticRegression with solver='liblinear'.
Actually, LogisticRegression has a random_state parameter that was never used (before SAG)

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort May 22, 2015

Member
Member

agramfort commented May 22, 2015

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 27, 2015

Member

Here is the implementation of SAG as 2 solvers (sag_ridge and sag_logistic).
I also updated the doc and the tests.

I checked speed performances, compared to previous class implementation(SAGClassifier):
Classifier for RCV1 dataset:
classifier_rcv1_dloss
Regressor for Alpha dataset:
regressor_alpha_dloss_nointercept_alone

Member

TomDLT commented May 27, 2015

Here is the implementation of SAG as 2 solvers (sag_ridge and sag_logistic).
I also updated the doc and the tests.

I checked speed performances, compared to previous class implementation(SAGClassifier):
Classifier for RCV1 dataset:
classifier_rcv1_dloss
Regressor for Alpha dataset:
regressor_alpha_dloss_nointercept_alone

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 27, 2015

Member

Please tell me what you think of the code.

Also, do we want to change the API for Ridge and LogisticRegression, in order to add a warm_start parameter?

Member

TomDLT commented May 27, 2015

Please tell me what you think of the code.

Also, do we want to change the API for Ridge and LogisticRegression, in order to add a warm_start parameter?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 27, 2015

Member

Why is there a change in speed? Shouldn't it be the same?

Member

amueller commented May 27, 2015

Why is there a change in speed? Shouldn't it be the same?

Show outdated Hide outdated doc/modules/linear_model.rst
@@ -696,6 +696,10 @@ be better calibrated than the default "one-vs-rest" setting.
L-BFGS and newton-cg cannot optimize L1-penalized models, though,
so the "multinomial" setting does not learn sparse models.
The solver "sag" uses a Stochastic Average Gradient descent [3]_. It does not handle "multinomial" case, and is limited to L2-penalized models, yet it is

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

It uses ovr for multi-class, right?

@amueller

amueller May 27, 2015

Member

It uses ovr for multi-class, right?

This comment has been minimized.

@TomDLT

TomDLT May 27, 2015

Member

Yes

@TomDLT

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

Well, then I would say that ;)

@amueller

amueller May 27, 2015

Member

Well, then I would say that ;)

This comment has been minimized.

@agramfort

agramfort May 29, 2015

Member

line too long

@agramfort

agramfort May 29, 2015

Member

line too long

Show outdated Hide outdated doc/modules/linear_model.rst
@@ -696,6 +696,10 @@ be better calibrated than the default "one-vs-rest" setting.
L-BFGS and newton-cg cannot optimize L1-penalized models, though,
so the "multinomial" setting does not learn sparse models.
The solver "sag" uses a Stochastic Average Gradient descent [3]_. It does not handle "multinomial" case, and is limited to L2-penalized models, yet it is
faster than other solvers for large datasets, when both the number of sample

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

samples

@amueller

amueller May 27, 2015

Member

samples

Show outdated Hide outdated doc/modules/linear_model.rst
@@ -696,6 +696,10 @@ be better calibrated than the default "one-vs-rest" setting.
L-BFGS and newton-cg cannot optimize L1-penalized models, though,
so the "multinomial" setting does not learn sparse models.
The solver "sag" uses a Stochastic Average Gradient descent [3]_. It does not handle "multinomial" case, and is limited to L2-penalized models, yet it is
faster than other solvers for large datasets, when both the number of sample
and the number of feature are large.

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

features

@amueller

amueller May 27, 2015

Member

features

@@ -204,7 +205,7 @@ def set_fast_parameters(estimator):
and estimator.__class__.__name__ != "TSNE"):
estimator.set_params(n_iter=5)
if "max_iter" in params:
# NMF
warnings.simplefilter("ignore", ConvergenceWarning)

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

Why did you remove the nmf comment?

@amueller

amueller May 27, 2015

Member

Why did you remove the nmf comment?

This comment has been minimized.

@TomDLT

This comment has been minimized.

@dsullivan7

dsullivan7 May 27, 2015

Contributor

Because "max_iter" is now in the params of SAG, this no longer applies exclusively to NMF. I suppose it would have been wiser to instead make the comment "NMF or SAG".

@dsullivan7

dsullivan7 May 27, 2015

Contributor

Because "max_iter" is now in the params of SAG, this no longer applies exclusively to NMF. I suppose it would have been wiser to instead make the comment "NMF or SAG".

This comment has been minimized.

@amueller

amueller May 27, 2015

Member

hum, max_iter is also in many other estimators, right? Not sure why I put NMF here then... well, never mind

@amueller

amueller May 27, 2015

Member

hum, max_iter is also in many other estimators, right? Not sure why I put NMF here then... well, never mind

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 27, 2015

Member

Does this implementation actually sample the data point to update randomly? I thought usually people just shuffle after each iteration. Have you compared the two approaches (or did anyone before)?
Do we want liblinear to be the default solver still?

Member

amueller commented May 27, 2015

Does this implementation actually sample the data point to update randomly? I thought usually people just shuffle after each iteration. Have you compared the two approaches (or did anyone before)?
Do we want liblinear to be the default solver still?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 27, 2015

Member

If you want to add warm_start, I'd be fine with that. it should also be supported for l-bfgs, right?

Member

amueller commented May 27, 2015

If you want to add warm_start, I'd be fine with that. it should also be supported for l-bfgs, right?

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 28, 2015

Member

Does this implementation actually sample the data point to update randomly?

Yes, each data point is chosen randomly, and the same data point can be chosen several times in a row ("draw with replacement").

I thought usually people just shuffle after each iteration. Have you compared the two approaches (or did anyone before)?

In Mark Schmidt's presentation (co-author of SAG), slide 76:
"Does re-shuffling and doing full passes work better? NO !"
It decreases speed performances. I did not tested it.

Do we want 'liblinear' to be the default solver still?

I think we should keep it, since 'sag' solver is less efficient if the dataset is small.

Member

TomDLT commented May 28, 2015

Does this implementation actually sample the data point to update randomly?

Yes, each data point is chosen randomly, and the same data point can be chosen several times in a row ("draw with replacement").

I thought usually people just shuffle after each iteration. Have you compared the two approaches (or did anyone before)?

In Mark Schmidt's presentation (co-author of SAG), slide 76:
"Does re-shuffling and doing full passes work better? NO !"
It decreases speed performances. I did not tested it.

Do we want 'liblinear' to be the default solver still?

I think we should keep it, since 'sag' solver is less efficient if the dataset is small.

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT May 28, 2015

Member

If you want to add warm_start, I'd be fine with that. it should also be supported for l-bfgs, right?

Actually, there is two kinds of warm starting for SAG:
(1) reuse the coef as initial guess
(2) reuse the coef as initial guess, and store and reuse intermediate calculus (SAG Paper, 4.4) (better)


In LogisticRegressionCV, warm starting (1) is already used for SAG, LBFGS and Newton-CG.
I can just modify slightly in order to use warm starting (2) for SAG. (easy)

If we want warm starting in LogisticRegression, we only need to add a warm_start parameter, and warm starting (1) can be used for SAG, LBFGS and Newton-CG. If we want warm starting (2) for SAG, we need to store intermediate calculus, of size (n_samples + n_features).


In Ridge, other solvers do not handle warm starting. If we want warm starting (1) for SAG, we only need to add a warm_start parameter. If we want warm starting (2) for SAG, we need to store intermediate calculus, of size (n_samples + n_features).

RidgeCV currently does not handle SAG.

Member

TomDLT commented May 28, 2015

If you want to add warm_start, I'd be fine with that. it should also be supported for l-bfgs, right?

Actually, there is two kinds of warm starting for SAG:
(1) reuse the coef as initial guess
(2) reuse the coef as initial guess, and store and reuse intermediate calculus (SAG Paper, 4.4) (better)


In LogisticRegressionCV, warm starting (1) is already used for SAG, LBFGS and Newton-CG.
I can just modify slightly in order to use warm starting (2) for SAG. (easy)

If we want warm starting in LogisticRegression, we only need to add a warm_start parameter, and warm starting (1) can be used for SAG, LBFGS and Newton-CG. If we want warm starting (2) for SAG, we need to store intermediate calculus, of size (n_samples + n_features).


In Ridge, other solvers do not handle warm starting. If we want warm starting (1) for SAG, we only need to add a warm_start parameter. If we want warm starting (2) for SAG, we need to store intermediate calculus, of size (n_samples + n_features).

RidgeCV currently does not handle SAG.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 28, 2015

Member

What do the benchmarks on small datasets look like?

Member

amueller commented May 28, 2015

What do the benchmarks on small datasets look like?

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Aug 27, 2015

Member

I believe @TomDLT is on vacation at the moment. He should be back online soon :)

Member

ogrisel commented Aug 27, 2015

I believe @TomDLT is on vacation at the moment. He should be back online soon :)

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Aug 27, 2015

Member
Member

agramfort commented Aug 27, 2015

@fabianp

This comment has been minimized.

Show comment
Hide comment
@fabianp

fabianp Aug 27, 2015

Member

some more feedback (I start using it and some some usability issues come to mind).

The function signature of sag_ridge and sag_logistic is quite similar but its output is very different, which feels strange. The first one returns coef_, n_iter while the second one returns a dict with many keys (including coef) and n_iter. I understand that sag_logistic might want to return more information, but to make it more consistent I would make it such that both return a triplet (coef_, n_iter, warm_start_mem) where the last one contains the information in warm_start_mem from sag_logistic except for coef_. What do you think?

Member

fabianp commented Aug 27, 2015

some more feedback (I start using it and some some usability issues come to mind).

The function signature of sag_ridge and sag_logistic is quite similar but its output is very different, which feels strange. The first one returns coef_, n_iter while the second one returns a dict with many keys (including coef) and n_iter. I understand that sag_logistic might want to return more information, but to make it more consistent I would make it such that both return a triplet (coef_, n_iter, warm_start_mem) where the last one contains the information in warm_start_mem from sag_logistic except for coef_. What do you think?

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Aug 27, 2015

Member
Member

agramfort commented Aug 27, 2015

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Aug 27, 2015

Member

+1 as well for consistency.

Member

ogrisel commented Aug 27, 2015

+1 as well for consistency.

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Sep 8, 2015

Member

Thanks a lot for all your reviews !


Changing the signature of sag_logistic and sag_ridge would have lead to extremely similar functions, so I decided to merge them in one function sag_solver.


About default alpha value, I changed it to 1, to be consistent with Ridge's default alpha, and LogisticRegression's default C.


About the scaling of alpha by n_samples, the confusion comes from :
alpha_ridge = 1. / C_logistic
alpha_sgd = 1. / C_logistic / n_samples

For example in this benchmark, I need to scale alpha_sgd to obtain the same minimum.

alpha_sag was initially equal to alpha_sgd.
I chose to change it to alpha_sag = alpha_ridge = 1. / C_logistic, and to scale it by n_samples internally.
Choosing C_sag = C_logistic = 1. / alpha_ridge would be identical.

To avoid confusion, I now changed the named in alpha_scaled internally.


As suggested, I also added a boolean parameter return_n_iter in ridge_regression public function to avoid changing its signature.

Member

TomDLT commented Sep 8, 2015

Thanks a lot for all your reviews !


Changing the signature of sag_logistic and sag_ridge would have lead to extremely similar functions, so I decided to merge them in one function sag_solver.


About default alpha value, I changed it to 1, to be consistent with Ridge's default alpha, and LogisticRegression's default C.


About the scaling of alpha by n_samples, the confusion comes from :
alpha_ridge = 1. / C_logistic
alpha_sgd = 1. / C_logistic / n_samples

For example in this benchmark, I need to scale alpha_sgd to obtain the same minimum.

alpha_sag was initially equal to alpha_sgd.
I chose to change it to alpha_sag = alpha_ridge = 1. / C_logistic, and to scale it by n_samples internally.
Choosing C_sag = C_logistic = 1. / alpha_ridge would be identical.

To avoid confusion, I now changed the named in alpha_scaled internally.


As suggested, I also added a boolean parameter return_n_iter in ridge_regression public function to avoid changing its signature.

@fabianp

This comment has been minimized.

Show comment
Hide comment
@fabianp

fabianp Sep 9, 2015

Member

Great! Thanks for the changes, all looks good to me now. +1 from me.

Member

fabianp commented Sep 9, 2015

Great! Thanks for the changes, all looks good to me now. +1 from me.

@fabianp fabianp changed the title from [MRG+1] Adding Implementation of SAG - next episode to [MRG+2] Adding Implementation of SAG - next episode Sep 9, 2015

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Sep 9, 2015

Member

Thanks again for the review!

FYI I am working on a multinomial version of SAG, but it will be in another PR.

Member

TomDLT commented Sep 9, 2015

Thanks again for the review!

FYI I am working on a multinomial version of SAG, but it will be in another PR.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Sep 9, 2015

Member
Member

agramfort commented Sep 9, 2015

Show outdated Hide outdated doc/whats_new.rst
@@ -156,6 +158,12 @@ Enhancements
visible with extra trees and on datasets with categorical or sparse
features. By `Arnaud Joly`_.
- Added optional parameter ``random_state`` in :class:`linear_model.Ridge`
, to set the seed of the pseudo random generator used in ``sag`` solver.By `Tom Dupre la Tour`_.

This comment has been minimized.

@amueller

amueller Sep 9, 2015

Member

space before "By" ;)

@amueller

amueller Sep 9, 2015

Member

space before "By" ;)

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Sep 9, 2015

Member

@ogrisel we want this in the release, right?

Member

amueller commented Sep 9, 2015

@ogrisel we want this in the release, right?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Sep 9, 2015

Member

Great work everybody :)

Member

amueller commented Sep 9, 2015

Great work everybody :)

@amueller amueller added this to the 0.17 milestone Sep 9, 2015

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Sep 9, 2015

Member

@ogrisel we want this in the release, right?

I am not opposed to have it in :)

Member

ogrisel commented Sep 9, 2015

@ogrisel we want this in the release, right?

I am not opposed to have it in :)

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Sep 10, 2015

Member

FYI I am working on a multinomial version of SAG, but it will be in another PR.

Would be great to consider adding support for sample_weight to LogisticRegression and LogisticRegressionCV as well while you are at it.

Member

ogrisel commented Sep 10, 2015

FYI I am working on a multinomial version of SAG, but it will be in another PR.

Would be great to consider adding support for sample_weight to LogisticRegression and LogisticRegressionCV as well while you are at it.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Sep 10, 2015

Member

This PR needs a rebase on top of the current master.

Member

ogrisel commented Sep 10, 2015

This PR needs a rebase on top of the current master.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Sep 10, 2015

Member

I'll rebase, squash and merge in a bit unless anyone complains.

Member

amueller commented Sep 10, 2015

I'll rebase, squash and merge in a bit unless anyone complains.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Sep 10, 2015

Member

Pushed as 94eb619. Thanks for the great work!

Member

amueller commented Sep 10, 2015

Pushed as 94eb619. Thanks for the great work!

@amueller amueller closed this Sep 10, 2015

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Sep 10, 2015

Member
Member

agramfort commented Sep 10, 2015

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Sep 11, 2015

Member

🍻!

Member

ogrisel commented Sep 11, 2015

🍻!

@dsullivan7

This comment has been minimized.

Show comment
Hide comment
@dsullivan7

dsullivan7 Sep 11, 2015

Contributor

Awesome!!

Contributor

dsullivan7 commented Sep 11, 2015

Awesome!!

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Sep 11, 2015

Member

Nice !

Member

TomDLT commented Sep 11, 2015

Nice !

@fabianp

This comment has been minimized.

Show comment
Hide comment
@fabianp

fabianp Sep 11, 2015

Member

Yeah! @TomDLT deserves extra kudos for patience and perseverance :-)

Member

fabianp commented Sep 11, 2015

Yeah! @TomDLT deserves extra kudos for patience and perseverance :-)

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Sep 11, 2015

Member

Thanks :)

Member

TomDLT commented Sep 11, 2015

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment