Different output than scikit-learn's LASSO on a weird example #173

rpetit · 2020-12-09T10:37:17Z

Hi !

I am not sure this is the best place to report this, but I noticed a difference in the ouptut produced by your solver and scikit-learn's. I had a really hard time coming up with some minimal example, sorry if the one below is not really informative. I am using Python 3.7.4 and celer development version on Ubuntu.

import numpy as np
import sklearn.linear_model
import celer


X = np.array([[0, 0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0.015, 0, 0, 0.046, 0, 0, 0.061, 0, 0, 0.062]]).T
y = np.array([0.008, 0, 0.001, 0.02, 0, 0.001, 0.024, 0.001, 0.001, 0.023,
              0.006, 0, 0.011, 0.032, 0, 0.002, 0.056, 0.001, 0.001, 0.062])

lasso_sklearn = sklearn.linear_model.Lasso(alpha=1e-4)
lasso_celer = celer.Lasso(alpha=1e-4)

lasso_sklearn.fit(X, y)
lasso_celer.fit(X, y)

So the coefficient I get from scikit-learn's solver is approximately 0.55 (dual gap is 0) and the one I get from your solver is 0 (dual gap is approximately 1e-4).

I know this is a really degenerate use case, so maybe there is no need to worry about it, but I wanted to report this just in case, and ask if there was any reason celer should not be used in such situation.

Thanks in advance for your help !

The text was updated successfully, but these errors were encountered:

mathurinm · 2020-12-09T11:06:15Z

Hi Romain,
It's the correct place ! I'm very happy to get your feedback.

I think both solvers are right: they return a coef_ which is less than some tolerance (which is rescaled version of clf.tol) away from the optimal one in terms of objective. sklearn does 1 iteration and celer does not (it evaluates the gap before doing any update)

With
true_sol = sklearn.linear_model.Lasso(alpha=1e-4, tol=1e-14).fit(X, y)

In [44]: p_celer = norm(y - X @ celer_sol.coef_ - celer_sol.intercept_) ** 2 / (2 * len(y)) + celer_sol.a
    ...: lpha * norm(celer_sol.coef_, ord=1)

In [45]: p_celer
Out[45]: 0.000165375

In [46]: p_star = norm(y - X @ true_sol.coef_ - true_sol.intercept_) ** 2 / (2 * len(y)) + true_sol.alpha
    ...:  * norm(true_sol.coef_, ord=1)

In [47]: p_star
Out[47]: 0.00010331658481530062

In [48]: p_sklearn = norm(y - X @ sklearn_sol.coef_ - sklearn_sol.intercept_) ** 2 / (2 * len(y)) + sklea
    ...: rn_sol.alpha * norm(sklearn_sol.coef_, ord=1)

In [49]: p_sklearn
Out[49]: 0.00010331658481530062

In [50]: p_celer - p_star
Out[50]: 6.205841518469937e-05

In [51]: p_sklearn - p_star
Out[51]: 0.0

Notice that your objective at 0 is quite low (and within the default tolerance, 1e-4, of the optimal value):

In [16]: from numpy.linalg import norm

In [17]: norm(y - y.mean()) ** 2 / (2 * len(y))  # primal at w = 0
Out[17]: 0.000165375

If you decrease tol you get the same results:

In [18]: lasso_celer = celer.Lasso(alpha=1e-4, tol=1e-10).fit(X, y)

In [19]: lasso_sklearn = sklearn.linear_model.Lasso(alpha=1e-4, tol=1e-10).fit(X, y)

In [20]: lasso_celer.coef_, lasso_celer.intercept_
Out[20]: (array([0.55034622]), 0.007409297501753958)

In [21]: lasso_sklearn.coef_, lasso_sklearn.intercept_
Out[21]: (array([0.55034622]), 0.0074092975017539565)

mathurinm · 2020-12-09T11:13:59Z

Also be aware that both celer and sklearn fit an intercept by default.

rpetit · 2020-12-09T13:39:37Z

Thank you very much for your quick reply ! I'll try to carefully set the tolerance from now on then.

Also be aware that both celer and sklearn fit an intercept by default.

Thanks for pointing this out !

mathurinm · 2020-12-09T13:45:34Z

The tolerance should be scaled with respect to norm(y) ** 2 or norm(y) ** / n_samples (as is done in sklearn) to make it easier to set. This has been on my todolist for a while: #125

Keep me posted if there's any feature you're missing. this solver should be must faster than sklearn if your data is high dimensional

rpetit · 2020-12-09T13:59:06Z

The tolerance should be scaled with respect to norm(y) ** 2 or norm(y) ** / n_samples (as is done in sklearn) to make it easier to set. This has been on my todolist for a while: #125

Good to know !

Keep me posted if there's any feature you're missing. this solver should be must faster than sklearn if your data is high dimensional

I actually want to solve problems with a large number of samples in small dimension. I however have to use a weighted l1 norm in the objective, so that's why I am interested in celer, it turns out there are not many packages allowing to do that easily !

Thanks again for your help and for maintaining this package, it's really pleasant to use !

rpetit closed this as completed Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different output than scikit-learn's LASSO on a weird example #173

Different output than scikit-learn's LASSO on a weird example #173

rpetit commented Dec 9, 2020

mathurinm commented Dec 9, 2020 •

edited

mathurinm commented Dec 9, 2020

rpetit commented Dec 9, 2020

mathurinm commented Dec 9, 2020

rpetit commented Dec 9, 2020 •

edited

Different output than scikit-learn's LASSO on a weird example #173

Different output than scikit-learn's LASSO on a weird example #173

Comments

rpetit commented Dec 9, 2020

mathurinm commented Dec 9, 2020 • edited

mathurinm commented Dec 9, 2020

rpetit commented Dec 9, 2020

mathurinm commented Dec 9, 2020

rpetit commented Dec 9, 2020 • edited

mathurinm commented Dec 9, 2020 •

edited

rpetit commented Dec 9, 2020 •

edited