Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) #11566

doaa-altarawy · 2018-07-16T19:14:05Z

Description

This is a feature request in allow flexibility in elastic net, which allows the user to apply separate penalties to each coefficient of the L1 term. The default value of this penalty factor is 1, meaning the regular elastic net behavior. If the penalty factor of a feature is Zero, it means it's not penalized at all, and that the user wants this feature to be always there in the model.

This feature is very useful in Bioinformatics and systems biology (that's why it is in Stanford's R package glmnet). With this feature, the user can run feature selection on a set of genes while making sure some genes stay in the system and not penalized (because of prior knowledge that they are involvid the system).

Here is glmnet documentation explaining the penalty factor. Mainly, it's controlling the selection weight on the lasso term:

https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html#lin

This feature is used in several papers, I've implemented in it in one of mine in scikit-learn. I've requests from other biologists to use this method in Python without having to recompile scikit-learn. I can do a pull request with this feature.

Papers using this feature:

Altarawy, Doaa, Fatma-Elzahraa Eid, and Lenwood S. Heath. "PEAK: Integrating Curated and Noisy Prior Knowledge in Gene Regulatory Network Inference." Journal of Computational Biology 24.9 (2017): 863-873.
Greenfield, Alex, Christoph Hafemeister, and Richard Bonneau. "Robust data-driven incorporation of prior knowledge into the inference of dynamic regulatory networks." Bioinformatics 29.8 (2013): 1060-1067.
Friedman, Jerome, Trevor Hastie, and Rob Tibshirani. "Regularization paths for generalized linear models via coordinate descent." Journal of statistical software 33.1 (2010): 1.

…y for each variable (scikit-learn#11566)

amueller · 2018-07-16T20:33:59Z

sounds like a reasonable addition. please open the PR. I can't guarantee that it'll get accepted but it makes sense to me.

…y for each variable (scikit-learn#11566)

lorentzenchr · 2018-09-16T12:25:15Z

This feature is also implemented in #9405, where you can exclude coefficients from the L1 as well as from the L2 penalty term.

hermidalc · 2020-05-24T00:05:55Z

If possible could penalty factors also be added to LogisticRegression (which is also part of glmnet)? It would also be very useful in the same way as for ElasticNet except for classification problems where you need to include unpenalized covariates.

lorentzenchr · 2022-02-01T13:48:19Z

How do we want the API?

Allow to specify an array-like for the penalization strength? Same for l1_ratio?
Could we already permit to work with feature names? Then a dict could work {"featue_1": 0.5, "features_2": 2.5}

Among the estimators are:

LogisticRegression(C=, l1_ratio=)
Lasso(alpha=)
ElasticNet(alpha=, l1_ration=)
Ridge(alpha=)

What do we do with the CV variants? I would leave them untouched for the moment.

@scikit-learn/core-devs as info

agramfort · 2022-02-01T15:59:42Z

one issue I foresee is the ambiguity between an array of alpha with one alpha for each output task and one alpha per feature.... hum... otherwise the alpha as array of length n_features is the simplest

…

Message ID: ***@***.***>

xiaowei1234 · 2022-02-28T22:55:39Z

can I get a review on my pull request that addresses this issue?

thomasjpfan · 2022-03-06T00:40:44Z

Ridge already accepts alpha as an array of (n_targets,). If GLMs support an array of (n_features,) the two APIs would be inconsistent.

jnothman · 2022-03-06T00:57:10Z

Can we accept shape (n_targets,), (1, n_targets), (n_features, 1) or (n_features, n_targets)?

lorentzenchr · 2022-03-07T08:38:57Z

Personally, I find the multi-output story for penalties a bit unfortunate (maybe I'm blind or unaware of good use cases). Given that we can't change that (easily), what about introducing new parameters P1 and P2 as in https://glum.readthedocs.io/en/latest/glm.html (and as in #9405) for penalty matrix for l1 and penalty matrix for l2.

In particular the P2 would be nice to have as it generalizes the penalty to coef @ P2 @ coef. This can come in very handy, e.g. for penalizing differences of coefficients or just for different penalty strength per feature.

Further advantages:

unified way to specify feature-wise penalties across all (linear) estimators: always the same parameter name
clear distinction between l1 and l2
no mixing/confusion with n_targets

xiaowei1234 · 2022-03-07T17:00:27Z

What is n_targets? is that not the number of features?

thomasjpfan · 2022-03-07T17:03:29Z

It's the number of targets in y:

from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge

X, y = make_regression(n_targets=2, random_state=0)

print(y.shape)
# (100, 2)

reg =  Ridge().fit(X, y)
print(reg.coef_.shape)
# (2, 100)

xiaowei1234 · 2022-03-07T18:32:02Z

It's the number of targets in y:

from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge

X, y = make_regression(n_targets=2, random_state=0)

print(y.shape)
# (100, 2)

reg =  Ridge().fit(X, y)
print(reg.coef_.shape)
# (2, 100)

ah ok thanks, haven't ever had the need to fit a regression w/ multiple targets before

agramfort · 2022-03-08T20:58:57Z

Personally, I find the multi-output story for penalties a bit unfortunate (maybe I'm blind or unaware of good use cases).

I think the motivating use case is when the targets are mapping to time samples. You want to fit one model per temporal observations and you want to factorize some computations as X does not depend on the target

Given that we can't change that (easily), what about introducing new parameters P1 and P2 as in https://glum.readthedocs.io/en/latest/glm.html (and as in #9405 <#9405>) for *penalty matrix for l1* and *penalty matrix for l2*.

that would work yes. But to me P1 cannot be a generic matrix as otherwise the solvers will be a lot more complicated. In GLUM I think it can only be a diagonal matrix.

In particular the P2 would be nice to have as it generalizes the penalty to coef @ P2 @ coef. This can come in very handy, e.g. for penalizing differences of coefficients or just for different penalty strength per feature.

+1

…

Further advantages: - unified way to specify feature-wise penalties across all (linear) estimators: always the same parameter name - clear distinction between l1 and l2 - no mixing/confusion with n_targets works for me! Message ID: ***@***.***>

lorentzenchr · 2022-03-21T19:37:57Z

But to me P1 cannot be a generic matrix as otherwise
the solvers will be a lot more complicated.

Correct, a 1d array for the diagonals of a P1 matrix suffices and keeps things solvable within our codebase.

lorentzenchr · 2022-03-28T21:07:51Z

As discussed in the dev meeting 28 March 2022, new parameter(s) seems fine. The questions are:

One or two parameters?
Naming

One vs two

Having one parameter like alpha_features would work for pretty much all linear models. This keeps L1 and L2 penalties in sync.
Having two individual ones, like P1 and P2 for ||P1 * w||_1 and ||w' @ diag(P2) @ w||, would allow more control. It also opens the opportunity for later to allow P2 to be a 2d array.

agramfort · 2022-03-29T09:29:53Z

Having two individual ones, like P1 and P2 for ||P1 * w||_1 and ||w' @ diag(P2) @ w||, would allow more control. It also opens the opportunity for later to allow P2 to be a 2d array.

this seems more flexible. Would you have an example in mind to illustrate the benefit of this on some data? I think that without a good example this feature will remain largely hidden.

lorentzenchr · 2022-03-29T17:09:17Z

There are several use cases:

I might not want to penalize some coefficients at all, in particular very strong main effects.
I might consider to add L1 to all coefficients, but exclude main effects from L2 which acts stronger on large coefficients.
With two different parameters for L1 and L2, we could later allow the L2 parameter to be a 2d array. With this one can construct:
- L2 penalties of differences of coefficients like sum_{i>j} |w_i - w_j|, e.g. for splines (=> p-splines)
- L2 penalties on differences of coefficients of neighbouring geo-locations, e.g. see section 6 of https://github.com/lorentzenchr/Tutorial_freMTPL2/blob/master/glm_freMTPL2_example.ipynb
- Use more fantasy

doaa-altarawy changed the title ~~Add Penalty factors for each coefficient (similar to the R's glmnet library)~~ Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) Jul 16, 2018

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 16, 2018

Implement penalty.factor of glmnet R package to have different penalt…

040cf1b

…y for each variable (scikit-learn#11566)

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 19, 2018

Implement penalty.factor of glmnet R package to have different penalt…

971de6a

…y for each variable (scikit-learn#11566)

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 24, 2018

Implement penalty.factor of glmnet R package to have different penalt…

b7a2315

…y for each variable (scikit-learn#11566)

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 24, 2018

Add tests for l1_weights (scikit-learn#11566)

30de8d6

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 24, 2018

Implement penalty.factor of glmnet R package to have different penalt…

3e71c4b

…y for each variable (scikit-learn#11566)

doaa-altarawy added a commit to doaa-altarawy/scikit-learn that referenced this issue Jul 24, 2018

Add tests for l1_weights in enet (scikit-learn#11566)

dbeffd5

doaa-altarawy mentioned this issue Jul 24, 2018

[MRG] Add Penalty factors for each coefficient in enet ( see #11566) #11671

Open

lorentzenchr mentioned this issue Jan 2, 2019

[MRG] META Add Generalized Linear Models #9405

Closed

8 tasks

anzonyquispe mentioned this issue Oct 30, 2021

New Information anzonyquispe/work_experience#1

Open

cmarmo added module:linear_model help wanted Needs Benchmarks A tag for the issues and PRs which require some benchmarks labels Jan 17, 2022

lorentzenchr mentioned this issue Jan 31, 2022

Allow GeneralizedLinearRegressor to use parameter level regularization parameter instead of a scalar for all parameters #22350

Closed

lorentzenchr added Moderate Anything that requires some knowledge of conventions and best practices Needs Decision - API and removed Needs Benchmarks A tag for the issues and PRs which require some benchmarks labels Feb 1, 2022

xiaowei1234 mentioned this issue Feb 15, 2022

[MRG] separate penalty factors for GLM regressions #22485

Open

3 tasks

lorentzenchr added API Needs Decision Requires decision and removed Needs Decision - API labels Mar 14, 2024

lorentzenchr mentioned this issue Apr 10, 2024

Add individual penalization to precision matrix in graphical_lasso.py #27652

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) #11566

Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) #11566

doaa-altarawy commented Jul 16, 2018 •

edited

amueller commented Jul 16, 2018

lorentzenchr commented Sep 16, 2018

hermidalc commented May 24, 2020

lorentzenchr commented Feb 1, 2022

agramfort commented Feb 1, 2022 via email

xiaowei1234 commented Feb 28, 2022

thomasjpfan commented Mar 6, 2022

jnothman commented Mar 6, 2022

lorentzenchr commented Mar 7, 2022

xiaowei1234 commented Mar 7, 2022

thomasjpfan commented Mar 7, 2022

xiaowei1234 commented Mar 7, 2022

agramfort commented Mar 8, 2022 via email

lorentzenchr commented Mar 21, 2022

lorentzenchr commented Mar 28, 2022

agramfort commented Mar 29, 2022 via email

lorentzenchr commented Mar 29, 2022

Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) #11566

Add Penalty factors for each coefficient in enet (similar to the R's glmnet library) #11566

Comments

doaa-altarawy commented Jul 16, 2018 • edited

Description

amueller commented Jul 16, 2018

lorentzenchr commented Sep 16, 2018

hermidalc commented May 24, 2020

lorentzenchr commented Feb 1, 2022

agramfort commented Feb 1, 2022 via email

xiaowei1234 commented Feb 28, 2022

thomasjpfan commented Mar 6, 2022

jnothman commented Mar 6, 2022

lorentzenchr commented Mar 7, 2022

xiaowei1234 commented Mar 7, 2022

thomasjpfan commented Mar 7, 2022

xiaowei1234 commented Mar 7, 2022

agramfort commented Mar 8, 2022 via email

lorentzenchr commented Mar 21, 2022

lorentzenchr commented Mar 28, 2022

One vs two

agramfort commented Mar 29, 2022 via email

lorentzenchr commented Mar 29, 2022

doaa-altarawy commented Jul 16, 2018 •

edited