-
Notifications
You must be signed in to change notification settings - Fork 3.3k
elastic_net implementation #9428
Copy link
Copy link
Open
Description
Sometimes elastic_net gives a different result than the l1 and l1_cvxopt solvers:
import numpy as np
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
# Load and prepare data
FACTOR = 1
X = np.array([[1, 4], [2, 1], [3, 3], [4, 2]] * FACTOR)
y = np.array([0, 0, 1, 1] * FACTOR)
X_c = sm.add_constant(X)
n = len(X)
ALPHA = .1
alpha_mask = np.ones(X_c.shape[1])
alpha_mask[0] = 0
# Sklearn with elastic net
model = LogisticRegression(
penalty="elasticnet",
solver="saga",
max_iter=10000,
C=1 / ALPHA / n,
l1_ratio=1.0,
warm_start=True,
tol=1e-8,
)
model.fit(X, y)
print("sklearn saga:", model.intercept_, model.coef_)
coef = np.concatenate([model.intercept_, model.coef_[0]])
y_pred = model.predict_proba(X)[:, 1]
loss = -np.mean(y * np.log(y_pred) + (1 - y) * np.log(1 - y_pred))
print("Logistic regression loss:", loss + ALPHA * np.abs(model.coef_).sum())
# Statsmodels Logit with L1 (cvxopt)
model = sm.Logit(y, X_c)
results = model.fit_regularized(
method="l1_cvxopt_cp", alpha=ALPHA * n * alpha_mask, disp=False, maxiter=100
)
print("Logit l1_cvxopt_cp:", results.params)
loss = -model.loglikeobs(results.params).mean()
print("Logit l1_cvxopt_cp loss:", loss + ALPHA * np.abs(results.params[1:]).sum())
# Statsmodels Logit with L1
model = sm.Logit(y, X_c)
results = model.fit_regularized(method="l1", alpha=ALPHA * n * alpha_mask, disp=False)
print("Logit l1:", results.params)
loss = -model.loglikeobs(results.params).mean()
print("Logit l1 loss:", loss + ALPHA * np.abs(results.params[1:]).sum())
# Statsmodels GLM with elastic net
model = sm.GLM(y, X_c, family=sm.families.Binomial())
results = model.fit_regularized(
method="elastic_net", L1_wt=1.0, alpha=ALPHA * alpha_mask, maxiter=10000
)
print("GLM elastic_net (corrected):", results.params)
loss = -model.loglike(results.params) / n
print("GLM elastic_net loss:", loss + ALPHA * np.abs(results.params[1:]).sum())
sklearn saga: [-5.22609283] [[2.02421525 0.07529035]]
Logistic regression loss: 0.3804874930269413
Logit l1_cvxopt_cp: [-5.22611898 2.02422238 0.07529335]
Logit l1_cvxopt_cp loss: 0.38048749302397933
Logit l1: [-5.22610827 2.02422335 0.07528856]
Logit l1 loss: 0.3804874930259116
GLM elastic_net (corrected): [-5.09474961 2.03789985 0. ]
GLM elastic_net loss: 0.3808717013601821
It is interesting because the objective is very similar but the mask of coefficients is not the same.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels