# 규제회귀모델

기본아이디어💡: 선형 회귀는 모든 피처(독립변수)에 대해 가중치(계수)를 학습해서 예측

            하지만 피처가 너무 많거나 상관관계가 높으면 과적합이 발생

            규제는 가중치 값이 너무 커지지 않도록 별점을 줘서 과적합을 방지

- 릿지: L2규제 사용, 비용함수에 가중치를 제곱합을 더해 패널티를 줌
- 라쏘: L1규제 사용, 비용함수에 가중치의 절댓값 합을 더해 패널티를 줌
- 엘라스틱넷: 릿지와 라쏘를 섞어서 쓰는 방법

In [1]:
from sklearn.linear_model import Ridge, Lasso, ElasticNet

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split

boston_df = pd.read_csv('./data/boston.csv')
X = boston_df.drop(columns='target')
y = boston_df['target']

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=156)

In [18]:
ridge = Ridge(alpha= 1.0)
ridge.fit(X_train, y_train)
pred_ridge = ridge.predict(X_test)

from sklearn.metrics import mean_squared_error, r2_score
mse = mean_squared_error(y_test, pred_ridge)
r2  = r2_score(y_test, pred_ridge)
mse, r2

(np.float64(17.206833316177086), np.float64(0.758490701135722))

In [17]:
ridge = Lasso(alpha= 1.0)
ridge.fit(X_train, y_train)
pred_lasso = ridge.predict(X_test)

from sklearn.metrics import mean_squared_error, r2_score
mse = mean_squared_error(y_test, pred_ridge)
r2  = r2_score(y_test, pred_ridge)
mse, r2

(np.float64(22.137098759778436), np.float64(0.68929116112626))

MSE: 예측값과 실제값의 차이를 제곱해서 평균을 낸 값, 즉 오차의 크기를 나타낸다.

R2: 예측모델이 실제 데이터를 얼마나 잘 설명하는 지를 나타내는 값, 1에 가까울수록 좋음

이 둘은 반비례 관게를 가지고 있음

다항회귀모델

In [5]:
from sklearn.linear_model import RidgeCV, LassoCV
alphas = [0.001,0.01,0.1,1,10,100]
ridge_cv = RidgeCV(alphas= alphas, cv = 5)
ridge_cv.fit(X_train, y_train)
ridge_preds = ridge_cv.predict(X_test)

from sklearn.metrics import mean_squared_error, r2_score
ridge_mse = mean_squared_error(y_test, ridge_preds)
ridge_r2 = r2_score(y_test, ridge_preds)
print(ridge_mse,ridge_r2)

17.296260101779946 0.7572355369870505


In [20]:
alphas = [0.001,0.01,0.1,1,10,100]
lasso_cv = LassoCV(alphas= alphas, cv = 5)
lasso_cv.fit(X_train, y_train)
ridge_preds = lasso_cv.predict(X_test)

from sklearn.metrics import mean_squared_error, r2_score
lasso_mse = mean_squared_error(y_test, ridge_preds)
lasso_r2 = r2_score(y_test, ridge_preds)
print(lasso_mse,lasso_r2)

17.283108558822182 0.7574201276005788


In [22]:
ridge_cv.alpha_

np.float64(0.001)

| 구분       | 릿지 (Ridge)      | 라쏘 (Lasso)   |
| -------- | --------------- | ------------ |
| 규제 방식    | L2 (제곱)         | L1 (절댓값)     |
| 변수 제거    | ❌ 안 함           | ✅ 일부 제거      |
| 다중공선성 완화 | O               | O            |
| 해석성      | 중간              | 높음           |
| 사용 예     | 피처 많고 상관관계 높을 때 | 피처 선택이 필요할 때 |


Lasso alpha=0.1

In [23]:
lasso = Lasso(alpha=0.1)  # alpha 값 작으면 규제 약해짐
lasso.fit(X_train, y_train)
pred_lasso = lasso.predict(X_test)

print("\n[라쏘 회귀]")
print("MSE:", mean_squared_error(y_test, pred_lasso))
print("R2:", r2_score(y_test, pred_lasso))


[라쏘 회귀]
MSE: 17.828795683730707
R2: 0.7497610474831454


LassoCV

In [24]:
lasso_cv = LassoCV(alphas=alphas, cv=5, max_iter=10000)
lasso_cv.fit(X_train, y_train)
lasso_preds = lasso_cv.predict(X_test)
lasso_mse = mean_squared_error(y_test, lasso_preds)
lasso_r2 = r2_score(y_test, lasso_preds)

print(f"[개선 Lasso] 최적 alpha: {lasso_cv.alpha_}")
print(f"MSE: {lasso_mse:.3f}, R2: {lasso_r2:.3f}")

[개선 Lasso] 최적 alpha: 0.001
MSE: 17.283, R2: 0.757


In [25]:
ridge_cv.coef_ # Lasso모델에서 제거된 건 없음.

array([-1.12661593e-01,  6.56008407e-02,  3.30878096e-02,  3.02619848e+00,
       -1.94682482e+01,  3.35481221e+00,  5.71239360e-03, -1.73664047e+00,
        3.55109529e-01, -1.43215477e-02, -9.16008532e-01,  1.04167858e-02,
       -5.66946541e-01])

In [26]:
lasso_cv.coef_

array([-1.12661593e-01,  6.56008407e-02,  3.30878096e-02,  3.02619848e+00,
       -1.94682482e+01,  3.35481221e+00,  5.71239360e-03, -1.73664047e+00,
        3.55109529e-01, -1.43215477e-02, -9.16008532e-01,  1.04167858e-02,
       -5.66946541e-01])

엘라시틱넷

In [None]:
enet = ElasticNet(alpha= 0.1, l1_ratio= 0.5)
enet.fit(X_train, y_train)

enet_pred = enet.predict(X_test)
print(mean_squared_error(y_test, enet_pred))
print(r2_score(y_test, enet_pred))

18.116738279436163
0.7457195824951278


모델의 성능

In [14]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

In [15]:
model_poly = Pipeline([
    ("poly", PolynomialFeatures(degree=2, include_bias=False)),
    ("linear", LinearRegression())
])

model_poly.fit(X_train, y_train)

results = []

for  degree in range(1,5):
    model_poly = Pipeline([
        ("poly", PolynomialFeatures(degree = degree, include_bias=False)),
        ("linear", LinearRegression())
    ])

    model_poly.fit(X_train, y_train)
    pred_poly = model_poly.predict(X_test)

In [21]:
results = pd.DataFrame({
    '모델': ['다항회귀', '릿지회귀', '라쏘회귀', '엘라스틱넷회귀'],
    'MSE': [mean_squared_error(y_test, pred_poly),
            mean_squared_error(y_test, pred_ridge),
            mean_squared_error(y_test, pred_lasso),
            mean_squared_error(y_test, enet_pred)
    ],
    'R2': [r2_score(y_test, pred_poly),
           r2_score(y_test, pred_ridge),
           r2_score(y_test, pred_lasso),
           r2_score(y_test, enet_pred),]
})

results

Unnamed: 0,모델,MSE,R2
0,다항회귀,170599.948638,-2393.48324
1,릿지회귀,17.206833,0.758491
2,라쏘회귀,22.137099,0.689291
3,엘라스틱넷회귀,18.116738,0.74572
