### 규제
* 학습이 과대적합도는 것을 방지하고자 하는 알고리즘
* 라쏘(Lasso)
    - L1규제를 추가한 모형
    - 영향력이 크지 않은 회귀계수 값을 0으로 만드는 특성이 있다.
        * 회귀계수 : 독립변수의 값이 변화함에 따라 종속변수에 미치는 영향력 크기
    - alpha를 이용하여 가중치 제어. alpha값에 따라 과적합될 우려가 있다.
    - 영향력이 작은 회귀계수를 0으로 만듦으로써 모델에서 가장 중요한 특성이 무엇인지 알 수 있다
* 릿지(Ridge)
    - L2규제를 추가한 모형
    - 계수값을 0이 아닌 작게 만드는 특성이 있다.
    - alpha를 이용하여 가중치 제어. alpha값에 따라 과적합될 우려가 있다.
* 엘라스틱넷(ElasticNet)
    - L1, L2를 함께 결합한 모형
    - 피처가 많은 데이터세트에 적용
    - L1 규제로 feature의 수를 줄이고 L2규제로 계수값의 크기를 조정
    - 파라미터
        * alpha : L1규제의 alpha(a) + L2규제의 alpha(b). L1과 L2의 alpha를 합처논 것이다.
        * l1_ratio = 0 : 0에 가까워 질수록 L2규제와 동일
        * l1_ratio = 1 : 1에 가까워 질수록 L1규제와 동일
        * 0 < l1_ratio < 1 : L1과 L2규제를 적절히 적용
* 계수 : 계산해서 얻은 값

In [1]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings("ignore")

In [2]:
df = pd.read_csv("data/boston.csv")
df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,PRICE
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [3]:
df.columns

Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
       'PTRATIO', 'B', 'LSTAT', 'PRICE'],
      dtype='object')

In [4]:
f =['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
       'PTRATIO', 'B', 'LSTAT']
label='PRICE'
x, y = df[f], df[label]

In [5]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)
x_train.shape, x_test.shape

((354, 13), (152, 13))

In [7]:
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import Lasso
# alpha : 규제 강도
# 수치가 높을수록 강한 강도이다
# 강도가 높다 : 영향력이 높은 것들도 0으로 만든다
lasso = Lasso(alpha=0.07)
lasso.fit(x_train, y_train)
train_pred = lasso.predict(x_train)
test_pred = lasso.predict(x_test)

print("score : ", lasso.score(x_train, y_train), "mse : ", mean_squared_error(train_pred, y_train))
print("score : ", lasso.score(x_test, y_test), "mse : ", mean_squared_error(test_pred, y_test))

score :  0.754387398246946 mse :  22.14984766184343
score :  0.6356374725562111 mse :  25.686031596635438


In [9]:
alphas = [0.07, 0.1, 0.5, 1.3, 2]
for a in alphas : 
    lasso = Lasso(alpha=0.07)
    lasso.fit(x_train, y_train)
    train_pred = lasso.predict(x_train)
    test_pred = lasso.predict(x_test)

    print("alpha : ", a)
    print("score : ", lasso.score(x_test, y_test), "mse : ", mean_squared_error(test_pred, y_test))
    print("score : ", lasso.score(x_train, y_train), "mse : ", mean_squared_error(train_pred, y_train))
    print("-"*50)

alpha :  0.07
score :  0.6356374725562111 mse :  25.686031596635438
score :  0.754387398246946 mse :  22.14984766184343
--------------------------------------------------
alpha :  0.1
score :  0.6356374725562111 mse :  25.686031596635438
score :  0.754387398246946 mse :  22.14984766184343
--------------------------------------------------
alpha :  0.5
score :  0.6356374725562111 mse :  25.686031596635438
score :  0.754387398246946 mse :  22.14984766184343
--------------------------------------------------
alpha :  1.3
score :  0.6356374725562111 mse :  25.686031596635438
score :  0.754387398246946 mse :  22.14984766184343
--------------------------------------------------
alpha :  2
score :  0.6356374725562111 mse :  25.686031596635438
score :  0.754387398246946 mse :  22.14984766184343
--------------------------------------------------


In [13]:
from sklearn.model_selection import GridSearchCV

params = {"alpha": [0.07, 0.1, 0.5, 1.3, 2]}
lasso = Lasso()
grid_cv = GridSearchCV(lasso, param_grid = params, cv=5)
grid_cv.fit(x_train, y_train)
print("최적의 하이퍼 파라미터 : ", grid_cv.best_params_ )
print("train : ", grid_cv.score(x_train, y_train))
print("test : ", grid_cv.score(x_test, y_test))

최적의 하이퍼 파라미터 :  {'alpha': 0.1}
train :  0.7533303195515548
test :  0.6352812434132087


In [15]:
lasso = Lasso(alpha=0.07)
lasso.fit(x_train, y_train)
print(x_train.columns )
lasso.coef_

Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
       'PTRATIO', 'B', 'LSTAT'],
      dtype='object')


array([-0.08454171,  0.03441552, -0.01150775,  1.207332  , -0.        ,
        4.64719588, -0.01801698, -0.99796159,  0.22591742, -0.01297881,
       -0.87534062,  0.00965054, -0.50586973])

In [16]:
alphas

[0.07, 0.1, 0.5, 1.3, 2]

In [19]:
coeff_df = pd.DataFrame(index = x_train.columns)
#coeff_df
for idx, alpha in enumerate(alphas):
    print(idx, " : ", alpha)
    lasso = Lasso(alpha=alpha)
    lasso.fit(x_train, y_train)
    col_name = "alpha : " + str(alpha)
    coeff_df[col_name] = lasso.coef_
coeff_df

0  :  0.07
1  :  0.1
2  :  0.5
3  :  1.3
4  :  2


Unnamed: 0,alpha : 0.07,alpha : 0.1,alpha : 0.5,alpha : 1.3,alpha : 2
CRIM,-0.084542,-0.08421,-0.072227,-0.04624,-0.023143
ZN,0.034416,0.034504,0.034865,0.034615,0.027755
INDUS,-0.011508,-0.005774,-0.0,-0.0,-0.0
CHAS,1.207332,0.737256,0.0,0.0,0.0
NOX,-0.0,-0.0,-0.0,-0.0,-0.0
RM,4.647196,4.563962,3.293072,0.742676,0.0
AGE,-0.018017,-0.016242,-0.0,0.024831,0.040407
DIS,-0.997962,-0.97823,-0.765847,-0.385394,-0.0
RAD,0.225917,0.230215,0.229945,0.215064,0.160249
TAX,-0.012979,-0.013276,-0.013584,-0.013341,-0.011466


In [20]:
from sklearn.linear_model import Ridge

alphas = [0.01, 0.1, 1, 10, 100]

In [21]:
for alpha in alphas : 
    ridge = Ridge(alpha = alpha)
    ridge.fit(x_train, y_train)
    
    train_pred = ridge.predict(x_train)
    test_pred = ridge.predict(x_test)
    
    train_score = ridge.score(x_train, y_train)
    test_score = ridge.score(x_test, y_test)
    
    train_mse = mean_squared_error(train_pred, y_train)
    test_mse = mean_squared_error(test_pred, y_test)
    
    print("alpha : ", a)
    print("score : ", train_score, "mse : ", train_mse)
    print("score : ", test_score, "mse : ", test_mse)
    print("-"*50)

alpha :  2
score :  0.7653220336843106 mse :  21.163739834123188
score :  0.6493473161348786 mse :  24.719545065166
--------------------------------------------------
alpha :  2
score :  0.7652574884608364 mse :  21.169560654622824
score :  0.6490914178698826 mse :  24.7375848206983
--------------------------------------------------
alpha :  2
score :  0.7631333430331375 mse :  21.361120441455085
score :  0.6461006699750316 mse :  24.948420016797222
--------------------------------------------------
alpha :  2
score :  0.7563754740851001 mse :  21.970558909389364
score :  0.6445264188422062 mse :  25.059398125941645
--------------------------------------------------
alpha :  2
score :  0.7351018306257955 mse :  23.88905966412561
score :  0.6555448692391059 mse :  24.282643537520386
--------------------------------------------------


In [22]:
coeff_df = pd.DataFrame(index=x_train.columns)
for alpha in alphas : 
    ridge = Ridge(alpha = alpha)
    ridge.fit(x_train, y_train)
    col_name = "alpha : " + str(alpha)
    coeff_df[col_name] = ridge.coef_
coeff_df

Unnamed: 0,alpha : 0.01,alpha : 0.1,alpha : 1,alpha : 10,alpha : 100
CRIM,-0.095028,-0.094334,-0.090681,-0.087442,-0.087136
ZN,0.032042,0.03222,0.033266,0.036034,0.043777
INDUS,0.044237,0.039225,0.012466,-0.016826,-0.027987
CHAS,2.317086,2.307035,2.215432,1.617031,0.47635
NOX,-16.576186,-15.367057,-8.890763,-1.720201,-0.186454
RM,4.641523,4.651692,4.685271,4.358257,2.363852
AGE,-0.007706,-0.008717,-0.01395,-0.016666,-0.000543
DIS,-1.312298,-1.292747,-1.188362,-1.077145,-0.996421
RAD,0.257292,0.254463,0.240154,0.237402,0.286789
TAX,-0.011007,-0.011109,-0.011691,-0.012889,-0.015191


In [23]:
from sklearn.linear_model import ElasticNet

# l1_ratios : 0에 가까울수록 L2(릿지)규제에 가깝다
# l1_ratios : 1에 가까울수록 L1(릿소)규제에 가깝다
ratios = [0.2, 0.5, 0.8]
alphas = [0.1, 0.7, 1.5]

In [24]:
el = ElasticNet(alpha=0.7, l1_ratio=0.2)
el.fit(x_train, y_train)

print("train : ", el.score(x_train, y_train))
print("test : ", el.score(x_test, y_test))

train :  0.7157433952616225
test :  0.6472823272856025


In [28]:
params ={
    "alpha" : alphas,
    "l1_ratio" : ratios
}
el = ElasticNet()
grid_cv = GridSearchCV(el, param_grid=params, cv=5)
grid_cv.fit(x_train, y_train)

print("최적의 하이퍼 파라미터 : ", grid_cv.best_params_ )
print("train : ", grid_cv.score(x_train, y_train))
print("test : ", grid_cv.score(x_test, y_test))

최적의 하이퍼 파라미터 :  {'alpha': 0.1, 'l1_ratio': 0.2}
train :  0.7505247777008414
test :  0.6502819930131112
