##### 라쏘
- sklearn.linear_model.Lasso(alpha=1.0, *, fit_intercept=True, precompute=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')

#### 엘라스틱 넷 

- sklearn.linear_model.ElasticNet(alpha=1.0, *, l1_ratio=0.5, fit_intercept=True, precompute=False, max_iter=1000, copy_X=True, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic'

- l1_ratio를 통해서 해당 모델이 얼마나 많이 라쏘규제를 사용할지 결정. 낮아지면 당연히 릿지규제가 증가할 것. 

- 라쏘는 규제를통해 변수를 제거하고 릿지는 변수를 0에 가깝게

In [1]:
import pandas as pd
column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
boston_df = pd.read_csv('./datasets/boston_housing.csv' , header=None , delimiter=r"\s+" , names=column_names)
boston_df.head(3)

y_target = boston_df['MEDV']
x_data = boston_df.drop(['MEDV'],axis=1,inplace=False)

In [10]:
from sklearn.linear_model import Lasso, ElasticNet 
from sklearn.model_selection import cross_val_score
import numpy as np

alpha=[0,0.1,1,10,100]
def get_linear_reg_eval(model_name, params=None, x_data_n=None, y_target_n=None, verbose=True):
    coeff_df = pd.DataFrame()
    if verbose : print('#####', model_name, '#####')
    for param in params:
        if model_name =='Ridge': model = Ridge(alpha=param)
        elif model_name =='Lasso': model = Lasso(alpha=param)
        elif model_name =='ElasticNet': model = ElasticNet(alpha=param, l1_ratio=0.7)

        neg_mse_scores = cross_val_score(model,x_data_n,
                                         y_target_n, scoring='neg_mean_squared_error',cv=5)
        
        avg_rmse = np.mean(np.sqrt(-1*neg_mse_scores))
        print(f'알파 {alpha}일때, 폴드 세트의 평균 RMSE:{avg_rmse}')

        model.fit(x_data, y_target)
        coeff= pd.Series(data=model.coef_, index= x_data.columns)
        colname= 'alpha:'+ str(param)
        coeff_df[colname] = coeff 
    return coeff_df 

In [11]:
lasso_alphas=[0.07,0.1,0.5,1,3]

coeff_lasso_df = get_linear_reg_eval('Lasso', params=lasso_alphas, x_data_n = x_data, y_target_n= y_target)

##### Lasso #####
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.612284267526676
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.615116035266935
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.6691234095948975
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.776020813823375
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:6.188763210800905


In [12]:
sort_column = 'alpha:'+str(lasso_alphas[0])
coeff_lasso_df.sort_values(by=sort_column, ascending=False)

Unnamed: 0,alpha:0.07,alpha:0.1,alpha:0.5,alpha:1,alpha:3
RM,3.789725,3.703202,2.498212,0.949811,0.0
CHAS,1.434343,0.95519,0.0,0.0,0.0
RAD,0.270936,0.274707,0.277451,0.264206,0.061864
ZN,0.049059,0.049211,0.049544,0.049165,0.037231
B,0.010248,0.010249,0.009469,0.008247,0.00651
NOX,-0.0,-0.0,-0.0,-0.0,0.0
AGE,-0.011706,-0.010037,0.003604,0.02091,0.042495
TAX,-0.01429,-0.01457,-0.015442,-0.015212,-0.008602
INDUS,-0.04212,-0.036619,-0.005253,-0.0,-0.0
CRIM,-0.098193,-0.097894,-0.083289,-0.063437,-0.0


In [13]:
elastic_alphas=[0.07,0.1,0.5,1,3]
coeff_elastic_df = get_linear_reg_eval('ElasticNet', params=elastic_alphas,
                                       x_data_n=x_data, y_target_n=y_target)

##### ElasticNet #####
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.541654347348139
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.52592849629491
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.466748649445586
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:5.596874445109748
알파 [0, 0.1, 1, 10, 100]일때, 폴드 세트의 평균 RMSE:6.068121638621163


In [14]:
sort_column = 'alpha:'+str(elastic_alphas[0])
coeff_elastic_df.sort_values(by=sort_column, ascending=False)

Unnamed: 0,alpha:0.07,alpha:0.1,alpha:0.5,alpha:1,alpha:3
RM,3.574162,3.414154,1.918419,0.938789,0.0
CHAS,1.330724,0.979706,0.0,0.0,0.0
RAD,0.27888,0.283443,0.300761,0.289299,0.146846
ZN,0.050107,0.050617,0.052878,0.052136,0.038268
B,0.010122,0.010067,0.009114,0.00832,0.00702
AGE,-0.010116,-0.008276,0.00776,0.020348,0.043446
TAX,-0.014522,-0.014814,-0.016046,-0.016218,-0.011417
INDUS,-0.044855,-0.042719,-0.023252,-0.0,-0.0
CRIM,-0.099468,-0.099213,-0.08907,-0.073577,-0.019058
NOX,-0.175072,-0.0,-0.0,-0.0,-0.0
