## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則話來觀察訓練情形。

In [4]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

In [6]:
boston = datasets.load_boston()

In [7]:
boston.keys()

dict_keys(['data', 'target', 'feature_names', 'DESCR'])

In [33]:
col_names = boston['feature_names'] #[:, np.newaxis]
X = pd.DataFrame(boston.data, columns = col_names)
Y = boston.target
print(X.head())

      CRIM    ZN  INDUS  CHAS    NOX     RM   AGE     DIS  RAD    TAX  \
0  0.00632  18.0   2.31   0.0  0.538  6.575  65.2  4.0900  1.0  296.0   
1  0.02731   0.0   7.07   0.0  0.469  6.421  78.9  4.9671  2.0  242.0   
2  0.02729   0.0   7.07   0.0  0.469  7.185  61.1  4.9671  2.0  242.0   
3  0.03237   0.0   2.18   0.0  0.458  6.998  45.8  6.0622  3.0  222.0   
4  0.06905   0.0   2.18   0.0  0.458  7.147  54.2  6.0622  3.0  222.0   

   PTRATIO       B  LSTAT  
0     15.3  396.90   4.98  
1     17.8  396.90   9.14  
2     17.8  392.83   4.03  
3     18.7  394.63   2.94  
4     18.7  396.90   5.33  


In [36]:
# 線性回歸
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.1, random_state = 4)
regr = linear_model.LinearRegression()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_test)
print(regr.coef_)

print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

[-1.24793110e-01  4.83961673e-02  1.88111508e-02  3.08800922e+00
 -1.73655165e+01  3.60982405e+00  2.27233321e-03 -1.49381500e+00
  3.19455416e-01 -1.27236845e-02 -9.28369630e-01  9.60925451e-03
 -5.34508193e-01]
Mean squared error: 17.03


In [42]:
# LASSO
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.1, random_state = 4)
lasso = linear_model.Lasso(alpha = 1)
lasso.fit(x_train, y_train)

y_pred = lasso.predict(x_test)

print(lasso.coef_)
print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

[-0.07256167  0.049677   -0.          0.         -0.          0.80504721
  0.02330318 -0.68471274  0.26857502 -0.01526236 -0.71722423  0.00834102
 -0.77160917]
Mean squared error: 23.25


In [40]:
# Ridge
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.1, random_state = 4)
ridge = linear_model.Ridge(alpha = 0.5)
ridge.fit(x_train, y_train)

y_pred = ridge.predict(x_test)

print(ridge.coef_)
print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

[-1.22530460e-01  4.90837392e-02 -2.89976602e-04  2.97744734e+00
 -1.27417020e+01  3.64539011e+00 -1.97242234e-03 -1.42793561e+00
  3.07976650e-01 -1.30288126e-02 -8.80266996e-01  9.83570193e-03
 -5.40607841e-01]
Mean squared error: 17.19


In [8]:
print(boston.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      