## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則話來觀察訓練情形。

In [1]:
from sklearn import datasets,linear_model
import numpy as np
import matplotlib as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error,r2_score

In [2]:
dta = datasets.load_boston()

In [4]:
X = dta['data']
Y = dta['target']

In [5]:
train_X,test_X,train_Y,test_Y = train_test_split(X,Y,test_size=0.1)

In [6]:
reg = linear_model.LinearRegression().fit(train_X,train_Y)

In [7]:
mean_squared_error(reg.predict(test_X),test_Y)

24.445031396382863

In [10]:
print(dta['DESCR'])

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [8]:
print(reg.coef_)

[ -1.10875668e-01   4.86511566e-02   1.53473902e-02   3.13156507e+00
  -1.71649825e+01   3.82776288e+00  -4.72393125e-03  -1.47800136e+00
   3.04358452e-01  -1.23748730e-02  -9.59300597e-01   8.28569622e-03
  -5.03400662e-01]


In [12]:
len(X[1])

13

In [23]:
L_reg = linear_model.Lasso(alpha=0.3).fit(train_X,train_Y)

In [24]:
mean_squared_error(L_reg.predict(test_X),test_Y)

24.75925544651481

In [25]:
print(L_reg.coef_)

[-0.09299811  0.05209235 -0.01740225  0.         -0.          3.12029741
 -0.00638848 -1.06672632  0.28785032 -0.015716   -0.80677436  0.00967053
 -0.59474551]


In [29]:
R_reg  = linear_model.Ridge(alpha=0.1).fit(train_X,train_Y)

In [30]:
R_reg.coef_

array([ -1.10131468e-01,   4.87887021e-02,   1.03074980e-02,
         3.12460086e+00,  -1.59570752e+01,   3.83287043e+00,
        -5.83869349e-03,  -1.46081920e+00,   3.01732240e-01,
        -1.24815201e-02,  -9.47762196e-01,   8.38474122e-03,
        -5.04764806e-01])

In [31]:
mean_squared_error(R_reg.predict(test_X),test_Y)

24.495936309675127