## [作業重點]
使用 Sklearn 中的線性迴歸模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義

## 作業
試著使用 sklearn datasets 的其他資料集 (wine, boston, ...)，來訓練自己的線性迴歸模型。

### HINT: 注意 label 的型態，確定資料集的目標是分類還是回歸，在使用正確的模型訓練！

In [1]:
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    classification_report,
    mean_squared_error,
    r2_score,
)
from sklearn.exceptions import ConvergenceWarning
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings(
    action='ignore',
    category=ConvergenceWarning
)
plt.style.use('seaborn')

### Dataset: wine
#### Classification

In [2]:
wine = datasets.load_wine()

x_train, x_test, y_train, y_test = train_test_split(
    wine.data, wine.target, test_size=0.1, random_state=9487)

clf = linear_model.LogisticRegression(solver='lbfgs', multi_class='ovr')
clf.fit(x_train, y_train)
pred = clf.predict(x_test)

print(classification_report(y_test, pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00         5
           1       1.00      1.00      1.00        10
           2       1.00      1.00      1.00         3

    accuracy                           1.00        18
   macro avg       1.00      1.00      1.00        18
weighted avg       1.00      1.00      1.00        18



### Dataset: boston
#### Regression

In [3]:
boston = datasets.load_boston()
x_train, x_test, y_train, y_test = train_test_split(
    boston.data, boston.target, test_size=0.1, random_state=9487)

reg = linear_model.LinearRegression()
reg.fit(x_train, y_train)
pred = reg.predict(x_test)

print(
    f"{'R2-Score':>10}: {r2_score(y_test, pred)}\n"
    f"{'MSE':>10}: {mean_squared_error(y_test, pred)}"
)


  R2-Score: 0.6914226667689838
       MSE: 32.026191470103655


### Dataset: breast_cancer
#### Classification

LogisticRegression

In [4]:
breast_cancer = datasets.load_breast_cancer()

x_train, x_test, y_train, y_test = train_test_split(
    breast_cancer.data, breast_cancer.target, test_size=0.1, random_state=9487)

clf = linear_model.LogisticRegression(solver='lbfgs')
clf.fit(x_train, y_train)
pred = clf.predict(x_test)

print(classification_report(y_test, pred))

              precision    recall  f1-score   support

           0       0.87      0.91      0.89        22
           1       0.94      0.91      0.93        35

    accuracy                           0.91        57
   macro avg       0.91      0.91      0.91        57
weighted avg       0.91      0.91      0.91        57



LogisticRegressionCV

In [5]:
clf = linear_model.LogisticRegressionCV(solver='lbfgs', cv=3)
clf.fit(x_train, y_train)
pred = clf.predict(x_test)

print(f"C={clf.C_}\n-")
print(classification_report(y_test, pred))

C=[1291.54966501]
-
              precision    recall  f1-score   support

           0       0.88      0.95      0.91        22
           1       0.97      0.91      0.94        35

    accuracy                           0.93        57
   macro avg       0.92      0.93      0.93        57
weighted avg       0.93      0.93      0.93        57

