# scikit-learnのトレーニング♨

## DNN 編

## [目次](TableOfContents.ipynb)

## 参考
開発基盤部会 Wiki
- データマイニング（DM）- Python - DL  
https://dotnetdevelopmentinfrastructure.osscons.jp/index.php?%E3%83%87%E3%83%BC%E3%82%BF%E3%83%9E%E3%82%A4%E3%83%8B%E3%83%B3%E3%82%B0%EF%BC%88DM%EF%BC%89-%20Python%20-%20DL

## 環境準備

In [None]:
import io
import requests

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn import metrics
from sklearn.metrics import confusion_matrix as cm
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.layers import BatchNormalization
print(tf.__version__)

import keras
print(keras.__version__)
# モデル定義
from keras.models import Model, Sequential, model_from_json
from keras.layers import Dense, Input, Activation, Flatten, Dropout, LSTM
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPool2D
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras import optimizers
from keras.optimizers import SGD, Adam
# その他
from keras.applications.vgg16 import VGG16
from keras.utils import np_utils

import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

## DNNアルゴリズム・モデル

### DNNで重回帰分析

#### データ

##### 生成
[重回帰分析](ScikitLearnTraining1.ipynb)で使用したデータを使用。

##### 加工
...

##### 理解
...

##### 準備

###### 説明変数・目的変数分割

In [None]:
x_org = np.array(df.drop(['medv'], axis=1))
y_org = np.array(df.loc[:, ['medv']])

###### 正規化
axis=0で列単位（変数単位）。

In [None]:
mean = x_org.mean(axis=0)
std = x_org.std(axis=0)
x = (x_org - mean) / std

###### 学習・テストデータの分割（ホールド・アウト法

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x, y_org, test_size = 0.3, random_state = 0)

#### モデリング

##### DNNの定義

In [None]:
model = keras.Sequential([
    keras.layers.Dense(64, activation=tf.nn.relu,
                       input_shape=(x_train.shape[1],)),
    keras.layers.Dense(64, activation=tf.nn.relu),
    keras.layers.Dense(1)
])

##### コンパイル
- 回帰の損失関数は誤差二乗和（mse ≒ mean_squared_error）
- [optimizer=Adam](TensorFlowAndKeras0.ipynb)を指定する。
- metricsは平均絶対誤差（mae ≒ mean_absolute_error）

In [None]:
model.compile(loss='mse', optimizer=Adam(), metrics=['mae'])

##### 確認

In [None]:
model.summary()

##### 実行

###### 学習

In [None]:
batch_size = 20
n_epoch = 200
hist = model.fit(x_train,
                 y_train,
                 epochs=n_epoch,
                 validation_data=(x_test, y_test),
                 verbose=0,
                 batch_size=batch_size)

###### 推論

In [None]:
y_pred = model.predict(x)
y_pred.flatten()

##### 評価

###### 実測・予測を表示

In [None]:
plt.plot(y_org, color='blue') # 実測値
plt.plot(y_pred, color='red') # 予測値
plt.show()

###### [スコアを表示](https://dotnetdevelopmentinfrastructure.osscons.jp/index.php?%E3%83%87%E3%83%BC%E3%82%BF%E3%83%9E%E3%82%A4%E3%83%8B%E3%83%B3%E3%82%B0%EF%BC%88DM%EF%BC%89-%20CRISP-DM#uf759972)
- 平均絶対誤差（MAE：Mean Absolute Error）
- 平均二乗誤差（MSE：Mean Squared Error）

In [None]:
train_mse_score, train_mae_score = model.evaluate(x_train, y_train)
test_mse_score, test_mae_score = model.evaluate(x_test, y_test)
print("train_mae_score: ",train_mae_score)
print("test_mae_score: ",test_mae_score)
print("train_mse_score: ",train_mse_score)
print("test_mse_score: ",test_mse_score)

###### 学習履歴を表示

In [None]:
def plot_history_loss(hist):
    plt.plot(hist.history['loss'],label="loss for training")
    plt.plot(hist.history['val_loss'],label="loss for validation")
    plt.title('model loss')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.legend(loc='best')
    plt.show()

In [None]:
plot_history_loss(hist)

### DNNの分類器