# Sprint15課題 ディープラーニングフレームワーク2

## この課題の目的

- フレームワークのコードを読めるようにする
- フレームワークを習得し続けられるようになる
- 理論を知っている範囲をフレームワークで動かす

## 公式Example

深層学習フレームワークには公式に様々なモデルのExampleコードが公開されています。

## 【問題1】公式Exampleを分担して実行

TensorFLowの公式Exampleを分担して実行してください。

以下の中から1人ひとつ選び実行し、その結果を簡単に発表してください。

research

定番のモデルから最新のモデルまで多様なコードが公開されています。

[models/research at master · tensorflow/models](https://github.com/tensorflow/models/tree/master/research)

tutorials

TensorFLowのチュートリアルとして用意された簡単なモデルが含まれています。

[models/tutorials at master · tensorflow/models](https://github.com/tensorflow/models/tree/master/tutorials)

**GPU使用のため、GoogleCorabで実施。別ファイルで保管。**  
[Sprint15 問題1のGithub](https://github.com/yuuhi-s/diveintocode-ml/blob/master/diveintocode-term2/sprint15/sprint15-dnn-framework2_question1.ipynb)

## 異なるフレームワークへの書き換え

Sprint14で作成した4種類のデータセットを扱うTensorFLowのコードを異なるフレームワークに変更していきます。

- Iris（Iris-versicolorとIris-virginicaのみの2値分類）
- Iris（3種類全ての目的変数を使用して多値分類）
- House Prices
- MNIST

## Kerasへの書き換え

KerasはTensorFLowに含まれるtf.kerasモジュールを使用してください。

KerasにはSequentialモデルかFunctional APIかなど書き方に種類がありますが、これは指定しません。

## 【問題2】Iris（2値分類）をKerasで学習

Sprint14で作成したIrisデータセットに対する2値分類をKerasに書き換えてください。

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import r2_score
from keras.datasets import mnist

Using TensorFlow backend.


In [2]:
#データセットの読み込み
dataset_path = 'Iris.csv'
df = pd.read_csv(dataset_path)

#データフレームから条件抽出
df = df[(df['Species'] == 'Iris-versicolor') | (df['Species'] == 'Iris-virginica')]
y = df['Species']
X = df.loc[:, ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]
y = np.array(y)
X = np.array(X)

#ラベルを数値に変換
y[y == 'Iris-versicolor'] = 0
y[y == 'Iris-virginica'] = 1

#次元変換
y = y.astype(np.int)[:, np.newaxis]

#trainとtestに分割
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

#さらにtrainとvalに分割
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

In [3]:
#入力層
input_data = tf.keras.layers.Input(shape=(4,))

#隠れ層
x = tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer='he_normal')(input_data)
x = tf.keras.layers.Dense(50, activation=tf.nn.relu, kernel_initializer='he_normal')(x)

#出力層
output = tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, kernel_initializer='he_normal')(x)

#インスタンスを渡す
model = tf.keras.Model(inputs=input_data, outputs=output)

Instructions for updating:
Colocations handled automatically by placer.


In [4]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 4)                 0         
_________________________________________________________________
dense (Dense)                (None, 100)               500       
_________________________________________________________________
dense_1 (Dense)              (None, 50)                5050      
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 51        
Total params: 5,601
Trainable params: 5,601
Non-trainable params: 0
_________________________________________________________________


In [5]:
#コンパイル
model.compile(loss='binary_crossentropy', 
                           optimizer=tf.train.AdamOptimizer(learning_rate=0.01),
                            metrics=['accuracy'])

In [6]:
#学習
history = model.fit(X_train, y_train, batch_size=10, epochs=10, verbose=1, validation_data=(X_val, y_val))

Train on 64 samples, validate on 16 samples
Instructions for updating:
Use tf.cast instead.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [7]:
#推定
y_pred_proba = model.predict(X_test)[:, 0]
y_pred = np.where(y_pred_proba > 0.5, 1, 0)

print("y_pred_proba", y_pred_proba)
print("y_pred", y_pred)

y_pred_proba [0.05722302 0.94213897 0.05412406 0.9506238  0.7354522  0.94088006
 0.40747905 0.4921059  0.9653951  0.62999886 0.92520547 0.92174923
 0.9466534  0.23544744 0.03701541 0.05178449 0.41029733 0.02069745
 0.7864569  0.03622207]
y_pred [0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 1 0]


In [8]:
#評価
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.25129827857017517
Test accuracy: 0.9


## 【問題3】Iris（多値分類）をKerasで学習

Sprint14で作成したIrisデータセットに対する3値分類をKerasに書き換えてください。

In [9]:
#データセットの読み込み
dataset_path = 'Iris.csv'
df = pd.read_csv(dataset_path)

#nparrayに変換
y = df['Species']
X = df.loc[:, ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]
y = np.array(y)
X = np.array(X)

#ラベルを数値に変換
y[y == 'Iris-setosa'] = 0
y[y == 'Iris-versicolor'] = 1
y[y == 'Iris-virginica'] = 2

#次元変換
y = y.astype(np.int)[:, np.newaxis]

#one-hotエンコーディング
enc = OneHotEncoder(handle_unknown='ignore', sparse=False)
y = enc.fit_transform(y)

#trainとtestに分割
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

#さらにtrainとvalに分割
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

In [10]:
#入力層
input_data = tf.keras.layers.Input(shape=(4,))

#隠れ層
x = tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer='he_normal')(input_data)
x = tf.keras.layers.Dense(50, activation=tf.nn.relu, kernel_initializer='he_normal')(x)

#出力層
output = tf.keras.layers.Dense(3, activation=tf.nn.softmax, kernel_initializer='he_normal')(x)

#インスタンスを渡す
model = tf.keras.Model(inputs=input_data, outputs=output)

In [11]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 4)                 0         
_________________________________________________________________
dense_3 (Dense)              (None, 100)               500       
_________________________________________________________________
dense_4 (Dense)              (None, 50)                5050      
_________________________________________________________________
dense_5 (Dense)              (None, 3)                 153       
Total params: 5,703
Trainable params: 5,703
Non-trainable params: 0
_________________________________________________________________


In [12]:
#コンパイル
model.compile(loss='categorical_crossentropy', 
                           optimizer=tf.train.AdamOptimizer(learning_rate=0.01),
                            metrics=['accuracy'])

In [13]:
#学習
history = model.fit(X_train, y_train, batch_size=10, epochs=10, verbose=1, validation_data=(X_val, y_val))

Train on 96 samples, validate on 24 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [14]:
#推定
y_pred_proba = model.predict(X_test)
y_pred = np.argmax(y_pred_proba, axis=1)

print("y_pred_proba", y_pred_proba)
print("y_pred", y_pred)

y_pred_proba [[1.74333631e-10 2.35601421e-03 9.97644007e-01]
 [9.51593625e-04 9.96637464e-01 2.41102767e-03]
 [9.99867678e-01 1.32284447e-04 8.39121411e-11]
 [1.23960331e-09 6.68445304e-02 9.33155537e-01]
 [9.99254286e-01 7.45752535e-04 3.39551343e-09]
 [6.88471329e-12 7.95335043e-04 9.99204695e-01]
 [9.99468863e-01 5.31131634e-04 2.17699325e-09]
 [1.55048838e-04 9.96104240e-01 3.74066317e-03]
 [8.44952810e-05 9.94827211e-01 5.08832699e-03]
 [1.58672710e-03 9.96818662e-01 1.59457047e-03]
 [1.45725405e-08 7.76530653e-02 9.22346950e-01]
 [3.66350723e-04 9.96144295e-01 3.48929642e-03]
 [8.09532503e-05 9.83649611e-01 1.62695423e-02]
 [8.75574988e-05 9.88160372e-01 1.17521202e-02]
 [4.37615272e-05 9.58529890e-01 4.14263122e-02]
 [9.99029756e-01 9.70212626e-04 5.71235459e-09]
 [7.20438475e-05 9.57449079e-01 4.24788967e-02]
 [6.39596255e-05 9.19891655e-01 8.00444335e-02]
 [9.98539448e-01 1.46054302e-03 1.68144005e-08]
 [9.99753773e-01 2.46212643e-04 3.67579578e-10]
 [5.39855805e-09 1.39303263

In [15]:
#評価
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.043709684163331985
Test accuracy: 0.96666664


## 【問題4】House PricesをKerasで学習

Sprint14で作成したHouse Pricesデータセットに対する回帰をKerasに書き換えてください。

In [16]:
#データセットの読み込み
dataset_path = 'train.csv'
df = pd.read_csv(dataset_path)

#nparrayに変換
y = df['SalePrice']
X = df.loc[:, ['GrLivArea', 'YearBuilt']]
y = np.array(y)
X = np.array(X)

#次元変換
y = y[:, np.newaxis]

#対数変換
X = np.log(X)
y = np.log(y)

#trainとtestに分割
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

#さらにtrainとvalに分割
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

In [17]:
#入力層
input_data = tf.keras.layers.Input(shape=(2,))

#隠れ層
x = tf.keras.layers.Dense(200, activation=tf.nn.relu, kernel_initializer='he_normal')(input_data)
x = tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer='he_normal')(x)

#出力層
output = tf.keras.layers.Dense(1, kernel_initializer='he_normal')(x)

#インスタンスを渡す
model = tf.keras.Model(inputs=input_data, outputs=output)

In [18]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 2)                 0         
_________________________________________________________________
dense_6 (Dense)              (None, 200)               600       
_________________________________________________________________
dense_7 (Dense)              (None, 100)               20100     
_________________________________________________________________
dense_8 (Dense)              (None, 1)                 101       
Total params: 20,801
Trainable params: 20,801
Non-trainable params: 0
_________________________________________________________________


In [19]:
#コンパイル
model.compile(loss='mean_squared_error', 
                           optimizer=tf.train.AdamOptimizer(learning_rate=0.001))

Instructions for updating:
Use tf.cast instead.


In [20]:
#学習
history = model.fit(X_train, y_train, batch_size=10, epochs=15, verbose=1, validation_data=(X_val, y_val))

Train on 934 samples, validate on 234 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [21]:
#推定
y_pred = model.predict(X_test)

print("y_pred", y_pred)

y_pred [[12.552007 ]
 [12.124228 ]
 [11.869594 ]
 [12.366162 ]
 [11.809992 ]
 [11.7346115]
 [12.074779 ]
 [11.991236 ]
 [12.975975 ]
 [11.83131  ]
 [11.998376 ]
 [12.195977 ]
 [12.3569355]
 [11.670026 ]
 [11.792945 ]
 [11.954691 ]
 [12.327058 ]
 [11.75028  ]
 [11.89304  ]
 [12.252234 ]
 [12.003289 ]
 [11.640494 ]
 [11.592439 ]
 [12.081495 ]
 [12.298425 ]
 [12.197667 ]
 [12.141098 ]
 [11.354317 ]
 [12.216508 ]
 [11.877198 ]
 [12.242801 ]
 [12.25414  ]
 [11.684465 ]
 [12.448808 ]
 [12.388022 ]
 [12.059301 ]
 [12.135438 ]
 [11.657858 ]
 [12.406813 ]
 [12.600335 ]
 [12.4139805]
 [11.986899 ]
 [11.967882 ]
 [12.289934 ]
 [12.650417 ]
 [12.218814 ]
 [11.598187 ]
 [11.669319 ]
 [12.198355 ]
 [11.683875 ]
 [12.61772  ]
 [11.794726 ]
 [11.975888 ]
 [11.554793 ]
 [12.17963  ]
 [11.649259 ]
 [12.020728 ]
 [12.271979 ]
 [11.825083 ]
 [11.558705 ]
 [11.7810545]
 [11.773788 ]
 [11.896198 ]
 [11.845025 ]
 [12.201434 ]
 [12.067108 ]
 [11.686911 ]
 [12.213786 ]
 [11.890805 ]
 [12.234686 ]
 [12.052661 ]

In [22]:
#評価
#MSE
score = model.evaluate(X_test, y_test, verbose=0)
print('Test mse:', score)

#R2
r2 = r2_score(y_test, y_pred)
print('Test R2:', r2)

Test mse: 0.06874046178713236
Test R2: 0.5459674125185714


## 【問題5】MNISTをKerasで学習

Sprint14で作成したMNISTデータセットによる画像の多値分類をKerasに書き換えてください。

In [23]:
#データの読み込み
(X_train, y_train), (X_test, y_test) = mnist.load_data()

#次元変換
y_train = y_train.astype(np.int)[:, np.newaxis]
y_test = y_test.astype(np.int)[:, np.newaxis]

#one-hotエンコーディング
enc = OneHotEncoder(handle_unknown='ignore', sparse=False)
y_train = enc.fit_transform(y_train)
y_test = enc.fit_transform(y_test)

#データの変換
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)
X_train = X_train.astype(np.float)
X_test = X_test.astype(np.float)
X_train /= 255
X_test /= 255

#分割
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)

In [24]:
#入力層
input_data = tf.keras.layers.Input(shape=(784,))

#隠れ層
x = tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer='he_normal')(input_data)
x = tf.keras.layers.Dense(50, activation=tf.nn.relu, kernel_initializer='he_normal')(x)

#出力層
output = tf.keras.layers.Dense(10, activation=tf.nn.softmax, kernel_initializer='he_normal')(x)

#インスタンスを渡す
model = tf.keras.Model(inputs=input_data, outputs=output)

In [25]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         (None, 784)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 100)               78500     
_________________________________________________________________
dense_10 (Dense)             (None, 50)                5050      
_________________________________________________________________
dense_11 (Dense)             (None, 10)                510       
Total params: 84,060
Trainable params: 84,060
Non-trainable params: 0
_________________________________________________________________


In [26]:
#コンパイル
model.compile(loss='categorical_crossentropy', 
                           optimizer=tf.train.AdamOptimizer(learning_rate=0.01),
                            metrics=['accuracy'])

In [27]:
#学習
history = model.fit(X_train, y_train, batch_size=10, epochs=10, verbose=1, validation_data=(X_val, y_val))

Train on 48000 samples, validate on 12000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [28]:
#推定
y_pred_proba = model.predict(X_test)

y_pred = np.argmax(y_pred_proba, axis=1)

print("y_pred_proba", y_pred_proba)
print("y_pred", y_pred)

y_pred_proba [[3.49033591e-34 3.16157802e-08 3.65171028e-13 ... 1.00000000e+00
  7.72079122e-26 5.74775338e-09]
 [1.53312136e-17 1.96032506e-08 9.99561012e-01 ... 2.91927223e-04
  5.91441562e-08 0.00000000e+00]
 [0.00000000e+00 9.99999762e-01 1.33244322e-14 ... 4.16143548e-17
  2.34985350e-07 2.19536249e-28]
 ...
 [3.05605047e-28 7.24789880e-17 4.26016887e-25 ... 1.07663241e-13
  1.07337176e-14 1.99388211e-07]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00
  0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00
  0.00000000e+00 0.00000000e+00]]
y_pred [7 2 1 ... 4 5 6]


In [29]:
#評価
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.2697908725791796
Test accuracy: 0.9465
