## Train Keras : GPU

参考：<br>
- 深層学習コンパイラTVMと主要深層学習フレームワークをColaboratoryで使い倒そう<br>
https://qiita.com/stakemura/items/1761be70a06fa8ee853f

- 簡単なCNNによるディープラーニングライブラリ速度比較<br>
https://qiita.com/daigo0927/items/8092f3ff5276ffc4f088


## GPU モードの設定

メニューより<br>
　　<strong>ランタイム  ⇒  ランタイムのタイプを変更</strong> <br>
 を選択して、現れたダイアログで<br>
- ランタイムのタイプ  = <font color='red'><strong>Python3</strong></font>
- ハードウェアアクセラレータ  = <font color='red'><strong>GPU</strong></font>
- このノートブックを保存する際にコードセルの出力を除外する = <font color='red'><strong>OFF</strong></font>

に設定してから【保存】ボタンを押す。

## Google Drive をマウント

### <font color='red'>注意</font>
ランタイムの最初において、下記のコードを実行すると、<font color='red'><strong>認証コード</strong></font> の URL が表示される。<br>
URL をクリックして、リンク先で自分のアカウントを選択して認証した後、<br>
表示された認証コードをコピーして、下記の入力欄にペーストすればマウントが完了する。

### 参考：
　　Google ドライブの使い方<br>
　　https://www.appsupport.jp/category/drive/

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


## パスを追加して独自のパッケージをインポートできるようにする

In [0]:
import sys
sys.path.append('/content/drive/My Drive/compare_deeplibs')

In [3]:
!ls -l /content/drive/'My Drive'/compare_deeplibs

total 1328
drwx------ 2 root root    4096 Aug 25 05:33 CIFAR10
-rw------- 1 root root   12286 Aug 25 09:55 lap_record.csv
-rw------- 1 root root 1172512 Aug 25 08:28 model_torch.pth
drwx------ 2 root root    4096 Aug 24 23:11 __pycache__
-rw------- 1 root root   19192 Aug 25 09:55 train_Chainer_GPU_Tesla-K80.ipynb
-rw------- 1 root root   19208 Aug 25 09:55 train_Chainer_GPU_Tesla-T4.ipynb
-rw------- 1 root root   20684 Aug 25 09:40 train_Keras_GPU_Tesla-K80.ipynb
-rw------- 1 root root   11569 Aug 25 09:56 train_Keras_GPU_Tesla-T4.ipynb
-rw------- 1 root root   30977 Aug 25 06:51 train_Keras_TPU.ipynb
-rw------- 1 root root   19410 Aug 25 09:29 train_PyTorch_GPU_Tesla-T4.ipynb
-rw------- 1 root root   19678 Aug 25 06:38 train_TensorFlow_GPU_Tesla-K80.ipynb
-rw------- 1 root root   19457 Aug 25 09:34 train_TensorFlow_GPU_Tesla-T4.ipynb
-rw------- 1 root root    3432 Aug 24 22:59 utils.py


In [4]:
from utils import load_cifar10, load_cifar100, show_progress

Using TensorFlow backend.


## TensorFlow と Keras のバージョンの確認

In [5]:
import tensorflow as tf
import keras

print("TensorFlow: ", tf.__version__)
print("Keras     : ", keras.__version__)

TensorFlow:  1.14.0
Keras     :  2.2.4


## GPU のデバイスの情報を表示

In [6]:
from torch import cuda
assert cuda.is_available()
assert cuda.device_count() > 0
device_name = cuda.get_device_name(cuda.current_device())
print(device_name)

Tesla T4


## Warnings の抑制

今後の変更点などが警告として表示されるので、以下のセルの各文をコメントアウトして、一度は眺めておくと参考になる。

In [0]:
import warnings
warnings.filterwarnings('ignore')

import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)

## モデルを構築する関数

In [0]:
from tensorflow.keras.layers import Input, Conv2D, Activation, BatchNormalization
from tensorflow.keras.layers import MaxPooling2D, Flatten, Dense
from tensorflow.keras.models import Model

def cnn(image_size=32, num_output=10):
    w = int(image_size)

    inputs = Input(shape=(w, w, 3))
    x = Conv2D(64, (5, 5), strides=(1, 1), padding='same')(inputs)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((2, 2))(x)

    x = Conv2D(128, (5, 5), strides=(1, 1), padding='same')(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((2, 2))(x)
    
    x = Flatten()(x)
    x = Dense(num_output)(x)
    outputs = Activation('softmax')(x)

    model = Model(inputs=inputs, outputs=outputs)
    model.summary()
    
    return model

## 学習を管理するクラス

In [0]:
import time
import os
import numpy as np
from tensorflow.keras.optimizers import Adam

class Trainer(object):
    def __init__(self, num_epochs, batch_size):
        self.num_epochs = num_epochs
        self.batch_size = batch_size
        self.net = cnn()
        self.opt = Adam()
        self.load_cifar10()
        self.net.compile(loss='categorical_crossentropy', optimizer=self.opt)

    def load_cifar10(self):
        (self.x_train, self.y_train), (self.x_test, self.y_test) = load_cifar10()

    def load_cifar100(self):
        (self.x_train, self.y_train), (self.x_test, self.y_test) = load_cifar100()

    def train(self):
        num_batches = int(len(self.x_train) / self.batch_size)
        print('epochs : {}, number of baches : {}'.format(self.num_epochs, num_batches))

        lap_times = []
        # training iteration
        for e in range(self.num_epochs):
            permute_idx = np.random.permutation(np.arange(50000))
            lap_time = []
            
            for b in range(num_batches):
                x_batch = self.x_train[permute_idx[b*self.batch_size:(b+1)*self.batch_size]]
                y_batch = self.y_train[permute_idx[b*self.batch_size:(b+1)*self.batch_size]]

                s_time = time.time()
                loss = self.net.train_on_batch(x_batch, y_batch)
                e_time = time.time()
                lap_time.append(e_time - s_time)

                if b % 10 == 0:
                    preds = self.net.predict(x_batch)
                    acc = np.mean(np.sum(preds*y_batch, axis = 1))
                    show_progress(e+1, b+1, num_batches, loss, acc)

            # record single epoch training lap-time
            lap_times.append(np.sum(lap_time))

            # validation
            accs_val = []
            for b in range(int(len(self.x_test) / self.batch_size)):
                x_val = self.x_test[b*self.batch_size:(b+1)*self.batch_size]
                y_val = self.y_test[b*self.batch_size:(b+1)*self.batch_size]
                
                preds_val = self.net.predict(x_val)
                acc_val = np.mean(np.sum(preds_val * y_val, axis=1))
                accs_val.append(acc_val)
            print('\n{} epoch validation accuracy {}'.format(e+1, np.mean(accs_val)))

            # save trained model
            #if not os.path.exists('/content/drive/My Drive/compare_deeplibs/model_keras'):
            #    os.mkdir('/content/drive/My Drive/compare_deeplibs/model_keras')

            #self.net.save_weights('/content/drive/My Drive/compare_deeplibs/model_keras/model_{}.h5'.format(e))

        with open('/content/drive/My Drive/compare_deeplibs/lap_record.csv', 'a') as f:
            f.write('Keras-GPU')
            f.write(',' + device_name)
            for lap in lap_times:
                f.write(',' + str(lap))
            f.write('\n')

## 学習を実行するための関数

In [0]:
def train_keras(args):
    os.environ['CUDA_VISIBLE_DEVICES'] = str(args['gpu_id'])

    trainer = Trainer(num_epochs = args['epochs'],
                      batch_size = args['batch_size'])
    trainer.train()

## 計算開始時刻の記録

Google Colaboratory で実行する際に、日本時間の時刻を表示するためにはタイムゾーンの取得が必要となる。

In [11]:
import datetime
import pytz

start_time = datetime.datetime.now(pytz.timezone('Asia/Tokyo'))
print(start_time)

2019-08-25 18:56:13.142215+09:00


## 学習の実行

In [12]:
args={
    'epochs'     : 20,
    'batch_size' : 128,
    'gpu_id'     : 0
}

print(args)

for key, value in args.items():
    print('{:12s} : {}'.format(key, value))

train_keras(args)

{'epochs': 20, 'batch_size': 128, 'gpu_id': 0}
epochs       : 20
batch_size   : 128
gpu_id       : 0
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 32, 32, 64)        4864      
_________________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 64)        256       
_________________________________________________________________
activation (Activation)      (None, 32, 32, 64)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 128)       204928    
__________________________

## 学習に要した時間の表示

In [13]:
end_time = datetime.datetime.now(pytz.timezone('Asia/Tokyo'))
print("\nStart   Time  : " + str(start_time))
print(  "End     Time  : " + str(end_time))
print(  "Elapsed Time  : " + str(end_time - start_time))


Start   Time  : 2019-08-25 18:56:13.142215+09:00
End     Time  : 2019-08-25 18:58:53.556727+09:00
Elapsed Time  : 0:02:40.414512


## Google Colaboratory のセッションを開始してからの経過時間を表示

In [14]:
!cat /proc/uptime | awk '{print "経過時間 : " ($1 / 3600) " hours (" $1 " sec)"}'

経過時間 : 0.0684194 hours (246.31 sec)
