# 第7回講義 宿題

## 課題

今Lessonで学んだことに工夫を加えて、CNNでより高精度なCIFAR10の分類器を実装してみましょう。精度上位者はリーダーボードに載ります。

### 目標値

Accuracy 78%

### ルール

- 訓練データはx_train、 t_train、テストデータはx_testで与えられます。
- 予測ラベルは one_hot表現ではなく0~9のクラスラベル で表してください。
- **下のセルで指定されているx_train、t_train以外の学習データは使わないでください。**
- ネットワークの形などは特に制限を設けません。
- 高レベルのAPI(tf.layers)を利用しても構いません。

### 提出方法

- 2つのファイルを提出していただきます。
  - テストデータ (x_test) に対する予測ラベルをcsvファイル (ファイル名: submission_pred.csv) で提出してください。
  - それに対応するpythonのコードをsubmission_code.pyとして提出してください (%%writefileコマンドなどを利用してください)。

### 評価方法

- 予測ラベルのt_testに対する精度 (Accuracy) で評価します。
- 毎日夜24時にテストデータの一部に対する精度でLeader Boardを更新します。
- 締切日の夜24時にテストデータ全体に対する精度でLeader Boardを更新します。これを最終的な評価とします。

### データの読み込み

- この部分は修正しないでください

In [1]:
import numpy as np
import pandas as pd

def load_cifar10():
    
    # 学習データ
    x_train = np.load('/root/userspace/public/chap07/data/x_train.npy')
    t_train = np.load('/root/userspace/public/chap07/data/t_train.npy')

    # テストデータ
    x_test = np.load('/root/userspace/public/chap07/data/x_test.npy')
    
    x_train = x_train.astype('float32') / 255
    x_test = x_test.astype('float32') / 255
    
    t_train = np.eye(10)[t_train.astype('int32').flatten()]
    
    return (x_train, x_test, t_train)

### 畳み込みニューラルネットワーク(CNN)の実装

### 使用するデータの可視化

In [2]:
# %%writefile /root/userspace/chap07/submission/submission_code_VGG.py

import tensorflow as tf
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

rng = np.random.RandomState(1234)
random_state = 42

def tf_log(x):
    return tf.log(tf.clip_by_value(x, 1e-10, x))

### layer定義 ###

class Conv:
    def __init__(self, filter_shape, function = lambda x: x, strides = [1,1,1,1], padding = 'VALID'):
        # He initializationを使う
        # filter_shape = Height * Width * Num of input_channels * Num of output_channels
        fun_in = np.prod(filter_shape[:3])
        fun_out = np.prod(filter_shape[:2]) * filter_shape[3]
        self.W = tf.Variable(rng.uniform(
                low = -np.sqrt(6/ fun_in),
                high = np.sqrt(6/ fun_out),
                size = filter_shape
                ).astype('float32'), name = 'W')
        self.b = tf.Variable(np.zeros((filter_shape[3]), dtype = 'float32'), name = 'b')
        self.function = function
        self.strides = strides
        self.padding = padding
    
    def __call__(self, x):
        u = tf.nn.conv2d(x, self.W, strides = self.strides, padding = self.padding) + self.b
        return self.function(u)

class BatchNorm:
    def __init__(self, shape, epsilon=np.float32(1e-5)):
        self.gamma = tf.Variable(np.ones(shape, dtype='float32'), name='gamma')
        self.beta  = tf.Variable(np.zeros(shape, dtype='float32'), name='beta')
        self.epsilon = epsilon

    def __call__(self, x):
        mean, var = tf.nn.moments(x, axes=(0,1,2), keep_dims=True)
        std = tf.sqrt(var + self.epsilon)
        x_normalized = (x - mean)/std
        return self.gamma * x_normalized + self.beta    
    
class Activation:
    def __init__(self, function=lambda x: x):
        self.function = function
    
    def __call__(self, x):
        return self.function(x)

class Pooling:
    def __init__(self, ksize = [1, 2, 2, 1] , strides = [1, 2, 2, 1], padding = 'VALID'):
        self.ksize = ksize
        self.strides = strides
        self.padding = padding
    
    def __call__(self, x):
        return tf.nn.max_pool(x, ksize = self.ksize, strides = self.strides, padding = self.padding)
    
    
class Flatten:
    def __call__(self, x):
        return tf.reshape(x, (-1, np.prod(x.get_shape().as_list()[1:])))
    
    
class Dense:
    def __init__(self, in_dim, out_dim, function = lambda x: x):
        # ここでも, He Initialization
        self.W = tf.Variable(rng.uniform(
                low = - np.sqrt(6/ in_dim),
                high = np.sqrt(6/ in_dim),
                size = [in_dim, out_dim]
                ).astype('float32'), name = 'W')
        self.b = tf.Variable(np.zeros((out_dim), dtype = 'float32'), name = 'b')
        self.function = function
        self.params = [self.W, self.b]
        
    def __call__(self, x):
        u = tf.matmul(x, self.W) + self.b
        return self.function(u)

class Dropout:
    def __init__(self, dropout_keep_prob=1.0):
        self.dropout_keep_prob = dropout_keep_prob
        self.params = []
    
    def __call__(self, x):
        # 訓練時のみdropoutを適用
        return tf.cond(
            pred=is_training,
            true_fn=lambda: tf.nn.dropout(x, keep_prob=self.dropout_keep_prob),
            false_fn=lambda: x
        )



### ネットワーク ###
tf.reset_default_graph()
is_training = tf.placeholder(tf.bool, shape=())

x = tf.placeholder(tf.float32, [None, 32, 32, 3])
t = tf.placeholder(tf.float32, [None, 10])

lmd = 0.0001
dropout_keep_prob = 0.75


# VGG16 network


# Block1
h = Conv(filter_shape = (3, 3, 3, 64), function = tf.nn.relu, padding = 'SAME')(x) # [None, 32, 32, 3] -> [None, 32, 32, 64]
h = Conv(filter_shape = (3, 3, 64, 64), function = tf.nn.relu, padding = 'SAME')(h) # [None, 32, 32, 64] -> [None, 32, 32, 64]
h = BatchNorm(shape = (32,32,64))(h) # [None, 32, 32, 64] -> [None, 32, 32, 64]
h = Pooling(ksize = (1, 2, 2, 1), strides = [1, 2, 2, 1])(h) # [None, 32, 32, 64] -> [None, 16, 16, 64]
h = Dropout(dropout_keep_prob)(h) # 1 of 4 inputs is randomly excluded

# Block2
h = Conv(filter_shape = (3, 3, 64, 128), function = tf.nn.relu, padding = 'SAME')(h) # [None, 16, 16, 64] -> [None, 16, 16, 128]
h = Conv(filter_shape = (3, 3, 128, 128), function = tf.nn.relu, padding = 'SAME')(h) # [None, 16, 16, 128] -> [None, 16, 16, 128]
h = BatchNorm(shape = (16, 16, 128))(h)
h = Pooling(ksize = (1, 2, 2, 1), strides = [1, 2, 2, 1])(h) # [None, 16, 16, 128] -> [None, 8, 8, 128]
h = Dropout(dropout_keep_prob)(h) # 1 of 4 inputs is randomly excluded

# Block3
h = Conv(filter_shape = (3, 3, 128, 256), function = tf.nn.relu, padding = 'SAME')(h) # [None, 8, 8, 128] -> [None, 8, 8, 256]
h = Conv(filter_shape = (3, 3, 256, 256), function = tf.nn.relu, padding = 'SAME')(h) # [None, 8, 8, 256] -> [None, 8, 8, 256]
h = Conv(filter_shape = (3, 3, 256, 256), function = tf.nn.relu, padding = 'SAME')(h) # [None, 8, 8, 256] -> [None, 8, 8, 256]
h = BatchNorm(shape = (8, 8, 256))(h)
h = Pooling(ksize = (1, 2, 2, 1), strides = [1, 2, 2, 1], padding = 'SAME')(h) # [None, 8, 8, 256] -> [None, 4, 4, 256]
h = Dropout(dropout_keep_prob)(h) # 1 of 4 inputs is randomly excluded

# Block4
h = Conv(filter_shape = (3, 3, 256, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 4, 4, 256] -> [None, 4, 4, 512]
h = Conv(filter_shape = (3, 3, 512, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 4, 4, 512] -> [None, 4, 4, 512]
h = Conv(filter_shape = (3, 3, 512, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 4, 4, 512] -> [None, 4, 4, 512]
h = BatchNorm(shape = (4, 4, 512))(h)
h = Pooling(ksize = (1, 2, 2, 1), strides = [1, 2, 2, 1], padding = 'SAME')(h) # [None, 4, 4, 512] -> [None, 2, 2, 512]
h = Dropout(dropout_keep_prob)(h) # 1 of 4 inputs is randomly excluded

# Block5
h = Conv(filter_shape = (3, 3, 512, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 2, 2, 512] -> [None, 2, 2, 512]
h = Conv(filter_shape = (3, 3, 512, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 2, 2, 512] -> [None, 2, 2, 512]
h = Conv(filter_shape = (3, 3, 512, 512), function = tf.nn.relu, padding = 'SAME')(h) # [None, 2, 2, 512] -> [None, 2, 2, 512]
h = BatchNorm(shape = (2, 2, 512))(h)
h = Pooling(ksize = (1, 2, 2, 1), strides = [1, 2, 2, 1], padding = 'SAME')(h) # [None, 2, 2, 512] -> [None, 1, 1, 512]
h = Dropout(dropout_keep_prob)(h) # 1 of 4 inputs is randomly excluded



print("Before Flatten, the shape of h is:{}".format(h.shape))
h = Flatten()(h)
print("After Flatten, the shape of h is:{}".format(h.shape))

h = Dense(512, 200, tf.nn.relu)(h)
h = Dropout(0.5)(h)
y = Dense(200, 10, tf.nn.softmax)(h)


cost = - tf.reduce_mean(tf.reduce_sum(t * tf_log(y), axis=1))
optimizer = tf.train.AdamOptimizer(0.01)
# optimizer = tf.train.AdadeltaOptimizer()
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
    train = optimizer.minimize(cost)


### 元々あった記述 ###
# cost = - tf.reduce_mean(tf.reduce_sum(t * tf_log(y), axis=1))
# update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)

# with tf.control_dependencies(update_ops):
#     optimizer = tf.train.AdamOptimizer(0.01).minimize(cost)

### 前処理 ###
def gcn(x, epsilon = 1e-8):
    mean = np.mean(x, axis = (1, 2, 3), keepdims = True)
    var = np.var(x, axis = (1, 2, 3), keepdims = True)
    return (x - mean)/np.sqrt(var + epsilon)


class ZCAWhitening:
    def __init__(self, epsilon = 1e-4):
        self.epsilon = epsilon
        self.mean = None
        self.ZCA_matrix = None
        
    def fit(self, x):
        x = x.reshape(x.shape[0], -1) # 1枚の画像を１つのベクトルにしてしまう
        self.mean = np.mean(x, axis = 0) # 全ての画像のchannel kの(i,j)成分毎に平均を取る。
        x -= self.mean # 標準化
        cov_matrix = np.matmul(x.T, x)/x.shape[0] # 共分散行列の推定値
        A, d, _ = np.linalg.svd(cov_matrix)
        self.ZCA_matrix = np.matmul(A, np.matmul(np.diag(1./ np.sqrt(d + self.epsilon)), A.T))
        
    def transform(self, x):
        shape = x.shape
        x = x.reshape(x.shape[0], -1)
        x -= self.mean
        x = np.dot(x, self.ZCA_matrix.T) # ここの積の形に注意。
        return x.reshape(shape)

    
x_train, x_test, t_train = load_cifar10()
x_train, x_valid, t_train, t_valid = train_test_split(x_train, t_train, test_size=0.1, random_state=random_state)
zca = ZCAWhitening()
zca.fit(x_train)
x_train_zca = zca.transform(gcn(x_train))
t_train_zca = t_train[:]
x_valid_zca = zca.transform(gcn(x_valid))
t_valid_zca = t_valid[:]
x_test_zca = zca.transform(gcn(x_test))

### 学習 ###
n_epochs = 100
batch_size = 100
n_batches = x_train.shape[0]//batch_size

# sess = tf.Session()
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        x_train, t_train = shuffle(x_train_zca, t_train_zca, random_state = random_state)
        for batch in range(n_batches):
            start = batch*batch_size
            finish = start + batch_size
            sess.run(train, feed_dict = {
                x:x_train_zca[start:finish], 
                t:t_train_zca[start:finish],
                is_training:True,
            })
        _pred, _cost = sess.run([y, cost], feed_dict = {
            x:x_valid_zca,
            t:t_valid_zca,
            is_training:False,
        })
        print("EPOCH:{}, Valid_Cost:{:.3f}, Valid_Accuracy:{:.3f}".format(
            epoch + 1,
            _cost,
            accuracy_score(t_valid_zca.argmax(axis = 1), _pred.argmax(axis = 1))
        ))
#         if (accuracy_score(t_valid_zca.argmax(axis = 1), _pred.argmax(axis = 1)) > 0.84):
#             print("The target accuracy is achieved! Learning process is stopping...")
#             break
    del x_train_zca, x_valid_zca, t_train_zca, t_valid_zca
    y_pred = []
    for i in range(10):
        start = i*1000
        finish = start + 1000
        _pred = sess.run(y, feed_dict = {x:x_test_zca[start:finish], is_training:False})
        y_pred = np.concatenate([y_pred,_pred.argmax(axis = 1)])
    submission = pd.Series(y_pred, name='label')
    submission.to_csv('/root/userspace/chap07/submission/submission_pred_VGG16_100epochs.csv', header=True, index_label='id')

Before Flatten, the shape of h is:(?, 1, 1, 512)
After Flatten, the shape of h is:(?, 512)
EPOCH:1, Valid_Cost:2.216, Valid_Accuracy:0.136
EPOCH:2, Valid_Cost:1.875, Valid_Accuracy:0.207
EPOCH:3, Valid_Cost:1.763, Valid_Accuracy:0.258
EPOCH:4, Valid_Cost:1.682, Valid_Accuracy:0.288
EPOCH:5, Valid_Cost:1.654, Valid_Accuracy:0.305
EPOCH:6, Valid_Cost:1.604, Valid_Accuracy:0.336
EPOCH:7, Valid_Cost:1.535, Valid_Accuracy:0.364
EPOCH:8, Valid_Cost:1.499, Valid_Accuracy:0.389
EPOCH:9, Valid_Cost:1.484, Valid_Accuracy:0.387
EPOCH:10, Valid_Cost:1.455, Valid_Accuracy:0.394
EPOCH:11, Valid_Cost:1.428, Valid_Accuracy:0.437
EPOCH:12, Valid_Cost:1.319, Valid_Accuracy:0.487
EPOCH:13, Valid_Cost:1.228, Valid_Accuracy:0.532
EPOCH:14, Valid_Cost:1.206, Valid_Accuracy:0.560
EPOCH:15, Valid_Cost:1.123, Valid_Accuracy:0.602
EPOCH:16, Valid_Cost:1.038, Valid_Accuracy:0.637
EPOCH:17, Valid_Cost:1.033, Valid_Accuracy:0.663
EPOCH:18, Valid_Cost:0.938, Valid_Accuracy:0.704
EPOCH:19, Valid_Cost:0.931, Valid_Ac

In [14]:
x_test_zca.shape

(10000, 32, 32, 3)