__Skip Connection / Bottleneck Skip connection__

![image](https://img1.daumcdn.net/thumb/R720x0.q80/?scode=mtistory2&fname=http%3A%2F%2Fcfile7.uf.tistory.com%2Fimage%2F99F0453F5C47F1741338F0)

- ResNet50 부터는 연산량의 줄이기 위해 Residual Block 내에, 1x1, 3x3, 1x1 컨볼루션 연산을 쌓았다. Inception에서 배웠던 것과 같은 개념이다. 1x1 컨볼루션 연산으로 피쳐맵의 갯수를 줄였다가 3x3을 거친 후, 1x1 컨볼루션 연산으로 차원을 늘려준다. 이 과정이 병목 같다 하여 병목레이어(bottleneck layer)라고 부른다.

__Residual Block / Identity Block__

![image](https://datascienceschool.net/upfiles/2e104ff279804e839cef46fc58ef16e7.png)

-  이미지가 반으로 작아진 경우, Identity Block이 사용되며, 입력값을 바로 더하지 않고, 1x1 컨볼루션 연산을 스트라이드 2로 설정하여 피쳐맵의 크기와 갯수를 맞추어준 다음 더해준다. 이를 프로젝션 숏컷(projection shortcut)

__ResNet Structrue by layer__

![image](https://img1.daumcdn.net/thumb/R800x0/?scode=mtistory2&fname=https%3A%2F%2Ft1.daumcdn.net%2Fcfile%2Ftistory%2F99167C335C47F0E315)

__SENet (SE block)__

![image](https://i.imgur.com/9UFjxDA.png)

__SE block in ResNet__
![image](https://t1.daumcdn.net/cfile/tistory/9917F14A5D3EB6D535)

In [1]:
# GPU setting
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="3"

# modules setting

import numpy as np
import glob
import matplotlib.pyplot as plt
import tensorflow as tf
import time
import datetime
from utils import one_hot, train_valid_split, random_minibatch, shuffle, history
from utils import training_history

# Load Data

In [2]:
train_dir =  '/mnt/disk1/yunseob/courses/19-2_computer vision/data/STFT/v1/train'
npy_files = os.listdir(train_dir)
npy_files

['dball_7.npy',
 'dball_14.npy',
 'dball_21.npy',
 'dinner_7.npy',
 'dinner_14.npy',
 'dinner_21.npy',
 'douter_7.npy',
 'douter_14.npy',
 'douter_21.npy',
 'normal.npy']

In [3]:
normal = np.load(os.path.join(train_dir, str([i for i in npy_files if 'normal' in i][0])))
ball_7 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'ball_7' in i][0])))
ball_14 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'ball_14' in i][0])))
ball_21 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'ball_21' in i][0])))
inner_7 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'inner_7' in i][0])))
inner_14 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'inner_14' in i][0])))
inner_21 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'inner_21' in i][0])))
outer_7 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'outer_7' in i][0])))
outer_14 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'outer_14' in i][0])))
outer_21 = np.load(os.path.join(train_dir, str([i for i in npy_files if 'outer_21' in i][0])))

normal_y = one_hot(normal, 0, nb_classes = 10)
ball_7_y = one_hot(ball_7, 1, nb_classes = 10)
ball_14_y = one_hot(ball_14, 2, nb_classes = 10)
ball_21_y = one_hot(ball_21, 3, nb_classes = 10)
inner_7_y = one_hot(inner_7, 4, nb_classes = 10)
inner_14_y = one_hot(inner_14, 5, nb_classes = 10)
inner_21_y = one_hot(inner_21, 6, nb_classes = 10)
outer_7_y = one_hot(outer_7, 7, nb_classes = 10)
outer_14_y = one_hot(outer_14, 8, nb_classes = 10)
outer_21_y = one_hot(outer_21, 9, nb_classes = 10)

print("normal:", normal.shape, normal_y.shape)
print("ball_7:", ball_7.shape, ball_7_y.shape)
print("ball_14:", ball_14.shape, ball_14_y.shape)
print("ball_21:", ball_21.shape, ball_21_y.shape)
print("inner_7:", inner_7.shape, inner_7_y.shape)
print("inner_14:", inner_14.shape, inner_14_y.shape)
print("inner_21:", inner_21.shape, inner_21_y.shape)
print("outer_7:", outer_7.shape, outer_7_y.shape)
print("outer_14:", outer_14.shape, outer_14_y.shape)
print("outer_21:", outer_21.shape, outer_21_y.shape)

normal: (750, 100, 100) (750, 10)
ball_7: (750, 100, 100) (750, 10)
ball_14: (750, 100, 100) (750, 10)
ball_21: (750, 100, 100) (750, 10)
inner_7: (750, 100, 100) (750, 10)
inner_14: (750, 100, 100) (750, 10)
inner_21: (750, 100, 100) (750, 10)
outer_7: (750, 100, 100) (750, 10)
outer_14: (750, 100, 100) (750, 10)
outer_21: (750, 100, 100) (750, 10)


# Data split

In [4]:
normal_train_x, normal_train_y, normal_valid_x, normal_valid_y = train_valid_split(normal, normal_y)
print("normal:", normal_train_x.shape, normal_train_y.shape, normal_valid_x.shape, normal_valid_y.shape)

ball_7_train_x, ball_7_train_y, ball_7_valid_x, ball_7_valid_y = train_valid_split(ball_7, ball_7_y)
ball_14_train_x, ball_14_train_y, ball_14_valid_x, ball_14_valid_y = train_valid_split(ball_14, ball_14_y)
ball_21_train_x, ball_21_train_y, ball_21_valid_x, ball_21_valid_y = train_valid_split(ball_21, ball_21_y)
print("ball_7:", ball_7_train_x.shape, ball_7_train_y.shape, ball_7_valid_x.shape, ball_7_valid_y.shape)
print("ball_14:", ball_14_train_x.shape, ball_14_train_y.shape, ball_14_valid_x.shape, ball_14_valid_y.shape)
print("ball_21:", ball_21_train_x.shape, ball_21_train_y.shape, ball_21_valid_x.shape, ball_21_valid_y.shape)

inner_7_train_x, inner_7_train_y, inner_7_valid_x, inner_7_valid_y = train_valid_split(inner_7, inner_7_y)
inner_14_train_x, inner_14_train_y, inner_14_valid_x, inner_14_valid_y = train_valid_split(inner_14, inner_14_y)
inner_21_train_x, inner_21_train_y, inner_21_valid_x, inner_21_valid_y = train_valid_split(inner_21, inner_21_y)
print("inner_7:", inner_7_train_x.shape, inner_7_train_y.shape, inner_7_valid_x.shape, inner_7_valid_y.shape)
print("inner_14:", inner_14_train_x.shape, inner_14_train_y.shape, inner_14_valid_x.shape, inner_14_valid_y.shape)
print("inner_21:", inner_21_train_x.shape, inner_21_train_y.shape, inner_21_valid_x.shape, inner_21_valid_y.shape)

outer_7_train_x, outer_7_train_y, outer_7_valid_x, outer_7_valid_y = train_valid_split(outer_7, outer_7_y)
outer_14_train_x, outer_14_train_y, outer_14_valid_x, outer_14_valid_y = train_valid_split(outer_14, outer_14_y)
outer_21_train_x, outer_21_train_y, outer_21_valid_x, outer_21_valid_y = train_valid_split(outer_21, outer_21_y)
print("outer_7:", outer_7_train_x.shape, outer_7_train_y.shape, outer_7_valid_x.shape, outer_7_valid_y.shape)
print("outer_14:", outer_14_train_x.shape, outer_14_train_y.shape, outer_14_valid_x.shape, outer_14_valid_y.shape)
print("outer_21:", outer_21_train_x.shape, outer_21_train_y.shape, outer_21_valid_x.shape, outer_21_valid_y.shape)

normal: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
ball_7: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
ball_14: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
ball_21: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
inner_7: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
inner_14: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
inner_21: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
outer_7: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
outer_14: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)
outer_21: (638, 100, 100) (638, 10) (112, 100, 100) (112, 10)


In [5]:
train_X = np.vstack([normal_train_x, ball_7_train_x, ball_14_train_x, ball_21_train_x, 
                     inner_7_train_x, inner_14_train_x, inner_21_train_x,
                     outer_7_train_x, outer_14_train_x, outer_21_train_x, ])
train_Y = np.vstack([normal_train_y, ball_7_train_y, ball_14_train_y, ball_21_train_y, 
                     inner_7_train_y, inner_14_train_y, inner_21_train_y,
                     outer_7_train_y, outer_14_train_y, outer_21_train_y, ])
valid_X = np.vstack([normal_valid_x, ball_7_valid_x, ball_14_valid_x, ball_21_valid_x, 
                     inner_7_valid_x, inner_14_valid_x, inner_21_valid_x,
                     outer_7_valid_x, outer_14_valid_x, outer_21_valid_x, ])
valid_Y = np.vstack([normal_valid_y, ball_7_valid_y, ball_14_valid_y, ball_21_valid_y, 
                     inner_7_valid_y, inner_14_valid_y, inner_21_valid_y,
                     outer_7_valid_y, outer_14_valid_y, outer_21_valid_y, ])

print("Training set:", train_X.shape, train_Y.shape)
print("Validation set:", valid_X.shape, valid_Y.shape)

Training set: (6380, 100, 100) (6380, 10)
Validation set: (1120, 100, 100) (1120, 10)


In [6]:
train_X = np.expand_dims(train_X, axis = 3)
valid_X = np.expand_dims(valid_X, axis = 3)

print("Training set:", train_X.shape, train_Y.shape)
print("Validation set:", valid_X.shape, valid_Y.shape)

Training set: (6380, 100, 100, 1) (6380, 10)
Validation set: (1120, 100, 100, 1) (1120, 10)


# Model

In [7]:
input_h = 100
input_w = 100
input_ch = 1

ch = 16
# 50 50 16

r_ch_1 = 32
# 25 25 32

r_ch_2 = 32
# 12 12 16

r_ch_3 = 64
# 12 12 32

r_ch_4 = 128
# 6 6 128

n_output = 10

In [8]:
tf.reset_default_graph()

x = tf.placeholder(tf.float32, [None, input_h, input_w, input_ch], name = 'img')
y = tf.placeholder(tf.float32, [None, n_output], name = 'label')
batch_prob = tf.placeholder(tf.bool, name = 'bn_prob')

class SEResNet50:
    def __init__(self, ch, r_ch_1, r_ch_2, r_ch_3, r_ch_4):
        self.ch = ch
        self.r_ch_1 = r_ch_1
        self.r_ch_2 = r_ch_2
        self.r_ch_3 = r_ch_3
        self.r_ch_4 = r_ch_4
    def conv(self, x, channel, kernel_size = [3, 3], strides = (1, 1), activation = True):
        conv = tf.layers.conv2d(inputs = x, filters = channel, kernel_size = kernel_size, 
                                strides = strides, padding = "SAME")
        conv = tf.layers.batch_normalization(conv, center=True, scale=True, training=batch_prob)
        if activation == True:
            conv = tf.nn.relu(conv)
        return conv
    
    def maxp(self, conv):
        maxp = tf.layers.max_pooling2d(inputs = conv, pool_size = [2, 2], strides = 2)
        return maxp

    def SE_block(self, x, channel = None, reduction_ratio = 4):
        ch_reduced = channel/reduction_ratio
        x_in = x
        squeeze = self.global_avg_pooling(x_in)
        excitation =  tf.layers.dense(inputs = squeeze, units = ch_reduced, 
                                      kernel_initializer = tf.contrib.layers.variance_scaling_initializer(uniform=False, factor=2.0, mode='FAN_IN', dtype=tf.float32),
                                      activation = tf.nn.relu, use_bias = False)
        excitation =  tf.layers.dense(inputs = excitation, units = channel, 
                                      kernel_initializer = tf.contrib.layers.variance_scaling_initializer(uniform=False, factor=2.0, mode='FAN_IN', dtype=tf.float32),
                                      activation = tf.nn.sigmoid, use_bias = False)
#         excitation =  tf.layers.dense(inputs = squeeze, units = ch_reduced, activation = tf.nn.relu, use_bias = False)
#         excitation =  tf.layers.dense(inputs = excitation, units = channel, activation = tf.nn.sigmoid, use_bias = False)
        excitation = tf.reshape(excitation, [-1, 1, 1, channel])
        scale = tf.multiply(x_in, excitation)
        
        return scale

    def SE_res_block(self, x, channel, strides = (1, 1), reduction_ratio = 4):
        conv_a = self.conv(x, channel/4, kernel_size = [1, 1], strides = strides)
        conv_b = self.conv(conv_a, channel/4, kernel_size = [3, 3])
        conv_c = self.conv(conv_b, channel, kernel_size = [1, 1], activation = False)
        se = self.SE_block(conv_c, channel, reduction_ratio = 4)

        proj_input = self.conv(x, channel, kernel_size = [1, 1], strides = strides, activation = False)
        
        return tf.nn.relu(proj_input + se)
    
    def SE_res_stage(self, x, target_ch, reduction_ratio = 4, downsample = False, n_rep = None):
        strides = (2, 2) if downsample != False else (1, 1)
       
        x = self.SE_res_block(x, target_ch, strides)
        for _ in range(n_rep-1):
            x = self.SE_res_block(x, target_ch)
        return x

    def fc_layer(self, gap, n_output = None):
        flatten = tf.layers.flatten(gap)
        output = tf.layers.dense(inputs = flatten, units = n_output)
        return output

    def global_avg_pooling(self, x):
        gap = tf.reduce_mean(x, axis=[1, 2], keepdims=True)
        return gap

    def inf(self, x):
        """
        conv_1: 1
        id_~ + resnet_~: 16 x 3 = 48
        fc_lay: 1

        total: 50
        """
        conv_1 = self.conv(x, self.ch, strides = (2, 2))
        maxp_1 = self.maxp(conv_1)
        se_1 = self.SE_res_stage(maxp_1, self.r_ch_1, downsample = False, n_rep = 3)
        se_2 = self.SE_res_stage(se_1, self.r_ch_2, downsample = True, n_rep = 4)
        se_3 = self.SE_res_stage(se_2, self.r_ch_3, downsample = True, n_rep = 6)
        se_4 = self.SE_res_stage(se_3, self.r_ch_4, downsample = True, n_rep = 3)
        gap = self.global_avg_pooling(se_4)
        score = self.fc_layer(gap, n_output)
        return score

    
model = SEResNet50(ch, r_ch_1, r_ch_2, r_ch_3, r_ch_4)
score = model.inf(x)
loss = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=score)
loss = tf.reduce_mean(loss)

W1215 06:04:48.636652 139980242278144 deprecation.py:323] From <ipython-input-8-9c0902bb28ba>:16: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
W1215 06:04:48.644223 139980242278144 deprecation.py:506] From /home/yunseob/.local/lib/python3.5/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1215 06:04:48.839367 139980242278144 deprecation.py:323] From <ipython-input-8-9c0902bb28ba>:17: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.BatchNormalization instead.  In particular, `tf.control_dep

# Training

In [9]:
t_batch = 32
v_batch = 64
n_cal = 10
n_prt = 100

n_iter = 0

# LR = 1e-4 # 1e-4 ~ 5e-4 (xavier)
lr = 1e-4

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
    optm = tf.train.AdamOptimizer(learning_rate=lr).minimize(loss)
# optm = tf.train.AdamOptimizer(lr).minimize(loss)

saver = tf.train.Saver()
sess = tf.Session()

init = tf.global_variables_initializer()
sess.run(init)
start_time = time.time() 

accr_train, accr_valid, loss_train, loss_valid = [], [], [], []
early_stopping = False

hist = training_history(n_iter, accr_train, accr_valid, loss_train, loss_valid)
hist.table()

while True:
    train_x, train_y = random_minibatch(train_X, train_Y, batch_size = t_batch)
    train_x, train_y = shuffle(train_x, train_y)
    
    sess.run(optm, feed_dict = {'img:0': train_x, 'label:0': train_y, 'bn_prob:0' :1})
    n_iter += 1
    if n_iter % n_cal == 0:
        c, p = sess.run([loss, score], feed_dict = {'img:0': train_x, 'label:0': train_y, 'bn_prob:0' :0})

        p = np.argmax(p, axis = 1)
        l = np.argmax(train_y, axis = 1)
        a = np.mean(np.equal(p, l))
        
        valid_x, valid_y = random_minibatch(valid_X, valid_Y, batch_size = v_batch)
        c_valid, p_valid = sess.run([loss, score], feed_dict = {'img:0': valid_x, 'label:0': valid_y, 'bn_prob:0' :0})

        p_valid = np.argmax(p_valid, axis = 1)
        l_valid = np.argmax(valid_y, axis = 1)
        a_valid = np.mean(np.equal(p_valid, l_valid))

        accr_valid.append(a_valid)
        loss_valid.append(c_valid)
        accr_train.append(a)
        loss_train.append(c)

        if n_iter % n_prt == 0:
            hist.prt_evl()
            
        if loss_valid[-1] == np.min(loss_valid):
            now = datetime.datetime.now()
            nowDatetime = now.strftime('%y%m%d%H%M')
            model_name = 'stft_v1_seres50_{0}_{1}_val_acc_{2:.2f}_val_loss_{3:.6f}'.format(nowDatetime, n_iter, accr_valid[-1], loss_valid[-1])
            saver.save(sess, './model/STFT/' + model_name)
        if n_iter == 20000:
            break
#         if n_iter > 1000:
#             if np.max(accr_train) < 0.9:
#                 if np.mean(loss_train[-50:-30]) <= np.mean(loss_train[-30:]) :
#                     hist.early_under()
#                     early_stopping = True
#                     break
#             if np.mean(accr_train[-50:]) >= 0.995:
#                 if (
#                     np.mean(loss_valid[-41:-21]) <= np.mean(loss_valid[-21:-1]) and
#                     loss_valid[-1] < loss_valid[-2] # np.min(loss_valid[-20:]) == loss_valid[-1]
#                     ):
#                     hist.early_over()
#                     early_stopping = True
#                     break          

train_time = int((time.time() - start_time)/60)  
hist.done(train_time, early_stopping)

np.save('/mnt/disk1/yunseob/courses/19-2_computer vision/history/SEResNet50_STFT_v1_accr', np.array(accr_train))
np.save('/mnt/disk1/yunseob/courses/19-2_computer vision/history/SEResNet50_STFT_v1_loss', np.array(loss_train))

history(save = False)   

# sess.close()

[Iter] || Train_accr || Valid_accr || Train_loss || Valid_loss
[***0] || 6.25 %    || 15.62 %    || 2.35977602 || 2.32383347
--------------------------------------------------------------
[***0] || 6.25 %    || 12.50 %    || 2.52446342 || 2.52716017
--------------------------------------------------------------
[***0] || 12.50 %    || 9.38 %    || 2.71002150 || 2.88409686
--------------------------------------------------------------
[***0] || 40.62 %    || 42.19 %    || 1.79045296 || 1.58849299
--------------------------------------------------------------


W1215 06:06:04.411106 139980242278144 deprecation.py:323] From /home/yunseob/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py:960: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


[***0] || 71.88 %    || 70.31 %    || 0.76060987 || 0.79256845
--------------------------------------------------------------
[***0] || 81.25 %    || 92.19 %    || 0.54688829 || 0.30733633
--------------------------------------------------------------
[***0] || 96.88 %    || 100.00 %    || 0.16003570 || 0.08579825
--------------------------------------------------------------
[***0] || 100.00 %    || 98.44 %    || 0.07928973 || 0.07799081
--------------------------------------------------------------
[***0] || 100.00 %    || 98.44 %    || 0.03030233 || 0.04934071
--------------------------------------------------------------
[***0] || 100.00 %    || 98.44 %    || 0.04229132 || 0.04804727
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.01496918 || 0.03843940
--------------------------------------------------------------
[***0] || 100.00 %    || 96.88 %    || 0.00443930 || 0.10319340
------------------------------------------------

[***0] || 100.00 %    || 100.00 %    || 0.00007134 || 0.00030367
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00005576 || 0.00005996
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00004788 || 0.00093934
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00004024 || 0.00003091
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00001968 || 0.00003739
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00002494 || 0.00026230
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00002524 || 0.00019071
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00002254 || 0.00007964
---------------------------------------

[***0] || 100.00 %    || 100.00 %    || 0.00000655 || 0.00002104
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00002417 || 0.00001442
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00000895 || 0.00001970
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00001569 || 0.00000733
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00000817 || 0.00000856
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00000652 || 0.00000943
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00000668 || 0.00002270
--------------------------------------------------------------
[***0] || 100.00 %    || 100.00 %    || 0.00000444 || 0.00000679
---------------------------------------

[***0] || 100.00 %    || 100.00 %    || 0.00001046 || 0.00003878
--------------------------------------------------------------


NameError: name 'datetime' is not defined

In [11]:
np.save('/mnt/disk1/yunseob/courses/19-2_computer vision/history/SEResNet50_STFT_v1_accr', np.array(accr_train))
np.save('/mnt/disk1/yunseob/courses/19-2_computer vision/history/SEResNet50_STFT_v1_loss', np.array(loss_train))

In [10]:
history(save = False)  

NameError: name 'accr_train' is not defined

<Figure size 1080x1440 with 0 Axes>