## MNIST
<br>이제 higher level API(tf.layers 등)를 적극 활용하면서 필요에 따라 앞서 배운 low level API(tf.nn)를 활용해 세부적인 model tuning이 가능합니다. (https://goo.gl/Rmy8qq)
<br>
<br><span style="color:red;"> - 더욱 편하게 layer 를 구성할 수 있도록 돕는 **tf.layers** 를 적용합니다.
<br>- tf.layers.batch_normalization()을 활용해 손쉽게 **Batch Normalization**을 적용할 수 있습니다.</span>
<br>- BN을 적용하면 전반적으로 모델의 성능이 향상되어 Params init, Regularization, Dropout 등의 필요성이 크게 줄어듭니다. 
<br>- 물론 신경망이 깊어지고 풀어야 할 문제가 복잡해진다면 앞선 최적화 방법들을 함께 적용시켜 성능 향상을 도모할 수 있습니다. 
<br><br>

In [1]:
import numpy as np
import matplotlib.pyplot as plt

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import os, warnings
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # https://stackoverflow.com/questions/35911252/disable-tensorflow-debugging-information
tf.logging.set_verbosity(tf.logging.ERROR)

Instructions for updating:
non-resource variables are not supported in the long term


In [2]:
from tensorflow.keras import datasets, utils

(train_data, train_label), (test_data, test_label) = datasets.mnist.load_data()

train_data = train_data.reshape(60000, 784) / 255.0
test_data = test_data.reshape(10000, 784) / 255.0

train_label = utils.to_categorical(train_label) # 0~9 -> one-hot vector
test_label = utils.to_categorical(test_label) # 0~9 -> one-hot vector

In [3]:
# 각종 placeholder 들을 선언해줍니다.

X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

bn_sign = tf.placeholder(tf.bool)

In [8]:
# BN 순서 : 선형 결합 -> BN 적용 -> 활성화 함수 
# activation function 을 걷어내고 BN을 먼저 적용하기 위해 activation에 None을 적용하였습니다.

L1 = tf.layers.dense(X, 256, activation=None) 
L1 = tf.layers.batch_normalization(L1, training=bn_sign)
L1 = tf.nn.relu(L1)

L2 = tf.layers.dense(L1, 256, activation=None)
L2 = tf.layers.batch_normalization(L2, training=bn_sign)
L2 = tf.nn.relu(L2)

model = tf.layers.dense(L2, 10, activation=None)

In [9]:
cost = tf.losses.softmax_cross_entropy(Y, model) 
optimizer = tf.train.AdamOptimizer(1e-3).minimize(cost) 

In [10]:
cost = tf.losses.softmax_cross_entropy(Y, model) 

# BN 적용 중 계산되는 moving_mean & moving_variance 를 지속적으로 업데이트 해줍니다.
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
    optimizer = tf.train.AdamOptimizer(1e-3).minimize(cost) 
    
# * When is_training is "True", the moving_mean and moving_variance need to be updated, 
# by default the update_ops are placed in tf.GraphKeys.UPDATE_OPS
# so they need to be added as a dependency to the train_op

In [11]:
is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

In [12]:
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

In [13]:
batch_size = 100
total_batch = int(len(train_data) / batch_size)
total_batch

600

In [14]:
# import tqdm 
# for epoch in tqdm.notebook.tqdm(range(15)):

for epoch in range(15):
    
    
    training_results = [] # Training accuracy 를 동시에 출력해보도록 합니다.
    total_cost = 0
    batch_idx = 0
    
    for i in range(total_batch):
        
        batch_x = train_data[ batch_idx : batch_idx + batch_size ]
        batch_y = train_label[ batch_idx : batch_idx + batch_size ]
        
        # 1) Optimizer
        sess.run(optimizer, feed_dict={X: batch_x, 
                                       Y: batch_y, 
                                       bn_sign: True}) # Batch Normalization - Training mode
        
        # 2) Cost
        batch_cost = sess.run(cost, feed_dict={X: batch_x, 
                                               Y: batch_y, 
                                               bn_sign: True}) # Batch Normalization - Training mode
        total_cost = total_cost + batch_cost
        
        
        # 3) 매 Epoch마다 Training accuracy를 출력합니다. (bn_sign을 False로 바꾸어 training mode가 아닌 inference mode로 실행합니다.)
        batch_results = sess.run([is_correct], feed_dict={X: batch_x, 
                                                           Y: batch_y, 
                                                           bn_sign: False}) 
        training_results = training_results + batch_results
        
        batch_idx += batch_size
    
    
    training_cost = total_cost / total_batch
    
    
    print('Epoch: {}'.format(epoch + 1), 
          '|| Avg. Training cost = {:.3f}'.format(training_cost),
          '|| Training accuracy : {:.3f}'.format(np.mean(training_results)))

print('Learning process is completed!')

Epoch: 1 || Avg. Training cost = 0.166 || Training accuracy : 0.939
Epoch: 2 || Avg. Training cost = 0.057 || Training accuracy : 0.981
Epoch: 3 || Avg. Training cost = 0.029 || Training accuracy : 0.988
Epoch: 4 || Avg. Training cost = 0.015 || Training accuracy : 0.991
Epoch: 5 || Avg. Training cost = 0.009 || Training accuracy : 0.993
Epoch: 6 || Avg. Training cost = 0.007 || Training accuracy : 0.993
Epoch: 7 || Avg. Training cost = 0.006 || Training accuracy : 0.994
Epoch: 8 || Avg. Training cost = 0.005 || Training accuracy : 0.995
Epoch: 9 || Avg. Training cost = 0.004 || Training accuracy : 0.996
Epoch: 10 || Avg. Training cost = 0.003 || Training accuracy : 0.997
Epoch: 11 || Avg. Training cost = 0.003 || Training accuracy : 0.996
Epoch: 12 || Avg. Training cost = 0.003 || Training accuracy : 0.996
Epoch: 13 || Avg. Training cost = 0.002 || Training accuracy : 0.997
Epoch: 14 || Avg. Training cost = 0.002 || Training accuracy : 0.997
Epoch: 15 || Avg. Training cost = 0.002 || 

In [15]:
# Test accuracy 를 출력합니다. 
# bn_sign을 False로 바꾸어 training mode가 아닌 [ inference mode ]로 바꿔주어야 합니다.
# 학습 단계에서는 데이터가 배치 단위로 들어오기 때문에 배치의 평균, 분산을 구하는 것이 가능하지만, 
# 테스트 단계에서는 배치 단위로 평균/분산을 구하기가 어려워 학습 단계에서 배치 단위의 평균/분산을 저장해 놓고 테스트 시에는 이를 사용합니다.

print('Test accuracy : {}'.format(sess.run(accuracy, 
                                           feed_dict={
                                               X: test_data,
                                               Y: test_label,
                                               bn_sign: False})))

Test accuracy : 0.9783999919891357
