# MNIST NN ( Xavier )
> 참고
1. Naver Edwith

머신러닝의 목표는 옵티마이저를 통해 파라미터값을 훈련시켜 비용함수를 줄이는 것이다. 하지만, 파라미터값의 초기값에 따라서 훈련의 내용이 달라지므로, 초기값을 잘 설정하는 것은 매우 중요합니다.

In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
print(tf.__version__)

2.1.0


## 1. 데이터 불러오기 & 간단한 전처리

In [2]:
# 데이터 불러오기 & 전처리
def load_mnist():
    (train_x, train_y), (test_x, test_y) = tf.keras.datasets.mnist.load_data()
    # as_dtype & normalize
    train_x = tf.constant(train_x, dtype=tf.float32) / 255.0
    test_x = tf.constant(test_x, dtype=tf.float32) / 255.0

    # expand_dims ( image )
    train_x = tf.expand_dims(train_x, axis=-1)  # [N, 28, 28] -> [N, 28, 28, 1]
    test_x = tf.expand_dims(test_x, axis=-1)  # [N, 28, 28] -> [N, 28, 28, 1]

    # label one-hot encoding
    train_y = tf.one_hot(train_y, depth=10)
    test_y = tf.one_hot(test_y, depth=10)

    return train_x, train_y, test_x, test_y


train_X, train_Y, test_X, test_Y = load_mnist()

## 2. 하이퍼파라미터 설정

In [3]:
# 하이퍼파라미터 설정
learning_rate = 0.001
batch_size = 100
training_epochs = 5
num_classes = 10

## 3. 모델 생성 및 구성
> 참고
1. https://www.edwith.org/boostcourse-dl-tensorflow/lecture/43736/
2. https://reniew.github.io/13/

* Dense메서드에 kernel_initializer='glorot_normal'을 통해 초기값 설정.
    * glorot_normal = xavier

In [4]:
# 모델 생성 및 구성
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(input_shape=[-1,784], units=256, kernel_initializer='glorot_normal', activation='relu'))
model.add(tf.keras.layers.Dense(input_shape=[-1,256], units=256, kernel_initializer='glorot_normal', activation='relu'))
model.add(tf.keras.layers.Dense(input_shape=[-1,256], units=10, kernel_initializer='glorot_normal', activation='softmax'))

# 훈련 준비
model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate), metrics=['accuracy'])
model.build(input_shape=train_X.shape)
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten (Flatten)            multiple                  0         
_________________________________________________________________
dense (Dense)                multiple                  200960    
_________________________________________________________________
dense_1 (Dense)              multiple                  65792     
_________________________________________________________________
dense_2 (Dense)              multiple                  2570      
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
_________________________________________________________________


## 4. 훈련 및 평가

In [8]:
# 훈련 시작 및 로그 저장
history = model.fit(train_X, train_Y, batch_size=batch_size, epochs=training_epochs)

# 검증 데이터를 사용한 검증.
cost, accuracy = model.evaluate(test_X,test_Y,batch_size=batch_size,verbose=0)
print("cost : {}, accuracy : {}".format(cost, accuracy))

Train on 60000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
cost : 0.09698255738759358, accuracy : 0.9805999994277954


### 결과
초기값만 잘 설정해도 정확도를 높일 수 있다.
정확도 : 98%