dropout은 학습 시 노드를 임의의 비율인 dropout rate 만큼 삭제하여 layer에 포함된 weight의 일부만 참여시킨다. 그래서 overfitting을 방지할 수 있다. 아래는 dropout rate를 바꿔줬을 때와 여기에 batch_normalization을 추가해 줬을 때의 accuracy이다.<br><br>
dropout = 0.2 accuracy = 87.54% accuracy(+batch normalization) = 88.84% <br>
dropout = 0.5 accuracy = 87.00% accuracy(+batch normalization) = 88.29% <br>
dropout = 0.8 accuracy = 84.79% accuracy(+batch normalization) = 85.20% <br><br>
dropout이 커질수록 accuracy는 점점 작아진다. dropout rate가 클수록 더 적은 데이터 unit이 학습되기 때문에 정확도는 감소한다.<br>
batch normalization은 각 batch 별로 평균과 분산을 이용하여 정규화하는 것이기 때문에 overfitting을 막을 수 있다. dropout이 노드를 일부 삭제하여 세분화시켜 앙상블 결합하는 것은 batch normalization에서 활성화 함수에 의해 최적의 분포를 만들어 특징이 잘 나타내도록 하는 것과 효과가 같다. 따라서 dropout은 dropout rate만큼 노드가 제외되기 때문에 학습 자체에서의 계산량이 줄어들지만 batch normalization은 삭제하는 노드가 없고 연산 중간마다 연산 결과를 정규화하고 각 성분의 분포 차로 인해 발생하는 가중치 학습의 불균형을 방지하기 때문에 dropout에 비해 accuracy가 높게 나온다.

In [9]:
import os, random
import keras
from keras.datasets import fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization, Dropout
from keras.utils import np_utils
import numpy as np
import tensorflow as tf
os.path.expanduser = lambda path: './'

In [10]:
batch_size = 128
num_classes = 10
epochs = 60

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.np_utils.to_categorical(y_train, num_classes)
y_test = keras.utils.np_utils.to_categorical(y_test, num_classes)

60000 train samples
10000 test samples


# Define Model 0.2

In [23]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.2))
    #model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.2))
    #model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_18 (Dense)            (None, 512)               401920    
                                                                 
 dropout_12 (Dropout)        (None, 512)               0         
                                                                 
 dense_19 (Dense)            (None, 512)               262656    
                                                                 
 dropout_13 (Dropout)        (None, 512)               0         
                                                                 
 dense_20 (Dense)            (None, 10)                5130      
                                                                 
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________


# Start Training

In [24]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy

In [25]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8754000067710876
Accuracy: 87.54%



--
# Define Model 0.2 with Batch_Normalization

In [26]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.2))
    model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.2))
    model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_21 (Dense)            (None, 512)               401920    
                                                                 
 dropout_14 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_8 (Batc  (None, 512)              2048      
 hNormalization)                                                 
                                                                 
 dense_22 (Dense)            (None, 512)               262656    
                                                                 
 dropout_15 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_9 (Batc  (None, 512)              2048      
 hNormalization)                                      

# Start Training

In [27]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy 

In [28]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8884000182151794
Accuracy: 88.84%



--
# Define Model 0.5

In [29]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.5))
    #model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.5))
    #model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_24 (Dense)            (None, 512)               401920    
                                                                 
 dropout_16 (Dropout)        (None, 512)               0         
                                                                 
 dense_25 (Dense)            (None, 512)               262656    
                                                                 
 dropout_17 (Dropout)        (None, 512)               0         
                                                                 
 dense_26 (Dense)            (None, 10)                5130      
                                                                 
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________


# Start Training

In [30]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy 

In [31]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8700000047683716
Accuracy: 87.00%



--
# Define Model 0.5 with Batch_Nomalization

In [32]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.5))
    model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.5))
    model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_27 (Dense)            (None, 512)               401920    
                                                                 
 dropout_18 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_10 (Bat  (None, 512)              2048      
 chNormalization)                                                
                                                                 
 dense_28 (Dense)            (None, 512)               262656    
                                                                 
 dropout_19 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_11 (Bat  (None, 512)              2048      
 chNormalization)                                     

# Start Training

In [33]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy

In [34]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8828999996185303
Accuracy: 88.29%



--
# Define Model 0.8

In [35]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.8))
    #model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.8))
    #model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_30 (Dense)            (None, 512)               401920    
                                                                 
 dropout_20 (Dropout)        (None, 512)               0         
                                                                 
 dense_31 (Dense)            (None, 512)               262656    
                                                                 
 dropout_21 (Dropout)        (None, 512)               0         
                                                                 
 dense_32 (Dense)            (None, 10)                5130      
                                                                 
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________


# Start Training

In [36]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy

In [37]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8478999733924866
Accuracy: 84.79%



--
# Define Model 0.8 with Batch_Nomalization

In [38]:
# for reproducibility
import random, os
os.environ['PYTHONHASHSEED']='0'
random.seed(123)
np.random.seed(123)
tf.random.set_seed(123)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,    inter_op_parallelism_threads=1,
                                                               allow_soft_placement=True, device_count = {'CPU': 1}))
from tensorflow.python.keras import backend as K
K.set_session(sess)



kernel_initializer='glorot_uniform'
activation_function = 'relu'

with tf.device('/cpu:0'):
    model = Sequential()
    model.add(Dense(512, activation='relu', input_shape=(784,)))
    model.add(Dropout(0.8))
    model.add(BatchNormalization())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.8))
    model.add(BatchNormalization())
    model.add(Dense(num_classes, activation='softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])

Model: "sequential_11"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_33 (Dense)            (None, 512)               401920    
                                                                 
 dropout_22 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_12 (Bat  (None, 512)              2048      
 chNormalization)                                                
                                                                 
 dense_34 (Dense)            (None, 512)               262656    
                                                                 
 dropout_23 (Dropout)        (None, 512)               0         
                                                                 
 batch_normalization_13 (Bat  (None, 512)              2048      
 chNormalization)                                    

# Start Training

In [39]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60


Epoch 58/60
Epoch 59/60
Epoch 60/60


# Calculate accuracy

In [40]:
metrics = model.evaluate(x_test, y_test) #returns loss and accuracy
print(metrics[1])
print(f'Accuracy: {metrics[1]*100:.2f}%\n')

0.8519999980926514
Accuracy: 85.20%

