## 0. 실습 데이터

* Handwriting Digit Recognition
  * 28 * 28, 전체 784개 변수
  * 0~9 로 구성된 10개의 클래스
  
  * training data : 60,000
  * testing data : 10,000
  
## 1. 구조생성

![](./img/01_example_01.png)

* Dense(500) : 500개의 output
* Activation function이 여러개인데 : 이 중, sigmoid를 생성

* 500개가 들어가서 500개가 나옴 (sigmoid)를 통과해서 output을 내보내는 형태

* 최종 레이어는 Dense(10) 10개의 output (0,1,2, ... 9 )
* 최종 classification을 하기 때문에 'softmax'를 사용 
  * binary classfication 일때는 'sigmoid'를 사용하고
  * 일반적으로 2개 이상의 class 일 경우는 'softmax'를 사용함
  
## 2. Loss function 설정
> model.compile(loss = 'categorical crossentropy', <br/>
>              optimizer = 'adam', <br/>
>              metrics = ['accuracy'])


![](./img/01_example_02.png)

* loss 와 optimizer는 학습이 되는 방향을 정해 줌
* Availale loss function
  * mean_squared_error
  * mean_absolute_error
  * mean_absolute_percentage_error
  ...
  * categorical_crossentropy
  * binary_crossentropy
  
  * Regression 문제
    * MSE 를 많이 씀 
    * MAP (if MAP=10 이면, 10% 오차가 있다)
  
  * Classficiation 문제 : crossentropy


* Available Optimizer
  * SGD, RMSprpop, Adagrad, Adadelta, Adam, Adamax, Nadam
  * 현재까지는 Adam 을 가장 많이 사용함
  

## 3. 모델링

> model.fit(x_train, y_train, batch_size = 100, nb_epoch = 20)

![](./img/01_example_03.png)

![](./img/01_example_04.png)

* classfication 이기 때문에,  one-hot vector 형식으로 y를 변형시켜 줌 
  

## 4. 모델 저장 및 로드

```{python}
## import modul
from keras.models import load_model

## save model
model.save('path/filename.h5') 

## load model
model = load_model('path/filename.h5')
```

## 5. 테스팅

```
## case1
score = model.evaluation(x_test, y_test)
print('Total loss on Testing set : ', score[0])
print('Accuracy of Testing set : ', score[1])

## case2
result = model.predict(x_test)
```


## 6. batch size 및 epoch 설정

* 데이터 사이즈가 커질 수록, batch size를 줄이던지, 사용하는 데이터에 맞게 변경

![](./img/01_example_05.png)


## DNN ver1 실습

- spyder에서 실행 시, variable explorer 로 편리하게 확인 가능함 
<br/>
![](./img/01_spyder_01.png)

In [4]:
# Load modulas
import tensorflow as tf
import numpy as np
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Input, Dense, Activation
from keras.optimizers import Adam
from keras.models import load_model
from keras.utils import to_categorical
#from keras.regularizers import # regulizer를 위한 설정

## parameter
batch_size = 128
num_classes = 10
epoch = 20

## Load mnist data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
## 나누기 연산이 들어가므로 uint8 -> float32로 변경

## preprocessing (scale)
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

print(np.min(x_train), ' ~ ', np.max(x_train))
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')


# print('x_train shape:', x_train.shape)
# print('x_test shape:', x_test.shape)
# print('y_train shape:', y_train.shape)
# print('y_test shape:', y_test.shape)


0.0  ~  1.0
60000 train samples
10000 test samples
x_train shape: (60000, 784)
x_test shape: (10000, 784)
y_train shape: (60000,)
y_test shape: (10000,)


In [9]:
## convert class vectors 
# Lable의 categorical 값을 One-hot 형태로 변환 
# 예를 들어 [1, 3, 2, 0] 를 
# [[ 0., 1., 0., 0.], 
# [ 0., 0., 0., 1.], 
# [ 0., 0., 1., 0.], 
# [ 1., 0., 0., 0.]] 
# 로 변환하는 것을 One-hot 형태라고 함
## convert class vectors
y_train_cat = to_categorical(y_train, num_classes)
y_test_cat = to_categorical(y_test, num_classes)


#### deep neural network modeling 
info_input = Input(shape=(28*28,))

layer1 = Dense(512)(info_input)
layer1_act = Activation('relu')(layer1)

layer2 = Dense(512)(layer1_act)
layer2_act = Activation('relu')(layer2)

layer3 = Dense(10)(layer2_act)
layer3_act = Activation('softmax')(layer3)

model = Model(inputs=[info_input], 
              outputs = [layer3_act])

### model structure
model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 784)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 512)               401920    
_________________________________________________________________
activation_4 (Activation)    (None, 512)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 512)               262656    
_________________________________________________________________
activation_5 (Activation)    (None, 512)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 10)                5130      
_________________________________________________________________
activation_6 (Activation)    (None, 10)                0         
Total para

In [10]:
### model structure (continued)

model.compile(loss = 'categorical_crossentropy',
              optimizer = Adam(),
              metrics = ['accuracy'])

model.fit(x_train, y_train_cat, batch_size = batch_size, epochs = epoch,
          verbose = 1,
          validation_data = (x_test, y_test_cat)) ## validation loss


Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7fcf9fa50320>

In [13]:
## model Save & load 
model.save('./DLdata/dnn_ver1.h5') ## save model
model = load_model('./DLdata/dnn_ver1.h5') ## load model
         
          
## case1          
score = model.evaluate(x_test,y_test_cat,verbose=0)          
print('Total loss on Testing set:', score[0])
print('Accuracyt of Testing set:', score[1])
        
## cacs2
pred = model.predict(x_test)
classify = []
for i in range(0,pred.shape[0]):
    classify.append(np.argmax(pred[i]))
classify =  np.asarray(classify)

result = [y_test,classify] ## fin table
result = np.matrix(result)


Total loss on Testing set: 0.0899214999654
Accuracyt of Testing set: 0.9825


## Option : Checkpoint

In [None]:
## checkpoint
from keras.callbacks import ModelCheckpoint
filepath = 'weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5 ### checkpoint neural network model improvements

filepath = 'weight-best.hdf5' ## check point best NN model Only 

checkpoint = ModelCheckpoint(filepath, monitor = 'val_acc', verbose = 1, save_best_only = True, mode = 'max')
callbacks_list = [checkpoint]           

model.fit(x_train,y_train_cat, validation_split=0.2, epochs = epoch, batch_size = batch_size,
          callbacks = callbacks_list, verbose = 1 )

## load weights
model.load_weights("weight)

############ parameters .. 

mode type : max // auto // min 

if monitor = val_acc then mode = max
if monitor = val_loss then mode = min
if mode = auto then automatically inferred from the nmae of the monitored quantity


## Option : Early Stopping

In [None]:
## early stopping
from keras.callbacks import EarlyStopping
earlystop = EarlyStopping(monitor = 'val_loss', patience = 2, verbose = 2, mode = 'auto')
callbacks_es = [earlystop]

model.fit(x_train,y_train_cat, validation_split=0.2, epochs = epoch, batch_size = batch_size,
          callbacks = callbacks_es, verbose = 1 )

#### patience 
when the loss on validation set doesn't improve for 2 epochs  ==> stop training


## Option : Learning rate & weight decay & momentum

In [None]:
### learning rate & weight decay & momentum
from keras.optimizers import *

optim1 = SGD(lr = 0.01, momentum = 0.0, decay = 0.0)
optim2 = Adagrad(lr = 0.01, decay = 0.0)
optim3 = Adam(lr = 0.0001, decay = 0.0)
....
## reference site 
## http://keras.io/optimizers/ 


## Option : Dropout Example

In [None]:
## dropout example
#### deep neural network modeling  with dropout
#### 데이터가 많거나 layer가 많을때, overfitting 방지 측면 
#### 일반적으로는 activation function 위에 위치 시킴 (dense > dropout > activation > dense > dropout > activation)

from keras.layers import Dropout

info_input = Input(shape=(28*28,))

layer0 = Dropout(0.5)(info_input)
layer1 = Dense(512)(layer0)
layer1_act = Activation('relu')(layer1)

layer2 = Dropout(0.5)(layer1_act)

layer3 = Dense(512)(layer2)
layer3_act = Activation('relu')(layer3)

layer4 = Dense(10)(layer3_act)
layer4_act = Activation('softmax')(layer4)

model = Model(inputs=[info_input], 
              outputs = [layer4_act])

model.summary()

model.compile(loss = 'categorical_crossentropy',
              optimizer = optim1,
              metrics = ['accuracy'])