<a href="https://colab.research.google.com/github/swarthyPig/SM13/blob/master/py_modules/py_modules_5_keras_mnist_3_DL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Keras example: mnist analysis by DL (CV2D)

- Keras was designed to easily use the famous deep-learning frameworks; tensorflow, tenano. 
- Keras provides an easy and convenient way to build deep learning models.

    - Keras is an open source python library that enables you to easily build Deep Neural Networks. 
    - The library is capable of running on top of TensorFlow, Theano, Microsoft Cognitive Toolkit, and MXNet. 
    - Tensorflow and Theano are the most used numerical platforms in Python to build Deep Learning algorithms but they can be quite complex and difficult to use.
    
[Goood Intro to Keras](https://towardsdatascience.com/how-to-build-a-neural-network-with-keras-e8faa33d0ae4)

In [0]:
# use TensorFlow 1.x 
%tensorflow_version 1.x
import tensorflow as tf
print(tf.__version__)

In [0]:
# import numpy as np
# from keras.utils import to_categorical
# from keras import models
# from keras import layers

In [0]:
%%time
from keras.datasets import mnist
(X_train0, y_train0), (X_test0, y_test0) = mnist.load_data()

In [0]:
print(X_train0.shape, X_train0.dtype)
print(y_train0.shape, y_train0.dtype)
print(X_test0.shape, X_test0.dtype)
print(y_test0.shape, y_test0.dtype)

In [0]:
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline

In [0]:
X_train0[0]  # 0 =< value =< 255

In [0]:
y_train0[0]

In [0]:
# Plot X_train0[0]
plt.figure(figsize=(2, 2))
plt.imshow(X_train0[0], cmap=mpl.cm.bone_r)  # colormap
plt.grid(False)
plt.xticks([])
plt.yticks([])
plt.show()

### Show images of numbers

In [0]:
# 추가 사항
import numpy as np
# import matplotlib as mpl
def plot_digits(instances, images_per_row=10, **options):
    size = 28
    images_per_row = min(len(instances), images_per_row)
    images = [instance.reshape(size,size) for instance in instances]
    n_rows = (len(instances) - 1) // images_per_row + 1
    row_images = []
    n_empty = n_rows * images_per_row - len(instances)
    images.append(np.zeros((size, size * n_empty)))
    for row in range(n_rows):
        rimages = images[row * images_per_row : (row + 1) * images_per_row]
        row_images.append(np.concatenate(rimages, axis=1))
    image = np.concatenate(row_images, axis=0)
    plt.imshow(image, cmap = mpl.cm.binary, **options)
    plt.axis("off")

In [0]:
plt.figure(figsize=(9,9))
example_images = np.r_[X_train0[:50]]
plot_digits(example_images, images_per_row=10)

plt.show()

### 데이터를 float 타입으로 바꾸고 스케일링한다. (GPU powered!)

In [0]:
X_train = X_train0.reshape(60000, 28, 28, 1).astype('float32') / 255.0
X_test = X_test0.reshape(10000, 28, 28, 1).astype('float32') / 255.0
print(X_train.shape, X_train.dtype)

# 데이터의 구조가 simple NN과는 다름을 확인하시오. (784 -> (28,28))

### y 데이터는 One-Hot-Encoding 을 한다. (Probabilistic labeling)

In [0]:
y_train0[:5]  # first 5 labels

In [0]:
# to_categorical()
from keras.utils import np_utils

y_train = np_utils.to_categorical(y_train0, 10)
y_test = np_utils.to_categorical(y_test0, 10)
y_train[:5]  # Probabilistic labeling

***

## 신경망 구현 순서

### Keras 를 사용하면 다음과 같은 순서로 신경망을 구성할 수 있다.

1. **모형 객체 생성**, Sequential 모형 클래스 객체 생성
2. **신경망 구성**, add 메서드로 layer 추가하여 구성
    - Dense layer 가 가장 일반적인 신경망
    - 입력단부터 순차적으로 추가한다.
    - 레이어는 출력 뉴런 갯수를 첫번째 인수로 받는다.
    - 최초의 레이어는 input_dim 인수로 입력 크기를 설정해야 한다.
    - activation 인수로 activation 함수 설정
3. **compile** 메서드로 모형 완성.
    - loss인수로 Loss 함수 설정
    - optimizer 인수로 최적화 알고리즘 설정
    - metrics 인수로 트레이닝 단계에서 기록할 성능 기준 설정
4. **fit** 메서드로 트레이닝
    - nb_epoch 로 epoch 횟수 설정
    - batch_size 로 mini batch size 설정
    - metrics 인수로 설정한 성능 기준 기록을 출력으로 반환
    - Jupyter Notebook을 사용할 때는 verbose=1 ->  show progress bar , verbose=2 ->  No progress bar 

### 모형 구조 출력 준비
> 만들어진 모형은 model_to_dot 명령이나 summary 명령으로 모델 내부의 layers 리스트를 살펴봄으로써 내부 구조를 확인할 수 있다.
- graphviz, pydot

In [0]:
# Install graphviz, pydot in colab.
# https://laujohn.com/2018/09/24/Plot-Keras-Model-in-Colaboratory/
# Install dependencies
!apt install graphviz
!pip install pydot pydot-ng
!echo "Double check with Python 3"
!python -c "import pydot"

### 합성곱 신경망 모형 구성
- Conv2D()

In [0]:
from keras.optimizers import SGD  # Stochastic Gradient Decent
import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint,EarlyStopping

In [0]:
# Deep Learning model
np.random.seed(0)

# Simple NN
# model = Sequential()
# model.add(Dense(15, input_dim=784, activation="sigmoid"))  # firsr layer
# model.add(Dense(10, activation="sigmoid")) # output layer
# model.compile(optimizer=SGD(lr=0.2), loss='mean_squared_error', metrics=["accuracy"])

# 컨볼루션 신경망의 설정 (Convolutional neural network 2D)
model = Sequential()
# model.add(Conv2D(32, kernel_size=(3, 3), input_dim=784, activation='relu'))
model.add(Conv2D(32, kernel_size=(3, 3), input_shape=(28, 28, 1), activation='relu')) # 1st layer with input
model.add(Conv2D(64, (3, 3), activation='relu')) # 2nd layer
model.add(MaxPooling2D(pool_size=2))  # Poolin layer
model.add(Dropout(0.25))  # Set dropout
model.add(Flatten())      # Flatten
model.add(Dense(128,  activation='relu'))  # Fully connected layer
model.add(Dropout(0.5))   # Set dropout
model.add(Dense(10, activation='softmax')) # Output layer with softmax activation

In [0]:
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

SVG(model_to_dot(model, show_shapes=True, dpi=70).create(prog='dot', format='svg'))

In [0]:
from keras.utils import plot_model
plot_model(model, to_file='model_DL.png')

In [0]:
model.summary()

In [0]:
l1 = model.layers[0]  # 1st layer with input : Cpnv2D (filtering #1)
l2 = model.layers[1]  # 2nd layer : Conv2D (filtering #2)
l3 = model.layers[2]  # 3rd layer : max pooling (Extracting the dominant characteristics)
l4 = model.layers[3]  # 4th layer: dropout to avoid overfitting
l5 = model.layers[4]  # 5th layer: flatten
l6 = model.layers[5]  # 6th dense layer
l7 = model.layers[6]  # 7th layer: dropout to avoid overfitting
l8 = model.layers[7]  # last layer: output layer (softmax: probabilistic prediction of 0 to 9)

In [0]:
l1.name, type(l1), l1.output_shape, l1.activation.__name__, l1.count_params()  # 3*3*32 + 32 = 320

[link: moving gif - "How to do with convolution?"](http://machinelearninguru.com/_images/topics/computer_vision/basics/convolutional_layer_1/stride1.gif)

In [0]:
l2.name, type(l2), l2.output_shape, l2.activation.__name__, l2.count_params()   # 3*3*32*64 + 64 = 18496

In [0]:
l3.name, type(l3), l3.output_shape, l3.count_params()

In [0]:
l4.name, type(l4), l4.output_shape, l4.count_params()

In [0]:
l5.name, type(l5), l5.output_shape, l5.count_params()  # 12*12*64 = 9216

In [0]:
l6.name, type(l6), l6.output_shape, l6.activation.__name__, l6.count_params()   # 9216*128+128 = 1179776

In [0]:
l7.name, type(l7), l7.output_shape, l7.count_params()

In [0]:
l8.name, type(l8), l8.output_shape, l8.activation.__name__, l8.count_params()  # 128*10 + 10 = 1290

### Links to gooood introduction: convolution

- [cnn 요약, 직관적인 설명(단, bias에 대한 설명 부족)](http://taewan.kim/post/cnn/)
- [Undrestanding Convolutional Layers in Convolutional Neural Networks (CNNs)](http://machinelearninguru.com/computer_vision/basics/convolution/convolution_layer.html)
- [Short Introduction to Convolutions and Pooling](https://medium.com/analytics-vidhya/deep-learning-methods-1700548a3093)

## fit 메서드로 트레이닝

In [0]:
# Compiling model
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# loss function: categorical_crossentropy (2 개 이상의 클래스에 적용)
# adam: Adaptive Moment Estimation

### Reload data and preprocess data

In [0]:
%%time
from keras.datasets import mnist
(X_train0, y_train0), (X_test0, y_test0) = mnist.load_data()

In [0]:
print(X_train0.shape, X_train0.dtype)
print(y_train0.shape, y_train0.dtype)
print(X_test0.shape, X_test0.dtype)
print(y_test0.shape, y_test0.dtype)

In [0]:
# 데이터를 float 타입으로 바꾸고 스케일링한다. 
X_train = X_train0.reshape(60000, 28, 28, 1).astype('float32') / 255.0
X_test = X_test0.reshape(10000, 28, 28, 1).astype('float32') / 255.0
print(X_train.shape, X_train.dtype)

In [0]:
# One-hot encoding: Probabilistic labeling
# to_categorical()
from keras.utils import np_utils

y_train = np_utils.to_categorical(y_train0, 10)
y_test = np_utils.to_categorical(y_test0, 10)

In [0]:
# Fitting model
%%time
hist = model.fit(X_train, y_train, 
                 epochs=30, batch_size=100, 
                 validation_data=(X_test, y_test), 
                 verbose=1)

# batch_size: 100 -> 60,000개의 트레이닝 데이터를 100개씩 600묶음으로 나눈 후, 
# 각 묶음에 대하어 forward, back propagation을 실시하면서 파라미터를 조정한다.
# epochs: 전체 600개의 batch에 대하여 다 계산을 수행하면 epoch 1 종료.
# validation_data: 테스트 데이터를 섞어서 검증한다.

### Plot results: performance and accuracy

In [0]:
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline

In [0]:
# Plot performance
plt.plot(hist.history['loss'])
plt.show()

In [0]:
plt.plot(hist.history['acc'], 'b-', label="training")
plt.plot(hist.history['val_acc'], 'r:', label="test")
plt.legend()
plt.show()

## 가중치 정보

> 트레이닝이 끝난 모형의 가중치 정보는 get_weights 메서드로 구할 수 있다. 이 메서드는 w 값과 b 값을 출력한다.

In [0]:
w1 = l1.get_weights()
w1[0].shape, w1[1].shape

In [0]:
w2 = l2.get_weights()
w2[0].shape, w2[1].shape

## 모형의 사용

> 트레이닝이 끝난 모형은 predict 메서드로 y 값을 출력하거나 출력된 y값을 각 클래스에 대한 판별함수로 가정하고 predict_classes 메서드로 classification을 할 수 있다.

In [0]:
plt.figure(figsize=(2, 2))
plt.imshow(X_test0[0], cmap=mpl.cm.bone_r)
plt.grid(False)
plt.xticks([])
plt.yticks([])
plt.show()

In [0]:
model.predict(X_test[:1, :])

In [0]:
model.predict_classes(X_test[:1, :], verbose=0)

### 테스트 데이터에 대한 예측 정확도 계산 

In [0]:
y_pred0 = model.predict(X_test)
y_pred0[:10]

In [0]:
y_pred  = model.predict_classes(X_test, verbose=1)
y_pred[:10]

In [0]:
t_count = np.sum(y_pred == y_test0) # True positive
f_count = np.sum(y_pred != y_test0) # False positive
f_count==10000-t_count

In [0]:
t_count,f_count

In [0]:
accuracy = t_count/10000*100
accuracy

### Accuracy of predicting test numbers is around 99% in DL using Conv2D().

In [0]:
# see which we predicted correctly and which not
correct_indices = np.nonzero(y_pred == y_test0)[0]
incorrect_indices = np.nonzero(y_pred != y_test0)[0]
print()
print(len(correct_indices)," classified correctly")
print(len(incorrect_indices)," classified incorrectly")

In [0]:
# adapt figure size to accomodate 18 subplots
plt.rcParams['figure.figsize'] = (7,14)

figure_evaluation = plt.figure()

# plot 9 correct predictions
for i, correct in enumerate(correct_indices[:9]):
    plt.subplot(6,3,i+1)
    plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')
    plt.title(
      "Predicted: {}, Truth: {}".format(y_pred[correct],
                                        y_test0[correct]))
    plt.xticks([])
    plt.yticks([])

# plot 9 incorrect predictions
for i, incorrect in enumerate(incorrect_indices[:9]):
    plt.subplot(6,3,i+10)
    plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')
    plt.title(
      "Predicted {}, Truth: {}".format(y_pred[incorrect], 
                                       y_test0[incorrect]))
    plt.xticks([])
    plt.yticks([])

# figure_evaluation

## DL 모형의 저장

>  트레이닝이 끝난 모형은 save 메서드로 가중치와 함께 hdf5 형식으로 저장하였다가 나중에 load 명령으로 불러 사용할 수 있다.

In [0]:
model.save('my_model_dl.hdf5')
# del model

In [0]:
ls

In [0]:
!ls sample_data

In [0]:
from keras.models import load_model

model2 = load_model('my_model_dl.hdf5')
model2.predict_classes(X_test[:1, :], verbose=0)

In [0]:
model2.predict_classes(X_test[:10, :], verbose=0)

In [0]:
y_test0[:10]

### 테스트 데이터에 대한 예측 정확도 계산 

In [0]:
# Correct prediction
model2.predict_classes(X_test[8:9, :], verbose=1)

In [0]:
y_test0[8]

In [0]:
# 전체 테스트 데이터에 대한 예측
x_pred = model2.predict_classes(X_test, verbose=1)

In [0]:
t_count = np.sum(x_pred==y_test0) # True positive
f_count = np.sum(x_pred!=y_test0) # False positive
f_count==10000-t_count

In [0]:
t_count,f_count

In [0]:
accuracy = t_count/10000*100
accuracy

## DL is great!!!

#### 대단한 시뮬레이션: MNIST
- [http://scs.ryerson.ca/~aharley/vis/conv/flat.html](http://scs.ryerson.ca/~aharley/vis/conv/flat.html)

### Goood introduction to CNN
- [Image(Cat vs. dog) classifier with CNN](https://towardsdatascience.com/image-classifier-cats-vs-dogs-with-convolutional-neural-networks-cnns-and-google-colabs-4e9af21ae7a8)

- [Full CNN overview](https://cdn-images-1.medium.com/max/1100/1*qsbsCVyu376kqdnNcdxmmw.png)
- [Process of CNN](https://cdn-images-1.medium.com/max/1100/1*yZQjaMKHjm1HzDF4t4juzg.png)