하나님은 모든 사람이 구원을 받으며 진리를 아는데에 이르기를 원하시느니라 (딤전2:4)
<center><img src="https://github.com/idebtor/KMOOC-ML/blob/master/ipynb/images/MLwithPython.png?raw=true" width=1000></center>

__NOTE:__ The following materials have been compiled and adapted from the numerous sources including my own. Please help me to keep this tutorial up-to-date by reporting any issues or questions. Send any comments or criticisms to `idebtor@gmail.com` Your assistances and comments will be appreciated.

# 제 13-2 강: Deep Neural Net

## 학습목표 
- 기계학습을 위한 오픈 프레임워크는 무엇이 있는지 알아본다.
- TensorFlow, Keras, PyTorch가 무엇인지 이해한다.
- CNN을 이용한 MNIST 데이터를 3가지 프레임워크로 학습하는 것을 이해한다. 

## 학습 내용
- 기계학습을 위한 오픈 프레임워크
- TensorFlow
- Keras
- PyTorch
- MNIST 데이터셋 분석


In [None]:
# Package imports
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

import sklearn
import sklearn.datasets
import sklearn.linear_model
# Our own private imports
import imp
import joy
imp.reload(joy)

%matplotlib inline
np.random.seed(1)   # a good practice for reproducibility and debugging

# The following code is used for hiding the warnings and 
# make this notebook clearer.
#import warnings
#warnings.filterwarnings('ignore')

## Deep Neural Net 구현

지금까지 우리는 신경망의 신호처리를 행렬로 표기하면 개발해왔습니다. 이러한 표기법을 바탕으로 어렵지 않게 딥러닝에 필요한 다층 신경망을 구현할 수 있습니다. 

In [None]:
def tanh(x):
    return (1.0 - np.exp(-2 * x))/(
            1.0 + np.exp(-2 * x))
def tanh_d(x):
    return (1 + tanh(x)) * (1 - tanh(x))

def sigmoid(x): 
    #x = np.clip(x, -500, 500)  
    return 1 / (1 + np.exp((-x)))

def sigmoid_d(x):
    return sigmoid(x) * (1 - sigmoid(x))

def relu(x):
    return np.maximum(x, 0)

def relu_d(x):
    x[x<=0] = 0
    x[x>0] = 1
    return x

def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

In [None]:
#%%writefile code/mnistDeepNet.py
#%load code/mnistDeepNet.py
# deep neural net
# version 0.1
# author: idebtor@gmail.com 

import sys
class DeepNeuralNet(object):
    """ implements a deep neural net. 
        Users may specify any number of layers.
        net_arch -- consists of a number of neurons in each layer 
    """
    def __init__(self, net_arch, activate = None, eta = 1.0, epochs = 100, random_seed = 1):
        self.eta = eta
        self.epochs = epochs
        self.net_arch = net_arch
        self.layers = len(net_arch)
        self.W = []
        self.random_seed = random_seed
        
        self.g       = [lambda x: sigmoid(x)   for _ in range(self.layers)]
        self.g_prime = [lambda x: sigmoid_d(x) for _ in range(self.layers)]
        
        if activate is not None:
            for i, (g, g_prime) in enumerate(zip(activate[::2], activate[1::2])):
                self.g[i+1] = g
                self.g_prime[i+1] = g_prime
                
        for i in range(len(self.g)):
            print(type(self.g[i]), id(self.g[i]))
        
        #print('X.shape={}, y.shape{}'.format(X.shape, y.shape))
        # Random initialization with range of weight values (-1,1)
        np.random.seed(self.random_seed)
        
        # A place holder [None] is used to indicated "unused place".
        self.W = [[None]]    ## the first W0 is not used.
        for layer in range(self.layers - 1):
            w = 2 * np.random.rand(self.net_arch[layer+1], 
                                   self.net_arch[layer]) - 1
            print('layer:', layer, 'shape:', w.shape)
            self.W.append(w)  
        print('Weight:', self.W)
            
    def forpass(self, A0):     
        Z = [[None]]   # Z0 is not used.
        A = []       # A0 = X0 is used. 
        A.append(A0)
        for i in range(1, len(self.W)):
            z = np.dot(self.W[i], A[i-1])
            Z.append(z)
            a = self.g[i](z)
            A.append(a)
        return Z, A
    
    def backprop(self, Z, A, Y):
        # initialize empty lists to save E and dZ
        # A place holder None is used to indicated "unused place".
        E  = [None for x in range(self.layers)]
        dZ = [None for x in range(self.layers)]
        
        # Get error at the output layer or the last layer
        ll = self.layers - 1
        error = Y - A[ll]
        E[ll] = error   
        dZ[ll] = error * self.g_prime[ll](Z[ll]) 
        
        # Begin from the back, from the next to last layer
        for i in range(self.layers-2, 0, -1):
            E[i]  = np.dot(self.W[i+1].T, E[i+1])
            dZ[i] = E[i] * self.g_prime[i](Z[i])
       
        # Adjust the weights, using the backpropagation rules
        m = Y.shape[0] # number of samples
        for i in range(ll, 0, -1):
            self.W[i] += self.eta * np.dot(dZ[i], A[i-1].T) / m
        return error
         
    def fit(self, X, y):
        print('fit')
        self.cost_ = []        
        for epoch in range(self.epochs):          
            Z, A = self.forpass(X)        
            cost = self.backprop(Z, A, y)   
            self.cost_.append(
                 np.sqrt(np.sum(cost * cost)))    
        return self

    def predict(self, X):
        print('predict')
        A0 = np.array(X, ndmin=2).T         # A0: inputs
        Z, A = self.forpass(A0)     # forpass
        return A[-1]                                       
   
    def evaluate(self, Xtest, ytest):       # fully vectorized calculation
        print('evaluate')
        m_samples = len(ytest)
        scores = 0        
        A3 = self.predict(Xtest)
        yhat = np.argmax(A3, axis = 0)
        scores += np.sum(yhat == ytest)
        return scores/m_samples * 100
    

In [None]:
def __init__(self, net_arch, activate = None, 
             eta = 1.0, epochs = 100, random_seed = 1):
    self.eta = eta
    self.epochs = epochs
    self.net_arch = net_arch
    self.layers = len(net_arch)
    self.W = []

    self.g       = [lambda x: sigmoid(x)   for _ in range(self.layers)]
    self.g_prime = [lambda x: sigmoid_d(x) for _ in range(self.layers)]

    if activate is not None:
        for i, (g, g_prime) in enumerate(zip(activate[::2], activate[1::2])):
            self.g[i+1] = g
            self.g_prime[i+1] = g_prime

    np.random.seed(random_seed)
    self.W = [[None]]    ## the first W0 is not used.
    for layer in range(self.layers - 1):
        w = 2 * np.random.rand(self.net_arch[layer+1], 
                               self.net_arch[layer]) - 1
        self.W.append(w)    

In [None]:
class DeepNeuralNet():
    """ implements a deep neural net. 
        Users may specify any number of layers.
        net_arch -- consists of a number of neurons in each layer 
    """
    def __init__(self, net_arch, activate = None, 
                 eta = 1.0, epochs = 100, random_seed = 1):
        pass
  
    def forpass(self, A0):     
        pass
    
    def backprop(self, Z, A, Y):
        pass
    
    def fit(self, X, y):
        pass   

    def predict(self, X):
        pass                                     
   
    def evaluate(self, Xtest, ytest):      
        pass

In [None]:
def forpass(self, A0):     
    Z = [[None]] # Z0 is not used.
    A = []       # A0 = X0 is used. 
    A.append(A0)
    for i in range(1, len(self.W)):
        z = np.dot(self.W[i], A[i-1])
        Z.append(z)
        a = self.g[i](z)
        A.append(a)
    return Z, A

In [None]:
def backprop(self, Z, A, Y):
    E  = [None for x in range(self.layers)]
    dZ = [None for x in range(self.layers)]

    ll = self.layers - 1
    error = Y - A[ll]
    E[ll] = error   
    dZ[ll] = error * self.g_prime[ll](Z[ll]) 

    for i in range(self.layers-2, 0, -1):
        E[i]  = np.dot(self.W[i+1].T, E[i+1])
        dZ[i] = E[i] * self.g_prime[i](Z[i])

    m = Y.shape[0] # number of samples
    for i in range(ll, 0, -1):
        self.W[i] += self.eta * np.dot(dZ[i], A[i-1].T) / m
    return error

In [None]:
import joy
import matplotlib.pyplot as plt 
import numpy as np
%matplotlib inline

# Set the input data and labels for XOR
X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])
print(X, "\n", y)

# Initialize the deep neural net with
dnn = DeepNeuralNet([2, 4, 2, 1], eta = 0.9, epochs = 10000)  

# training the deep neural net objcet with X, y
dnn.fit(X, y)             
    
Ao = dnn.predict(X.T)
for x, yhat in zip(X.T, Ao.T):
    print(x, np.round(yhat, 3))

joy.plot_decision_regions(X.T, y, dnn)   
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.legend(loc='best')
plt.show()

In [None]:
import joy
X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])
dnn = DeepNeuralNet([2, 4, 3, 1], eta = 0.5, epochs = 5000).fit(X, y)   

joy.plot_decision_regions(X.T, y, dnn)   
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.legend(loc='best')
plt.show()

### 6.3 오차(self.cost_)의 시각화 

신경망을 학습시키면서 발생하는 오차(손실)를 MnistMiniBatch객체의 속성 `cost_`에 저장되어 있습니다. 이를 시각화해서 신경망이 어떻게 학습을 하였는지, 손실을 최소화하는 방향을 수렴하였는지 분석할 수 있습니다.  다음 셀의 코드를 실행해 봅시다.  

In [None]:
import joy
X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])
dnn = DeepNeuralNet([2, 4, 1], eta = 0.5, epochs = 5000).fit(X, y)   

joy.plot_decision_regions(X.T, y, dnn)   
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.legend(loc='best')
plt.show()

In [None]:
plt.plot(range(len(dnn.cost_)), dnn.cost_)
plt.xlabel('Epochs')
plt.ylabel('Squared Sum of Errors')
plt.title('DeepNeuralNet:{}'.format(dnn.net_arch))
plt.show()

In [None]:
dnn = DeepNeuralNet([2, 4, 1], 
                    eta = 0.5, epochs = 5000).fit(X, y) 
plt.plot(range(len(dnn.cost_)), dnn.cost_)
plt.xlabel('Epochs')
plt.ylabel('Squared Sum of Errors')
plt.title('DeepNeuralNet:{}'.format(dnn.net_arch))
plt.show()

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])

dnn1 = DeepNeuralNet([2,4,1], eta = 0.5, epochs = 5000).fit(X, y) 

g = [sigmoid, sigmoid_d, sigmoid, sigmoid_d, sigmoid, sigmoid_d]
dnn2 = DeepNeuralNet([2,4,2,1], activate=g, eta = 0.5, epochs = 5000).fit(X, y) 
plt.plot(range(len(dnn1.cost_)), dnn1.cost_, label='{}'.format(dnn1.net_arch))
plt.plot(range(len(dnn2.cost_)), dnn2.cost_, label='{}'.format(dnn2.net_arch))
plt.title('DeepNeuralNet for XOR')
plt.xlabel('Epochs')
plt.ylabel('Squared Sum of Errors')
plt.legend(loc='best')
plt.show()

# 각 층별로 활성화 함수를 지정하기

In [None]:
# Set the input data and labels for XOR
X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])
print(X, "\n", y)

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10, 4))

g = [tanh, tanh_d, sigmoid, sigmoid_d, sigmoid, sigmoid_d]
dnn1 = DeepNeuralNet([2, 4, 2, 1], activate = g, eta = 0.5, epochs = 2000).fit(X,y)
ax[0].plot(range(1, len(dnn1.cost_) + 1), dnn1.cost_)
ax[0].set_xlabel('Epochs')
ax[0].set_ylabel('log(Sum-squared-error)')
ax[0].set_ylim([0.0, 1.1])
ax[0].set_title('DeepNeuralNet:{}'.format(dnn1.net_arch))

g = [tanh, tanh_d, relu, relu_d, tanh, tanh_d]
dnn2 = DeepNeuralNet([2, 4, 2, 1], activate = g, eta=0.5, epochs=2000).fit(X, y)
ax[1].plot(range(1, len(dnn2.cost_) + 1), dnn2.cost_)
ax[1].set_xlabel('Epochs')
ax[1].set_ylabel('Sum-squared-error')
ax[1].set_ylim([0.0, 1.1])
ax[1].set_title('DeepNeuralNet:{}'.format(dnn2.net_arch))
plt.show()

In [None]:
# Set the input data and labels for XOR
X = np.array([ [0, 0, 1, 1], [0, 1, 0, 1] ])
y = np.array([0, 1, 1, 0])
print(X, "\n", y)

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10, 4))

g = [tanh, tanh_d, sigmoid, sigmoid_d, sigmoid, sigmoid_d]
dnn1 = DeepNeuralNet([2, 18, 4, 1], activate = g, eta = 0.5, epochs = 2000).fit(X,y)
ax[0].plot(range(1, len(dnn1.cost_) + 1), dnn1.cost_)
ax[0].set_xlabel('Epochs')
ax[0].set_ylabel('log(Sum-squared-error)')
ax[0].set_ylim([0.0, 1.1])
ax[0].set_title('DeepNeuralNet:{}'.format(dnn1.net_arch))

g = [tanh, tanh_d, relu, relu_d, tanh, tanh_d]
dnn2 = DeepNeuralNet([2, 18, 4, 1], activate = g, eta=0.5, epochs=2000).fit(X, y)
ax[1].plot(range(1, len(dnn2.cost_) + 1), dnn2.cost_)
ax[1].set_xlabel('Epochs')
ax[1].set_ylabel('Sum-squared-error')
ax[1].set_ylim([0.0, 1.1])
ax[1].set_title('DeepNeuralNet:{}'.format(dnn2.net_arch))
plt.show()

## 학습 정리
- 기계학습을 위한 오픈 프레임워크는 무엇이 있는지 알아보기.
- TensorFlow, Keras, PyTorch가 무엇인지 이해하기.
- CNN을 이용한 MNIST 데이터를 3가지 프레임워크로 학습하는 것을 이해하기.


----------
Rejoice in the Lord always. I will say it again: Rejoice! (Ph4:4)