<a href="https://colab.research.google.com/github/Redwoods/Py/blob/master/py-basic/ann/DL4_FeedForward.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[original link: DL-FeedForward](https://github.com/alfredessa/lasi2018/blob/master/notebooks/slideversion/deeplearning/DL-FeedForward.ipynb)

In [0]:
from IPython.core.display import HTML, Image
css_file = 'style.css'
HTML(open(css_file, 'r').read())

>[Deep Learning](#scrollTo=lje3cygaYAFj)

>>[Feed Forward](#scrollTo=lje3cygaYAFj)

>[Forward Propagation](#scrollTo=iW6lahopYAFl)

>[Feedforward Networks](#scrollTo=UpXRkkSKYAFt)

>[Forward Propagation Computation](#scrollTo=aKI5F90aYAFz)

>>>[Network Class](#scrollTo=h5Rf-27oYAF7)

>>[Network class 점검](#scrollTo=35BEaBxSYAGL)

>>>[Hidden layer](#scrollTo=genHXesvYAG3)

>>>[행벡터와 열벡터의 합에서 발생하는 문제 해결](#scrollTo=BPlRBiVeYAG9)

>>>[Output layer](#scrollTo=g2kQtDXFYAHU)

>>>[Checking feedforward operation](#scrollTo=BAtTmPpzYAH4)

>>>[heaviside activation](#scrollTo=lQF7iPHjYAIE)

>>>[relu activation](#scrollTo=7DprevEAYAIM)

>>>[sigmoid activation](#scrollTo=H2Z8RVxpYAIW)

>>>[Well done! Great!](#scrollTo=g54aYHLDYAIl)

>>>[Neural network 1: single layer (3 -> 2 -> 1)](#scrollTo=BGy-2RPKYAIo)

>>>[Neural network 2: single layer (5 -> 3 -> 2)](#scrollTo=-SPngHpSYAJv)

>>>[Neural network 3: deep layers (5 -> [3, 3] -> 1)](#scrollTo=uqHN1rPhYAMG)

>>>[[도전] 10개의 입력을 받아서 두 개의 딥 레이어로 처리해서 두개의 최종 출력을 구하는 과정을 만드시오.](#scrollTo=DfyR-JIoYANX)

>>>[Input data: X](#scrollTo=ln7zZtnmYANt)

>>>[[도전2] 764개의 입력을 받아서 4 개의 딥 레이어로 처리해서 두개의 최종 출력을 구하는 과정을 만드시오.](#scrollTo=1uaBEbEJYAOI)

>>>[Input data: X](#scrollTo=HUu9aQG6YAOc)

>>>[Data Representation ((1,n) input, a layer with k neurons)](#scrollTo=TGzvsmbyYAOq)

>>[Data Representation ((1,n) input, deep layer with [k,s,t] neurons, and (1,r) output)](#scrollTo=klEtc1_LYAOt)



# Deep Learning 

## Feed Forward

Alfred Essa, Ani Aghababyan, Shirin Mojarad

# Forward Propagation

**1:** State the definition of a *feedforward* network

**2:**: Describe the data structures, in terms of *matrix algebra*, for representing a feedforward network 

**3:**: Understand the end-to-end *computation* or forward propagation for a feedforward network

**4:**: Write a Python class to compute the forward propagation steps resulting in the final output

# Feedforward Networks

<img src="https://github.com/alfredessa/lasi2018/raw/8140f2ad8d9998a02d7ec559d0ddceb5f41c6a18/notebooks/slideversion/deeplearning/images/feedforward1.png" width="60%" height="60%" />


> Feedforward networks are the quintessential neural network. They are referred to as feedforward because information flows sequentially through each layer without feedback or loops.

- Feedforward networks은 입력이 한 방향으로 레이어들을 통과하면서 최종 출력값을 결정한다.

# Forward Propagation Computation

<img src="https://github.com/alfredessa/lasi2018/raw/8140f2ad8d9998a02d7ec559d0ddceb5f41c6a18/notebooks/slideversion/deeplearning/images/feedforward2.png" width="75%" height="75%" />

- A feedforward neural network composes together a series of functions: $f(x) = f^{(3)}(f^{(2)}(f^{(1)}(x)))$

- $f^{(1)}$ is the first layer of the network, $f^{(2)}$ is the second layer, and so on

- 각 레이어는 입력 (x)을 처리해서 구한 출력을 새로운 입력으로 다음 레이어를 보낸다.



---



### Network Class

In [0]:
import numpy as np

In [0]:
# activation function
def relu(z):
    return np.maximum(0,z)

def sigmoid(z):
        return 1.0/(1.0+np.exp(-z))
    
def heaviside(z):
#     if z<0:  #.any()<0:  # update at 181119
#         return 0
#     else:
#         return 1
    return np.heaviside(z,0) # updated by Redwoods Yi, 181120

# use np.heaviside(z,0) instead of heaviside(z)

In [0]:
class Network(object):
    def __init__(self,sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y,1) for y in sizes[1:]]
        self.weights = [np.random.randn(y,x) 
                        for x,y in zip(sizes[:-1], sizes[1:])]
        
    def feedforward(self,a,phi):
        for b,w in zip(self.biases, self.weights):
            z = np.dot(w,a) + b   # b[0].T
#             print(z)
            z=np.diag(z)  # update by Redwoods Yi, 181123
            print(z)
            a = phi(z)  # new input to next layer
#         print(z)
        print(a)
        return a
        
    def set_biases(self,newbiases):
        self.biases = [np.array(newbiases)]
        
    def set_weights(self,newweights):
        self.weights = [np.array([newweights])]
        
    def show_parameters(self):
        print("biases: =", self.biases)
        print("weights: =", self.weights)

## Network class 점검

In [0]:
# Set a seed for randomnumber generation
np.random.seed(0)

In [5]:
# test
sizes=[5,3,2]  # input, layer1, output
b1 = [np.random.randn(y,1) for y in sizes[1:]]
b1

[array([[1.76405235],
        [0.40015721],
        [0.97873798]]), array([[2.2408932 ],
        [1.86755799]])]

In [7]:
b1[0],b1[1]

(array([[1.76405235],
        [0.40015721],
        [0.97873798]]), array([[2.2408932 ],
        [1.86755799]]))

In [8]:
w1 = [np.random.randn(y,x) for x,y in zip(sizes[:-1], sizes[1:])]
w1 

[array([[-0.97727788,  0.95008842, -0.15135721, -0.10321885,  0.4105985 ],
        [ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
        [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]]),
 array([[-2.55298982,  0.6536186 ,  0.8644362 ],
        [-0.74216502,  2.26975462, -1.45436567]])]

In [9]:
w1[0]

array([[-0.97727788,  0.95008842, -0.15135721, -0.10321885,  0.4105985 ],
       [ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
       [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]])

In [10]:
w1[1]

array([[-2.55298982,  0.6536186 ,  0.8644362 ],
       [-0.74216502,  2.26975462, -1.45436567]])

In [11]:
# input
x0 = np.arange(5)
x0,x0.shape

(array([0, 1, 2, 3, 4]), (5,))

### Hidden layer

In [12]:
# hidden layer: transfer function : linear sum + bias
z0 = np.dot(w1[0], x0)  # linear sum
z0,b1[0],z0+b1[0].T

(array([ 1.98011145,  5.11682694, -1.39341731]), array([[1.76405235],
        [0.40015721],
        [0.97873798]]), array([[ 3.7441638 ,  5.51698415, -0.41467932]]))

### 행벡터와 열벡터의 합에서 발생하는 문제 해결

In [13]:
t1=np.array([ 1.98011145,  5.11682694, -1.39341731])
t2=np.array([[1.76405235],[0.40015721],[0.97873798]])
t1+t2

array([[ 3.7441638 ,  6.88087929,  0.37063504],
       [ 2.38026866,  5.51698415, -0.9932601 ],
       [ 2.95884943,  6.09556492, -0.41467933]])

In [14]:
np.diag(t1+t2)

array([ 3.7441638 ,  5.51698415, -0.41467933])

In [15]:
t1+t2.T  # correct result!

array([[ 3.7441638 ,  5.51698415, -0.41467933]])

In [16]:
# in the layer
z0+b1[0].T   # [3.7441638  6.88087928 0.37063504] -> wrong

array([[ 3.7441638 ,  5.51698415, -0.41467932]])

In [17]:
# output from the hidden layer
heaviside(z0+b1[0].T),relu(z0+b1[0].T),sigmoid(z0+b1[0].T)  # output

(array([[1., 1., 0.]]),
 array([[3.7441638 , 5.51698415, 0.        ]]),
 array([[0.97689125, 0.99599813, 0.39779064]]))

### Output layer
> The output from the hidden layer becomes the input to the output layer.

In [18]:
# output layer: transfer function : linear sum + bias
z1 = np.dot(w1[1], heaviside(z0+b1[0].T).T)  #+ b1[0].T #+ N1.biases
z1,b1[1],z1+b1[1]

(array([[-1.89937122],
        [ 1.5275896 ]]), array([[2.2408932 ],
        [1.86755799]]), array([[0.34152198],
        [3.39514759]]))

In [0]:
# (linear sum + bias) in output layer
z1h=z1+b1[1]

In [20]:
# output layer: heaviside activation
heaviside(z1h+b1[1])

array([[1.],
       [1.]])

In [0]:
# (linear sum + bias) in output layer
z1r = np.dot(w1[1], relu(z0+b1[0].T).T)  + b1[1]

In [22]:
# output layer: relu activation
relu(z1r)

array([[ 0.        ],
       [11.61097086]])

In [0]:
# (linear sum + bias) in output layer
z1s = np.dot(w1[1], sigmoid(z0+b1[0].T).T)  + b1[1]

In [24]:
# output layer: sigmoid activation
sigmoid(z1s)

array([[0.6773822],
       [0.9439951]])

### Checking feedforward operation
> N0 network with sizes=[5,3,2]  # input, layer1, output

In [25]:
np.random.seed(0)
N0=Network(sizes)
sizes

[5, 3, 2]

In [26]:
N0.show_parameters()

biases: = [array([[1.76405235],
       [0.40015721],
       [0.97873798]]), array([[2.2408932 ],
       [1.86755799]])]
weights: = [array([[-0.97727788,  0.95008842, -0.15135721, -0.10321885,  0.4105985 ],
       [ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
       [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]]), array([[-2.55298982,  0.6536186 ,  0.8644362 ],
       [-0.74216502,  2.26975462, -1.45436567]])]


### heaviside activation

In [27]:
N0.feedforward(x0, heaviside)

[ 3.7441638   5.51698415 -0.41467932]
[0.34152198 3.39514759]
[1. 1.]


array([1., 1.])

In [28]:
N0.feedforward(x0, heaviside)==heaviside(z1h).T

[ 3.7441638   5.51698415 -0.41467932]
[0.34152198 3.39514759]
[1. 1.]


array([[ True,  True]])

### relu activation

In [29]:
N0.feedforward(x0, relu)

[ 3.7441638   5.51698415 -0.41467932]
[-3.71191542 11.61097086]
[ 0.         11.61097086]


array([ 0.        , 11.61097086])

In [30]:
N0.feedforward(x0, relu) == relu(z1r).T

[ 3.7441638   5.51698415 -0.41467932]
[-3.71191542 11.61097086]
[ 0.         11.61097086]


array([[ True,  True]])

### sigmoid activation

In [31]:
N0.feedforward(x0, sigmoid)

[ 3.7441638   5.51698415 -0.41467932]
[0.74176733 2.82468179]
[0.6773822 0.9439951]


array([0.6773822, 0.9439951])

In [32]:
N0.feedforward(x0, sigmoid)==sigmoid(z1s).T

[ 3.7441638   5.51698415 -0.41467932]
[0.74176733 2.82468179]
[0.6773822 0.9439951]


array([[ True,  True]])

### Well done! Great!

***

### Neural network 1: single layer (3 -> 2 -> 1)

> input (3), layer (2) , and output (1) 

In [33]:
# inputs (should be numpy array)
x1 = np.array([2.3,4.5,1.3])
x2 = np.array([1.3,2.5,4.3])
x1,x2

(array([2.3, 4.5, 1.3]), array([1.3, 2.5, 4.3]))

In [34]:
np.random.seed(0)
N1 = Network([3,2,1])  # 3X2X1 NN
N1.num_layers # input (3), layer (2) , and output (1) 

3

In [35]:
N1.num_layers,N1.sizes

(3, [3, 2, 1])

In [36]:
N1.show_parameters()

biases: = [array([[1.76405235],
       [0.40015721]]), array([[0.97873798]])]
weights: = [array([[ 2.2408932 ,  1.86755799, -0.97727788],
       [ 0.95008842, -0.15135721, -0.10321885]]), array([[0.4105985 , 0.14404357]])]


In [37]:
N1.weights,N1.biases,x1

([array([[ 2.2408932 ,  1.86755799, -0.97727788],
         [ 0.95008842, -0.15135721, -0.10321885]]),
  array([[0.4105985 , 0.14404357]])],
 [array([[1.76405235],
         [0.40015721]]), array([[0.97873798]])],
 array([2.3, 4.5, 1.3]))

In [38]:
# z = np.dot(w,a)+b
z0 = np.dot(N1.weights[0], x1) # + N1.biases
z0,N1.biases[0],z0+N1.biases[0].T

(array([12.28760407,  1.36991142]), array([[1.76405235],
        [0.40015721]]), array([[14.05165642,  1.77006862]]))

In [39]:
N1.biases[0],N1.biases[0].T

(array([[1.76405235],
        [0.40015721]]), array([[1.76405235, 0.40015721]]))

In [40]:
z0 = np.dot(N1.weights[0], x1)  + N1.biases[0].T
z0

array([[14.05165642,  1.77006862]])

In [41]:
heaviside(z0),relu(z0),sigmoid(z0)

(array([[1., 1.]]),
 array([[14.05165642,  1.77006862]]),
 array([[0.99999921, 0.8544662 ]]))

In [42]:
N1.weights[1],np.dot(N1.weights[1], heaviside(z0).T) 

(array([[0.4105985 , 0.14404357]]), array([[0.55464207]]))

In [43]:
z1h = np.dot(N1.weights[1], heaviside(z0).T)  + N1.biases[1].T
z1h

array([[1.53338006]])

In [44]:
z1r = np.dot(N1.weights[1], relu(z0).T)  + N1.biases[1].T
z1r

array([[7.00329406]])

In [45]:
z1s = np.dot(N1.weights[1], sigmoid(z0).T)  + N1.biases[1].T
z1s

array([[1.51241653]])

In [46]:
heaviside(z1h),relu(z1r),sigmoid(z1s)

(array([[1.]]), array([[7.00329406]]), array([[0.81941906]]))

In [47]:
N1.feedforward(x1, heaviside),N1.feedforward(x1, relu),N1.feedforward(x1, sigmoid)

[14.05165642  1.77006862]
[1.53338006]
[1.]
[14.05165642  1.77006862]
[7.00329406]
[7.00329406]
[14.05165642  1.77006862]
[1.51241653]
[0.81941906]


(array([1.]), array([7.00329406]), array([0.81941906]))

In [48]:
N1.feedforward(x2, heaviside),N1.feedforward(x2, relu),N1.feedforward(x2, sigmoid)

[5.1438136  0.81303807]
[1.53338006]
[1.]
[5.1438136  0.81303807]
[3.20789305]
[3.20789305]
[5.1438136  0.81303807]
[1.48674151]
[0.81558869]


(array([1.]), array([3.20789305]), array([0.81558869]))

***

### Neural network 2: single layer (5 -> 3 -> 2)

> input (5), layer (3) , and output (2) 

In [0]:
# inputs (should be numpy array)
x1 = np.array([2.3,4.5,1.3,0.7,2.8])
x2 = np.array([1.3,2.5,4.3,5.1,3.9])
x1,x2

(array([2.3, 4.5, 1.3, 0.7, 2.8]), array([1.3, 2.5, 4.3, 5.1, 3.9]))

In [0]:
np.random.seed(0)
N2 = Network([5,3,2])  # 5X3X2 NN
N2.num_layers # input (5), layer (3) , and output (2) 

3

In [0]:
N2.num_layers,N2.sizes

(3, [5, 3, 2])

In [0]:
N2.show_parameters()

biases: = [array([[1.76405235],
       [0.40015721],
       [0.97873798]]), array([[2.2408932 ],
       [1.86755799]])]
weights: = [array([[-0.97727788,  0.95008842, -0.15135721, -0.10321885,  0.4105985 ],
       [ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
       [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]]), array([[-2.55298982,  0.6536186 ,  0.8644362 ],
       [-0.74216502,  2.26975462, -1.45436567]])]


In [0]:
N2.weights,N2.biases,x1

([array([[-0.97727788,  0.95008842, -0.15135721, -0.10321885,  0.4105985 ],
         [ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
         [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574]]),
  array([[-2.55298982,  0.6536186 ,  0.8644362 ],
         [-0.74216502,  2.26975462, -1.45436567]])],
 [array([[1.76405235],
         [0.40015721],
         [0.97873798]]), array([[2.2408932 ],
         [1.86755799]])],
 array([2.3, 4.5, 1.3, 0.7, 2.8]))

In [0]:
N2.weights[0].shape,N2.weights[1].shape,N2.biases[0].shape,N2.biases[1].shape

((3, 5), (2, 3), (3, 1), (2, 1))

In [0]:
# z = np.dot(w,a)+b
z0 = np.dot(N2.weights[0], x1) # + N1.biases
z0,N2.biases[0],z0+N2.biases[0].T

(array([2.90831699, 9.1928696 , 5.05178036]), array([[1.76405235],
        [0.40015721],
        [0.97873798]]), array([[4.67236934, 9.59302681, 6.03051834]]))

In [0]:
N2.biases[0],N2.biases[0].T

(array([[1.76405235],
        [0.40015721],
        [0.97873798]]), array([[1.76405235, 0.40015721, 0.97873798]]))

In [0]:
z0 = np.dot(N2.weights[0], x1)  + N2.biases[0].T
z0

array([[4.67236934, 9.59302681, 6.03051834]])

In [0]:
heaviside(z0),relu(z0),sigmoid(z0)

(array([[1., 1., 1.]]),
 array([[4.67236934, 9.59302681, 6.03051834]]),
 array([[0.99073652, 0.9999318 , 0.99760152]]))

In [0]:
N2.weights[1],np.dot(N2.weights[1], heaviside(z0).T) 

(array([[-2.55298982,  0.6536186 ,  0.8644362 ],
        [-0.74216502,  2.26975462, -1.45436567]]), array([[-1.03493502],
        [ 0.07322393]]))

In [0]:
z1h = np.dot(N2.weights[1], heaviside(z0).T)  + N2.biases[1]
z1h

array([[1.20595818],
       [1.94078192]])

In [0]:
z1r = np.dot(N2.weights[1], relu(z0).T)  + N2.biases[1]
z1r

array([[ 1.79556092],
       [11.40312698]])

In [0]:
z1s = np.dot(N2.weights[1], sigmoid(z0).T)  + N2.biases[1]
z1s

array([[1.22748983],
       [1.95099042]])

In [0]:
heaviside(z1h),relu(z1r),sigmoid(z1s)

(array([[1.],
        [1.]]), array([[ 1.79556092],
        [11.40312698]]), array([[0.77337893],
        [0.8755546 ]]))

In [0]:
N2.feedforward(x1, heaviside),N2.feedforward(x1, relu),N2.feedforward(x1, sigmoid)

[4.67236934 9.59302681 6.03051834]
[1.20595818 1.94078192]
[1. 1.]
[4.67236934 9.59302681 6.03051834]
[ 1.79556092 11.40312698]
[ 1.79556092 11.40312698]
[4.67236934 9.59302681 6.03051834]
[1.22748983 1.95099042]
[0.77337893 0.8755546 ]


(array([1., 1.]),
 array([ 1.79556092, 11.40312698]),
 array([0.77337893, 0.8755546 ]))

In [0]:
N2.feedforward(x2, heaviside),N2.feedforward(x2, relu),N2.feedforward(x2, sigmoid)

[3.29289416 9.84716903 2.53120365]
[1.20595818 1.94078192]
[1. 1.]
[3.29289416 9.84716903 2.53120365]
[ 2.45852479 18.09304885]
[ 2.45852479 18.09304885]
[3.29289416 9.84716903 2.53120365]
[1.23365244 2.074429  ]
[0.7744572  0.88839286]


(array([1., 1.]),
 array([ 2.45852479, 18.09304885]),
 array([0.7744572 , 0.88839286]))

***

### Neural network 3: deep layers (5 -> [3, 3] -> 1)

> input (5), layer (3,3) , and output (1) 

In [0]:
# inputs (should be numpy array)
x1 = np.array([2.3,4.5,1.3,0.7,2.8])
x2 = np.array([1.3,2.5,4.3,5.1,3.9])
x1,x2

(array([2.3, 4.5, 1.3, 0.7, 2.8]), array([1.3, 2.5, 4.3, 5.1, 3.9]))

In [0]:
np.random.seed(0)
N3 = Network([5,3,3,1])  # 5X[3X3]X1 NN
N3.num_layers # input (5), layer (3,3) , and output (1) 

4

In [0]:
N3.num_layers,N3.sizes

(4, [5, 3, 3, 1])

In [0]:
N3.show_parameters()

biases: = [array([[1.76405235],
       [0.40015721],
       [0.97873798]]), array([[ 2.2408932 ],
       [ 1.86755799],
       [-0.97727788]]), array([[0.95008842]])]
weights: = [array([[-0.15135721, -0.10321885,  0.4105985 ,  0.14404357,  1.45427351],
       [ 0.76103773,  0.12167502,  0.44386323,  0.33367433,  1.49407907],
       [-0.20515826,  0.3130677 , -0.85409574, -2.55298982,  0.6536186 ]]), array([[ 0.8644362 , -0.74216502,  2.26975462],
       [-1.45436567,  0.04575852, -0.18718385],
       [ 1.53277921,  1.46935877,  0.15494743]]), array([[ 0.37816252, -0.88778575, -1.98079647]])]


In [0]:
N3.weights,N3.biases,x1

([array([[-0.15135721, -0.10321885,  0.4105985 ,  0.14404357,  1.45427351],
         [ 0.76103773,  0.12167502,  0.44386323,  0.33367433,  1.49407907],
         [-0.20515826,  0.3130677 , -0.85409574, -2.55298982,  0.6536186 ]]),
  array([[ 0.8644362 , -0.74216502,  2.26975462],
         [-1.45436567,  0.04575852, -0.18718385],
         [ 1.53277921,  1.46935877,  0.15494743]]),
  array([[ 0.37816252, -0.88778575, -1.98079647]])],
 [array([[1.76405235],
         [0.40015721],
         [0.97873798]]), array([[ 2.2408932 ],
         [ 1.86755799],
         [-0.97727788]]), array([[0.95008842]])],
 array([2.3, 4.5, 1.3, 0.7, 2.8]))

In [0]:
# weights
N3.weights[0].shape,N3.weights[1].shape,N3.weights[2].shape

((3, 5), (3, 3), (1, 3))

In [0]:
# biases
N3.biases[0].shape,N3.biases[1].shape,N3.biases[2].shape

((3, 1), (3, 1), (1, 1))

In [0]:
# z = np.dot(w,a)+b
# z0 = np.dot(N3.weights[0], x1) # + N1.biases
# z0,N3.biases[0],z0+N3.biases[0].T
z0 = np.dot(N3.weights[0], x1)  + N3.biases[0].T
z0

array([[5.65802031, 7.69209719, 0.84839337]])

In [0]:
heaviside(z0),relu(z0),sigmoid(z0)

(array([[1., 1., 1.]]),
 array([[5.65802031, 7.69209719, 0.84839337]]),
 array([[0.99652272, 0.99954379, 0.70023001]]))

In [0]:
N3.weights[1],np.dot(N3.weights[1], heaviside(z0).T) 

(array([[ 0.8644362 , -0.74216502,  2.26975462],
        [-1.45436567,  0.04575852, -0.18718385],
        [ 1.53277921,  1.46935877,  0.15494743]]), array([[ 2.3920258 ],
        [-1.59579101],
        [ 3.15708541]]))

In [0]:
z1h = np.dot(N3.weights[1], heaviside(z0).T)  + N3.biases[1]
z1h

array([[4.632919  ],
       [0.27176698],
       [2.17980753]])

In [0]:
z1r = np.dot(N3.weights[1], relu(z0).T)  + N3.biases[1]
z1r

array([[ 3.34873007],
       [-6.1680991 ],
       [19.12912487]])

In [0]:
z1s = np.dot(N3.weights[1], sigmoid(z0).T)  + N3.biases[1]
z1s

array([[3.94984737],
       [0.33291545],
       [2.12735869]])

In [0]:
z2h = np.dot(N3.weights[2], heaviside(z1h))  + N3.biases[2].T
z2h

array([[-1.54033128]])

In [0]:
z2r = np.dot(N3.weights[2], relu(z1r))  + N3.biases[2].T
z2r

array([[-35.67445036]])

In [0]:
z2s = np.dot(N3.weights[2], sigmoid(z1s))  + N3.biases[2].T
z2s

array([[-0.96591028]])

In [0]:
sigmoid(z1s)

array([[0.98110621],
       [0.58246858],
       [0.893534  ]])

In [0]:
heaviside(z2h),relu(z2r),sigmoid(z2s)

(array([[0.]]), array([[0.]]), array([[0.27569642]]))

In [0]:
N3.feedforward(x1, heaviside),N3.feedforward(x1, relu),N3.feedforward(x1, sigmoid)

[5.65802031 7.69209719 0.84839337]
[4.632919   0.27176698 2.17980753]
[-1.54033128]
[0.]
[5.65802031 7.69209719 0.84839337]
[ 3.34873007 -6.1680991  19.12912487]
[-35.67445036]
[0.]
[5.65802031 7.69209719 0.84839337]
[3.94984737 0.33291545 2.12735869]
[-0.96591028]
[0.27569642]


(array([0.]), array([0.]), array([0.27569642]))

In [0]:
N3.feedforward(x2, heaviside),N3.feedforward(x2, relu),N3.feedforward(x2, sigmoid)

[  9.48110329  11.13095315 -12.64904572]
[2.36316438 0.45895083 2.0248601 ]
[-1.54033128]
[0.]
[  9.48110329  11.13095315 -12.64904572]
[  2.17569802 -11.41209729  29.9105238 ]
[-57.47380405]
[0.]
[  9.48110329  11.13095315 -12.64904572]
[2.3631166  0.45906049 2.02472216]
[-0.99808232]
[0.26931863]


(array([0.]), array([0.]), array([0.26931863]))

***

### [도전] 10개의 입력을 받아서 두 개의 딥 레이어로 처리해서 두개의 최종 출력을 구하는 과정을 만드시오. 
> 입력:10, 첫번째 레이어: 5 뉴런, 두번째 레이어: 3 뉴런, 출력:2

In [0]:
np.random.seed(0)
N4 = Network([10,5,3,2])

In [0]:
N4.sizes

[10, 5, 3, 2]

In [0]:
N4.biases  # 

[array([[1.76405235],
        [0.40015721],
        [0.97873798],
        [2.2408932 ],
        [1.86755799]]), array([[-0.97727788],
        [ 0.95008842],
        [-0.15135721]]), array([[-0.10321885],
        [ 0.4105985 ]])]

In [0]:
N4.weights  # (10X5) , (5X3), (3X2)

[array([[ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323,
          0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574],
        [-2.55298982,  0.6536186 ,  0.8644362 , -0.74216502,  2.26975462,
         -1.45436567,  0.04575852, -0.18718385,  1.53277921,  1.46935877],
        [ 0.15494743,  0.37816252, -0.88778575, -1.98079647, -0.34791215,
          0.15634897,  1.23029068,  1.20237985, -0.38732682, -0.30230275],
        [-1.04855297, -1.42001794, -1.70627019,  1.9507754 , -0.50965218,
         -0.4380743 , -1.25279536,  0.77749036, -1.61389785, -0.21274028],
        [-0.89546656,  0.3869025 , -0.51080514, -1.18063218, -0.02818223,
          0.42833187,  0.06651722,  0.3024719 , -0.63432209, -0.36274117]]),
 array([[-0.67246045, -0.35955316, -0.81314628, -1.7262826 ,  0.17742614],
        [-0.40178094, -1.63019835,  0.46278226, -0.90729836,  0.0519454 ],
        [ 0.72909056,  0.12898291,  1.13940068, -1.23482582,  0.40234164]]),
 array([[-0.68481009, -0.8

In [0]:
N4.show_parameters()

biases: = [array([[1.76405235],
       [0.40015721],
       [0.97873798],
       [2.2408932 ],
       [1.86755799]]), array([[-0.97727788],
       [ 0.95008842],
       [-0.15135721]]), array([[-0.10321885],
       [ 0.4105985 ]])]
weights: = [array([[ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323,
         0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574],
       [-2.55298982,  0.6536186 ,  0.8644362 , -0.74216502,  2.26975462,
        -1.45436567,  0.04575852, -0.18718385,  1.53277921,  1.46935877],
       [ 0.15494743,  0.37816252, -0.88778575, -1.98079647, -0.34791215,
         0.15634897,  1.23029068,  1.20237985, -0.38732682, -0.30230275],
       [-1.04855297, -1.42001794, -1.70627019,  1.9507754 , -0.50965218,
        -0.4380743 , -1.25279536,  0.77749036, -1.61389785, -0.21274028],
       [-0.89546656,  0.3869025 , -0.51080514, -1.18063218, -0.02818223,
         0.42833187,  0.06651722,  0.3024719 , -0.63432209, -0.36274117]]), array([[-0.67246045

### Input data: X
> X : (1,10) -> (10,5) -> (5,3) -> (3,2) -> (2,1)

In [0]:
# np.random.seed(0)
xin1 = np.random.randn(10)
xin2 = np.array([2.3,4.5,1.3,0.7,2.8,3.3,2.2,4.4,6.6,5.5])
xin1,xin2

(array([ 0.90082649,  0.46566244, -1.53624369,  1.48825219,  1.89588918,
         1.17877957, -0.17992484, -1.07075262,  1.05445173, -0.40317695]),
 array([2.3, 4.5, 1.3, 0.7, 2.8, 3.3, 2.2, 4.4, 6.6, 5.5]))

In [0]:
xin1.shape,xin2.shape

((10,), (10,))

In [0]:
N4.feedforward(xin1,heaviside)

[ 3.44311716 -0.22294009 -2.56030369  2.45383914 -0.13826867]
[-3.37602093 -0.35899088 -0.65709247]
[-0.10321885  0.4105985 ]
[-0.10321885  0.4105985 ]
[0. 1.]


array([0., 1.])

In [0]:
N4.feedforward(xin1,relu)

[ 3.44311716 -0.22294009 -2.56030369  2.45383914 -0.13826867]
[-7.5286578  -2.65965466 -0.67107692]
[-0.10321885  0.4105985 ]
[-0.10321885  0.4105985 ]
[0.        0.4105985]


array([0.       , 0.4105985])

In [0]:
N4.feedforward(xin1,sigmoid)

[ 3.44311716 -0.22294009 -2.56030369  2.45383914 -0.13826867]
[-3.35410479 -0.94196144 -0.25557434]
[-0.62324114 -0.09269711]
[-0.62324114 -0.09269711]
[0.34904466 0.4768423 ]


array([0.34904466, 0.4768423 ])

In [0]:
# xin2   # second input
N4.feedforward(xin2,heaviside)

[ 11.81104509  17.10459792   3.81605569 -21.44312867  -3.31124613]
[-2.82243777 -0.61910861  1.84611695]
[-0.68206852 -0.75455134]
[-0.68206852 -0.75455134]
[0. 0.]


array([0., 0.])

In [0]:
N4.feedforward(xin2,relu)

[ 11.81104509  17.10459792   3.81605569 -21.44312867  -3.31124613]
[-18.17276231 -29.91324874  15.01418159]
[ -8.79417283 -17.08317279]
[ -8.79417283 -17.08317279]
[0. 0.]


array([0., 0.])

In [0]:
N4.feedforward(xin2,sigmoid)

[ 11.81104509  17.10459792   3.81605569 -21.44312867  -3.31124613]
[-2.79867422 -0.6272462   1.83572589]
[-0.94490378 -0.59260485]
[-0.94490378 -0.59260485]
[0.27991087 0.3560374 ]


array([0.27991087, 0.3560374 ])

### [도전2] 764개의 입력을 받아서 4 개의 딥 레이어로 처리해서 두개의 최종 출력을 구하는 과정을 만드시오. 
> 입력:764, 첫번째 레이어: 100 뉴런, 두번째 레이어: 28 뉴런, 세번째 레이어: 28 뉴런,  네번째 레이어: 10 뉴런, 출력:1

In [0]:
28*28

784

In [0]:
np.random.seed(0)
N5 = Network([784,100,28,28,10,1])

In [0]:
N5.sizes

[784, 100, 28, 28, 10, 1]

In [0]:
N5.biases  # 

In [0]:
N5.weights  # 

In [0]:
N5.show_parameters()

### Input data: X
> X : (1,768) -> (768,100) -> (100,28) -> (28,28) -> (28,10) -> (10,1) -> (1,1)

In [0]:
np.random.seed(0)
xin = np.random.randn(784)
xin

In [0]:
xin.shape

In [0]:
N5.feedforward(xin,heaviside)

In [0]:
N5.feedforward(xin,relu)

In [0]:
N5.feedforward(xin,sigmoid)

***

### Data Representation ((1,n) input, a layer with k neurons)
> (1,n) -> (n,k) -> (1,k)

We represent the input $\mathbf{x}$ as a column vector. $\mathbf{x} =\begin{bmatrix}
    x_1\\
    x_2\\
    \vdots \\
    x_n
\end{bmatrix}$

We represent each layer's weights as a matrix. $$\mathbf{W}^\top =  \begin{bmatrix} { w }_{ 11 } & { w}_{ 12 } & \dots & {w}_{1n}\\ { w }_{ 21 } & { w }_{ 22 } & \dots &{w}_{2n} \\ \vdots & \vdots & \dots & \vdots \\ { w }_{ k1 } & { w }_{ k2 }& \dots & {w}_{kn} \end{bmatrix}$$

For k-neuron layer, we represent the biases $b$ as a column vector. $\mathbf{b} =\begin{bmatrix}
    b_1\\
    b_2\\
    \vdots \\
    b_k
\end{bmatrix}$

$ Z = \begin{bmatrix} { w }_{ 11 } & { w}_{ 12 } & \dots & {w}_{1n}\\ { w }_{ 21 } & { w }_{ 22 } & \dots &{w}_{2n} \\ \vdots & \vdots & \dots & \vdots \\ { w }_{ k1 } & { w }_{ k2 }& \dots & {w}_{kn} \end{bmatrix}$ **DOT** $\begin{bmatrix}
    x_1\\
    x_2\\
    \vdots \\
    x_n
\end{bmatrix} + \begin{bmatrix}
    b_1\\
    b_2\\
    \vdots \\
    b_k
\end{bmatrix}$

$$ a = \phi (\mathbf{W}^\top \cdot \mathbf{X} + \mathbf{B}) $$

## Data Representation ((1,n) input, deep layer with [k,s,t] neurons, and (1,r) output)
> (1,n) -> (n,k) -> (k,s) -> (s,t) -> (t,r) -> (1,r)