# Deep neural Network

### Table of Contents

* [1. Deep L-layer neural Network](#chapter1)
    * [1.1 What is a deep Neural Network?](#section_1_1)
    * [1.2  Deep Neural network Notation](#section_1_2)
    * [1.3 Matrix Representation](#section_1_3)
        * [1.3.1 m=1 Examle](#section_1_3_1)
        * [1.3.1 Vectorized](#section_1_3_2)
* [2. Building Deep Neural Network](#chapter2)
    * [2.1 Forward Propagation](#section_2_1)
    * [2.2 Backward Propagation](#section_2_2)
    * [2.3 Blocks Deep Neural network](#section_2_3)
   


# 1. Deep L-layer neural Network <a class="anchor" id="chapter1"></a>

## 1.1 What is a deep Neural Network? <a class="anchor" id="section_1_1"></a>

<center><img src="images/05-Deep Neural network/deep-neural-network.PNG" width = "500px"></center>

> Logistic regression:

- shallow neural network
- Neural Network with 1 layer :  Output layer

> 1 hidden layer:

- shallow neural network
- 2 Layers 
    - 1 hidden layer
    - 1 output layer

Shallow versus depth is a matter of degrees.

 ## 1.2 Deep Neural network Notation <a class="anchor" id="section_1_2"></a>

<center><img src="images/05-Deep Neural network/notation-deep-neural.PNG" width = "500px"></center>

> In this example we have a 4 layer Neural Network:

- L : numbers of layers
- n<sup>[l]</sup> : numbers of nodes/units in layer l
- a<sup>[l]</sup> : activations in layer l
- W<sup>[l]</sup> : weigths for Z<sup>[l]</sup> 
- b<sup>[l]</sup> : bias for Z<sup>[l]</sup>

$$ a^{[l]} = g(Z^{[l]})$$

> In our example:

- n<sup>[0]</sup> = n<sub>X</sub> = 3 | n<sup>[1]</sup> = 5 | n<sup>[2]</sup> = 5 | n<sup>[3]</sup> = 3 | n<sup>[4]</sup> = 1
- L = 4
- X = a<sup>[0]</sup>



## 1.3 Matrix Representation <a class="anchor" id="section_1_3"></a>

### 1.3.1 m = 1 Example <a class="anchor" id="section_1_3_1"></a>

<center><img src="images/05-Deep Neural network/matrix-example.PNG" width = "400px"></center>

> Neural Network with m=1 example:

- L = 5, 
- n<sup>[0]</sup> = n<sub>X</sub> = 2
- n<sup>[1]</sup> = 3
- n<sup>[2]</sup> = 5 
- n<sup>[3]</sup> = 4 
- n<sup>[4]</sup> = 2
- n<sup>[5]</sup> = 1

$$X=\begin{bmatrix} ..\\..\\X^{(1)}\\..\\ ..\end{bmatrix} = \begin{bmatrix} x_{1}\\x_{2}\end{bmatrix} \in (n_X=2 \times 1) \in (n_X \times 1)$$

> Layer 1 :

$$\begin{cases}
    Z^{[1]} = W^{[1]} X + b^{[1]} \\
    (3,1) = (3,2) (2,1) + (3,1)   \\
    (n^{[1]},1) = (n^{[1]},n^{[0]}) (n^{[0]},1) + (n^{[1]},1)
\end{cases}
$$


> Layer 2 : 

$$\begin{cases}
    Z^{[2]} = W^{[2]} A^{[1]} + b^{[2]} \\
    (5,1) = (5,3) (3,1) + (5,1)   \\
    (n^{[2]},1) = (n^{[2]},n^{[1]}) (n^{[1]},1) + (n^{[2]},1)
\end{cases}
$$


> Layer 2 : 

$$\begin{cases}
    Z^{[3]} = W^{[3]} A^{[2]} + b^{[3]} \\
    (4,1) = (4,5) (5,1) + (4,1)   \\
    (n^{[3]},1) = (n^{[3]},n^{[2]}) (n^{[2]},1) + (n^{[3]},1)
\end{cases}
$$


##### Recap m=1 example

> Forward propagation parameters :

$$\begin{cases}
    W^{[l]} = (n^{[l]},n^{[l-1]})\\
    b^{[l]} = (n^{[l]},1)
\end{cases}
$$


> Backward propagation parameters :

$$\begin{cases}
    dW^{[l]} = (n^{[l]},n^{[l-1]})\\
    db^{[l]} = (n^{[l]},1)
\end{cases}
$$


### 1.3.2 Vectorized (m examples) <a class="anchor" id="section_1_3_2"></a>

$$X=\begin{bmatrix} .. & .. & .. & ..\\ .. & .. & .. & .. \\ X^{(1)} & X^{(2)} & .. & X^{(m)}  \\.. & .. & .. & ..\\ .. & .. & .. & .. \end{bmatrix} \in (n_X \times m)$$

> Layer 1 :

$$Z^{[1]} = W^{[1]} X + b^{[1]}$$

$$Z^{[1]} = W^{[1]} X + b^{[1]} =\begin{bmatrix} .. & .. & .. & ..\\ .. & .. & .. & .. \\ Z^{[1](1)} & Z^{[1](2)} & .. & Z^{[1](m)}   \\.. & .. & .. & ..\\ .. & .. & .. & .. \end{bmatrix} \in (n^{[1]} \times m)$$

$$\begin{cases}
    Z^{[1]} = W^{[1]} X + b^{[1]} \\
    (3,m) = (3,2) (2,m) + (3,1)   \\
    (n^{[1]},m) = (n^{[1]},n^{[0]}) (n^{[0]},m) + (n^{[1]},1)
\end{cases}
$$


##### Recap m examples:

$$\begin{cases}
    Z^{[l]},A^{[l]} : (n^{[l]},m) \\
    W^{[l]}: (n^{[l]},n^{[l-1]})   \\
    b^{[l]}: (n^{[l]},1)  \\
    dZ^{[l]},dA^{[l]} : (n^{[l]},m) \\
    dW^{[l]} : (n^{[l]},n^{[l-1]})   \\
    db^{[l]}: (n^{[l]},1)
\end{cases}
$$


# 2. Building Deep Neural Network <a class="anchor" id="chapter2"></a>

## 2.1 Forward Propagation <a class="anchor" id="section_2_1"></a>

<center><img src="images/05-Deep Neural network/forward-prop.png" width = "600px"></center>

$$
\begin{cases}
    Z^{[l]} = W^{[l]} X + b^{[l]} \\
    A^{[l]} = g^{[l]}(Z^{[l]}) 
\end{cases}
$$


## 2.2 Backward Propagation <a class="anchor" id="section_2_2"></a>

<center><img src="images/05-Deep Neural network/backward-prop.png" width = "600px"></center>

$$
\begin{cases}
    dZ^{[l]} =  (A^{[l]} - Y) \\
    dW^{[l]} = \frac{1}{m} dZ^{[l]}A^{[l-1]T} \\
    db^{[l]} = \frac{1}{m} \sum dZ^{[l]}    \\
    dZ^{[l-1]} = W^{[l]T}dZ^{[l]} * g^{[l-1]'}(Z^{[l-1]}) \\
\end{cases}
$$

$$
\begin{cases}
    dZ^{[1]} = W^{[2]T}dZ^{[2]} * g^{[1]'}(Z^{[1]}) \\
    dW^{[1]} = \frac{1}{m} dZ^{[1]} X^T \\
    db^{[1]} = \frac{1}{m} \sum  dZ^{[1]}
\end{cases}
$$


## 2.3 Blocks Neural Network <a class="anchor" id="section_2_3"></a>

In each layer there's a forward propagation step and there's a corresponding backward propagation step. And has a cache to pass information from one to the other.

<center><img src="images/05-Deep Neural network/forw-backw.png" width = "600px"></center>

# 3. Parameters vs Hyperparameters <a class="anchor" id="chapter3"></a>

The hyperparaters control the parameters W and b.

> Parameters:

- W[1], b[1]
- W[2], b[2]
- ...
- W[l], b[l]

> Hyperparameters:

- learning rate
- number of iteration
- number of hidden layers
- number of nodes in hidden layer
- choice of activation function
- ...