# C1_Notes_W3 - Deep Neural Networks

> Understand the key computations underlying deep learning, use them to build and train deep neural networks, and apply it to computer vision.

## Table of contents
  * [1. Deep L-layer neural network](#deep-l-layer-neural-network)
  * [2. Forward Propagation in a Deep Network](#forward-propagation-in-a-deep-network)
  * [3. Getting your matrix dimensions right](#getting-your-matrix-dimensions-right)
  * [4. Why deep representations?](#why-deep-representations)
  * [5. Building blocks of deep neural networks](#building-blocks-of-deep-neural-networks)
  * [6. Forward and Backward Propagation](#forward-and-backward-propagation)
  * [7. Parameters vs Hyperparameters](#parameters-vs-hyperparameters)
  * [8. What does this have to do with the brain](#what-does-this-have-to-do-with-the-brain)
  * [9. Extra: Ian Goodfellow interview](#extra-ian-goodfellow-interview)


# 1. Deep L-layer neural network

- Shallow NN is a NN with one or two layers.
- Deep NN is a NN with three or more layers.
- We will use the notation `L` to denote the number of layers in a NN.
- `n[l]` is the number of neurons in a specific layer `l`.
- `n[0]` denotes the number of inputs or featyres, n_x. 
- `n[L]` denotes the number of neurons in output layer.
- `g[l]` is the activation function for layer l
- `a[l] = g[l](z[l])`
- `w[l]` weights is used for `z[l]`
- `x = a[0]`, `a[l] = y'`
- These were the notation we will use for deep neural network.
- So we have:
  - A vector `n` of shape `(1, NoOfLayers+1)`
  - A vector `g` of shape `(1, NoOfLayers)`
  - A list of different shapes `w` based on the number of neurons on the previous and the current layer.
  - A list of different shapes `b` based on the number of neurons on the current layer.

# 2. Forward Propagation in a Deep Network

- Forward propagation general rule for one input:

  ```
  z[l] = W[l]a[l-1] + b[l]
  a[l] = g[l](a[l])
  ```

- Forward propagation general rule for `m` inputs:

  ```
  Z[l] = W[l]A[l-1] + B[l]
  A[l] = g[l](A[l])
  ```

- We can't compute the whole layers forward propagation without a for loop **so its OK to have a for loop here.**
- The dimensions of the matrices are so important you need to figure it out.

![](images/c1w2n_basicnn8.png)

# 3. Getting your matrix dimensions right

- The best way to debug your matrices dimensions is by a pencil and paper.
- Dimension of `W` is `(n[l],n[l-1])` . Can be thought by right to left.
- Dimension of `b` is `(n[l],1)`
- `dw` has the same shape as `W`, while `db` is the same shape as `b`
- VECTORIZATION: Dimension of `Z[l],` `A[l]`, `dZ[l]`, and `dA[l]`  is `(n[l],m)`

# 4. Why deep representations?

- Why deep NN works well, we will discuss this question in this section.
- Deep NN makes relations with data from simpler to complex. In each layer it tries to make a relation with the previous layer. E.g.:
  - 1) Face recognition application:
      - Image ==> Edges ==> Face parts ==> Faces ==> desired face
  - 2) Audio recognition application:
      - Audio ==> Low level sound features like (sss,bb) ==> Phonemes ==> Words ==> Sentences
![](images/c1w3n_deeprnn1.png)
- Intuitively, you can think of the earlier layers of the neural network learning simpler functions, and the later layers buidling on these simple patterns to learn more complex functions. 
- Neural Researchers think that deep neural networks "think" like brains (simple ==> complex)
- Circuit theory and deep learning:
  - ![](Images/07.png)
  - In the above example, you are simply trying to create an XOR gate for n inputs... so for XOR(10), you want to see if any of n1 to n10 inputs are 1 (as opposed to zero).  
  - If you only have a single hidden layer (on the right), you will need 2^n-1 neurons to enumerate all the possible combinations in the XOR calculations... but if you have multiple layers, as we see on the left, you can parse down the tree with with fewer computations. 
- When starting on an application don't start directly by dozens of hidden layers. Try the simplest solutions (e.g. Logistic Regression), then try the shallow neural network and so on.

# 5. Building blocks of deep neural networks

- Forward and back propagation for a layer l:
  - ![Untitled](Images/10.png)
- One iteration of training is as follows:
  - ![](Images/08.png)

# 6. Forward and Backward Propagation

- Pseudo code for forward propagation for layer l:

  ```
  Input  A[l-1]
  Z[l] = W[l]A[l-1] + b[l]
  A[l] = g[l](Z[l])
  Output A[l], cache(Z[l])
  ```

- Pseudo  code for back propagation for layer l:

  ```
  Input da[l], Caches
  dZ[l] = dA[l] * g'[l](Z[l])         # Multiplication here (*) is element wise multiplication! 
  dW[l] = (dZ[l]A[l-1].T) / m
  db[l] = sum(dZ[l])/m                # Dont forget axis=1, keepdims=True
  dA[l-1] = w[l].T * (dZ[1])          # Multiplication here (*) is dot product!
  Output dA[l-1], dW[l], db[l]
  ```
- If we have used our loss function then:

  ```
  dA[L] = (-(y/a) + ((1-y)/(1-a)))
  ```
![](images/c1w3n_deeprnn10.png)  
![](images/c1w3n_deeprnn9.png)


# 7. Parameters vs Hyperparameters

- Main parameters of the NN is `W` and `b`
- Hyper parameters (parameters that control the algorithm) are like:
  - Learning rate.
  - Number of iteration.
  - Number of hidden layers `L`.
  - Number of hidden units `n`.
  - Choice of activation functions.
- You have to try values yourself of hyper parameters.
- In the earlier days of DL and ML learning rate was often called a parameter, but it really is (and now everybody call it) a hyperparameter.
- On the next course we will see how to optimize hyperparameters.

# 8. What does this have to do with the brain

- The analogy that "It is like the brain" has become really an oversimplified explanation.
- There is a very simplistic analogy between a single logistic unit and a single neuron in the brain.
- No human today understand how a human brain neuron works.
- No human today know exactly how many neurons on the brain.
- Deep learning in Andrew's opinion is very good at learning very flexible, complex functions to learn X to Y mappings, to learn input-output mappings (supervised learning).
- The field of computer vision has taken a bit more inspiration from the human brains then other disciplines that also apply deep learning.
- NN is a small representation of how brain work. The most near model of human brain is in the computer vision (CNN)

# 9. Extra: Ian Goodfellow interview

- Ian is one of the world's most visible deep learning researchers.
- Ian is mainly working with generative models. He is the creator of GANs.
- We need to stabilize GANs. Stabilized GANs can become the best generative models.
- Ian wrote the first textbook on the modern version of deep learning with Yoshua Bengio and Aaron Courville.
- Ian worked with [OpenAI.com](OpenAI.com) and Google on ML and NN applications.
- Ian tells all who wants to get into AI to get a Ph.D. or post your code on Github and the companies will find you.
- Ian thinks that we need to start anticipating security problems with ML now and make sure that these algorithms are secure from the start instead of trying to patch it in retroactively years later.`m