# Deep Convolutional GANs (DCGANs)

## Activation Functions

In a neural network, each $i$'th node $a_i^{(l)}$ of a layer $l$ calculates as

$$
a_i^{(l)} = g^{(l)}(z_i^{(l)})
$$

with the **activation function** $g^{(l)}$ and

$$
z_i^{(l)} = \sum_i W_i^{(l)} a_i^{(l-1)}
$$

Activation functions are 

1. **Non-linear** to approximate complex functions
2. **Differentiable** for backpropagation

## Common Activation Functions

### ReLU

ReLU (Rectified Linear Unit)

$$
g(z) = max(0, z)
$$

has a **Dying ReLU problem**.

### Leaky ReLU

Leaky ReLU solves the dying ReLU problem:

$$
g(z) = max(\alpha z, z)
$$

The value $0 \le \alpha \le 1$ is typically small

### Sigmoid

Sigmoid has values between **0** and **1**:

$$
g(z) = \frac{1}{1 + e^{-z}}
$$

Sigmoid has a vanishing gradient and saturation problem.

### Tanh

Tanh has values between **-1** and **1**, hence keeps the sign of the input:

$$
g(z) = tanh(z)
$$

Tanh has also a vanishing gradient and saturation problem.


## Batch Normalization

1. Batch normalization smooths the cost function
2. Batch normalization **reduces** the internal **covariate shift**
3. Batch normalization **speeds up learning**!