How weight initialization affects forward and backward passes of a deep neural network
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.gitignore Create Jun 10, 2017

From Andrej Karpathy's course cs231n:CNNs for Visual Recognition

How weight initialization affects the forward and backprop of a deep Neural Network ?

All the plots were generated with one full forward pass across all the 10 layers of the network with the same activation function


There are 10 layers, each layer having 500 units.

Activation Functions

Tanh, ReLU, Sigmoid were used.


Random data points of 1000 training examples are generated from a univariate "normal" (Gaussian) distribution of mean 0 and variance 1. Weights for each layer were generated from the same distribution as that of data points but later on varied to obtain different plots.