How weight initialization affects forward and backward passes of a deep neural network
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
plots
.gitignore
README.md Create README.md Jun 10, 2017
weight_init.py

README.md

From Andrej Karpathy's course cs231n:CNNs for Visual Recognition

How weight initialization affects the forward and backprop of a deep Neural Network ?

All the plots were generated with one full forward pass across all the 10 layers of the network with the same activation function

Architecture

There are 10 layers, each layer having 500 units.

Activation Functions

Tanh, ReLU, Sigmoid were used.

Data

Random data points of 1000 training examples are generated from a univariate "normal" (Gaussian) distribution of mean 0 and variance 1. Weights for each layer were generated from the same distribution as that of data points but later on varied to obtain different plots.