# Batch Normalization for a Simple Neural Network 

In this notebook, we implement batch normalization from scratch for a simple neural network. 

We start with the forward propagation

In [1]:
def batch_norm_forwardprop(Z, gamma, beta, epsilon = 1e-8):
    
    '''
    This function performs batch normalization in a forward propagation on every hidden layer of a neural network. 
    
    Inputs:
    Z - the linear output of the layer to be normalized 
    gamma - the first normalizer parameter
    beta - second normalizer parameter
    
    Output :
    Z_out - outputed normalized Z
    Cache - a temprorary storage for our prameters
    
    '''
    
    #Compute the mean of the Zs in the layer
    mean = np.mean(Z, axis = 0)
    
    #compute the variance of the Zs
    variance = np.mean((Z - mean)**2)
    
    #Normalize the Z with mean and standard deviation
    Z_norm = (Z - mean) / (np.sqrt(variance + epsilon))
    
    #Add some noise to the normalized Z so that they are reasonably different
    Z_out = (gamma * Z_norm) + beta
    
    #Store parameters in cache
    Cache = (Z, Z_norm, Z_out, mean, variance, gamma, beta)
    
    return Z_out, Cache


    
    
    
    

Next, backward propagation

In [2]:
def batch_norm_backwardprop (Z_out, cache):
    
    '''
    This function performs batch normalization in a forward propagation on every hidden layer of a neural network. 
    
    Inputs:
    Z_out - the outputted normalized Z from the forward propagation 
    Cache - a temprorary storage for our prameters
    
    Outputs:
    
    dZ -gradient of Z
    dgamma - gradient of gamma
    dbeta - gradient of beta
    
    '''
    
    #Retrieve parameters from cache
    Z, Z_norm, Z_out, mean, variance, gamma, beta = Cache
    
    #Retrieve shape of Z
    X,Y = Z.shape
    
    '''
    We proceed to calculate the gradients. The formulas for the gradient can be obtained from the original 
    bactch normalization paper. Or they can be calculated by you, if you are great at calculus. 
    '''
    
    Z_mean = Z - mean
    inverted_sd = 1. / np.sqrt(var + 1e-8)

    dZ_norm = Z_out * gamma
    dvar = np.sum(dZ_norm * Z_mu, axis=0) * -.5 * inverted_sd**3
    dmean = np.sum(dZ_norm * -inverted_sd, axis=0) + dvar * np.mean(-2. * Z_mean, axis=0)

    dZ = (dZ_norm * inverted_sd) + (dvar * 2 * Z_mean / N) + (dmean / N)
    dgamma = np.sum(Zout * Z_norm, axis=0)
    dbeta = np.sum(Zout, axis=0)

    return dZ, dgamma, dbeta

There you have it - Batch normalization implemented from scratch! 