We will use gradient descent to train a network on graduate school admissions data found at [UCLA's Home Page](http://www.ats.ucla.edu/stat/data/binary.csv).<br><br>
This dataset has three input features: GRE score, GPA, and the rank of the undergraduate school (numbered 1 through 4).<br><br>
Institutions with rank 1 have the highest prestige, those with rank 4 have the lowest.<br><br>
The goal here will be to train the network until you reach a minimum in the mean square error (MSE) on the training set.

You will need to implement:

- The network output: **output**.
- The output error: **error**.
- The error term: **error_term**.
- Update the weight step: **del_w +=**.
- Update the weights: **weights +=**.

In [1]:
# Importing
import numpy as np
from data_prep import features, targets, features_test, targets_test

Enter sigmoid ativation function

In [2]:
def sigmoid(x):
    return 1/(1 + np.exp(-x))

Using the same seed to make debugging easier

In [3]:
np.random.seed(42)

# Defining number of records and features
n_records, n_features = features.shape
last_loss = None

Initializing the weights.
- It's desirable these to be small such that the input to the sigmoid is in the linear region near 0 and not squashed at the high and low ends.
- It's also important to initialize them randomly so that they all have different starting values and diverge, breaking symmetry.
- So, we'll initialize the weights from a normal distribution centered at 0.
- A good value for the scale is 1/√n where n is the number of input units.
- This keeps the input to the sigmoid low for increasing numbers of input units.

In [4]:
# Initializing weights
weights = np.random.normal(scale = 1/n_features**.5, size = n_features)

Entering hyperparameters:
- Epochs
- Learn rate

In [6]:
epochs = 1000
learnrate = 0.5

Neural Network

In [7]:
for e in range(epochs):
    del_w = np.zeros(weights.shape)
    for x, y in zip(features.values, targets):
        # Calculate the output
        output = sigmoid(np.dot(x, weights))
        
        # Calculate the error
        error = y - output
        
        # Calculate error term
        error_term = error * output * (1 - output)
        
        # Calculate the change in weights
        del_w += error_term * x
        
    # Update weights
    weights += learnrate * del_w / n_records
    
    # Printing out the mean square error on the training set
    if e % (epochs / 10) == 0:
        out = sigmoid(np.dot(features, weights))
        loss = np.mean((out - targets) ** 2)
        if last_loss and last_loss < loss:
            print("Train loss: ", loss, " Warning - Loss Increasing")
        else:
            print("Train loss: ", loss)
        last_loss = loss

('Train loss: ', 0.66914315930396695)


  


('Train loss: ', 0.32499833553900465)
('Train loss: ', 0.32499833326578603)
('Train loss: ', 0.32499833099293657)
('Train loss: ', 0.32499832872045081)
('Train loss: ', 0.32499832644832349)
('Train loss: ', 0.32499832417654906)
('Train loss: ', 0.32499832190512223)
('Train loss: ', 0.32499831963403758)
('Train loss: ', 0.32499831736328971)


Calculating accuracy

In [8]:
# Calculate accuracy on test data
tes_out = sigmoid(np.dot(features_test, weights))
predictions = tes_out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Predictions accuracy: {:.3f}". format(accuracy))

Predictions accuracy: 0.750


  
