<a href="https://colab.research.google.com/github/JimKing100/DS-Unit-4-Sprint-2-Neural-Networks/blob/master/Sprint_Prep.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### How a Neural Network and It's Components Work

ANN's are computational models inspired by neural networks in the brain.  ANN's are composed of three types of layers = input, hidden and output layers.  The input layer receives inputs from the dataset and is exposed to the dataset.  The hidden layer in not exposed directly to the data and consists of weights and calculations used to analyze the data.  The output layer represents the results or output of the model.  Each layer is composed of nodes.  The complete architecture of an ANN can be visualized in a Node Map.

The links between nodes in different layers represent weights.  In a feed-forward NN, each layer affects the next layer by a weighted sum of inputs plus a bias factor.  The optimal weights and biases of a NN can be searched through gradient descent if there is a loss function evaluating the quality of the predictions compared to the y values of the training data.

In NN each node has an activation function.  An activation function decides how much signal to pass to the next layer.

A perceptron in a simple neural network that takes input values, multiplies by weights, adds a bias,sums the products, passes the sum through an activation function and outputs a final value.


### Basics of Backpropagation

In order to evaluate a NN's performance, data is "fed forward" until predictions are obtained and then the "loss" or "error" for a given observation is ascertained by looking at what the network predicted for that observation and comparing it to what it should have predicted.

The error for a given observation is calculated by taking the square of the difference between the predicted value and the actual value.  The overall quality of a network's predictions can be found by finding the average error across all observations. This gives us the "Mean Squared Error."

An "epoch" is one cycle of passing our data forward through the network, measuring the error given our specified cost function, and then, via gradient descent,updating weights within our network to hopefully improve the quality of our predictions on the next iteration.

Backpropagation refers to a specific algorithm for how weights in a neural network are updated in reverse order at the end of each training epoch.

4 steps for backpropagation:

1) Calculate Error for a given each observation

2) Does the error indicate that I'm overestimating or underestimating in my prediction?

3) Look at final layer weights to get an idea for which weights are helping pass desireable signals and which are stifling desireable signals.

4) Also go to the previous layer and see what can be done to boost activations that are associated with helpful weights, and limit activations that are associated with unhelpful weights.

### Build and Train a Perceptron Using Numpy

In [0]:
import numpy as np

In [0]:
# I want activations that correspond to negative weights to be lower
# and activations that correspond to positive weights to be higher

class NeuralNetwork:
    def __init__(self):
        # Set up Architecture of Neural Network
        self.inputs = 3
        self.hiddenNodes = 4
        self.outputNodes = 1

        # Initial Weights
        # 3x4 Matrix Array for the First Layer
        self.weights1 = np.random.rand(self.inputs, self.hiddenNodes)
       
        # 4x1 Matrix Array for Hidden to Output
        self.weights2 = np.random.rand(self.hiddenNodes, self.outputNodes)
        
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
        
    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """
        
        # Error in Output
        self.o_error = y - o
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
        # Adjustment to first set of weights (input => hidden)
        self.weights1 += X.T.dot(self.z2_delta)
        # Adjustment to second set of weights (hidden => output)
        self.weights2 += self.activated_hidden.T.dot(self.o_delta)
        

    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X,y,o)

In [0]:
X = np.array([
              [0, 0, 1],
              [0, 1, 1],
              [1, 0, 1],
              [0, 1, 0],
              [1, 0, 0],
              [1, 1, 1],
              [0, 0, 0]

])

y = np.array([
              [0],
              [1],
              [1],
              [1],
              [1],
              [0],
              [0]

])

In [5]:
nn = NeuralNetwork()

# Number of Epochs / Iterations
for i in range(10000):
    if (i+1 in [1,2,3,4,5]) or ((i+1) % 1000 ==0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
        print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
    nn.train(X,y)

+---------EPOCH 1---------+
Input: 
 [[0 0 1]
 [0 1 1]
 [1 0 1]
 [0 1 0]
 [1 0 0]
 [1 1 1]
 [0 0 0]]
Actual Output: 
 [[0]
 [1]
 [1]
 [1]
 [1]
 [0]
 [0]]
Predicted Output: 
 [[0.76172105]
 [0.80179543]
 [0.78105107]
 [0.77092938]
 [0.74170987]
 [0.81581361]
 [0.7184885 ]]
Loss: 
 0.2812010514425749
+---------EPOCH 2---------+
Input: 
 [[0 0 1]
 [0 1 1]
 [1 0 1]
 [0 1 0]
 [1 0 0]
 [1 1 1]
 [0 0 0]]
Actual Output: 
 [[0]
 [1]
 [1]
 [1]
 [1]
 [0]
 [0]]
Predicted Output: 
 [[0.6817056 ]
 [0.71363969]
 [0.69914086]
 [0.69026655]
 [0.67113871]
 [0.72752208]
 [0.65172613]]
Loss: 
 0.25648012229283385
+---------EPOCH 3---------+
Input: 
 [[0 0 1]
 [0 1 1]
 [1 0 1]
 [0 1 0]
 [1 0 0]
 [1 1 1]
 [0 0 0]]
Actual Output: 
 [[0]
 [1]
 [1]
 [1]
 [1]
 [0]
 [0]]
Predicted Output: 
 [[0.61714745]
 [0.63850265]
 [0.63166578]
 [0.62399969]
 [0.61516414]
 [0.65076503]
 [0.59964887]]
Loss: 
 0.24568145243752057
+---------EPOCH 4---------+
Input: 
 [[0 0 1]
 [0 1 1]
 [1 0 1]
 [0 1 0]
 [1 0 0]
 [1 1 1]
 [0 0 0

### Build, Train and Hyperparameter Tune a MLP with Keras

In [0]:
# See Assignment