<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer:

The input layer represents the data (source dataset) used to predict or generate desired outputs of an artificial neural network.  In practice this the input layer will accept rows or vectors of numerical or numerically encoded values.

### Hidden Layer:

A hidden layer is a set or vector of interim values that are the result of a numerical transformation of the values in our input layer. A neural network can zero or more hidden layers as input data is transformed to a set of output data values.  Hidden layers can't be accessed or manipulated directly - only through the application of the network's numerical transformations.

### Output Layer:

The output layer contains the results of the network process transforming inputs through the layers of the network.  

### Neuron:

A neuron is a node in the network that contains a numerical value.  It is the result of a transformation of a set of inputs or node values from a previous hidden layer in the network.  The transformation is typically a sum of weighted input values in addition to a bias value that is subject to an activation function.

### Weight:

A weight is a factor multiplied to an input in the form of a dataset vector (model input) or an interim node.  It serves as a tuning factor used to transform input data to an output that can be used as a prediction or model result. Weights are adjusted or tuned in order to minimize a cost or loss function.

### Activation Function:

An activation function is appled to a node's value which has been generated by transforming network inputs or inputs from previous hidden layers in network.  The activation determines whether the node is applicable as a model output or for transformation by the next hidden layer in the network (if applicable)

### Node Map:

A node map is visual depiction of a neural network that represents how data is transformed from model inputs to model outputs via zero or more hidden layers

### Perceptron:

A Perceptron is a simple neural network that takes one or more inputs and transforms those inputs as a weighted sum which is applied to an activation function resulting in a model output


## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

In a neural network information flows from a set of inputs and is transformed into a set of outputs.  The transformation algorithm operates in such as to minimize a cost or loss function - usually the calculation of the mean square error between predicted outputs and actual data.

The transformation process comes the form of a sum of weighted inputs (sum of products) plus a bias value.  This result is applied to an activation function that determines if the value is applicable to the next step in the process (either a transformation to a subsequent hidden layer or a resultant set of output values)

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [109]:
import pandas as pd
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df

Unnamed: 0,x1,x2,y
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,0


In [110]:
# Create a copy of the dataframe
df_wrk = df.copy()
df_wrk.drop("y", axis=1, inplace=True)

# Generate an array of input arrays (an array of virtual data rows)
arr_inputs = df_wrk.to_numpy()

# Generate an array of outputs (targets)
df_wrk = df.copy()
df_wrk.drop(["x1", "x2"], axis=1, inplace=True)

arr_outputs = df_wrk.to_numpy()

In [111]:
arr_inputs

array([[0, 0],
       [1, 0],
       [0, 1],
       [1, 1]])

In [112]:
arr_outputs

array([[1],
       [1],
       [1],
       [0]])

In [113]:
# Generate an initial set of weights to begin network processing
import math
import random
import numpy as np
random.seed(33)

# gen_weights is a function that generates initial weights for our neural network processing
def gen_weights():
    ret_wgts = []
    
    for i in range(2):
        ret_wgts.append([random.uniform(-1.0, 1.0)])
        
    return np.array(ret_wgts) 

# sigmoid is a function that applies an activation (sigmoid) function to a weighted sum
def sigmoid(x):
    sgmd = 1 / (1 + np.exp(-x))
    return sgmd

# deriv_sigmoid calculates the derivative of sigmoid at a passed value (rate of change at point x)
def deriv_sigmoid(x):
    sgmd = sigmoid(x)
    dtv  = sgmd * (1 - sgmd)
    return dtv

# general_activate is a function that applies an activation function to a numeric value
def general_activate(val):
    ret_arr = []
    for v in val:
        tmp_val = 0
        if v > 0:
            tmp_val = 1
            
        ret_arr.append(v)
        
    return np.array(ret_arr)

In [114]:
weights = gen_weights()
weights

array([[0.14065685],
       [0.26446599]])

In [115]:
bias = random.uniform(.01, .33)
bias

0.2714412351626475

In [119]:
# learning factor
r = .25

In [120]:
# Iterate through x instances refining the network weights 
for iteration in range(50000):
    
    # Weighted sum of inputs / weights
    weighted_sum = np.dot(arr_inputs, weights)
    
    # Apply the activate function to enable or disable outputs
    activated_output = general_activate(weighted_sum + bias)
    
    # Calculate error
    error = arr_outputs - activated_output
    
    # Calculate weight adjustments
    adjustments = error * r 
    
    # Generate adjusted weights
    weights += np.dot(arr_inputs.T, adjustments)
    
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

Weights after training
[[0.15237251]
 [0.15237251]]
Output after training
[[0.27144124]
 [0.42381375]
 [0.42381375]
 [0.57618625]]


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [4]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [10]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

X = ...

In [None]:
##### Update this Class #####

class Perceptron:
    
    def __init__(self, niter = 10):
        self.niter = niter
    
    def __sigmoid(self, x):
        return None
    
    def __sigmoid_derivative(self, x):
        return None

    def fit(self, X, y):
    """Fit training data
    X : Training vectors, X.shape : [#samples, #features]
    y : Target values, y.shape : [#samples]
    """

        # Randomly Initialize Weights
        weights = ...

        for i in range(self.niter):
            # Weighted sum of inputs / weights

            # Activate!

            # Cac error

            # Update the Weights


    def predict(self, X):
    """Return class label after unit step"""
        return None

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?