<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer:
The input layer is the layer we use to push our data into the model. It's typically the highest level layer,
and the one we'll have the most interaction with.
### Hidden Layer:
The hidden layer(s) are where the magic happens. The truly deep, complex math calculations that allow ANNs
to identify sometimes impossible to see otherwise relationships.
### Output Layer:
The output layer is where we see the results of our model's calculations! This is another very visible layer
where our model outputs the predictions it's made based on the relationships it identified.
### Neuron:
A neuron in biology is one of the smallest parts of our brain, it fires action potentials that become our
own biological NN that allows us to experience all our senses. A neuron in an ANN is often how we refer to
a single node of one of our layers.
### Weight:
A weight is a number associated with a feature, allowing it to scale the prediction by an amount based on the
weight. Back propogation is part of the process we use to improve our weights and make the best possible model.
### Activation Function:
The activation function is like the action potential in a biological neuron. It decides when one of our artificial
neurons "fires" its own sort of action potential, allowing the model to begin to make sense of things.
### Node Map:
A node map is a basic, visual representation of a neural network.
### Perceptron:
A perceptron is the simplest kind of ANN we can make! It usually consists of only one layer, and is typically best 
at modeling linear relationships, much like the models we've used so far.

## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

#### Your Answer Here
First, we take our inputs, and multiply them by their weights, which in the beginning are typically random. Then, if the current hidden layer has an associated bias, we add that to the input. From there, it may travel along one, or many more hidden layers, being affected by various weights and biases, but all very much in the same way. It's important to note that a bias is associated with a single layer, affecting the layers AFTER the hidden layer it is associated with. After our input completes its journey through the hidden layers, we have our output.

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [1]:
import pandas as pd
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')

In [2]:
#Defining a sigmoid function for my perceptron
def sigmoid(x):
    return 1/ (1 + np.exp(-x))

In [3]:
import numpy as np
#Set random seed for reproducibility
np.random.seed(42)

In [14]:
#Establish a set of random weights
weights = np.random.random((3,1)) - 1
weights

array([[-0.26800606],
       [-0.40134152],
       [-0.84398136]])

In [15]:
#Defining sigmoid derivative function, this is a basic way to get back prop
def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1 - sx)

In [16]:
df.head()

Unnamed: 0,x1,x2,y
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,0


In [17]:
inputs = np.array([[0,1,1],
                   [1,0,1],
                   [0,1,1],
                   [1,1,1]])
correct_outputs = [[1],
                   [1],
                   [1],
                   [0]]

In [18]:
inputs

array([[0, 1, 1],
       [1, 0, 1],
       [0, 1, 1],
       [1, 1, 1]])

In [19]:
weighted_sum = np.dot(inputs, weights)
activated_output = sigmoid(weighted_sum)
print(weighted_sum)
print(activated_output)

[[-1.24532288]
 [-1.11198742]
 [-1.24532288]
 [-1.51332893]]
[[0.22351082]
 [0.24750056]
 [0.22351082]
 [0.18044597]]


In [20]:
#Loop through 10k epochs to allow the perceptron to backprop and correct its weights
for iteration in range(10000):
    #print('\n\n', iteration)
    # Weighted sum of inputs & weights
    weighted_sum = np.dot(inputs, weights)
    #print('Weighted Sum ', weighted_sum)
    # Activation func
    activated_output = sigmoid(weighted_sum)
    #print('Activated Output ', activated_output)
    # Calculate the error
    error = correct_outputs - activated_output
    #print('Error ', error)
    adjustments = error * sigmoid_derivative(activated_output)
    #print('Adjustments ', adjustments)
    # Update the Weights
    weights += np.dot(inputs.T, adjustments)
    #print('Weights ', weights)
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

Weights after training
[[-12.52723547]
 [-11.83322351]
 [ 18.49267484]]
Output after training
[[0.99871966]
 [0.99744039]
 [0.99871966]
 [0.00282144]]


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [4]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [10]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

X = ...

In [None]:
##### Update this Class #####

class Perceptron(object):
    
    def __init__(self, niter = 10):
    self.niter = niter
    
    def __sigmoid(self, x):
        return None
    
    def __sigmoid_derivative(self, x):
        return None

    def fit(self, X, y):
    """Fit training data
    X : Training vectors, X.shape : [#samples, #features]
    y : Target values, y.shape : [#samples]
    """

    # Randomly Initialize Weights
    weights = ...

    for i in range(self.niter):
        # Weighted sum of inputs / weights

        # Activate!

        # Cac error

        # Update the Weights


    def predict(self, X):
    """Return class label after unit step"""
        return None

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?