<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer:
(Often?) the first step of our neural network, accepts input features, reads in input data, equal to the number of features.
### Hidden Layer:
Reads inputs/weights from input layer. This is the function (computations) behind the scenes for the neural network. 
### Output Layer:
The output (y) that makes the final predictions from all input/hidden layers
### Neuron:
Activation nodes. A series of neurons make up a layer. The number of neurons depnds on the network structure. 
### Weight:
A number associated with each connection between neurons within a network.
### Activation Function:
Sigmoid or otherwise, this is what we run the weights and biases through. Indicate which nodes are active, weighted heavier?
### Node Map:
The structure/type of neuron connections. May be connected in different ways for different purposes
### Perceptron:
A simple type of neural network with binary output and weights for each feature

## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

Inputs > input layer > hidden layers > output layer > Output. Inputs are read in, and often normalized between 0 and 1. Then they are given a random weight/bias. This weight/bias are then multiplied by the inputs, to get a weighted score. This score is then run through the activation function to get a "prediction". This prediction is then scored against the correct output, and the resulting score is used to update the weights. This process done iteratively for each node and layer - multiply weights by inputs, scoring weights, scoring prediciton, updating weights. Once our network is trained, we get an output (probability) for some combination of inputs.

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [1]:
import numpy as np
import pandas as pd

In [163]:
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')

In [164]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1 - sx)

In [165]:
X = df[['x1', 'x2']]
y = df[['y']]

In [166]:
weights = np.random.random((inputs.shape[1], 1))

In [167]:
for i in range(1000):
    # input * weights
    weighted_sum = np.dot(X, weights)
    # activate sigmoid function
    activated_output = sigmoid(weighted_sum)
    # calculate correct - calculated (error)
    error = y - activated_output
    # calculate adjustments to weight
    adjustments = error * sigmoid_derivative(weighted_sum)
    # adjust weights
    weights += np.dot(X.T, adjustments)

In [168]:
print('Weights after training')
print(weights)

Weights after training
[[-3.05311332e-16]
 [ 1.38777878e-16]]


In [169]:
print('Output after training')
print(activated_output)

Output after training
[[0.5]
 [0.5]
 [0.5]
 [0.5]]


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [225]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [226]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]
transformer = Normalizer()
X = diabetes[feats]
X = transformer.fit_transform(X)
y = diabetes['Outcome']

In [239]:
class Perceptron:
    
    def __init__(self, rate=0.01, niter=10):
        self.rate = rate
        self.niter = niter
    
    def __sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
    
    def __sigmoid_derivative(self, x):
        sx = self.__sigmoid(x)
        return sx * (1 - sx)

    def fit(self, X, y):
        '''
        Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        '''
        
        # Randomly Initialize Weights
        weights = 2 * np.random.random((X.shape[1], 1)) - 1

        for i in range(self.niter):
            # Weighted sum of inputs / weights
            weighted_sum = np.dot(X, weights)
            # Activate!
            activated_output = self.__sigmoid(weighted_sum)
            # Calc error
            error = y.reshape(-1, 1) - activated_output
            # Update the Weights
            adjustments = error * self.__sigmoid_derivative(weighted_sum)
            weights += np.dot(X.T, adjustments)
        return self
    
    def predict(self, X):
        '''Return class label after unit step'''
        return activated_output

In [240]:
perceptron = Perceptron(1000)
perceptron.fit(X, y)
y_pred = perceptron.predict(X)
y_pred

AttributeError: 'Series' object has no attribute 'reshape'

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?