<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer: layer to a network where external values are input
### Hidden Layer: layer between the input and output layers
### Output Layer: final layer of a network
### Neuron: node in a neural network with associated activation and connecting weights
### Weight: degree of connection between two neurons 
### Activation Function: function used to calculate the output value of a neuron
### Node Map: diagramatic presentation of a neural network
### Perceptron: kind of neural network with a simple input to output


## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

#### Your Answer Here

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [33]:
import pandas as pd
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')

In [34]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivate(x):
    sx = sigmoid(x)
    return( sx * (1-sx))

In [35]:
weights = 2 * np.random.random((2,1)) - 1
weights

array([[ 0.0950294 ],
       [-0.93664987]])

In [37]:
inputs = df[['x1','x2']].values
inputs

array([[0, 0],
       [1, 0],
       [0, 1],
       [1, 1]])

In [38]:
correct_outputs = df['y'].values.reshape(-1,1).tolist()
correct_outputs

[[1], [1], [1], [0]]

In [39]:
weighted_sum = np.dot(inputs, weights)
weighted_sum

array([[ 0.        ],
       [ 0.0950294 ],
       [-0.93664987],
       [-0.84162047]])

In [40]:
activated_output = sigmoid(weighted_sum)
activated_output

array([[0.5       ],
       [0.52373949],
       [0.28157755],
       [0.3011936 ]])

In [41]:
error = correct_outputs - activated_output
error

array([[ 0.5       ],
       [ 0.47626051],
       [ 0.71842245],
       [-0.3011936 ]])

In [44]:
adjusted = error * sigmoid_derivate(activated_output)
adjusted

array([[ 0.11750186],
       [ 0.11125942],
       [ 0.17609208],
       [-0.07361617]])

In [45]:
weights += np.dot(inputs.T, adjusted)
weights

array([[ 0.13267265],
       [-0.83417396]])

In [46]:
for iteration in range(10000):
    
    # Weighted sum of inputs / weights
    weighted_sum = np.dot(inputs, weights)
    
    # Activate!
    activated_output = sigmoid(weighted_sum)
    
    # Cac error
    error = correct_outputs - activated_output
    
    adjustments = error * sigmoid_derivate(activated_output)
    
    # Update the Weights
    weights += np.dot(inputs.T, adjustments)
    
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

Weights after training
[[ 1.38777878e-16]
 [-2.91433544e-16]]
Output after training
[[0.5]
 [0.5]
 [0.5]
 [0.5]]


In [47]:
class Perceptron(object):
    
    def __init__(self, niter = 1000):
        self.niter = niter
    
    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
    
    def sigmoid_derivative(self, x):
        sx = sigmoid(x)
        return( sx * (1-sx))

    def fit(self, X, y):
        """Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        """
        
        inputs = X
        correct_outputs = y.reshape(-1,1)
        # Randomly Initialize Weights
        weights = 2 * np.random.random((X.shape[1],1)) - 1
        bias = 2 * np.random.random((len(X),1)) - 1
        
        for i in range(self.niter):

            # Update the Weights
            weighted_sum = np.dot(inputs, weights) + bias

            # Activate!
            activated_output = sigmoid(weighted_sum)

            # Cac error
            error = correct_outputs - activated_output

            adjustments = error * sigmoid_derivate(activated_output)

            # Update the Weights
            bias += error
            weights += np.dot(inputs.T, adjustments)
        self.activated_output = activated_output
        self.weights = weights
        self.bias = bias
        

    def predict(self, X):
        """Return class label after unit step"""
        weighted_sum = np.dot(X, self.weights) + self.bias
        
        return sigmoid(weighted_sum)

In [49]:
nl = Perceptron()
nl.fit(df[['x1','x2']].values, df['y'].values)
nl.predict(df[['x1','x2']].values)

array([[9.98999631e-01],
       [9.98967676e-01],
       [9.98967118e-01],
       [9.38385420e-04]])

In [50]:
nl.activated_output

array([[9.98998630e-01],
       [9.98966643e-01],
       [9.98966083e-01],
       [9.39325592e-04]])

In [51]:
bias = 2 * np.random.random((4,1)) - 1
bias

array([[0.3811215 ],
       [0.55366477],
       [0.33317554],
       [0.42946973]])

In [52]:
nl.predict(df[['x1','x2']].values)

array([[9.98999631e-01],
       [9.98967676e-01],
       [9.98967118e-01],
       [9.38385420e-04]])

## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [53]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [56]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

x = diabetes[feats].values
y = diabetes['Outcome'].values

scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)

per = Perceptron()
per.fit(X_scaled, y)

pred = per.predict(X_scaled)
pred_in = [1 if x >= .5 else 0 for x in pred]

res = pd.DataFrame(diabetes['Outcome'])
res['res'] = pred_in
res.head()

Unnamed: 0,Outcome,res
0,1,1
1,0,0
2,1,1
3,0,0
4,1,1


## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?