<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer:

The input layer is the layer of the neutral network in which your data interacts with. 
The features you want to feed into the network are each put into an input node in the input layer.

### Hidden Layer:

A hidden layer is one of the layers between the input and output layers in which the values that pass through 
them come from other nodes and not directly from user input or data

### Output Layer:

The output layer is where the final value or class is output and visable.

### Neuron:

A neuron takes input values from other nodes, multiplies them by thier weight, sums the results, and passes it through an activation function.

### Weight:

Weight is a value that is used to determine the importance of a particular feature. 
More importance values have a higher weight and corelation is determined by the sign.

### Activation Function:

The activation function is a function used to normalize the output of a neuron, it also determines how much of the signal to send to the next layer.

### Node Map:

A node map shows all of the individual nodes, their type, and which other nodes they are connected to. 

### Perceptron:

A perceptron is the most simple neutral network in that it is just one neuron with a given number of input nodes that gives one output after applying weights/activation function.

## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

Each feature being used in the neutral network is input through the input nodes. The value for each node is multiplied by its weight and summed.
Then bias is then added to the sum of the row, passed through an activation function and returned as output.

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [1]:
import pandas as pd
import numpy as np

In [335]:
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df

Unnamed: 0,x1,x2,y
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,0


In [233]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [427]:
# setting initial weights
weights = 2 * np.random.random((2,1)) - 1
print(weights)

# setting bias
bias = 1

[[ 0.45554027]
 [-0.09488714]]


In [450]:
# creates weighted sum for each row
df['w_sum'] = (df['x1'] * weights[0] + df['x2'] * weights[1]) + bias

# creates activated value for each row
df['a_value'] = df['w_sum'].apply(sigmoid)

# gives error for eacg riw
df['error'] = df['y'] - df['a_value']

# gets adjustment value
df['adjustment'] = df['error']*df['a_value'].apply(sigmoid_derivative)

# creating adjustment values to add to weights
adj1 = np.dot(df['adjustment'], df['x1'])
adj2 = np.dot(df['adjustment'], df['x2'])

# adjusting weights
weights[0] += adj1
weights[1] += adj2

df.head()

Unnamed: 0,x1,x2,y,w_sum,a_value,error,adjustment
0,0,0,1,1.0,0.731059,0.268941,0.058995
1,1,0,1,0.402723,0.599342,0.400658,0.091682
2,0,1,1,0.253363,0.563004,0.436996,0.101029
3,1,1,0,-0.343914,0.414859,-0.414859,-0.099377


In [461]:
def train_network(df, *, bias=1, iters=100):
    
    weights = 2 * np.random.random((2,1)) - 1
    
    for i in range(iters):
    
        df['w_sum'] = (df['x1'] * weights[0] + df['x2'] * weights[1]) + bias

        df['a_value'] = df['w_sum'].apply(sigmoid)

        df['error'] = df['y'] - df['a_value']

        df['adjustment'] = df['error']*df['a_value'].apply(sigmoid_derivative)
        
        weights[0] += np.dot(df['adjustment'], df['x1'])
        weights[1] += np.dot(df['adjustment'], df['x2'])
        
    return(df, weights)

In [472]:
trained_df, weights = train_network(df, bias= -1, iters=1000)

print(f'weights =  \n {weights}')

trained_df.head()

weights =  
 [[0.70177547]
 [0.70177547]]


Unnamed: 0,x1,x2,y,w_sum,a_value,error,adjustment
0,0,0,1,-1.0,0.268941,0.731059,0.179499
1,1,0,1,-0.298225,0.425992,0.574008,0.137184
2,0,1,1,-0.298225,0.425992,0.574008,0.137184
3,1,1,0,0.403551,0.599541,-0.599541,-0.137184


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [104]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [139]:
from sklearn.preprocessing import MinMaxScaler, Normalizer
scaler = MinMaxScaler()

feats = list(diabetes)[:-1]

for col in feats:
    diabetes[col] = diabetes[col].apply(float)

X = diabetes[feats]    
    
df1 = pd.DataFrame(scaler.fit_transform(X=X, y=diabetes['Outcome']), columns=feats)

In [148]:
df2 = pd.concat([df1, diabetes['Outcome']], axis=1)
df2.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,0.352941,0.743719,0.590164,0.353535,0.0,0.500745,0.234415,0.483333,1
1,0.058824,0.427136,0.540984,0.292929,0.0,0.396423,0.116567,0.166667,0
2,0.470588,0.919598,0.52459,0.0,0.0,0.347243,0.253629,0.183333,1
3,0.058824,0.447236,0.540984,0.232323,0.111111,0.418778,0.038002,0.0,0
4,0.0,0.688442,0.327869,0.353535,0.198582,0.642325,0.943638,0.2,1


In [None]:
##### Update this Class #####

class Perceptron(object):
    
    def __init__(self, niter = 10):
    self.niter = niter
    
    def __sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
    
    def __sigmoid_derivative(self, x):
        return sx * (1-sx)

    def fit(self, X, y):
    """Fit training data
    X : Training vectors, X.shape : [#samples, #features]
    y : Target values, y.shape : [#samples]
    """

    # Randomly Initialize Weights
    weights = ...

    for i in range(self.niter):
        # Weighted sum of inputs / weights

        # Activate!

        # Cac error

        # Update the Weights


    def predict(self, X):
    """Return class label after unit step"""
        return None

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?