<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer: The visible layer that interacts directly with the database to get inputs.
### Hidden Layer: Layers after the input layer that cannot be accessed directly. 
### Output Layer: The layer that ouputs the results in a form suitable for the problem
### Neuron: A single node.
### Weight: The value that an input is multiplied by. A scaling factor between nodes
### Activation Function: The function that takes model results and converts them into a for that is suitable for the problem.
### Node Map: A topological map of a neural network detailing the nodes and their connections.
### Perceptron: The simplest form of neural network that simply takes an input and returns a weighted ouput


## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

#### 

The neural net recieves inputs through the input node. The inputs are then sent into corresponding hidden nodes (one for each features) where weights are applied to them. Biases are added to the values and the results are passed into additional hidden layers if there are any. After the last hidden layer, the results are passed into an activation function which formats the results into a response that suits the problem (e.g. converting values to a range from 0,1 for probabilities). The results are then ouputted by the output layer.

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [167]:
import pandas as pd
import numpy as np
import random

data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df

Unnamed: 0,x1,x2,y
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,0


In [14]:
##### Your Code Here #####
# split data set into inputs and desired outputs
# add a column of 1's to act as our bias 

X = df[['x1', 'x2']]
X['bias'] = [1,1,1,1]
y = df['y']

X.head()

Unnamed: 0,x1,x2,bias
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,1


In [16]:
# define our activation function and it's derivative

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [64]:
np.random.seed(42)

weights = np.random.random((3,1))
weights

array([[0.37454012],
       [0.95071431],
       [0.73199394]])

In [65]:
weighted_sum = np.dot(X, weights)
weighted_sum

array([[0.73199394],
       [1.10653406],
       [1.68270825],
       [2.05724837]])

In [66]:
activated_output = sigmoid(weighted_sum)
activated_output

array([[0.67524268],
       [0.75148239],
       [0.84326281],
       [0.88667798]])

In [67]:
error = y.to_list() - activated_output.T
error

array([[ 0.32475732,  0.24851761,  0.15673719, -0.88667798]])

In [68]:
adjustments = np.multiply(error, sigmoid_derivative(weighted_sum).T)
adjustments

array([[ 0.07121603,  0.04641231,  0.02071605, -0.08909353]])

In [69]:
weights += np.dot(X.T, adjustments.T)
weights

array([[0.3318589 ],
       [0.88233683],
       [0.78124481]])

In [70]:
for iteration in range(10000):
    
    # Weighted sum of inputs / weights
    weighted_sum = np.dot(X, weights)
    
    # Activate!
    activated_output = sigmoid(weighted_sum)
    
    # Calc error
    error = y.to_list() - activated_output.T
    
    adjustments = np.multiply(error, sigmoid_derivative(weighted_sum).T)
    
    # Update the Weights
    weights += np.dot(X.T, adjustments.T)
    
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

Weights after training
[[-8.01846465]
 [-8.01846465]
 [12.1141634 ]]
Output after training
[[0.99999452]
 [0.98362753]
 [0.98362753]
 [0.0194034 ]]


In [72]:
# write a function to convert our outputs to binary values

def binary_cutoff(value, cutoff=0.5):
    if value >= cutoff:
        return 1
    else:
        return 0

In [82]:
# test our binary cutoff with most recent outputs
results = pd.DataFrame(activated_output)[0].apply(binary_cutoff)
results

0    1
1    1
2    1
3    0
Name: 0, dtype: int64

In [180]:
# Generate some fake data for a NAND gate

num_rows = 1000
i = 1
test_data = pd.DataFrame(columns=['x1', 'x2', 'y'])

while i <= num_rows:
    new_row = df.iloc[np.random.randint(4)]
    test_data = test_data.append(new_row, ignore_index=True)
    i+=1

test_data.shape

(1000, 3)

In [170]:
test_data

Unnamed: 0,x1,x2,y
0,1,0,1
1,1,1,0
2,1,0,1
3,1,0,1
4,1,0,1
...,...,...,...
995,1,0,1
996,1,1,0
997,0,1,1
998,0,0,1


In [171]:
# write model to generate predictions

def model(input_df, weights):
    
    # create our input dataframe (only x values and bias)
    X = input_df[['x1', 'x2']].astype(float)
    X['bias'] = [1] * len(input_df)
    
    # Weighted sum of inputs / weights
    weighted_sum = np.dot(X, weights)
    
    # Activate!
    activated_output = sigmoid(weighted_sum)
    
    # Convert to binary outputs
    binary_output = pd.DataFrame(activated_output)[0].apply(binary_cutoff)
    
    return binary_output

In [172]:
predictions = model(test_data, weights)
predictions

0      1
1      0
2      1
3      1
4      1
      ..
995    1
996    0
997    1
998    1
999    1
Name: 0, Length: 1000, dtype: int64

In [173]:
from sklearn.metrics import accuracy_score

print("Number of observations:", len(test_data))
print("Accuracy score:", accuracy_score(test_data['y'].astype(int), predictions))

Number of observations: 1000
Accuracy score: 1.0


In [182]:
obs = 1000
choices = [1, 0]

def nand(val1, val2):
     return 0 if (val1 == 1) & (val2 == 1) else 1
    
x1 = [random.choice(choices) for i in range(0,obs)]
x2 = [random.choice(choices) for i in range(0,obs)]
y = []

for i in range(0, obs): 
    y_true = nand(x1[i], x2[i])
    y.append(y_true)
    
df_test = pd.DataFrame({'x1': x1, 'x2': x2, 'y':y})
#df_test["bias"] = pd.Series(np.ones(obs,), dtype=int)
df_test

Unnamed: 0,x1,x2,y
0,1,0,1
1,0,0,1
2,0,0,1
3,1,0,1
4,0,1,1
...,...,...,...
995,1,1,0
996,0,1,1
997,1,0,1
998,0,0,1


In [178]:
predictions = model(df_test, weights)
predictions

0      1
1      1
2      1
3      0
4      1
      ..
995    1
996    0
997    0
998    1
999    0
Name: 0, Length: 1000, dtype: int64

In [179]:
from sklearn.metrics import accuracy_score

print("Number of observations:", len(df_test))
print("Accuracy score:", accuracy_score(df_test['y'].astype(int), predictions))

Number of observations: 1000
Accuracy score: 1.0


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [4]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [10]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

X = ...

In [None]:
##### Update this Class #####

class Perceptron:
    
    def __init__(self, niter = 10):
        self.niter = niter
    
    def __sigmoid(self, x):
        return None
    
    def __sigmoid_derivative(self, x):
        return None

    def fit(self, X, y):
    """Fit training data
    X : Training vectors, X.shape : [#samples, #features]
    y : Target values, y.shape : [#samples]
    """

        # Randomly Initialize Weights
        weights = ...

        for i in range(self.niter):
            # Weighted sum of inputs / weights

            # Activate!

            # Cac error

            # Update the Weights


    def predict(self, X):
    """Return class label after unit step"""
        return None

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?