<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

Source: https://machinelearningmastery.com/neural-networks-crash-course/

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer:

> Input layer, or the visible layer receives input from a dataset. This layer simply pass the input value though to the next layer.

### Hidden Layer:

> Not directly exposed to the input layer, in which they cannot be accessed except through the input layer.  We do not interact with them as they perform their functions. 

### Output Layer:

> The output layer is the final layer. The purpose is to output a value or vector in respect to the problem being solved. This layer is transformed by the "activation function" into a readable format for the context presented. 

### Neuron:

> The building blocks for NN. Neurons are simple computational units that have weighted input signals and produces an output signal using an activation function.

### Weight:

> Weight is the paramater within a NN that transforms input data within the networks hidden layer. For example, a single node may take the input data and multiply it by an assigned weight value, then add a bias before passing data to the next layer. 

### Activation Function:

> An activation function is a mapping of summed weighted input to the output of the neuron. This function governs the threshold on which the neuron is activated.

### Node Map:

> A visual representation of the internal architecture of a neural network; i.e. a map of the nodes (neurons) in the network and how they are connected to and share information with one another.

### Perceptron:

> The idea of the Perceptron is inspired by the information processing of a single neural cell called a neuron. 
  A neuron accepts input signals via its dentrites, which pass the electrical signal down to the cell body. 
  In a similar way, the Perceptron receives input signals from examples of training data that we weight and combined in a linear equation called the activator. 


## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?






#### Your Answer Here


> Input units receive various forms of information from the outside world. This in turn, triggers the hidden units(layers). The hidden layers in turn multiplies the inputs by the weights of the connection they travel along, and each unit is summed up along the way. If the sum reaches a certain threshold value, the unit 'fires' and triggers the units its connected to (going right). Finaly the information arrives to the output units. 

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

>  a NAND gate (NOT-AND) is a logic gate which produces an output which is false only if all its inputs are true; thus its output is complement to that of an AND gate. 

The NAND gate (negated AND) gives an output of 0 if both inputs are 1, it gives 1 otherwise.

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [25]:
import pandas as pd
import numpy as np

data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')

In [26]:
df.head()

Unnamed: 0,x1,x2,y
0,0,0,1
1,1,0,1
2,0,1,1
3,1,1,0


In [27]:
# Establish training data

np.random.seed(42)

# NAND gate features
# note: x0 is a dummy variable for the bias term
X = np.array([
    [0,0],
    [1,0],
    [0,1],
    [1,1]
])

# Desired outputs
y = [[1], [1], [1], [0]]

In [28]:
X

array([[0, 0],
       [1, 0],
       [0, 1],
       [1, 1]])

In [29]:
y

[[1], [1], [1], [0]]

In [None]:
# Sigmoid activation function and its derivative for updating weights


def sigmoid(X):
    return 1 / (1 + np.exp(-X))

def sigmoid_derivative(x):
    return sigmoid(x) * (1 - sigmoid(X))

In [None]:
# Initialize random weights

weights = 2 * np.random.random((2,1)) - 1

In [None]:
# Calculate weighted sum of inputs and weights

weighted_sum = np.dot(X, weights)

In [None]:
# Output the activated value for the end of 1 training ep

activated_output = sigmoid(weighted_sum)

In [None]:
# take difference of output and true values to calculate error

error = y - activated_output

In [None]:
# Gradient descent/backprop

adjustments = error * sigmoid_derivative(weighted_sum)

In [None]:
weights += np.dot(X.T, adjustments)

In [None]:
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

In [None]:
# Update our weights 10,000 times - (fingers crossed that this process reduces error)
for iteration in range(10000):
    
    # Weighted sum of inputs / weights
    weighted_sum = np.dot(X, weights)
    
    # Activate!
    activated_output = sigmoid(weighted_sum)
    
    # Calc error
    error = y - activated_output
    
    adjustments = error * sigmoid_derivative(weighted_sum)
    
    # Update the Weights
    weights += np.dot(X.T, adjustments)
    
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

In [None]:
class Perceptron(object):
    
    def __init__(self, rate = 0.01, niter = 10):
        self.rate = rate
        self.niter = niter
        
    def fit(self, X, y):
        """Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        """

        # weights
        self.weight = np.zeros(1 + X.shape[1])

        # Number of misclassifications
        self.errors = []  # Number of misclassifications

        for i in range(self.niter):
          err = 0
          for xi, target in zip(X, y):
            delta_w = self.rate * (target - self.predict(xi))
            self.weight[1:] += delta_w * xi
            self.weight[0] += delta_w
            err += int(delta_w != 0.0)
          self.errors.append(err)
        return self

    def net_input(self, X):
        """Calculate net input"""
        return np.dot(X, self.weight[1:]) + self.weight[0]

    def predict(self, X):
        """Return class label after unit step"""
        """ Default Step Function"""
        return np.where(self.net_input(X) >= 0.0, 1, -1)

### Refactored:

In [30]:
class Perceptron:
    def __init__(self, eta: float = 0.01, epochs: int = 10):
        self.eta = eta
        self.epochs = epochs

    def fit(self, X, y):
        """Fit Perceptron to training dataset."""
        # === Initialize weights === #
        self.weight = np.zeros(1 + X.shape[1])
        self.errors = []  # Number of misclassifications

        for i in range(self.epochs):
            err = 0
            print(f"=== Epoch {i + 1} ===")
            for xi, target in zip(X, y):
                # === Visual indiciation of results === #
                print(f"Expected: {target} | Predicted: {self.predict(xi)}")
                
                delta_w = self.eta * (target - self.predict(xi))
                self.weight[1:] += delta_w * xi
                self.weight[0] += delta_w
                err += int(delta_w != 0.0)

            self.errors.append(err)

        return "End"

    def net_input(self, X):
        """Calculate net input."""
        return np.dot(X, self.weight[1:]) + self.weight[0]

    def predict(self, X):
        """Return target label after unit step."""
        return np.where(self.net_input(X) >= 0.0, 1, 0)

In [31]:
sn = Perceptron()
sn.fit(X, y)

=== Epoch 1 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [0] | Predicted: 1
=== Epoch 2 ===
Expected: [1] | Predicted: 0
Expected: [1] | Predicted: 0
Expected: [1] | Predicted: 1
Expected: [0] | Predicted: 1
=== Epoch 3 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 0
Expected: [1] | Predicted: 0
Expected: [0] | Predicted: 1
=== Epoch 4 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 0
Expected: [0] | Predicted: 1
=== Epoch 5 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 0
Expected: [1] | Predicted: 1
Expected: [0] | Predicted: 0
=== Epoch 6 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [0] | Predicted: 0
=== Epoch 7 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Expected: [0] | Predicted: 0
=== Epoch 8 ===
Expected: [1] | Predicted: 1
Expected: [1] | Predicted: 1
Ex

'End'

## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [20]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [36]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

X = diabetes.drop(columns='Outcome').values
y = np.where(y == 'Outcome', -1, 1)
no_of_inputs = 

In [35]:
X

array([[  6.   , 148.   ,  72.   , ...,  33.6  ,   0.627,  50.   ],
       [  1.   ,  85.   ,  66.   , ...,  26.6  ,   0.351,  31.   ],
       [  8.   , 183.   ,  64.   , ...,  23.3  ,   0.672,  32.   ],
       ...,
       [  5.   , 121.   ,  72.   , ...,  26.2  ,   0.245,  30.   ],
       [  1.   , 126.   ,  60.   , ...,  30.1  ,   0.349,  47.   ],
       [  1.   ,  93.   ,  70.   , ...,  30.4  ,   0.315,  23.   ]])

In [37]:
y

(array([], dtype=int64),)

In [23]:
feats = list(diabetes)[:-1]
feats

['Pregnancies',
 'Glucose',
 'BloodPressure',
 'SkinThickness',
 'Insulin',
 'BMI',
 'DiabetesPedigreeFunction',
 'Age']

In [38]:
class Perceptron(object):

    def __init__(self, no_of_inputs, threshold=100, learning_rate=0.01):
        self.threshold = threshold
        self.learning_rate = learning_rate
        self.weights = np.zeros(no_of_inputs + 1)
           
    def predict(self, inputs):
        summation = np.dot(inputs, self.weights[1:]) + self.weights[0]
        if summation > 0:
            activation = 1
        else:
            activation = 0            
        return activation

    def train(self, training_inputs, labels):
        for _ in range(self.threshold):
            for inputs, label in zip(training_inputs, labels):
                prediction = self.predict(inputs)
                self.weights[1:] += self.learning_rate * (label - prediction) * inputs
                self.weights[0] += self.learning_rate * (label - prediction)

In [39]:
# Accuracy after 10 iterations
pn = Perceptron(no_of_inputs=no_of_inputs, threshold=10, learning_rate=0.01)
pn.train(X, y)
y_pred = [pn.predict(row) for row in X]
print(f'weights: {pn.weights}')
print(f'Accuracy: {accuracy_score(y, y_pred)}')

NameError: name 'no_of_inputs' is not defined

In [None]:
##### Update this Class #####

class Perceptron:
    
    def __init__(self, niter = 10):
        self.niter = niter
    
    def __sigmoid(self, x):
        return None
    
    def __sigmoid_derivative(self, x):
        return None

    def fit(self, X, y):
    """Fit training data
    X : Training vectors, X.shape : [#samples, #features]
    y : Target values, y.shape : [#samples]
    """

        # Randomly Initialize Weights
        weights = ...

        for i in range(self.niter):
            # Weighted sum of inputs / weights

            # Activate!

            # Cac error

            # Update the Weights


    def predict(self, X):
    """Return class label after unit step"""
        return None

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?