<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Neural Networks

## *Data Science Unit 4 Sprint 2 Assignment 1*

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Perceptron: The perceptron is an older and more 'vanilla' form of neural network that forms a foundation for understanding. It takes in an input, processes it through one or more hidden layers, and then produces an output value. Outputs are predicted from neurons pass signal despite bias and other limiters.

### Neuron: The primary unit of structure for neural networks, and mimic biological neurons. Here, they operate as discrete units that contain a mathematical function. The neuron processes values input from the previous layer by taking a summed average of the weight and the neuron's own value. Once an average is produced, it can be added to the bias and fed through an activation function.

### Weight: The weight within a neural network is how strong the relationship between neurons. The higher the magnitude of the weight, the more important that relationship is.

### Activation Function: Is a function that often 'squishifies' the final layer. This will often take an output scale that can be any set of numbers, and converts them to something easier to process like a number between 0 and +1 or between -1 and +1. This is often the final boundary as to wether an output neuron lights up.

### Node Map: A node map is a visual representaiton of a neural network's structure. It will dispaly relationships (like weights) for the input, hidden, and output layers along with the bias.

### Input Layer: The initial set of neurons representing the input data. The input layer matches the shape of the primary data.

### Hidden Layer: Hidden layers are sets of neurons inbetwen the input and output layers. It represents the attempts of the model to learn relationships between the input and output values. Hidden layers transform either input into hidden or hidden to hidden or hidden to output.

### Output Layer: The output is the final set of neurons, and takes the shape of our output data. The output layer takes signal after being passed through an activation function. The output neuron lights up if the output of the activation function meets some threshold.

## Inputs -> Outputs

### 

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

#### 1. Information starts flowing into the model as input. In the classic handwriting example, the image is turned into a series of numbers representing the brightness of every pixel ordered first to last.

#### 2. Once the input neurons have their activations (a number between 0 and 1), information is fed forward through the model to subsequent neurons. The following neuron's activation is based on the weights of the previous neurons. The weight is how strongly the neuron considers the previous neuron. If the weight is high, the subsequent neuron will "light up" when the previous one does. If the weight is low the following neuron does not take that previous connection/neuron into consideration. 

#### 3. After the weights and summed up times the activations, the bias is added. This bias factor is added on at the end to add a sort of discretion against that node-path lighting up. If the bias is high, it will take a strong activation value to make it light up. If the bias is low, any mild signal will cause high activation into the activation function.

#### 4. In the end, we want an actual answer though. So, having a bunch of final neurons lighting up because of any amount of signal is not decisive. Having the data from the hidden layers (if there are any) into the activation function squishifies the continuous activation value into more of a binary usually between 0,1 or -1,1. This turns the continuous uncertainty into a "final" decision from the network.

## Write your own perceptron code that can correctly classify (99.0% accuracy) a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [1]:
import pandas as pd
data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')

In [2]:
#Correct inputs and outputs
correct_outputs = [[1], [1], [1], [0]]
inputs = df

# Randomize weights
import numpy as np
weights = 2 * np.random.random((3,1)) - 1

# Bring over sigmoid and sigmoid derivation functions

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1 - sx)

In [3]:
# Update weights 10,000 times.
for iteration in range(100000):
    
    # Weighted sum of inputs and weight values (dot product())
    weighted_sum = np.dot(inputs, weights)
    
    # Activate the output using sigmoid
    activated_output = sigmoid(weighted_sum)
    
    # Calculate error & adjust
    error = correct_outputs - activated_output
    adjustments = error * sigmoid_derivative(weighted_sum)
    
    # Update the weights with new adjustments
    weights += np.dot(inputs.T, adjustments)
    
print("Weights post training")
print(weights)

print("Outputs post training")
print(activated_output)

Weights post training
[[-2.99208663]
 [-2.99317349]
 [ 9.23457976]]
Outputs post training
[[0.9999024 ]
 [0.99805877]
 [0.99805666]
 [0.00250926]]


## Implement your own Perceptron Class and use it to classify a binary dataset: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [4]:
diabetes = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


Although neural networks can handle non-normalized data, scaling or normalizing your data will improve your neural network's learning speed. Try to apply the sklearn `MinMaxScaler` or `Normalizer` to your diabetes dataset. 

In [5]:
from sklearn.preprocessing import MinMaxScaler, Normalizer

feats = list(diabetes)[:-1]

outcomes = diabetes[['Outcome']].values
X = diabetes[
             ['Pregnancies',
              'Glucose',
              'BloodPressure',
              'SkinThickness',
              'Insulin',
              'BMI',
              'DiabetesPedigreeFunction',
              'Age']
            ].values

features = list(diabetes)[:-1]
y = outcomes

In [6]:
# Check shapes of X and y
print(X.shape)
y.shape

(768, 8)


(768, 1)

In [9]:
#Create perceptron class from class lecture
class Perceptron(object):
    
    def __init__(self, rate = 0.01, num_iter = 10):
        self.rate = rate
        self.num_iter = num_iter
        
    def fit(self, X, y):
        """Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        """

        # Making weights a list of zeroes
        self.weight = np.zeros(1 + X.shape[1])

        # Number of misclassifications
        self.errors = []

        for i in range(self.num_iter):
            err = 0
            for xi, target in zip(X, y):
                delta_w = self.rate * (target - self.predict(xi))
                self.weight[1:] += delta_w * xi
                self.weight[0] += delta_w
                err += int(delta_w != 0.0)
            self.errors.append(err)
        return self

    def net_input(self, X):
        """Calculate net input"""
        return np.dot(X, self.weight[1:]) + self.weight[0]

    def predict(self, X):
        """Return class label after unit step"""
        return np.where(self.net_input(X) >= 0.5, 1, 0)

In [10]:
# Instantiate my perceptron
percept = Perceptron()

# Fit it to X,y and check
percept.fit(X, y)

<__main__.Perceptron at 0x1119c9220>

In [11]:
from sklearn.metrics import accuracy_score

# Check model accuracy
accuracy_score(percept.predict(X), y)

0.53515625

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?