# Perceptron Lab

This notebook will work through creating a single-layer perceptron and how to train it.

First import any libraries we may need.


In [None]:
import numpy as np

## Perceptron Training Algorithm

Rosenblatt had a hypothesis on how to train neurons based on his intuition about biological neurons. He intuited a simple learning algorithm. His idea was to run each example input through the perceptron and, if the perceptron is correct then "excite" its weights. If the perceptron was incorrect, then "inhibit" its weights. 

How do we inhibit or excite? We change the weight vector and bias. The weight vector is a parameter to the perceptron. We need to keep changing it until we can correctly classify each of our training examples. With this intuition in mind, we need to write an update rule for our weight vector so that we can appropriately change it.

First, we can define an error function as the difference between the desired output **d** and the predicted output **y**.

`e = d - y`

Notice that when **d** and **y** are the same (both are 0 or both are 1), we get 0! When they are different, (0 and 1 or 1 and 0), we can get either 1 or -1. This directly corresponds to exciting and inhibiting our perceptron. We multiply this with the input to tell our perceptron to change our weight vector in proportion to our input.

`w' = w + lr * e * x`

There is a hyperparameter `lr` that is called the learning rate. It is a scaling factor that determines how large the weight vector updates should be. This is a _hyperparameter_ because it is not learned by the perceptron (notice there’s no update rule for `lr`), but we, the data scientist, select this parameter.

Recall the Perceptron Convergence Theorem says that a perceptron will converge, given that the classes are linearly separable, regardless of the learning rate. But for other learning algorithms, this is a critical parameter! For our example, `lr` can potentially speed up the learning process.

**tldr** When the error  is 0, i.e., the output is what we expect, then we don’t change the weight vector at all. When the error is nonzero, we update the weight vector accordingly.


In [None]:
class Perceptron(object):
    """Implements a single layer Perceptron """

    def __init__(self, input_size, epochs=100, lr=1):
        # initialize the weights vector parameter, adding one for bias
        self.W = np.zeros(input_size + 1)

        # Each time through the training data applying the learning rule is
        # called an Epoch
        self.epochs = epochs

        # Learning rate --this is a hyperparameter, it is not learned via the
        # training set but is supplied.
        self.lr = lr

    def sigma(self, x):
        """This is the activation function. """
        pass

    def train(self, X, d):
        """Iterate through the training data (X) and score the perceptrons
        performance via the labels (d) and adjust weights accordingly."""
        pass

    def predict(self, x):
        """y = sigma(z) """
        pass

## Test Data

In [None]:
# Binary Op
X = np.array([
        [0, 0],
        [0, 1],
        [1, 0],
        [1, 1]
    ])
labels = np.array([0, 0, 0, 1])

perceptron = Perceptron(input_size=2)
perceptron.train(X, labels)
print(perceptron.W)

`perceptron.W` should be `[-3, 2, 1 ]` which means that the bias is -3, and the weights for `x_1` and `x_2` are 2 and 1, respectively. If both inputs are 0, then the pre-activation will be `-3 + 0*2 + 0*1 = -3`. Then by applying the activation function, we get `0`, which is exactly `0 AND 0`. Try this for other gates as well. 

Note that this is not the only correct weight vector. If there exists a single weight vector that can separate the classes, there exist an infinite number of weight vectors. Which weight vector we get depends on how we initialize the weight vector.

## Additional Data

SKLearn Wisconsin Breast Cancer Dataset: <https://scikit-learn.org/stable/datasets/toy_dataset.html#breast-cancer-dataset>



In [None]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

# Load the data set
bc = datasets.load_breast_cancer()
X = bc.data
y = bc.target

# Create training and test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

perceptron = Perceptron(input_size=len(X_train))
perceptron.train(X_train, y_train)

# TODO write a scoring method that takes X_test and y_test