# The (Rosenblatt) Perceptron

This version of the Perceptron should be found [here](https://colab.research.google.com/drive/1VfBv9H9Se-30OUMFfjEaq9BL2rKKo-Gy).To see a compact version of the Perceptron, check out [TinyPerceptron](https://colab.research.google.com/drive/1LmTL4WFZA1nKT6K8X2h2BYHb6S-PggEy).

Please send bug reports or updates to gabriele underscore fariello at harvard dot edu.

## Prerequisites

Before you go on, you may want to make sure you're comfortable with the material covered in:
- [Colab Basics & Mounting Google Drive](https://colab.research.google.com/drive/1vCv_PGeDZBvdsMUHrj_O8wScp0uHi2WL)
- [Loading Data & SKLearn](https://colab.research.google.com/drive/1waXfYJevdf9kyhasvUlzw6vf8RvZfv_V)
- Ability to read Python code


## Background

The original Rosenblatt [Perceptron](https://en.wikipedia.org/wiki/Perceptron) was invented by [Frank Rosenblatt](https://en.wikipedia.org/wiki/Frank_Rosenblatt) (1928-1971) in 1957 and published in 1958 journal "Psychological Review":

    Rosenblatt, Frank. "The perceptron: a probabilistic model for
    information storage and organization in the brain." Psychological
    review 65.6 (1958): 386.

The publication itself did not contain the code used, and, as far as we know, the original code is not accessible. Lukily the methods section contains enough information to likely approximate the origical code.

"[Toward Data Science](https://towardsdatascience.com/)" has a [nice brief article](https://towardsdatascience.com/what-the-hell-is-perceptron-626217814f53) by Sagar Sharma explaining the Perceptron further. In case you forgot, the original Perceptron is a *Binary Linear Classifier* which attempts to draw a line (for 2 features), a plane (for 3 features), etc. which creates a decision boundary splitting the data into two regions. Samples with features in one region are labelled one way, all others in the other way.

This code is a heavily modified version, with permission, of the code originally published in "[Python Machine Learning by Sebastian Raschka](https://www.amazon.com/Python-Machine-Learning-scikit-learn-TensorFlow-ebook/dp/B0742K7HYF)" (From the first edition, but the I recommend the 2nd edition for those interested ISBN-13: 978-1787125933 or ISBN-10: 1787125939)

Sebastian Raschka also has [a good, more in-depth explanation](https://sebastianraschka.com/Articles/2015_singlelayer_neurons.html#frank-rosenblatts-perceptron) on his site.

# Load the Data

In [None]:
from sklearn.datasets import load_iris
import numpy as np
import pandas as pd

iris = load_iris()
dataset = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])
labels, samples = np.where(dataset.iloc[0:100, 4].values==0,-1,1), dataset.iloc[0:100, [0, 2]].values

In [None]:
dataset

# The Perceptron Class (big)


Below we begin the `Perceptron` class. Remember that we split it across cells using Python's ability to add methods to class after they have been declared, but you normally don't want to do that.

Typically you document the class and the `__init__` function in the docstring (the string immediately following the `class` declaration. There are several formats that are widely in use, but here we'll try to stick to the [NumPy recommendation](https://numpydoc.readthedocs.io/en/latest/format.html) if and when I get around to finishing the documentation.

## Class Declaration

### Perceptron & __init__()

Below is the declaration of the `Perceptron` class with embedded documentation. I've decided to implement it such that `Perceptron` object are instantiated (i.e. created) with the samples and labels already passed to it. This is not the traditional way to do this since there are times when it would be preferable to re-use the Perceptron with different inputs (e.g., the Perceptrons in a multi-layer configuration). But we're not going to do that.

This way, we can quickly try different variables like changing the number of iterations during training, the values of the initial weights, and the learning rage ($\eta$) without passing around the inputs.

In [None]:
class Perceptron(object):
    """
    Parameters
    ----------
        samples : numpy.dnarray
            An array of arrays of features (those variables we will use to train
            on during trainging and then use when trying to "guess" the species
            of the flower). Each element of the array contains an array of
            features for that sample. For example:
            samples = [[petal_length_1, sepal_width_1],
                       [petal_length_2, sepal_width_2],
                       ...
                       [petal_length_n, sepal_width_n],
                       ]
        labels : numpy.ndarray
            An array, of the same length (same size and shape of the first
            dimension) as the `samples` array where each element if the
            "correct" label in number format for each sample. If species_1 = 0
            and species_2 = 1 it might look like this for the above sample:
            labels = [1,0,... 1]    
    """
    def __init__(self, samples, labels):
        # Store the samples after adding a 1.0 "hidden" feature to each
        # this is the "bias value". Many implementations separate the
        # bias weight (calling it just bias), but this way we treat
        # all weights the same and simplify the code (see TinyPerceptron)
        self.samples = np.insert(samples,0,np.ones(samples.shape[0]), axis=1) # add 1.0 to each sample
        # Store the labels
        self.labels = labels
        # This is a binary classifier, so if there are not two labels, we have a problem.
        self.unique_labels = np.unique(labels)
        if len(self.unique_labels) != 2:
            raise ValueError("We need exactly two categories/labels. We received %d." %(len(unique_labels)))
        self.a_label, self.b_label = self.unique_labels[0], self.unique_labels[1]
        # Set the threshold to halfway between the two
        self.threshold = self.unique_labels.mean()
        if self.samples.shape[0] != labels.shape[0]:
            raise ValueError("We need there to be as many samples (%d) as there are labels (%d)."
                            %(self.samples.shape[0], labels.shape[0]))
        # We want to keep track of the the weights as they progress
        self.bias_history = []
        self.weights_history = []
        self.iteration_count = None
        self.num_iterations = None
        self.learning_rate = None
        self.misclassifications = None
        # Initialize the weights
        self.initialize_weights()
        pass
    pass

### set_all_weights()

Since we'll be doing the same thing many times, it's good to create a function. This convenience function takes an Numpy ndarray and sets all the weights (bias included) based on the values in the array.

In [None]:
def set_all_weights(self,arr):
    """
    Set all weights (including the bias) using teh np.ndarray arr.
    
    You probabl don't want to call this directly. See other methods
    below.
    
    Parameters
    ----------
    
    arr : numpy.ndarray
        An array of the same length as the number of features per sample plus one
        containing all the values for all the weights.
    """
    if not isinstance(arr, np.ndarray):
        raise ValueError("We need an np.ndarray object as input not a %s" %type(arr))
        pass
    if arr.shape[0] != self.samples.shape[1]:
        raise ValueError("Array must have %d elements, it had %d (%s)" %(
            self.samples.shape[1],arr.shape[0], arr))
        pass
    self.weights = arr
    self.weights_history = [arr.copy()]
    return self

Perceptron.set_all_weights = set_all_weights

### set_random_weights()

This method sets the bias weight and the weights for each feature to some random value between -1.0 and 1.0 using some of the neat conveninence tricks afforded to us through NumPy.

In [None]:
def set_random_weights(self):
    """
    Set weights to random values from -1.0 to 1.0
    """
    return self.set_all_weights(np.random.rand(self.samples.shape[1]) * 2 - 1.0)

Perceptron.set_random_weights = set_random_weights

### initialize_weights()

Initialze all the weights (bias included0 to some values depending on input.
* If value is `None`, initialize to random values between -1 and 1.
* If value is a NumPy ndarray, set the values to the content of the array.
* Otherwise, assume it's a number, convert it to a float, and set all values to that.

In [None]:
def initialize_weights(self,value=None):
    """
    Initialize the bias and weights.
    
    Paremeters
    ----------
    value : None or float
        If None, all weights will be initialized randomly with values ranging from -1.0 to 1.0
        If np.ndarry, set weights with values in the array
        Otherwise set all weights to value
    """
    if value is None:
        return self.set_random_weights()
    # If it's an np.ndarray, assign
    if isinstance(value, np.ndarray):
        return self.set_all_weights(value)
    return self.set_all_weights(np.full(self.samples.shape[1], float(value)))

Perceptron.initialize_weights = initialize_weights

### get_predicted_float()

To get the predicted value (before thresholding) which is the result of the summation function in class:

\begin{equation*}
\sum_{n=0}^n x_n w_n = ( x_0 w_0 + x_1 w_1 + ... x_n w_n ) = \begin{bmatrix}
x_0 \\
x_1 \\
... \\
x_n
\end{bmatrix}
\bullet
\begin{bmatrix}
w_0 \\
w_1 \\
... \\
w_n
\end{bmatrix}
\end{equation*}

Where $x_0$ = `1.0` and $w_0$ = `self.bias` are the bias value and the bias weight respectively. Since the bias value is always 1.0, we don't need to go around adding `self.bias * 1.0` but just `self.bias`. Many implementations therefore keep $w_0$ out separately, we don't.

Since this the same thing as a dot-product (the stuff at the end of the second equals sign) and this could be done using `np.dot` in stead of a loop, which is better. The `np.dot` function is implemented using fast linear algebra libraries that experience a lot of acceleration if there are available GPUs. With two features per sample, this does not matter too much, but keep that in mind.

**Remember** we added a `1.0` to each sample, so we can get the results of the dot-product of BOTH the normal weights ( $x_1,\ x_2\ ...\ x_n$ ) per feature AND the bais ( $w_0$ ) with one `np.dot` call since the features per sample are

```python
    sample = [1.0, feature1, feature2]
```

and the weights are

```python
    self.weights = [bias_weight, feature1_weight, feature2_weight]
```

The tiny extra work done by having to multiply by `1.0` each sample is often dwarfed by the ability to use one single `np.dot` rather than multiple operations, especially when you have many features or, like we will be doing, are retraining on the same data set over and over.

In [None]:
def get_predicted_float(self, features):
    """
    "Predict" the un-thresholded value and return it as a float.
    """
    return np.dot(features,self.weights)

Perceptron.get_predicted_float = get_predicted_float

### predict()

Now you may recall from lecture that artificial neurons will generally have a summation (`get_predicted_float()`) function and a transformation function where the output is transformed usually between the values of 0 and 1 or -1 and 1 in some way, sometimes using a sigmoidal function or a rectifier. The transformation function used by the Perceptron is a simple thresholding (aka step) function:

\begin{equation*}
y = \begin{cases}
1\ if\ \geq \theta \\
0\ if\ \lt \theta
\end{cases}
\end{equation*}

OR

\begin{equation*}
y = \begin{cases}
1\ if\ \geq \theta \\
-1\ if\ \lt \theta
\end{cases}
\end{equation*}

Where theta ($\theta$) is the threshold.

Remember that we pulled `self.a_label` and `self.b_label` from the data in the `__init__()` method and set the threshold (`self.threshold`) to halfway between the two by using the `np.mean()` method.

In [None]:
def predict(self, features):
    """
    "Predict" the estimated category as a continuous float.
    """
    # Get the float predicted label and theshold it to make this binary
    return self.b_label if self.get_predicted_float(features) < self.threshold else self.a_label

Perceptron.predict = predict

### update_weights()

This method updates the weights in the direction expected to minimize the error by the calculated "error_amount" and the "learning_rate".

In [None]:
def update_weights(self,error_amount, features):
    if error_amount == 0.0:
        return False
    # Now "nudge" the weights by the error amount multiplied
    # by the learning rate for each feature.
    self.weights +=  self.learning_rate * error_amount * features
    # Store a copy for later visualization
    self.weights_history.append(self.weights.copy())
    return True

Perceptron.update_weights = update_weights

### train_once()

This is the method that does one iteration over the training samples and tries to improve the weights.

In [None]:
def train_once(self):
    # Number of errors encountered (misclassifications)
    error_count = 0
    # Iterate over each sample in the training set
    for features, correct_label in zip(self.samples, self.labels):

        # Based on the current weights and bias, what would we predict
        # this sample to be?
        predicted_label = self.predict(features)
        # The difference between the correct and predicted
        # tells us how far off we were. This could be positive
        # or negative. Note that which you subtract from which is
        # very important since the sign matters.
        error_amount = predicted_label - correct_label

        # If there was a difference, nudge the weights.
        if self.update_weights(error_amount,features):
            # Since we were wrong, count the error
            error_count += 1
            pass
        pass
    return error_count

Perceptron.train_once = train_once

### train()

Here is the trian function. The learning rate is sometimes called `eta` after the greek letter $\eta$ which is used in the math discussions of gradient descent.

In [None]:
def train(self, learning_rate=0.01, num_iterations=50, weight_values=None):
    # We store the learning rate so that we can view it later
    # otherwise it disappears after this method if finished.
    self.learning_rate = learning_rate
    # Same for num_iterations
    self.num_iterations = num_iterations
    # Set iteration_count to 0, since we're starting fresh
    self.iteration_count = 0
    # Initialize biases
    self.initialize_weights(weight_values)
    self.misclassifications = []
    for _ in range(self.num_iterations):
        self.iteration_count += 1
        misclassifications = self.train_once()
        self.misclassifications.append(misclassifications)
        if misclassifications <= 0.0:
            return self
        pass
    return self
 
Perceptron.train = train

# Display Methods


Below we add some display methods that allow us to "look" at what is going on in the Perceptron. We do this using a trick that allows us to add methods to a class after the class definition. This is normally not the way you would go about it, but I wanted to have the class definition block only contain the essential class definitions.

## Print Info

Here we just add a convenience function that allows us to "see" the current Perceptron settings.

In [None]:
def print_info(self):
    print("Perceptron:")
    print("  Samples: ................ %d" % self.samples.shape[0])
    print("  Features per Sample ... : %d" % (self.samples.shape[1] - 1))
    print("  Labels: ................. %s" % self.unique_labels)
    print("  Threshold: .............. %s" % self.threshold)
    print("  Learning Rate (eta): .... %s" % self.learning_rate)
    print("  Number of Iterations: ... %s" % self.num_iterations)
    print("  Original Bias: .......... %s" % self.weights_history[0][0])
    print("  Original Weights: ....... %s" % self.weights_history[0][1:])
    print("  Current Bias: ........... %s" % self.weights[0])
    print("  Current Weights: ........ %s" % self.weights[1:])
    print("  Iterations Completed: ... %s" % self.iteration_count)
    if self.misclassifications is None:
        print("  Misclassifications: ..... %s" % self.misclassifications)
    else:
        print("  Misclassifications: ..... %s" % np.sum(self.misclassifications))
        print("  Res: %s" % self.misclassifications)
        pass
    return

Perceptron.print_info = print_info

## Drawing the Decision Boundary Line

To draw a line from one end of the plot to the other we just need to calculate $y$ when $x$ is the smallest value visible on the x-axis ($x_{min}$) and when it is the maximum value on the x-axis ($x_{max}$). We'll call these $y_{x_{min}}$ and $y_{x_{max}}$ respectivly. These equations give us what we want:

\begin{equation}
y_{x_{min}} = \frac{-w_1x_{min}}{w_2}+\frac{-w_0}{w_2}=\frac{-(w_1x_{min}+w_0)}{w_2} \qquad  \text{and} \qquad  y_{x_{max}} = \frac{-w_1x_{max}}{w_2}+\frac{-w_0}{w_2}=\frac{-(w_1x_{max}+w_0)}{w_2}
\end{equation}

We then just need to draw a line from the point $\left(x_{min},y_{x_{min}}\right)$ to $\left(x_{min},y_{x_{min}}\right)$ and that's the decision boundary based on the curren weights and bias.

**Note**: if $w_2=0$ we have a problem. Also, this assumes we found a boundary. If we did not, the last set of weights my be way off the pot at $x_{min}$ and $x_{max}$.

### Detailed Derivation of the Equations

Hopefully you all recall from back in algebra the standard equation for a line:

\begin{equation*}
ax + by = c\  \Longleftrightarrow\  ax + by -c = 0
\end{equation*}

While the slope form of the line $y = mx + b$ is just this rearranged

\begin{equation*}
ax + by = c \ \Longleftrightarrow\  by = -ax + c \ \Longleftrightarrow\  y = \left( \frac{-a}{b} \right) x + c
\end{equation*}

Where $\left( \frac{-a}{b} \right)$ corresponds to $m$ (the slope, aka "rise over run") and $c$ to $b$.

Well, we can use that to draw the boundary line. When we're looking at how we "predict" a label, we do so by using this equation from above:

\begin{equation*}
\sum_{n=0}^n x_n w_n = ( x_0 w_0 + x_1 w_1 + ... x_n w_n ) = \begin{bmatrix}
x_0 \\
x_1 \\
... \\
x_n
\end{bmatrix}
\bullet
\begin{bmatrix}
w_0 \\
w_1 \\
... \\
w_n
\end{bmatrix}
\end{equation*}

But since the $x_0$ value is always `1`, this reduces to:

\begin{equation*}
w_0 + ( w_1 \times x_1 ) + ( w_2 \times x_2 )
\end{equation*}

> or, more simply put


\begin{equation*}
x_1 w_1 + x_2 w_2 + w_0
\end{equation*}

Remeber that we plot the features $x_1$ on the x-axis and $x_2$ on the y-axis. We can see that they are just like $x$ and $y$ respectively from the $ax + by = c$ equation. In fact

\begin{equation*}
x_1 w_1 + x_2 w_2 + w_0
\end{equation*}


Now looks exactly like the $ax + by -c = 0$ equation. By setting the above equation to zero, we can solve for all parts of the line equation. Remember that

\begin{equation}
y = x_2 \ ;\  x = x_1 \ ;\  a = w_1 \ ;\  b = w_2 \ ; \ c = -w_0 \\
x_1 w_1 + x_2 w_2 + w_0 = 0
\end{equation}

The $y$ intercept is defined as the value of $x$ when $y=0$ (it's where the line intercepts the $y$ axis) and conversely the $x$ intercept is the value of $y$ when $x=0$. Rearranginge the standard equation to solve for $x$ and $y$ individually we have:

\begin{equation}
\begin{aligned}
&x = \frac{c - by}{a} \ &\to \text{when}\ y=0 &\to &x = \frac{c}{a} \ &\to \text{substituting} &\to &x_1=\frac{-w_0}{w_1} \\
&y = \frac{c - ax}{b} &\to \text{when}\ x=0 &\to &y = \frac{c}{b} \ &\to \text{substituting} &\to &x_2=\frac{-w_0}{w_2}
\end{aligned}
\end{equation}

> To calculate the slope

\begin{equation}
m=\frac{-a}{b} \ \to \text{substituting} \to\ m = \frac{-w_1}{w_2} 
\end{equation}

> plugging values in to get $y$ for the maximum and minimum displayed values of $x$:

\begin{equation}
y_{x_{min}}=\frac{c-ax_{min}}{b} \ \to \text{substituting} \to\  y_{x_{min}} = \frac{-w_o-w_1x_{min}}{w_2} \\
y_{x_{max}}=\frac{c-ax_{max}}{b} \ \to \text{substituting} \to\  y_{x_{max}} = \frac{-w_o-w_1x_{max}}{w_2} \\
\end{equation}

> or more simply using the $y = mx + b$  where $m = \frac{-w_1}{w_2}$ and $b$ is the y-intercept $b=x_{y=0}=\frac{-w_0}{w_2}$version of the equation for a line:

\begin{equation}
y_{x_{min}}=mx_{min} + b \ \to \text{substituting} \to\  y_{x_{min}} = \frac{-w_1x_{min}}{w_2}+\frac{-w_0}{w_2} \\
y_{x_{max}}=mx_{max} + b \ \to \text{substituting} \to\  y_{x_{max}} = \frac{-w_1x_{max}}{w_2}+\frac{-w_0}{w_2}
\end{equation}

As expected, the two are the same.

### get_slope() - Get the slope for the given weights

In [None]:
def get_slope(self,weights=None):
    if weights is None:
        weights = self.weights
        pass
    return - (weights[1] / weights[2])

Perceptron.get_slope = get_slope

### get_y_at_x() - Get a $y$ value for a given $x$ value with the current or provided weights

In [None]:
def get_y_at_x(self,x,weights=None):
    if weights is None:
        weights = self.weights
        pass
    return -(x * (weights[1] / weights[2])) - (weights[0]/weights[2])

Perceptron.get_y_at_x = get_y_at_x

### get_x_at_y() - Get a $x$ value for a given $y$ value with the current or provided weights

In [None]:
def get_x_at_y(self,y,weights=None):
    if weights is None:
        weights = self.weights
        pass
    return -(y * (weights[2] / weights[1])) - (weights[0]/weights[1])

Perceptron.get_x_at_y = get_x_at_y

### plot_boundary_line() - Plot the boundary line and shade the regions

In [None]:
import matplotlib.pyplot as plt

def plot_boundary_line(self,ax):
    """
    Calculate the y values for the boundary line at x1 and x2 and draw it on the provided canvas
    """
    # And we can get the first and last x values from the canvas
    x_min, x_max = ax.get_xlim()
    # get y1 at x_max
    y1 = self.get_y_at_x(x_min)
    # get y1 at x_min
    y2 = self.get_y_at_x(x_max)
    #print("slope=%0.3f" % self.get_slope(),"(x1,y1)=(%0.3f,%0.3f)"%(x_min,y1),"(x2,y2)=(%0.3f,%0.3f)"%(x_max,y2))
    # Plotting a line from x_min, y_at_x_min to x_max, y_at_x_max will
    # draw the line from one end of the plot to the other that shows
    # the decision boundary at the time that this method was called.
    ax.plot([x_min,x_max],[y1,y2],'k-',label='decision boundary', alpha=0.5)
    # Shade the regions
    # Get the min and max values on the y-axis, since that's all that
    # is displayed
    y_min,y_max = ax.get_ylim()
    # Between the line and y_min, color it reddish
    ax.fill_between([x_min,x_max],y_min,[y1,y2],facecolor='r', alpha= 0.1)
    # Between the line and y_max color it blueish
    ax.fill_between([x_min,x_max],y_max,[y1,y2],facecolor='b', alpha= 0.1)
    # Add a title
    ax.set_title("Iris Data with Decision Boundary")
    return

Perceptron.plot_boundary_line = plot_boundary_line

## Plotting the Data with the Decision Boundary

In [None]:
def plot_data(self,ax):
    # Remove the first column (bias values of 1.0)
    samples = self.samples[:,1:3]
    # Calculate the max and min making them 0.1 wider for the plot so the
    # markers fit visibly.
    x_min, x_max = samples[:, 0].min() - 0.1, samples[:, 0].max() + 0.1
    y_min, y_max = samples[:, 1].min() - 0.1, samples[:, 1].max() + 0.1
    # Size plot to fit data tightly
    ax.set_xlim(x_min, x_max)
    ax.set_ylim(y_min, y_max)
    # Setosa Samples
    setosa = samples[self.labels == self.a_label]
    x_s,y_s = setosa[:,0],setosa[:,1]
    # Versicolor Samples
    versicolor = samples[self.labels == self.b_label]
    x_v,y_v = versicolor[:,0],versicolor[:,1]
    # Draw the dicision boundry
    self.plot_boundary_line(ax)
    # Plot the feature samples. We use transparency because some of the samples
    # have identical samples and it's nice to see when there are two or more
    # "dots" on the plot.
    ax.scatter(x_s, y_s, color='red', marker='o', label='setosa', alpha=0.33)
    ax.scatter(x_v, y_v, color='blue', marker='p', label='versicolor', alpha=0.33)
    ax.set_title("Iris Data")
    ax.set_xlabel('petal length')
    ax.set_ylabel('sepal length')
    ax.legend(loc='upper left')
    return

Perceptron.plot_data = plot_data

##Plotting the Errors (Misclassifications) for each Iteration (Epoch)

In [None]:
def plot_errors(self,ax, bar=True):
    # Titles are nice.
    ax.set_title("Errors per Iteration (Epoch)")
    ax.set_xlabel('Iterations (Epochs)')
    ax.set_ylabel('Number of Errors')
    if bar:
        x = [x + 1 for x in range(len(self.misclassifications))]
        y = self.misclassifications
        ax.bar(x,y,color=(0.2, 0.4, 0.6, 0.6))
        ax.set_xticklabels(x)
        ax.set_xticks(x)
        return
    # Set the x axis limits
    ax.set_xlim(0, len(self.misclassifications))
    # Set the y axis limites to one more than then max value so that it
    # does not look cut-off
    ax.set_ylim(0, np.max(self.misclassifications) + 1)
    # We really only want integer tick marks
    ax.set_yticks(range(np.max(self.misclassifications) + 1))
    ax.set_xticks(range(len(self.misclassifications)))
    ax.plot(range(1, len(self.misclassifications) + 1), self.misclassifications, marker='o')
    return 

# Add to classs
Perceptron.plot_errors = plot_errors

##Put Table of Attributes on Plot

### plot_info_table()

In [None]:
def plot_info_table(self,ax):
    cellText = []
    cellText.append(["Samples", "%d" % self.samples.shape[0]])
    cellText.append(["Setosa", "%d" % np.sum(self.labels==self.a_label)])
    cellText.append(["Versicolor", "%d" % np.sum(self.labels==self.b_label)])
    cellText.append(["Features", "%d" % (self.samples.shape[1] - 1)])
    cellText.append(["Labels", "%s" % self.unique_labels.shape[0]])
    for i,l in enumerate(self.unique_labels):
        cellText.append([" - Label %d" %(i+1), "%0.1f" %l])
        pass
    cellText.append(["Threshold", "%0.2f" % self.threshold])
    cellText.append(["Learning Rate ($\eta$)", "%0.2f" % self.learning_rate])
    cellText.append(["Max Iterations", "%s" % self.num_iterations])
    for i,w in enumerate(self.weights_history[0]):
        cellText.append([r"$w_{}$ at $t_0$".format(i), "%0.2f" % w])
        pass
    t=len(self.misclassifications)
    for i,w in enumerate(self.weights):
        cellText.append([r"$w_{}$ at $t_{{{}}}$".format(i,t), "%0.2f" % w])
        pass
    cellText.append(["Slope", "%0.2f" % self.get_slope()])
    cellText.append(["Iterations", "%s" % self.iteration_count])
    cellText.append(["Total Errors", "%s" % np.sum(self.misclassifications)])
    header_color = (0.4, 0.6, 0.8, 0.25)
    table = ax.table(
        cellText=cellText,
        colLabels=["Attribute","Value"],
        colColours=[header_color,header_color],
        colWidths=[0.5,0.15],
        cellLoc='left',
        loc='center right',
    )
    self.align_col(table,1,"right")
    table.set_fontsize(12)
    table.scale(1.24,1.75)
    ax.axis("off")
    pass

Perceptron.plot_info_table = plot_info_table

### align_col() - Change the alignment of a matplotlib table column.

Could not find a better way to do this.

In [None]:
def align_col(self,table, col, align="left"):
    cells = [key for key in table._cells if key[1] == col]
    for cell in cells:
        table._cells[cell]._loc = align
        pass
    pass

Perceptron.align_col = align_col

##Plotting All Decision Boundaries We Tried

### fit_vals() - Make sure that all the lines fit nicely in a plot.

Fit all the attempts on a plot by making all lines go from one side of the plot box to the other, whereever that might be.

In [None]:
def fit_vals(self,y0,x_min,x_max,y_min,y_max,ws):
    x = self.get_x_at_y(y0,ws)
    if x > x_max:
        x = self.get_x_at_y(self.get_y_at_x(x_max,ws),ws)
    elif x < x_min:
        x = self.get_x_at_y(self.get_y_at_x(x_min,ws),ws)
        pass
    y = self.get_y_at_x(x,ws)
    if y > y_max:
        y = self.get_y_at_x(self.get_x_at_y(y_max,ws),ws)
    elif y < y_min:
        y = self.get_y_at_x(self.get_x_at_y(y_min,ws),ws)
        pass
    return x,y

Perceptron.fit_vals = fit_vals

### plot_attempts() - Make a plot with all the decision boundaries we tried.

Create a plot with all the decision boundaries we've tried.

**Notes**:

- we only keep track of changes to the weights, not the number of times those weights were used, so the number of lines will almost never match the number of iterations or even the number of iterations times the number of samples.
- we dont (yet) handle vertical lines (where $w_2$ = 0) since in the quick-and-dirty methods we have we divide by $w_2$ to get our values. Some day, I may change that, but this is an issue if you set all weights to `0.0`, which we do. In that case, we just ignore the `NaN` errors but Python will complain.

In [None]:
def plot_attempts(self,ax):
    # get all the values of y where x = 0
    y_int = np.array([self.get_y_at_x(0,weights) for weights in self.weights_history])
    # get all the values of x where y = 0
    x_int = np.array([self.get_x_at_y(0,weights) for weights in self.weights_history])
    # remove any pairs with NaN values
    idxs = []
    wsh = []
    for i, y in enumerate(y_int):
        if np.isnan(y) or np.isnan(x_int[i]):
            idxs.append(False)
        else:
            idxs.append(True)
            wsh.append(self.weights_history[i])
            pass
        pass
    y_int = y_int[idxs]
    x_int = x_int[idxs]
    # Find the min and max for x and y axes
    y_min, y_max, x_min, x_max = y_int.min(), y_int.max(), x_int.min(), x_int.max()
    # Shrink canvas
    ax.set_xlim(x_min,x_max)
    ax.set_ylim(y_min,y_max)
    # Now loop through all the non-NaN weights and draw the longest line you can
    for ws in wsh:
        x1,y1 = self.fit_vals(y_min,x_min,x_max,y_min,y_max,ws)
        x2,y2 = self.fit_vals(y_max,x_min,x_max,y_min,y_max,ws)
        ax.plot([x1,x2],[y1,y2],'k-', alpha=0.33)
        pass
    ax.set_title("Boundaries Tried")
    return

Perceptron.plot_attempts = plot_attempts

## Create Comprehensive Information Plot

### into() - Plot the conprehensive visualization of the Perceptron

In [None]:
import matplotlib.gridspec as gridspec

def info(self, save=False, show=True):
    plt.close()
    # Create three subplots in a 4 x 3 grid
    grid = (4,3)
    # Create a decent sized figure
    fig = plt.figure(figsize=(12, 12), dpi=96, facecolor=None)
    # Top left for misclassifications per epoc
    err_ax = plt.subplot2grid(grid,(0,0), colspan=2)
    self.plot_errors(err_ax)
    # Top Right for text info
    inf_ax = plt.subplot2grid(grid,(0,2), rowspan=4)
    self.plot_info_table(inf_ax)
    # Middle for data and boundary
    dat_ax = plt.subplot2grid(grid, (1,0), colspan=2, rowspan=2)
    self.plot_data(dat_ax)
    # Bottom for attempts
    att_ax = plt.subplot2grid(grid, (3,0), colspan=2)
    self.plot_attempts(att_ax)
    fig.tight_layout()
    if save:
        import time
        timestr = time.strftime("%Y%m%d-%H%M%S")
        filename = "Perceptron-%s.svg" % timestr
        print("Saving '%s'" %(filename))
        plt.savefig(filename)
        pass
    if show:
        plt.show()
        pass
    return self

# Add to class
Perceptron.info = info

# Running the Perceptron

In [None]:
# Hopefully it looks good at this point. So let's run a trial and plot the
# misclassifications each iteration (epoch)
perc = Perceptron(samples,labels)
_ = perc.train(learning_rate=0.01, num_iterations=50, weight_values=0.0).info(save=True,show=True)

In [None]:
# Now try with random weights


In [None]:
# Now try with random weights & changing the learning rate = 100


In [None]:
# Now try with random weights & learning rate = 0.1 & changing the num_iterations to 3 
