# The Perceptron algorithm at work

In this notebook, we will look in detail at the Perceptron algorithm for learning a linear classifier in the case of binary labels.

### Import

In [None]:
%matplotlib inline

import numpy as np
import matplotlib
import matplotlib.pyplot as plt

matplotlib.rc('xtick', labelsize=14) 
matplotlib.rc('ytick', labelsize=14)

## The algorithm

This first procedure, **evaluate_classifier**, takes as input the parameters of a linear classifier (`w,b`) as well as a data point (`x`) and returns the prediction of that classifier at `x`.

The prediction is:
* `1`  if `w.x+b > 0`
* `0`  if `w.x+b = 0`
* `-1` if `w.x+b < -1`

In [None]:
np.sign(-1), np.sign(0), np.sign(0.5)

In [None]:
def evaluate_classifier(w, b, x):
    return np.sign(w @ x + b)

Here is the Perceptron training procedure. It is invoked as follows:
* `w,b,converged = train_perceptron(x,y,n_iters)`

where
* `x`: n-by-d numpy array with n data points, each d-dimensional
* `y`: n-dimensional numpy array with the labels (each 1 or -1)
* `n_iters`: the training procedure will run through the data at most this many times (default: 100)
* `w,b`: parameters for the final linear classifier
* `converged`: flag (True/False) indicating whether the algorithm converged within the prescribed number of iterations

If the data is not linearly separable, then the training procedure will not converge.

In [None]:
def train_perceptron(x, y, n_iters=100):
    n, d = x.shape
    w, b = np.zeros((d,)), 0
    convergence = 0
    np.random.seed(0)
    
    for itr in range(n_iters):
        for j in np.random.permutation(n):
            if evaluate_classifier(w, b, x[j,:]) != y[j]:
                w += y[j] * x[j,:]
                b += y[j]
                convergence = itr
                
    print("Perceptron algorithm converged at iteration: {}/{} iterations".format(convergence+1, n_iters))
    return w, b, convergence < n_iters

## The Perceptron at work

The directory containing this notebook should also contain the two-dimensional data files, `data_1.txt` and `data_2.txt`. These files contain one data point per line, along with a label, like:
* `3 8 1` (meaning that point `x=(3,8)` has label `y=1`)

The next procedure, **run_perceptron**, loads one of these data sets, learns a linear classifier using the Perceptron algorithm, and then displays the data as well as the boundary.

In [None]:
def run_perceptron(datafile, iterations=100):
    data = np.loadtxt(datafile)
#     print(data)
    n,d = data.shape
    
    # Create training set x and labels y
    x = data[:,0:2]
    y = data[:,2]
    
    # Run the Perceptron algorithm for at most 100 iterations
    w, b, converged = train_perceptron(x, y, iterations)
    
    # Determine the x1- and x2- limits of the plot
    x1min, x1max = min(x[:, 0]) - 1, max(x[:, 0]) + 1
    x2min, x2max = min(x[:, 1]) - 1, max(x[:, 1]) + 1
    plt.xlim(x1min, x1max)
    plt.ylim(x2min, x2max)
    
    # Plot the data points
    plt.plot(x[(y==1), 0], x[(y==1), 1], 'ro')
    plt.plot(x[(y==-1), 0], x[(y==-1), 1], 'k^')
    
    # Construct a grid of points at which to evaluate the classifier
    if converged:
        density = 0.05
        xx1, xx2 = np.meshgrid(np.arange(x1min, x1max+density, density), np.arange(x2min, x2max+density, density))
        grid = np.c_[xx1.ravel(), xx2.ravel()]
        Z = np.array([evaluate_classifier(w, b, pt) for pt in grid])
        
        # Show the classifier's boundary using a color plot
        Z = Z.reshape(xx1.shape)
        plt.contourf(xx1, xx2, Z, cmap=plt.cm.PRGn, vmin=-3, vmax=3)
    plt.show()

Let's run this on `data_1.txt`. Try running it a few times; you should get slightly different outcomes, because of the randomization in the learning procedure.

In [None]:
run_perceptron('../../_data/data_1.txt')

And now, let's try running it on `data_2.txt`. *What's going on here?*

In [None]:
run_perceptron('../../_data/data_2.txt', 1000)

### Corner cases

<font color="magenta">Design a data set</font> with the following specifications:
* there are just two data points, with labels -1 and 1
* the two points are distinct, with coordinate values in the range [-1,1]
* the Perceptron algorithm requires more than 1000 iterations to converge

In [None]:
%%writefile ../../_data/data_0.txt
# A linearly separable problem
# Left-hand group: -1
# Right-hand group: +1
1.0 5 1
0.97 5 -1

In [None]:
!head ../../_data/data_0.txt

In [None]:
run_perceptron('../../_data/data_0.txt', 1500)