# Classification with a Perceptron
In this notebook, we will investigate a classification problem using a Perceptron.

### Import modules
Begin by importing the modules to be used in this notebook

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

### A Classification Problem
Consider a dataset with classified labels. In other words, the data has a set of parameters in $x_{data}$ and $y_{data}$ with a classification $c_{data}$ corresponding to each point. For example, consider you had an upward-looking camera deployed in the shallow waters of Manresa State Beach. From images captured by the camera, you can measure the length of sharks passing by ($x_{data}$, in feet) and the distance between the tips of the pectoral fins ($y_{data}$, in feet). Then, the classifications may pertains to juvenile sharks (class 1) or mature sharks (class 2).

One such dataset is as follows:

In [None]:
data = np.genfromtxt('classification_scatter.csv',delimiter=',')
x_data = data[:,0]
y_data = data[:,1]
classifications_data = data[:,2]

Since we're going to be plotting the same dataset a few times, let's define a common set of bounds for our axes:

In [None]:
# define some bounds to be used in the plots below
min_x = -1
max_x = 12
min_y = -1
max_y = 9

Next, let's plot the data:

In [None]:
plt.plot(x_data[classifications_data==1],y_data[classifications_data==1],'b.',label='Class 1 (Juvenile)')
plt.plot(x_data[classifications_data==2],y_data[classifications_data==2],'g.',label='Class 2 (Mature)')
plt.gca().set_xlim([min_x,max_x])
plt.gca().set_ylim([min_y,max_y])
plt.ylabel('parameter 1 (shark length, ft)')
plt.xlabel('parameter 2 (pectoral fin width, ft)')
plt.legend(loc=2)
plt.show()

In this example, we want to find a line that separates these two classes. We could start with an initial guess as follows

In [None]:
slope = 3.0
intercept = -10.0

Plotting the classification "model" (i.e. the dividing line) would give the following:

In [None]:
plt.plot(x_data[classifications_data==1],y_data[classifications_data==1],'b.',label='Class 1')
plt.plot(x_data[classifications_data==2],y_data[classifications_data==2],'g.',label='Class 2')
plot_x = np.linspace(min_x,max_x,100)
plt.plot(plot_x, intercept + slope*plot_x, 'k-')
plt.gca().set_xlim([-1,12])
plt.gca().set_ylim([-1,9])
plt.legend(loc=2)
plt.show()

How many classifications did we get right with this model? We can compute it as follows: if a data point is above our line, we count it as 2, if not, we count it as 1. With a look to what's coming ahead, we'll call this an activation function

In [None]:
# define the activation_function here


Give this activation, we can compare side by side how our model is working:

In [None]:
classifications_model = activation_function(intercept, slope, x_data, y_data)

plt.figure(figsize=(11,4))

plt.subplot(1,2,1)
plt.plot(x_data[classifications_data==1],y_data[classifications_data==1],'b.',label='Class 1')
plt.plot(x_data[classifications_data==2],y_data[classifications_data==2],'g.',label='Class 2')
plot_x = np.linspace(min_x,max_x,100)
plt.plot(plot_x, intercept + slope*plot_x, 'k-')
plt.gca().set_xlim([-1,12])
plt.gca().set_ylim([-1,9])
plt.legend(loc=2)
plt.title('Data Classifications')

plt.subplot(1,2,2)
plt.plot(x_data[classifications_model==1],y_data[classifications_model==1],'b.',label='Class 1')
plt.plot(x_data[classifications_model==2],y_data[classifications_model==2],'g.',label='Class 2')
plot_x = np.linspace(min_x,max_x,100)
plt.plot(plot_x, intercept + slope*plot_x, 'k-')
plt.gca().set_xlim([-1,12])
plt.gca().set_ylim([-1,9])
plt.legend(loc=2)
plt.title('Modeled Classifications')

plt.show()

Given this model guess, we could compute the cost function depending on how many classifications we got wrong

In [None]:
# define the cost function as the number of correctly classified points


In [None]:
print('Cost: '+str(cost_function(classifications_data, classifications_model))+' incorrect classifications')

Further, we can compute the error space

In [None]:
intercept_space = np.linspace(-50,50,100)
slope_space = np.linspace(-4,4,100)
I, S = np.meshgrid(intercept_space, slope_space)
Error = np.zeros((100,100))

# fill in the error matrix
for row in range(np.shape(I)[0]):
    for col in range(np.shape(S)[1]):
        classifications_model = activation_function(I[row,col], S[row,col], x_data, y_data)
        Error[row,col] = cost_function(classifications_data, classifications_model)

And, we can make a plot of the error space

In [None]:
C = plt.pcolormesh(intercept_space,slope_space, Error+1)
plt.contour(intercept_space,slope_space, Error,colors='white',linewidths=0.7)
plt.plot(intercept, slope, 'wo')
plt.colorbar(C, label='cost (# of incorrect classifications)')
plt.title('Error space')
plt.ylabel('slope ($m$)')
plt.xlabel('intercept ($b$)')
plt.show()

Depending on the initial guess for the intercept and slope, we probably don't have a very good model. The idea here is to move through the error space to determine how we should update the parameters and get a better classification model.

Similar to gradient decent in optimization, we can define a gradient:

In [None]:
# define the cost_function_gradient here


This gradient can be used to then improve upon an initial guess and improve the model. First, define a learning rate and first guess.

In [None]:
learning_rate = 0.002
intercept = 1.0 # starting intercept guess
slope = 1.0 # starting slope guess
weights = np.array([intercept, slope])

One iteration can be computed by computing the classifications, determining the gradient, and updating the weights based on the gradient:

In [None]:
# compute the modeled values
classifications_model = activation_function(intercept, slope, x_data, y_data)
weight_gradient = cost_function_gradient(x_data, y_data, classifications_data, classifications_model)
weights -= learning_rate*weight_gradient
print(weights)

We can build a slider to examine how this would look over many iterations

In [None]:
def plot_fit_and_cost(initial_guess, n_iterations):

    weights = np.copy(initial_guess)
    classifications_model = activation_function(weights[0], weights[1], x_data, y_data)
    for n in range(n_iterations):
        weight_gradient = cost_function_gradient(x_data, y_data, classifications_data, classifications_model)
        weights += learning_rate*weight_gradient
        classifications_model = activation_function(weights[0], weights[1], x_data, y_data)
    
    fig = plt.figure(figsize=(11,5))
    
    plt.subplot(1,2,1)
    plt.plot(x_data[classifications_model==1],y_data[classifications_model==1],'b.',label='Class 1')
    plt.plot(x_data[classifications_model==2],y_data[classifications_model==2],'g.',label='Class 2')
    plot_x = np.linspace(min_x,max_x,100)
    plt.plot(plot_x, weights[0] + weights[1]*plot_x, 'k-')
    plt.gca().set_xlim([-1,12])
    plt.gca().set_ylim([-1,9])
    plt.title('Fit after '+str(n_iterations)+' iteration(s)')
    plt.legend(loc=2)
    plt.ylabel('y')
    plt.xlabel('x')
    
    plt.subplot(1,2,2)
    C = plt.pcolormesh(intercept_space,slope_space, Error)
    plt.contour(intercept_space,slope_space, Error,colors='white',linewidths=0.7)
    plt.plot(initial_guess[0], initial_guess[1], 'wo')
    plt.plot(weights[0], weights[1], 'ko')
    plt.text(initial_guess[0]+2, initial_guess[1], '$\leftarrow$ Initial',color='white',va='center')
    classifications_model = activation_function(weights[0], weights[1], x_data, y_data)
    missclassified_points = int(cost_function(classifications_data, classifications_model))
    if n_iterations>0:
        plt.text(weights[0]+2, weights[1], '$\leftarrow$ Final ('+str(missclassified_points)+')',color='white',va='center')
    plt.colorbar(C, label='misclassified points')
    plt.title('Error space')
    plt.ylabel('slope ($m$)')
    plt.xlabel('intercept ($b$)')
    plt.show()


In [None]:
interact(plot_fit_and_cost, initial_guess=fixed(np.array([intercept, slope])),
         n_iterations=widgets.IntSlider(min=0, max=500));

Here, we have "trained" a model based on available data.