# Perceptron

This notebook shows the implementation of a perceptron in Python, which structure illustrated bellow.

<center><img src="resources/img/perceptron.png" alt="Perceptron" width="500px"/></center>

Some datasets are evaluated and the results are discussed in the following cells. The main idea is to show in a 2D figure the distringuishing approch of a perception, illustration the lines that separates two classes of a dataset, has illustrated bellow in a 3D surface.

<center><img src="resources/img/linear_classifier.png" alt="Separation line of a linear classifier" width="500px"/></center>

Before we start, it is useful to define some functions first.

### Useful functions

In [1]:
from numpy.random import MT19937, RandomState, SeedSequence

def new_random_state(seed=123):
    return RandomState(MT19937(SeedSequence(seed)))
random_state = new_random_state()

# Uniform random between a and b
def urand(a, b, *args, **kwargs):
    r   = random_state.rand(*args, **kwargs)
    rab = (b - a) * r + a
    return rab

ndims = lambda a: shape(a)[1];      # number of dimensions
msqr  = lambda a: mean(inner(a,a))  # mean square

### Load data function

In [2]:
from numpy import loadtxt, transpose

def load_data(filename):
    data   = loadtxt(filename)
    inputs = data[:,0:2]
    labels = data[:,2] - 1
    return inputs, labels

### 2D-Output Neuron base class
This cells implements a base neuron class that can be used with eighter a perceptron, an ADALINE or even any other similar structure with a single output.

In [3]:
from numpy        import append, full, inner, insert, isscalar, mean, ones, ones_like, shape, sign, unique, where
from numpy.random import permutation

class Neuron():
    def __init__(self, input_length, learning_rate=0.01, max_epochs=1e2, min_loss=1e-2):
        self.input_length  = int(input_length)
        self.learning_rate = learning_rate
        self.max_epochs    = int(max_epochs)
        self.min_loss      = min_loss
        self.weights       = self.__initialize_weight__(length=input_length+1)

        # Propertiers defined elsewhere
        self.epochs        = None
        self.inputs        = None
        self.labels        = None
        self.losses        = None
        self.unique_labels = None

    def __adaptat__(self, input, error):
        w = self.weights[-1]
        n = self.learning_rate
        new_weight   = w + n*input*error # adaptation rule
        self.weights = append(self.weights, [new_weight], axis=0)

    def __calc_cost__(self, errors):
        cost = msqr(errors) # mean square error
        return cost

    def __calc_error__(self, labels, outputs):
        diff   = labels - outputs
        errors = sign(diff)
        return errors

    def __expand_inputs__(self, inputs):
        num_inputs      = shape(inputs)[0]
        biases          = -ones((1, num_inputs))
        expanded_inputs = insert(inputs, 0, biases, axis=1)
        return expanded_inputs

    def __has_converged__(self, errors, epoch=-1):
        self.losses[epoch] = self.__calc_cost__(errors)
        converged          = self.losses[epoch] <= self.min_loss # convergence rule
        return converged

    def __initialize_weight__(self, length):
        w0 = urand(-1, 1, length)
#         w0 = [-0.6270591 ,  1.96598375, -1.89546286]
        return [w0]

    def __predict__(self, input, epoch=-1):
        weights     = self.weights[epoch]
        inner_state = inner(weights, input) # induced local field
        output      = self.__activation_function__(inner_state)
        return output.tolist()

    def predict(self, input, epoch=-1):
        expanded_input = self.__expand_inputs__(input)
        output         = self.__predict__(expanded_input, epoch)
        return output

    def train(self, inputs, labels, shuffle_data=False):
        def prepare_training(inputs, labels, shuffle_data):
            self.inputs        = inputs
            self.labels        = labels
            self.unique_labels = unique(labels)
            self.num_inputs    = shape(inputs)[0]
            self.losses        = full(self.max_epochs + 1, float('nan'))
            inputs             = self.__expand_inputs__(inputs)
            if shuffle_data:
                permutations   = permutation(self.num_inputs)
                inputs         = inputs[permutations]
                labels         = labels[permutations]
            return inputs, labels

        def finish_training(epoch):
            self.epochs  = epoch
            self.losses  = self.losses[:epoch + 1] # keep only data until convergence reached
            self.weights = self.weights[::self.num_inputs] # keep only epoch updates

        inputs, labels = prepare_training(inputs, labels, shuffle_data)
        for epoch in range(self.max_epochs + 1):
            outputs = self.__predict__(inputs)
            errors  = self.__calc_error__(labels, outputs)
            if self.__has_converged__(errors, epoch): break
            for input, label in zip(inputs, labels):
                output = self.__predict__(input)
                error  = self.__calc_error__(label, output)
                self.__adaptat__(input, error)
        finish_training(epoch)

### Perceptron class
The Percetron class then inherits the base Neuron class and implements its out activation function.

The funtions `kernel` is useful to determine the separation line between two clsses.

The function `predict` is overloaded to process the input if any adapation is required beforehand.

In [4]:
from numpy import ndim

class Perceptron(Neuron):
    def __activation_function__(self, u):
        C = self.unique_labels
        act_fcn = where(u < 0, C[0], C[1])
        return act_fcn

    def kernel(self, input_class1=None, input_class2=None, epoch=-1):
        weight = self.weights[epoch]
        if input_class1 is None:
            input_class1 = []
            for x in input_class2:
                input  = self.__expand_inputs__([[0, x]])
                weight = -weight/weight[1]
                input_class1.append(inner(weight, input)[0])
            return input_class1
        elif input_class2 is None:
            input_class2 = []
            for x in input_class1:
                input  = self.__expand_inputs__([[x, 0]])
                weight = -weight/weight[2]
                input_class2.append(inner(weight, input)[0])
            return input_class2
        else:
            return None

    def predict(self, input, epoch=-1):
        if ndim(input) == 1:
            input = [input]
        return super().predict(input, epoch)

### Figure class
The following class can look complex, but it is composed by a set of functions that process the neuron output and convergence, illustrating the results in epochs.

For the reader interested in the perceptron itself, this cell can be skipped.

In [5]:
from bokeh.io       import output_notebook
from bokeh.layouts  import column, row
from bokeh.models   import ColumnDataSource, CustomJS, Range1d, Slider, tickers
from bokeh.plotting import Figure, output_file, show
from numpy          import argsort, array, concatenate, sort

output_notebook()

class Figure_neuron():
    def __init__(self, neuron, width=400, height=400):
        self.neuron = neuron
        self.__init_plot_params__()

        [fig1, area_left, area_right] = self.__create_figure1__()
        fig2   = self.__create_figure2__()
        slider = self.__create_slider__(area_left, area_right)
        if slider is None:
            self.layout = row(fig1, fig2)
        else:
            self.layout = column(slider, row(fig1, fig2))

    def __create_figure1__(self, width=400, height=400):
        colors_lr  = self.source1.tags[2][-1]
        figure     = Figure(plot_width=width, plot_height=height)
        area_left  = figure.varea(x='x', y1='y1_left',  y2='y2_left',  source=self.source1, fill_color=colors_lr[0], fill_alpha=0.2)
        area_right = figure.varea(x='x', y1='y1_right', y2='y2_right', source=self.source1, fill_color=colors_lr[1], fill_alpha=0.2)
        for class_input, legend_label, color in zip(self.class_input, self.legend_labels, self.colors):
            x = class_input[:,0]
            y = class_input[:,1]
            c = figure.circle(x, y, size=10, fill_color=color, fill_alpha=0.6, line_color=None, legend_label=legend_label)
        figure.line('x', 'y', source=self.source1, line_width=3, line_alpha=0.6, line_color='black')
        figure.x_range = self.x_range
        figure.y_range = self.y_range
        figure.legend.click_policy = 'hide'
        return figure, area_left, area_right

    def __create_figure2__(self, width=400, height=400):
        x = self.source2.data['x']
        y = self.source2.data['y']
        figure = Figure(plot_width=width, plot_height=height, x_axis_label='Epoch', y_axis_label='Loss')
        figure.line(x, y, line_width=3, line_color='black', line_alpha=0.1)
        figure.line('x', 'y', source=self.source2, line_width=3, line_color='black')
        figure.circle('x', 'y', source=self.source2, size=3, fill_color='black', line_color=None)
        if min(x) != max(x):
            figure.x_range = Range1d(min(x), max(x), bounds="auto")
        else:
            figure.x_range = Range1d(min(x), max(x) + 1, bounds="auto")
        figure.y_range = Range1d(0, max(y) + 1, bounds="auto")
        figure.xaxis.ticker.min_interval = 1
        figure.yaxis.ticker.min_interval = 1
        figure.xaxis.ticker.num_minor_ticks = 0
        figure.yaxis.ticker.num_minor_ticks = 0
        return figure

    def __create_slider__(self, area_left, area_right):
        if self.neuron.epochs == 0:
            return None
        else:
            callback1 = CustomJS(args=dict(area_left=area_left, area_right=area_right, source=self.source1), code="""
                var i         = cb_obj.value
                var x1_span   = source.tags[0][i]
                var x2_span   = source.tags[1][i]
                var colors_lr = source.tags[2][i]
                var x2_span_s = x2_span.slice().sort()
                source.data['x']            = x1_span
                source.data['y']            = x2_span
                source.data['y1_left']      = [x2_span_s[0], x2_span[1]]
                source.data['y2_left']      = [x2_span_s[1], x2_span[1]]
                source.data['y1_right']     = [x2_span[0],   x2_span_s[0]]
                source.data['y2_right']     = [x2_span[0],   x2_span_s[1]]
                area_left.glyph.fill_color  = colors_lr[0]
                area_right.glyph.fill_color = colors_lr[1]
                source.change.emit()
            """)
            callback2 = CustomJS(args=dict(source=self.source2), code="""
                var i = cb_obj.value
                var x = source.tags[0].slice(0, i + 1)
                var y = source.tags[1].slice(0, i + 1)
                source.data['x'] = x
                source.data['y'] = y
                source.change.emit()
            """)
            slider = Slider(start=0, end=self.neuron.epochs, value=self.neuron.epochs, step=1, title='Epoch')
            slider.js_on_change('value', callback1)
            slider.js_on_change('value', callback2)
            return slider

    def __init_plot_params__(self):
        n = self.neuron.input_length
        X = self.neuron.inputs
        Y = self.neuron.labels
        C = self.neuron.unique_labels
        W = self.neuron.weights
        I = range(0, self.neuron.epochs+1)
        L = self.neuron.losses

        self.colors        = ['blue', 'red']
        self.legend_labels = ['Class ' + str(int(label)) for label in self.neuron.unique_labels]

        x1min, x1max = min(X[:,0]), max(X[:,0])
        x2min, x2max = min(X[:,1]), max(X[:,1])
        dx1,   dx2   = x1max - x1min, x2max - x2min
        x1_range     = [x1min - 0.1*dx1, x1max + 0.1*dx1]
        x2_range     = [x2min - 0.1*dx2, x2max + 0.1*dx2]
        X1_span      = []
        X2_span      = []
        colors_lr    = []
        for epoch in I:
            x1_span = array(x1_range)
            x2_span = array(self.neuron.kernel(input_class1=x1_span, epoch=epoch))
            i       = argsort(x2_span)
            if x2_span[i[0]] > x2_range[0] or x2_span[i[1]] < x2_range[1]:
                if x2_span[i[0]] > x2_range[0]:
                    x2_span[i[0]] = x2_range[0]
                if x2_span[i[1]] < x2_range[1]:
                    x2_span[i[1]] = x2_range[1]
                x1_span = array(self.neuron.kernel(input_class2=x2_span, epoch=epoch))
            X1_span.append(x1_span)
            X2_span.append(x2_span)


            input_left   = [x1_span[0] - 1, x2_span[0]]
            input_right  = [x1_span[0] + 1, x2_span[0]]
            output_left  = self.neuron.predict(input=input_left,  epoch=epoch)
            output_right = self.neuron.predict(input=input_right, epoch=epoch)

            if output_left < output_right:
                color_left  = self.colors[0]
                color_right = self.colors[1]
            else:
                color_left  = self.colors[1]
                color_right = self.colors[0]
            colors_lr.append([color_left, color_right])
        x2_span_s = sort(x2_span)

        self.class_input = (array([x for x, y in zip(X, Y) if y == c]) for c in C)
        self.source1     = ColumnDataSource(data=dict(x       =x1_span,                    y       =x2_span,
                                                      y1_left =[x2_span_s[0], x2_span[1]], y2_left =[x2_span_s[1], x2_span[1]],
                                                      y1_right=[x2_span[0], x2_span_s[0]], y2_right=[x2_span[0], x2_span_s[1]]),
                                                      tags=[X1_span, X2_span, colors_lr])
        self.source2     = ColumnDataSource(data=dict(x=I, y=L), tags=[[i for i in I], L])
        self.x_range     = Range1d(*x1_range, bounds="auto")
        self.y_range     = Range1d(*x2_range, bounds="auto")

    def show(self):
        show(self.layout)

# Train perceptron and plot results
## Test dataset
In this dataset, the perceptron was able to converge pretty quickly with the following dataset.
The adaptation rule lead to the maximum minimum. That is because the dataset has classes quick separated from each other.

One can play with the parameters, like learning rate and maximum epochs to check different resutls.

In [6]:
inputs1, labels1 = load_data('./datasets/dataset1.txt')

perceptron1 = Perceptron(input_length=ndims(inputs1), learning_rate=0.001, max_epochs=5e2)
perceptron1.train(inputs1, labels1)

figure1 = Figure_neuron(perceptron1)
figure1.show()

## Dataset 2
This time, the dataset variance overlaps on the classes. That means points from one class can mix with points from other classes.
Since the perceptron can only separate classes linearly, the algorithm was not able to converge.

In [7]:
inputs2, labels2 = load_data('./datasets/dataset2.txt')

perceptron2 = Perceptron(input_length=ndims(inputs2), learning_rate=0.0001, max_epochs=2000)
perceptron2.train(inputs2, labels2)

figure2 = Figure_neuron(perceptron2)
figure2.show()

## Dataset 3
This dataset has concurrent classes mean positions and variances. The overlap and non-linearity of the data lead to an oscillating results, forbiding the perceptron to converge.

In [8]:
inputs3, labels3 = load_data('./datasets/dataset3.txt')

perceptron3 = Perceptron(input_length=ndims(inputs3), learning_rate=0.001)
perceptron3.train(inputs3, labels3)

figure3 = Figure_neuron(perceptron3)
figure3.show()

## Dataset 4
This dataset has a data disposition similar to the previous one, but with higher variance. Again, the overlap and non-linearity of the data lead to an oscillating results, forbiding the perceptron to converge.

In [9]:
inputs4, labels4 = load_data('./datasets/dataset4.txt')

perceptron4 = Perceptron(input_length=ndims(inputs4), learning_rate=0.001)
perceptron4.train(inputs4, labels4)

figure4 = Figure_neuron(perceptron4)
figure4.show()

## Dataset 5
The result got from this dataset is rather curious. Instead of leading to possible separation of the classes, the algorithm lead to a separation line that connects both classes centers.
As a consequence, this local minimum forbid the perceptron to evolve in the convergence, stucking in a non-optimal solution.

In [10]:
inputs5, labels5 = load_data('./datasets/dataset5.txt')

perceptron5 = Perceptron(input_length=ndims(inputs5), learning_rate=0.001, max_epochs=5e2)
perceptron5.train(inputs5, labels5, shuffle_data=True)

figure5 = Figure_neuron(perceptron5)
figure5.show()

## Dataset 6
The following data has a similar disposition from the last one, but with higher spreadness. The result separation from the perceptron was also similar: it got stuck in a local minimum solution.

In [11]:
inputs6, labels6 = load_data('./datasets/dataset6.txt')

perceptron6 = Perceptron(input_length=ndims(inputs6), learning_rate=0.001, max_epochs=5e2)
perceptron6.train(inputs6, labels6, shuffle_data=True)

figure6 = Figure_neuron(perceptron6)
figure6.show()